Anything you can do, AI can do better... Or can it? Comparing ChatGPT's Search Strategy Outputs with Cochrane Review Searches
2025-06-04 , 2314
Language: English

Objective: Previous studies have measured ChatGPT’s capabilities for completing literature search tasks. This study seeks to assess ChatGPT’s capability to produce comprehensive search strategies for systematic reviews, specifically comparing AI-generated outputs against published Cochrane review searches for precision and recall.

Methods: We created a test set of 9 PubMed search strategies from recent Cochrane. A script was created and ChatGPT was queried using each Cochrane review topic, research question(s), and inclusion criteria to generate a relevant PubMed search strategy. Precision and recall were measured using the Cochrane reviews’ PubMed search strategies and included articles as the standard and ChatGPT searches were evaluated using PRESS.

Results: GenAI search strategies had lower recall and lower precision on average when compared to Cochrane search strategies. The GenAI search strategies had an average recall of 57.6% (ranging from 0% to 100%) and an average precision of 1.51% (ranging from 0% to 4.17%), while the Cochrane search strategies had an average recall of 93.7% and an average precision of 2.39%. PRESS evaluations revealed errors including hallucinated MeSH terms and issues with keywords. The results indicate that ChatGPT could be used to help develop comprehensive literature search strategies for systematic reviews, but not without librarian oversight.

Conclusion: Results of this project provide a current estimation of whether, and to what extent, ChatGPT could be used to develop literature search strategies for systematic reviews. This project adds to the literature on GenAI uses for systematic reviews and informs librarians of the potential of these tools for comprehensive literature search development.

Emily Jones, MLIS, AHIP, (epjones3@email.unc.edu) is the Dentistry Librarian and Systematic Review Coordinator at the University of North Carolina at Chapel Hill’s Health Sciences Library. She is passionate about leveraging artificial intelligence to improve systematic review processes, developing technologies that enhance workplace productivity and streamline workflows, and advancing educational research by exploring innovative teaching and learning approaches.

This speaker also appears in: