Open Access

Can ChatGPT evaluate research quality?

   | May 27, 2024

Cite

Baker, M. (2016). Stat-checking software stirs up psychology. Nature, 540(7631), 151–152. Baker M. ( 2016 ). Stat-checking software stirs up psychology . Nature , 540 ( 7631 ), 151 152 . Search in Google Scholar

Bornmann, L., Mutz, R., & Daniel, H. D. (2010). A reliability-generalization study of journal peer reviews: A multilevel meta-analysis of inter-rater reliability and its determinants. PloS one, 5(12), e14331. Bornmann L. Mutz R. Daniel H. D. ( 2010 ). A reliability-generalization study of journal peer reviews: A multilevel meta-analysis of inter-rater reliability and its determinants . PloS one , 5 ( 12 ), e14331 . Search in Google Scholar

Buriak, J. M., Hersam, M. C., & Kamat, P. V. (2023). Can ChatGPT and Other AI Bots Serve as Peer Reviewers? ACS Energy Letters, 9, 191–192. Buriak J. M. Hersam M. C. Kamat P. V. ( 2023 ). Can ChatGPT and Other AI Bots Serve as Peer Reviewers? ACS Energy Letters , 9 , 191 192 . Search in Google Scholar

Cheng, S. W., Chang, C. W., Chang, W. J., Wang, H. W., Liang, C. S., Kishimoto, T., & Su, K. P. (2023). The now and future of ChatGPT and GPT in psychiatry. Psychiatry and Clinical Neurosciences, 77(11), 592–596. Cheng S. W. Chang C. W. Chang W. J. Wang H. W. Liang C. S. Kishimoto T. Su K. P. ( 2023 ). The now and future of ChatGPT and GPT in psychiatry . Psychiatry and Clinical Neurosciences , 77 ( 11 ), 592 596 . Search in Google Scholar

Feng, Y., Vanam, S., Cherukupally, M., Zheng, W., Qiu, M., & Chen, H. (2023, June). Investigating code generation performance of ChatGPT with crowdsourcing social data. In 2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC) (pp. 876–885). IEEE. Feng Y. Vanam S. Cherukupally M. Zheng W. Qiu M. Chen H. ( 2023 , June ). Investigating code generation performance of ChatGPT with crowdsourcing social data . In 2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC) (pp. 876 885 ). IEEE . Search in Google Scholar

Flanagin, A., Kendall-Taylor, J., & Bibbins-Domingo, K. (2023). Guidance for authors, peer reviewers, and editors on use of AI, language models, and chatbots. JAMA. https://doi.org/10.1001/jama.2023.12500. Flanagin A. Kendall-Taylor J. Bibbins-Domingo K. ( 2023 ). Guidance for authors, peer reviewers, and editors on use of AI, language models, and chatbots . JAMA . https://doi.org/10.1001/jama.2023.12500 . Search in Google Scholar

Garcia, M. B. (2024). Using AI tools in writing peer review reports: should academic journals embrace the use of ChatGPT? Annals of biomedical engineering, 52, 139–140. Garcia M. B. ( 2024 ). Using AI tools in writing peer review reports: should academic journals embrace the use of ChatGPT? Annals of biomedical engineering , 52 , 139 140 . Search in Google Scholar

Gov.uk (2023). Guidance: Exceptions to copyright. https://www.gov.uk/guidance/exceptions-to-copyright. Gov.uk ( 2023 ). Guidance: Exceptions to copyright . https://www.gov.uk/guidance/exceptions-to-copyright . Search in Google Scholar

Hosseini, M., & Horbach, S. P. (2023). Fighting reviewer fatigue or amplifying bias? Considerations and recommendations for use of ChatGPT and other Large Language Models in scholarly peer review. Research Integrity and Peer Review, 8(1), 4. https://doi.org/10.1186/s41073-023-00133-5. Hosseini M. Horbach S. P. ( 2023 ). Fighting reviewer fatigue or amplifying bias? Considerations and recommendations for use of ChatGPT and other Large Language Models in scholarly peer review . Research Integrity and Peer Review , 8 ( 1 ), 4 . https://doi.org/10.1186/s41073-023-00133-5 . Search in Google Scholar

Huang, J., & Tan, M. (2023). The role of ChatGPT in scientific communication: writing better scientific review articles. American Journal of Cancer Research, 13(4), 1148. Huang J. Tan M. ( 2023 ). The role of ChatGPT in scientific communication: writing better scientific review articles . American Journal of Cancer Research , 13 ( 4 ), 1148 . Search in Google Scholar

Johnson, D., Goodman, R., Patrinely, J., Stone, C., Zimmerman, E., Donald, R., … & Wheless, L. (2023). Assessing the accuracy and reliability of AI-generated medical responses: an evaluation of the Chat-GPT model. Research square. rs.3.rs-2566942. https://doi.org/10.21203/rs.3.rs-2566942/v1. Johnson D. Goodman R. Patrinely J. Stone C. Zimmerman E. Donald R. Wheless L. ( 2023 ). Assessing the accuracy and reliability of AI-generated medical responses: an evaluation of the Chat-GPT model . Research square . rs.3.rs-2566942 . https://doi.org/10.21203/rs.3.rs-2566942/v1 . Search in Google Scholar

Kocoń, J., Cichecki, I., Kaszyca, O., Kochanek, M., Szydło, D., Baran, J., & Kazienko, P. (2023). ChatGPT: Jack of all trades, master of none. Information Fusion, 101861. Kocoń J. Cichecki I. Kaszyca O. Kochanek M. Szydło D. Baran J. Kazienko P. ( 2023 ). ChatGPT: Jack of all trades, master of none . Information Fusion , 101861 . Search in Google Scholar

Langfeldt, L., Nedeva, M., Sörlin, S., & Thomas, D. A. (2020). Co-existing notions of research quality: A framework to study context-specific understandings of good research. Minerva, 58(1), 115–137. Langfeldt L. Nedeva M. Sörlin S. Thomas D. A. ( 2020 ). Co-existing notions of research quality: A framework to study context-specific understandings of good research . Minerva , 58 ( 1 ), 115 137 . Search in Google Scholar

Liang, W., Zhang, Y., Cao, H., Wang, B., Ding, D., Yang, X., & Zou, J. (2023). Can large language models provide useful feedback on research papers? A large-scale empirical analysis. arXiv preprint arXiv:2310.01783 Liang W. Zhang Y. Cao H. Wang B. Ding D. Yang X. Zou J. ( 2023 ). Can large language models provide useful feedback on research papers? A large-scale empirical analysis . arXiv preprint arXiv:2310.01783 Search in Google Scholar

Memon, A. R. (2020). Similarity and plagiarism in scholarly journal submissions: bringing clarity to the concept for authors, reviewers and editors. Journal of Korean medical science, 35(27), https://synapse.koreamed.org/articles/1146064. Memon A. R. ( 2020 ). Similarity and plagiarism in scholarly journal submissions: bringing clarity to the concept for authors, reviewers and editors . Journal of Korean medical science , 35 ( 27 ), https://synapse.koreamed.org/articles/1146064 . Search in Google Scholar

Mollaki, V. (2024). Death of a reviewer or death of peer review integrity? the challenges of using AI tools in peer reviewing and the need to go beyond publishing policies. Research Ethics, 17470161231224552. Mollaki V. ( 2024 ). Death of a reviewer or death of peer review integrity? the challenges of using AI tools in peer reviewing and the need to go beyond publishing policies . Research Ethics , 17470161231224552 . Search in Google Scholar

Nazir, A., & Wang, Z. (2023). A Comprehensive Survey of ChatGPT: Advancements, Applications, Prospects, and Challenges. Meta-radiology, 100022. Nazir A. Wang Z. ( 2023 ). A Comprehensive Survey of ChatGPT: Advancements, Applications, Prospects, and Challenges . Meta-radiology , 100022 . Search in Google Scholar

OpenAI (2023). GPT-4 technical report. https://arxiv.org/abs/2303.08774 OpenAI ( 2023 ). GPT-4 technical report . https://arxiv.org/abs/2303.08774 Search in Google Scholar

Perkins, M., & Roe, J. (2024). Academic publisher guidelines on AI usage: A ChatGPT supported thematic analysis. F1000Research, 12, 1398. Perkins M. Roe J. ( 2024 ). Academic publisher guidelines on AI usage: A ChatGPT supported thematic analysis . F1000Research , 12 , 1398 . Search in Google Scholar

REF (2019a). Guidance on submissions (2019/01). https://archive.ref.ac.uk/publications-and-reports/guidance-on-submissions-201901/ REF ( 2019a ). Guidance on submissions (2019/01) . https://archive.ref.ac.uk/publications-and-reports/guidance-on-submissions-201901/ Search in Google Scholar

REF (2019b). Panel criteria and working methods (2019/02). https://archive.ref.ac.uk/publications-and-reports/panel-criteria-and-working-methods-201902/ REF ( 2019b ). Panel criteria and working methods (2019/02) . https://archive.ref.ac.uk/publications-and-reports/panel-criteria-and-working-methods-201902/ Search in Google Scholar

Sivertsen, G. (2017). Unique, but still best practice? The Research Excellence Framework (REF) from an international perspective. Palgrave Communications, 3(1), 1–6. Sivertsen G. ( 2017 ). Unique, but still best practice? The Research Excellence Framework (REF) from an international perspective . Palgrave Communications , 3 ( 1 ), 1 6 . Search in Google Scholar

Thelwall, M., Kousha, K., Wilson, P., Makita, M., Abdoli, M., Stuart, E., Levitt, J. & Cancellieri, M. (2023a). Predicting article quality scores with machine learning: The UK Research Excellence Framework. Quantitative Science Studies, 4(2), 547–573. Thelwall M. Kousha K. Wilson P. Makita M. Abdoli M. Stuart E. Levitt J. Cancellieri M. ( 2023a ). Predicting article quality scores with machine learning: The UK Research Excellence Framework . Quantitative Science Studies , 4 ( 2 ), 547 573 . Search in Google Scholar

Thelwall, M., Kousha, K., Stuart, E., Makita, M., Abdoli, M., Wilson, P. & Levitt, J. (2023b). Does the perceived quality of interdisciplinary research vary between fields? Journal of Documentation, 79(6), 1514–1531. https://doi.org/10.1108/JD-01-2023-0012 Thelwall M. Kousha K. Stuart E. Makita M. Abdoli M. Wilson P. Levitt J. ( 2023b ). Does the perceived quality of interdisciplinary research vary between fields? Journal of Documentation , 79 ( 6 ), 1514 1531 . https://doi.org/10.1108/JD-01-2023-0012 Search in Google Scholar

Wei, X., Cui, X., Cheng, N., Wang, X., Zhang, X., Huang, S., & Han, W. (2023). Zero-shot information extraction via chatting with chatgpt. arXiv preprint arXiv:2302.10205. Wei X. Cui X. Cheng N. Wang X. Zhang X. Huang S. Han W. ( 2023 ). Zero-shot information extraction via chatting with chatgpt . arXiv preprint arXiv:2302.10205 . Search in Google Scholar

Wilsdon, J., Allen, L., Belfiore, E., Campbell, P., Curry, S., Hill, S., (2015). The metric tide. Report of the independent review of the role of metrics in research assessment and management. Wilsdon J. Allen L. Belfiore E. Campbell P. Curry S. Hill S. ( 2015 ). The metric tide . Report of the independent review of the role of metrics in research assessment and management . Search in Google Scholar

Wu, T., He, S., Liu, J., Sun, S., Liu, K., Han, Q. L., & Tang, Y. (2023). A brief overview of ChatGPT: The history, status quo and potential future development. IEEE/CAA Journal of Automatica Sinica, 10(5), 1122–1136. Wu T. He S. Liu J. Sun S. Liu K. Han Q. L. Tang Y. ( 2023 ). A brief overview of ChatGPT: The history, status quo and potential future development . IEEE/CAA Journal of Automatica Sinica , 10 ( 5 ), 1122 1136 . Search in Google Scholar

Zhao, X., & Zhang, Y. (2022). Reviewer assignment algorithms for peer review automation: A survey. Information Processing & Management, 59(5), 103028. Zhao X. Zhang Y. ( 2022 ). Reviewer assignment algorithms for peer review automation: A survey . Information Processing & Management , 59 ( 5 ), 103028 . Search in Google Scholar

eISSN:
2543-683X
Language:
English
Publication timeframe:
4 times per year
Journal Subjects:
Computer Sciences, Information Technology, Project Management, Databases and Data Mining