‘Obviously ChatGPT’ — how reviewers accused me of scientific fraud: A journal reviewer accused Lizzie Wolkovich of using ChatGPT to write a manuscript. She had not — but her paper was rejected anyway

This is an article from the Nature Careers Community, a place for Nature readers to share their professional experiences and advice

E.M. Wolkovich writes…I have just been accused of scientific fraud. Not data fraud — no one accused me of fabricating or misleadingly manipulating data or results. This, I suppose, is a relief because my laboratory, which studies how global change reshapes ecological communities, works hard to ensure that data are transparent and sharable, and that our work is reproducible. Instead, I was accused of writing fraud: passing off ‘writing’ produced by artificial intelligence (AI) as my own. That hurts, because — like many people — I find writing a paper to be a somewhat painful process. I read books on how to write — both to be comforted by how much these books stress that writing is generally slow and difficult, and to find ways to improve. My current strategy involves willing myself to write and creating several outlines before the first draft, which is followed by writing and a lot of revising. I always suggest this approach to my students, although I know it is not easy, because I think it is important that scientists try to communicate well.

Imagine my surprise when I received reviews on a submitted paper declaring that it was the work of ChatGPT. One (1) reviewer wrote that it was “obviously ChatGPT”, and the handling editor vaguely agreed, saying that they found “the writing style unusual”. Surprise was just one emotion I experienced; I also felt shock, dismay and a flood of confusion and alarm. Given how much work I put into writing, it was a blow to be accused of being a chatbot — especially without any evidence.

In reality, I had not written a word of the manuscript using ChatGPT. I quickly brainstormed how I might prove my case. Because I write in plain-text files (using the typesetting language LaTeX) that I track using the version-control system Git, I could show my text change history on GitHub (with commit messages including “finally writing!” and “Another 25 mins of writing progress!” that I never thought I would share). I could also try to compare the writing style of my pre-ChatGPT papers with that of my submission.

Maybe, I could ask ChatGPT itself if it thought it had written my paper. But then I realized I would be spending my time trying to prove that I am not a chatbot — which seemed a bad outcome to the whole situation. What I really wanted to do was pick up my ball and march off of the playground in a fury. How dare they? But first, I decided to get some perspectives from researchers who work on data fraud, co-authors on the paper and other colleagues. Most agreed with my alarm. One put it most succinctly: “All scientific criticism is admissible, but this is a different matter.”

Existential crisis

These reviews captured something both inherently broken about the peer-review process and — more importantly to me — about how AI could corrupt science without even trying.

People worry about AI gaining control over humanity, its potential to supercharge misinformation and how it might help to perpetuate insidious bias and inequality. Some are trying to create safeguards to prevent this. However, communities are also trying to create AI that helps where it should, and maybe that will include as a writing aid. But, as my experience shows, ChatGPT corrupted the whole process simply by its existential presence in the world. I was at once annoyed at being mistaken for a chatbot and horrified that reviewers and editors were so blasé about the idea that someone had submitted AI-generated text.

So much of science is built on trust and faith in the ethics and integrity of our colleagues. We mostly trust that others do not fabricate their data, and I trust that people do not (yet) write their papers or grants using large language models without disclosing it. I would not accuse someone of data fraud or statistical manipulation without evidence; however, a reviewer apparently felt no such qualms when accusing me. Perhaps they did not intend this to be a harsh accusation, and the editor thought nothing of passing along and echoing their comments — but they had effectively accused me of lying by deliberately presenting AI-generated text as my own. They also felt confident that they could discern my writing from that of an AI tool — but they obviously could not.

We need to be able to call out fraud and misconduct in science. In my view, the costs to people who call out data fraud are too high, and the consequences for committing fraud are too low. But I worry about a world in which a reviewer can casually level an accusation of fraud, and the editors and journal editor simply shuffle along the review and invite a resubmission. It suggests not only that reviewers and editors have no faith in the scientific integrity of the submitting authors, but also that ethics are negotiable. Such a world seems easy for ChatGPT to corrupt without even trying — unless we raise our standards.

Scientific societies can start by having conversations during their meetings and conferences with the goal of more explicit, community-generated standards about when and how AI can be used in the manuscript-writing process, and how that help should be acknowledged. Such standards could help editors to develop better processes for handling accusations of AI-generated text, ideally in a way that is less demoralizing for authors.

As for me, I now plan to use Git and GitHub for all my writing from day one, and to document changes every day. It is not an ironclad system, but it has given me some peace of mind — not to mention, a paper trail that clearly shows a manuscript written slowly and painstakingly, and without ChatGPT.

REFERENCE: Nature; 05 FEB 2024; E.M. Wolkovich