Skip to main content

Hello. It looks like you’re using an ad blocker that may prevent our website from working properly. To receive the best experience possible, please make sure any ad blockers are switched off, or add https://experience.tinypass.com to your trusted sites, and refresh the page.

If you have any questions or need help you can email us.

Did AI just win its first literary award?

An argument has broken out over a short story prize. But the argument over whether the author was human has revealed an unsettling truth about computer-generated text

Even the best AI detection tools are far from infallible. Image: TNW/Getty

“She had the kind of walking that made benches become men.”

Take a moment to read and enjoy that sentence. Roll it around your mouth. Now take a look at it again and try and work out what, if anything, it actually means. Could it be that it doesn’t mean anything at all, because it was written by something that doesn’t have the faintest idea what it is like to walk, what a bench is, or indeed how something might become a man?

Granta recently published the top-rated entries in the Commonwealth Short Story Prize, an annual award given to fiction from five regions (Africa, Asia, Canada and Europe, the Caribbean, and the Pacific) – the opening line of this piece is taken from the winner of the Caribbean prize, a story titled The Serpent in the Grove by Jamir Nazir. Or at least that’s the name attached to the submission.

Within a few days, speculation was rife on social media that the story had been penned by AI. Ethan Mollick, a professor at the University of Pennsylvania, widely considered to be something of an expert on the technology, posted to Bluesky that “in a Turing Test of sorts, it looks like a 100% AI-generated story just won the Commonwealth Prize for the Caribbean region,” and that he had run it through best-in-class AI detection software, Pangram, which had flagged it as “100% AI-generated”. Beyond that, Mollick continued, “if you know, you know.”

In a slightly bizarre twist to an already-odd story, Granta hit back. The story, and a couple of others from the same contest that had also been the subject of “AIvestigation” from the online crowd, would remain online, because while “we showed Claude.ai the story and asked whether it was AI-generated… [its] response was long, concluding that it was ‘almost certainly not produced unaided by a human’.” 

“The AI-generated critique of these Commonwealth writers – more than one has been accused of basing their story on AI material – may conceivably itself reflect AI bias.” As such, Granta concluded, the stories would stay on their website until the Commonwealth Foundation came to a definitive decision on the pieces’ authorship. 

This illustrates one of the main, tangible results of the AI boom – we are losing the ability to distinguish between what has been made by man and what by machine.  

Online AI detection software is widespread, and improving, but it is not, whatever its promoters may claim, 100% accurate. Claude is not capable of telling you, definitively, whether a piece of text was written by Claude. Even the founder of the aforementioned “best-in-class” AI detector Pangram admits that the software “sometimes makes mistakes”, and that “we still don’t fully understand how to precisely measure how much AI altered [an] original text” – meaning the extent to which work has been amended or modified by AI remains impossible to determine.     

When an AI detector analyses text to determine whether it’s AI-generated, it’s not accessing some secret metadata visible solely to The Machine. LLM-generated text doesn’t come with a watermark that can be scried with the right lens. All these tools are doing is comparing the text they are fed with an internal idea of what AI-generated copy is like – vocabulary, structure, etc – and determining the degree to which there are similarities.

Specifically, this tends to focus on two specific qualities of a text – “perplexity”, a measurement of the randomness of the text, and “burstiness”, a measurement of the variation in perplexity – and the extent to which a document displays those qualities. More perplexity and more burstiness should, in theory, mean the words are more likely to be human-penned. Pangram employs a slightly different approach based on extracting patterns with text and comparing them with known patterns in AI-generated copy, but the principles are similar.  

Unfortunately, though, they are far from infallible. A 2023 University of Maryland paper argued that AI-text detectors are unreliable in practical settings and can be evaded by paraphrasing. OpenAI withdrew its own classifier in 2023 because of its “low rate of accuracy,” saying it correctly identified only 26% of AI-written text, while falsely flagging human text 9% of the time. The tools have improved, but they are by no means perfect – and that means certainty is out of reach when it comes to a text’s provenance.

In the intervening years, the quality of models’ prose output has improved dramatically, to the point where a parallel class of AI tools has sprung up that are designed to “de-AI” machine-generated texts to avoid detection. They do this by upping the perplexity and burstiness of the sentences. 

Which means, fundamentally speaking, there’s no meaningful way of guaranteeing that words have been written by a human any more unless you’re standing behind them and watching them type. 

This detection problem applies to images and video, too. There are a few different technologies in play when it comes to watermarking AI-generated visuals, but many of them can be easily removed with either software or simply by making a copy of the image or video in question. 

The best-in-class is a technology called SynthID, which Google applies to all images created with its tools, but it’s not yet available to any other platforms as yet (although OpenAI has plans to integrate it), and only works on Google-generated AI pictures (not video or text), making it significantly less effective at scale. 

Reassuring as it might be to believe that there’s an effective arbiter of what is real and what isn’t, the sad fact is that there simply isn’t a reliable way to tell any more, other than with your own eyes and your own research. Oh, and if you think you can always spot AI-generated copy because there are tell-tale signs, then I have bad news for you: some people just write like that. 

When it comes to The Serpent in the Grove, the terrifying fact is not so much that it might have been machine-penned; it’s perhaps that a real, apparently human judge was able to read prose like “The shelf didn’t look like freedom – she couldn’t afford that word yet. It looked like not dying,” and think “yep, that deserves a literary award!” Maybe we deserve the slop. 

Matt Muir is writer of the webcurios.co.uk newsletter on tech and the internet

Hello. It looks like you’re using an ad blocker that may prevent our website from working properly. To receive the best experience possible, please make sure any ad blockers are switched off, or add https://experience.tinypass.com to your trusted sites, and refresh the page.

If you have any questions or need help you can email us.

See inside the Housing isn’t working edition

Foreign policy in America has gone through the looking glass — with Democrats looking increasingly insular as Trump embraces a global role. Image: TNW/Getty

If you think Trump’s foreign policy is bad, wait till you see the Democrats’

The current president is clueless on the world stage. But, if the Democrats ever get back into power, they won’t turn the US into an open, internationalist peacebroker. Their focus is their voters – it looks set to stay that way

Being single in 2026 feels like some kind of punishment. Image: TNW/Getty

The dating tax and the end of affordable romance

Dating has become a series of compounded costs, where the freedoms of being independent now come with a penalty