Jump to content

Wikipedia:WikiProject AI Cleanup/Guide

From Wikipedia, the free encyclopedia
Main pageDiscussionGuideTasksResourcesPoliciesResearch

This is a guide to finding and fixing AI-generated content on Wikipedia.

What are LLMs?

[edit]

Why is LLM use a problem on Wikipedia?

[edit]

Spotting AI

[edit]

Identifying AI-assisted edits is difficult in most cases since the generated text is often indistinguishable from human text. Some exceptions are if the text contains phrases like "as an AI model" or "as of my last knowledge update" and if the editor copy-pasted the prompt used to generate the text together with the AI response. Other indications include the presence obvious AI hallucinations.

  • AI content sometimes takes a promotional tone, reading like a tourism website.
  • When missing more precise information, AI will often describe in detail very generic and common features, praising a village for its fertile farmlands, livestock and scenic countryside despite it being in an arid mountain range.
  • Other times, the AI gets confused and will write about a hotel instead of a nearby village.
  • AI often invents fake references, so check to see if the URLs work and the cited books exist.
    • Example: the article Leninist historiography was entirely written by AI and previously included a list of completely fake sources in Russian and Hungarian at the bottom of the page. Google turned up no results for these sources.
    • Other example: the article Estola albosignata, about a beetle species, had paragraphs written by AI sourced to actual German and French sources. While the sourced articles were real, they were completely off-topic, with the French one discussing an unrelated genus of crabs.
  • Automatic AI detectors like GPTZero are unreliable and should only ever be used with caution. Given the high rate of false positives, deleting or tagging content purely because it was flagged by an automatic AI detector is not acceptable.

Style

[edit]
  • AI usually capitalizes every word in section titles (title case), which should instead be written in sentence case.
  • A "bullet points with bold titles" style is very typical of ChatGPT, which is virtually unknown on Wikipedia. Often, the content of each bullet point will be a longer rewording of the bolded keyword preceding it.
  • ChatGPT will often add a "Conclusion" section, usually arguing for the significance of the subject in a broader context. These sections do not add encyclopedic information, instead being more essay-like and subjective, and should not be present on Wikipedia.

Cleaning up

[edit]

Warning editors

[edit]