Jump to content

Wikipedia:Citation Watchlist

From Wikipedia, the free encyclopedia
Citation Watchlist
A new edit matches the "warn" filter
DescriptionUser script that adds visual indicators when a diff includes the addition of a URL from a questionable source
Author(s)Harej, Ocaasi
First releasedApril 23, 2024; 7 months ago (2024-04-23)
UpdatedSeptember 25, 2024; 2 months ago (2024-09-25)
Source

The Citation Watchlist is a user script that adds visual indicators to watchlist, recent changes, user Contribs, and page history entries when unreliable sources are added to articles. Indicators, including ❗ for warnings (more severe) and ✋ for cautions (less severe), appear only on the addition of unreliable URLs – not URLs that are already in the article. This makes the Citation Watchlist an efficient tool for analyzing individual edits for unreliable sources.

The vision for the Citation Watchlist is to make tracking changes in references, especially to unreliable sources, as easy as tracking other changes to articles of interest. All experienced editors track changes through their Watchlist, a customized feed of recent changes of articles. The Citation Watchlist adds a clear visual overlay highlighting the addition of spurious or prohibited sources within individual edits, with more information just a hover away.

Setup instructions

[edit]
  1. Navigate to Special:MyPage/common.js
  2. Add this line: importScript('User:Harej/citation-watchlist.js');

Want to test? Deprecated sources recent changes filter should have lots of matches.

Indicators

[edit]

There are three 'levels' of indicators, suggesting how much caution or investigation is warranted upon finding a domain. Warn and Caution, the first two, are drawn from lists with community consensus like Perennial Reliable Sources List. Everything else is under Inspect, including lists from other source reliability tools.

  • ❗ Warn - Highest level of concern, considered "generally unreliable" by WP:RSP, or listed on WP:DEPS (deprecated)
  • ✋ Caution - Middle level of concern, "sometimes unreliable" but officially "no consensus" WP:RSP
  • 🔎 Inspect - Not listed at WP:RSP, more of a heads up to look closer

All editors should use discretion and judgment with each source and usage of that source. Even many 'bad' sources can be used in very specific contexts.

Report bugs

[edit]

Phabricator board

Lists

[edit]

Citation Watchlist screens added URLs against lists. The list of these lists is defined at Wikipedia:Citation Watchlist/Lists. These lists are formatted for the bot to easily process. The default list is Perennial Sources List.

Other lists editors have developed such as predatory journals, pseudoscience, political influence, etc. The goal of our approach is that if anyone wants to change a list, they only need to update a regular wiki page instead of diving into the JavaScript where the code lives.

Once the script is capable of drawing from multiple lists at once, we will add the ability for users to enable/disable those lists per their individual preferences.

The next step of this work, which is in progress, is to add support for community members to add their own lists on top of the standard ones. (A similar approach is used by ad-blocking tools where there is not a single trove of links, but lists of links that editors can choose to opt-in or opt-out of. Of course, instead of blocking links, we are revealing them for scrutiny and intervention.)

Known limitations

[edit]
  • Bugs
    • If you refresh the watchlist via the "new changes" button, the script does not re-run. Actually, it overwrites over the annotations the script makes. You need to manually refresh your watchlist for the new highlights to show up.
  • Limitations
    • Data is not persisted between sessions, so if you reload the page the script will have to run again
  • Unsupported features
    • Differentiated judgments for different paths on the same domain. "news.com/news/" and "news.com/opinion/" would be considered the same source.
    • Supports additional lists, but there is no option to toggle between lists; everyone gets every list.
    • Does not support sources that are not identified by domain name, such as books or journal articles.
  • Intentional product decisions
    • Does not highlight the addition of "good" sources. This would create unnecessary visual clutter; the goal is to make it easier to highlight the addition of sources considered unreliable.

Release notes

[edit]
  • 1.11 (diff): Indicators now only appear for articles and drafts.
  • 1.10 (diff): Adds support for user contribution pages. Re-adds support for pages with exactly one revision.
  • 1.9 (diff): Fixes bug when viewing long article histories
  • 1.8 (diff): Significant performance improvement; no more rate limit pauses.
  • 1.7 (diff): Improves performance by batching, caching, and/or removing API requests
  • 1.6 (diff): Improves performance, discards invalid URLs from processing, globally tracks Wikimedia REST API request count to avoid rate limits that lead to your IP being temporarily blocked from making requests.
  • 1.5 (diff): Batched API requests, significantly improving performance
    • Rolled back
  • 1.4 (diff): Adds support for page histories; replaces internal diffing engine with Wikimedia REST API.
  • 1.3 (diff): Changed top-level domain matching strategy to catch a broader variety of URLs
  • 1.2 (diff): All subdomains of a given root domain now match. Before, URLs matched strictly based on the specific (sub)domain linked to.
  • 1.1 (diff): Adds support for third category Inspect as a neutral alternative to Warn or Caution
  • 1.0: Initial version deployed to English Wikipedia

Development roadmap

[edit]

Phase 2 of the project needs more support for development of a mature, integrated toolset that can be used widely across Wikipedia as a built-in "Gadget", or even adopted directly into the watchlist filtering system.

What's missing, and would take additional time and funding?

  1. False negatives. There are a few specific cases of errors where the tool 'misses' a bad source addition.
  2. Speed. The tool is reliable but works line by line and is not instantaneous.
  3. Customization. The tool does not present options to the user to turn on and off certain source lists or other features.
  4. UI sophistication. The tool uses built-in tool tips that lack enhanced design or added information like list provenance
  5. List development. There are many lists that could be added but would take time and rigorous or manual parsing to use

Etc.

[edit]
  • Enhanced tooltip when hovering over indicators
  • Display of provenance of domains appearing in a report, whether RSP or another list.
  • Link to lists that are invoked in a report
  • Deployment to Wikipedia as a gadget, making the script available in user preferences
  • Internationalization for deployment to non-English Wikipedias
  • Server-side processing
  • Propose offering as an extension to MediaWiki's own watchlist/recent changes filtering capability
  • Curate ALL the relevant citation lists that we know of

See also

[edit]