User talk:Iridescent/Archive 31

This is an archive of past discussions with User:Iridescent. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 25

←

Archive 29

→

Notification about notification

I hope I didn't step on anyone's toes with the referral to this talk page in this MediaWiki discussion where I've referred a developer (?) seeking comments on mw:JADE. The reason being that in my experience people with presumably little interest in a certain topic and a lot of background knowledge tend to have the best eye for spotting potential problems. Jo-Jo Eumerus (talk, contributions) 18:42, 4 October 2018 (UTC)

I'm probably not the best one to be asking about this; I was skeptical of ORES, am skeptical of the WMF's paranoia that bots and AI are threatening to undermine Wikipedia's integrity (human stupidity is doing that just fine without assistance, but the notion of automated systems to detect bias I consider potty) and am even more skeptical of any further attempts to get a computer to decide between right and wrong in the context of a multicultural project. The person who would probably have the most useful input to make would be Gurch if you can winkle him out of wherever he's hiding, as IMO he's the first and only person ever to write a "potentially problematic edit" detector that didn't cause more problems through false positives (and the equally problematic false negatives; I'm already seeing a lot of "ORES didn't flag this so it must be OK" crap slipping through) than it solved.

Personally, I think it would be a much better use of time and money to have a handful of paid professional moderators monitoring Special:RecentChanges, regularly dip-sampling edits from all active editors, and investigating more closely if anything problematic were found and flagging any potentially problematic editors for admin attention. Yes, the WMF is paranoid about losing §230 protection if they take a more active role in directly patrolling content, but the big social media and blogging firms directly employ moderators and their worlds haven't imploded. If anything, a "we recognized that even though we didn't have a legal obligation we had an ethical responsibility to know what we were disseminating" now would probably stand the WMF in good stead when the rising tide of backlash against perceived corporate irresponsibility and Wild West attitudes on the internet—which is currently destroying the viral content farms, washing over Facebook, lapping at Twitter's feet, and headed steadily towards Wikipedia and Google—finally reaches us. (If WAID is still watching this page from a couple of threads up, her opinions would probably be worth hearing here; even if she isn't, it would probably be worth asking for her input.) ‑ Iridescent 12:12, 6 October 2018 (UTC)

Today's officially another holiday. I can take a break from this week's AFD fun, trying to convince someone that a centuries-old philosophical shouldn't get WP:INTEXT attribution to a still-living encyclopedia author, and trying to figure out what promotionalism actually is, to say that I think the AI question is a good one. That system will have GIGO problems, and the only real solution is to get the garbage out of it. JADE might make a useful way to do that. That last discussion is highly relevant: If people who do NPP and AFC work regularly reject direct, factual statements as {{db-spam}} rather sending the articles to AFD on the grounds of non-notability, then we're going to end up with an AI system that believes articles about average companies are spam, and that only those with long sections about scandals could be considered "neutral". There needs to be some way to say that yes, it was deleted as "spam", but it isn't technically spam.

Personally, and noting that I probably know less about §230 than anyone who has actually read the relevant Wikipedia article, and noting that the people who think that "They have more than X stores" should be counted as unambiguous promotionalism are all volunteers, I would rather have this moderation done by editors than by WMF staff, and it sounds like that's the plan. Providing a tool that lets volunteers correct the ORES database doesn't sound like controlling content to me. WhatamIdoing (talk) 18:38, 8 October 2018 (UTC)

The parable of the shitty early-2000s website

@WhatamIdoing: "Moderation done by editors" is fine in theory, but in practice that means "moderation by self-appointed busybodies who see it as their mission to purify the site" as they're the ones who'll devote their time to patrolling, so you end up introducing a huge systemic bias against anything that anyone, anywhere, might consider controversial. (If you haven't already, I'd recommend reading this thread to get a feeling for just how broad a range of articles the self-appointed Defenders of the Wiki consider 'inappropriate'.)

Way back before the dawn of time, I did some work for an early dating website. We discovered early on that we needed some kind of moderation system to filter out the dick-pics and inappropriate profile comments, and we also discovered early on that such a process couldn't be automated as it needed people with a good knowledge of popular culture both to spot people using celebrity photos, and to differentiate between genuinely offensive comments and jocular banter and youth-culture references. We also discovered that when you're running your site on a free-registration model, you'd need to hire a small army to moderate the flood of profiles being created.

The solution seemed obvious; offer people who'd been members of the site for a few months the opportunity to become volunteer moderators, on the grounds that these people obviously had too much time on their hands, and that human nature being what it is many of them would jump at the chance to work for free for anything that made them feel important and gave them a position of apparent authority. It was set up such that any new profile or newly-added photo would be passed in front of multiple moderators, and those moderators whose opinions were regularly out of step with consensus would have their opinions disregarded without their even knowing it, until such time as it was obvious that they were voting in line with consensus again.

The whole thing worked stunningly well at first, with some of the moderators literally reviewing tens of thousands of uploads per day, independent online communities growing up where the moderators would chat and exchange tips on what was and wasn't acceptable and problematic users to watch out for, the moderators recommending the site to their friends which in turn generated more ad traffic, and so on. With minimal staff costs the site boomed, and became a multi-million dollar business.

Then things started to get out of control. The volunteers became increasingly worried about the risk of being the one that let something inappropriate through, and more and more legitimate profiles started to be rejected. The offsite message boards became breeding grounds for paranoia with the moderators posting increasingly lurid speculation about the employees. As the site grew in popularity, religious groups who were opposed to the site on general principle began to figure out that if their members signed up en masse, they could systematically disrupt the system and block anyone they thought looked slutty from posting. Within a year, the volunteer-based system had to be abandoned, and a bunch of low-paid but paid interns took their place, as even though it cost the site more it was the only way to keep it functioning without either allowing a bunch of cranks to determine what was and wasn't hosted, or abandoning moderation altogether, trusting to §230, and developing a reputation as the cesspit of the internet.

The moral of this story is, the kind of people who want to act as volunteer moderators aren't always the people you would want as volunteer moderators. Wikipedia is still to this day suffering from the after-effects of the early days when Jimmy was handing out admin bits to his friends; allowing the small handful of people who see themselves as Fearless Spam Hunters to set the tempo for Wikipedia's attitude towards what constitutes promotion could do irreparable damage, but because they're by and large the only ones who care enough to have input into what ORES et al consider inappropriate (the silent majority are writing articles, not prowling around looking for good faith new editors to harass with A7 and G11 tags), these automated systems are handing the policy agenda to a tiny clique of Free Culture cranks who don't want Wikipedia to host anything that doesn't coincide with their particular view of what it ought to be. This is the point where I ping SoWhy who can probably articulate this better than I can. ‑ Iridescent 02:32, 9 October 2018 (UTC)

I think you articulated that very well 👏 I myself was active in the early 2000s in a number of message boards as a moderator and admin, I even ran a German support board for two major message board softwares. Moderation on such pages always hinged on the fact that it was people with too much free time doing most of the work, which logically included my teenage and tweenage self. Luckily for Wikipedia, the user base is still large enough to not fall into the same patterns but I do see the risks. Regards So Why 07:37, 9 October 2018 (UTC)

As always, I feel that switching to a Git style system (or really any semi-modern source control system) with revision control for individual pages would help this problem. Right now it's nearly impossible to tell whether any semi-competent editor has reviewed a version of a page; pending changes has too many problems to be a feasible site-wide solution. I estimate it would cost at least $100 million/year to have paid staff review every change; the Foundation does not have that kind of money. So we make do with free labor. The most disruptive forms of vandalism (fake references, BLP violations, and the like) will not be detectable by AI anytime soon. power~enwiki (π, ν) 02:42, 9 October 2018 (UTC)

In a paid-moderation model, paid staff wouldn't need to review every change, any more than Facebook's or Instagram's paid moderators are inspecting every restaurant review or photograph of your cat you post. They'd do random dip-samples of edits, and whenever they found something problematic would look into that editor's other contributions in more detail, and they'd pay particular attention to edits that added, removed or changed large blocks of text on topics on which that editor hadn't previously worked. This is what already happens, we'd just be making it less haphazard and ensuring that unfashionable topics that aren't on the watchlists of multiple editors also get monitored for problematic edits. There are legitimate grounds for arguing against paid moderation, on the grounds that it would potentially demoralize unpaid admins and RC patrollers to know that other people are being paid to do identical work and that some people were receiving formal training to do a job in which other people were just being thrown in at the deep end and expected to pick it up as they go along, but cost isn't an issue; it would probably take no more than ten full-time-equivalent posts to have a significant impact on Wikipedia's quality, and those posts could be anywhere and wouldn't need Bay Area—or even Biloxi Area—wages. (The WMF is sitting on roughly $30 million surplus cash and the figure rises every year—we quite literally have more money than we know what to do with.) ‑ Iridescent 02:59, 9 October 2018 (UTC)

Ten FTE = 2.5 people concurrently, which is the lower end of being able to patrol Special:RecentChanges for vandalism (even with AI aids); in my experience a single person can watch IP edits, or can watch non-ECP edits, or can watch ECP edits and do something else. Doing that for 8 hours straight is borderline-unreasonable at any wage level; I don't think even the top Huggle-users manage that. And then you need other people (or "the community") to manage things that aren't insta-revert vandalism. If the WMF were willing to pay for such a thing, I'd rather them deal with patrolling/verifying references on articles on Indian films, rugby players, Chinese cars, etc. first. power~enwiki (π, ν) 03:10, 9 October 2018 (UTC)

Again, they'd not be expected to review everything. RC patrol—with or without semiautomation—is a staggeringly inefficient method; the hypothetical reviewers would be dip-sampling a couple of edits from each editor with a slight bias towards newer accounts and a stronger bias towards newer accounts making large changes. This isn't some kind of crazy blue-sky thinking; virtually every major social media site, blogging platform, advertising site, information site including user-submitted content (e.g. Google Maps) etc with the exception of Wikipedia already does this. ‑ Iridescent 03:19, 9 October 2018 (UTC)

This seems like the class of problem where the good Rev. Bayes could help. Shock Brigade Harvester Boris (talk) 04:05, 9 October 2018 (UTC)

As I understand it, that's what they're aiming for. The issue with all these machine-learning approaches is that Wikipedia is too complex a system to model well—if an IP removes a large block of text, are they a driveby vandal who should be summarily blocked and reverted, or an expert copyeditor who's realized that the point can be made far more elegantly, in which case blocking and reverting will likely drive away someone who could have gone on to do great things? With ORES and edit filters in particular, we also have blowback from what it doesn't detect; any WP:LTA case worth their salt can figure out how to word things such that an edit won't be flagged as potentially problematic, and if the RC patrollers are relying on the automated systems to decide what warrants further attention, the next Morning277 could be active for years before anyone even notices there's a problem. When even the human volunteers can't always agree on what is and isn't problematic, trusting in machines is unlikely to end well. ‑ Iridescent 13:37, 9 October 2018 (UTC)

In theory individual revisions can be reviewed (meta does this). On Iri’s point, while I’m certainly more on the “Self-appointed defender on the wiki” end, re: promotion, I also largely agree with his point on this more broadly: any stroll through SPI or AIV when both are crowded and you’ll find a fair number of “why do we care about this?” And “no. I will not indulge your bloodlust for blocks 11 months after the fact.” cases. I call it Wikipedia-the-videogame, and CAT:CSD probably suffers from the similar issues. Part of the problem is that very few admins feel like getting yelled at by the person reporting/tagging/requesting action because it’s not worth the hassle of spending 48 hours on your talk page explaining in explicit detail on how their interpretation of the policy/guideline in question is either wrong or controversial. TonyBallioni (talk) 03:02, 9 October 2018 (UTC)

This one is the worst example I've ever come across of "I haven't heard of it so we should delete it". ‑ Iridescent 03:19, 9 October 2018 (UTC)

Oh, I remember that one. The worst I’ve seen was an A7 on an Indian Catholic bishop who was the driving force behind translating the Bible into Kashmiri. Also trying to delete the elections of the Holy Roman Emperor (I think I’ve bitched here about that one.) TonyBallioni (talk) 03:28, 9 October 2018 (UTC)

Since I'm feeling grumpy (hmm, maybe it's bedtime? No, I'll post on your talk page first), I'm going to say that we can't even get support for proper version control for the actual software we're all running (i.e., JS/CSS; see phab:T165981 and related requests), so I'm not going to think about it for content.

I want to add another item to the list of problems: The problem is not just that edits are reviewed by bored busybodies (like me). The problem is also that the subset of bored busybodies who review RecentChanges in general have approximately zero incentive to support the addition of content. And perhaps even more importantly, a change gets reviewed, and re-reviewed, and re-re-reviewed, until someone reverts it. So if I happen to review a change, and I happen to believe that it's a net improvement to an article, that doesn't stop someone else from wandering by and reverting it anyway. The way most editors handle their watchlists is to check the net changes, rather than stepping through each change individually, so a change–revert cycle becomes invisible to them. I believe that we lose a fair bit of desirable (if perhaps not perfectly presented) content that way. Each change is subject to repeated review by people who "don't want to be the person who approved that" and whose only significant form of feedback is being Special:Thanked for things that they reverted (but never, ever thanked for things that they correctly accepted or ignored, because nobody knows about that).

At this wiki, anyway. Smaller wikis don't have this problem. There's too much work to be done, and changes are normally reviewed by only one or two people, who review all edits. WhatamIdoing (talk) 04:43, 11 October 2018 (UTC)

I totally agree with every word of the above. The way our software is set up creates a huge inbuilt systemic bias towards action over inaction; as well as "changes keep getting reviewed until someone either reverts it or makes another change so it ceases to be the most recent", there's a serious problem (on which I've commented before) at the admin boards, in which it doesn't matter that a dozen admins have decided that no action needs to be taken if another admin comes along afterwards and decides that protections or blocks are in order. It might be a lingering collective memory from the early days; on Nupedia (and its successor Citizendium) articles were marked as drafts until Larry or one of his cronies signed off on them and pronounced them "ready" whereas on Wikipedia the articles were live from the moment someone clicked "Save Changes", and it's something of an article of faith among the old guard who still largely determine policy that anything Nupedia did was wrong (uou presumably remember how much shouting and arguing it took even to get such a thing as the Draft: namespace to exist, which was surely a no-brainer), so there's maybe a cultural subconscious opposition to any form of "I approve this change".

Ultimately, we're limited by the fact that despite 15 years of additions and enhancements, MediaWiki is at its heart the same software that was designed for use by a small community of friends and colleagues in which everyone knew each other (it's not that long since the 'blocking mechanism' was to leave a polite notice on the editors talkpage that if they made any further edits, consideration would be given to reverting them), and it's never really scaled to an anonymous community with thousands of active editors at any given time.

Because removing stuff—in the sense of "come revert vandals" and "come nominate stuff for deletion"—is one of the key routes en-wiki has traditionally offered to people who want to get involved but don't really feel confident writing their own content, the cynic in me says nothing will change. (This is not to belittle the bored busybodies in any way; I was one myself and still from time to time fire up the RecentChanges patrol scripts or the mass typo search-and-replace tools,* or go prowling around Special:Random looking for things to nominate for deletion.) Community Engagement can scream until they're blue in the face that constantly having time wasted with nonsense like Wikipedia:Articles for deletion/Fownes Hotel or with constantly reviewing the actions of trigger-happy admins is a disincentive that drives editors away, but the rest of the WMF ultimately probably won't listen—the quick-buzz of "I reverted/tagged something and it disappeared! Something I did made a lasting change to the Sum of All Human Knowledge and it took minimal effort!" is one of the tools that keeps the new account registrations and the donor funds flowing.
*I was present at the birth of semiautomation—the sudden spike to 15,000 edits in a month in May 2007 and this talk thread mark the birth and growing pains of Huggle, the first credible attempt at a "likelihood that this edit is problematic" based system for reviewing recent changes. It's only with a decade of hindsight, watching people try and fail to come up with something better, that I truly appreciate what a work of genius Gurch's original incarnation of Huggle was.

Other than you and Maggie/MRG, most of the WMF staff, board and the volunteer devs don't actually have much experience with Wikipedia/MediaWiki as a writing medium (just gonna put this here), and even those who do have experience editing Wikipedia tend to do so from the revert-and-report admin-hurling-lightning-bolts-at-the-peasants-below perspective rather than from the perspective of someone trying to create and improve content from the bottom up—our supply of Doc James's is limited. If you hold your nose and try to read discussions at Meta and Phabricator, it's obvious that the prevailing attitude is towards technical rather than social fixes to problems, and towards a raw-participant-numbers social network approach in which a new account who does nothing but make hundreds of posts on talkpages is worth more than a new account who sits quietly in the background writing articles, because the editor who's made a thousand trivial posts to talkpages is more "engaged" in terms of raw metrics than the editor who's made fifty long contributions to articles. (Since these are your official statistics, it's reasonable to assume that they're the statistics you consider important, and you don't even differentiate between edits to talk and edits to articles.)

To be honest, as long as Jimmy Wales remains in post I don't see the problem ever being addressed, as he creates a huge chilling effect from the top down that freezes the life out of any serious "what do we want to be and how are we going to get there?" discussion from Board level down, especially since the Knowledge Engine farce burned his fingers. The WMF really needs people with the nerve to say "the existing model is ultimately going to reach the point where we can't keep patching and making do, what will Wikipedia 2.0 look like?", but his sitting in the center dismissing any suggestion that the sites aren't perfect as "trolling" means that anyone with a vision that goes beyond "more of the same" doesn't last. ‑ Iridescent 15:15, 11 October 2018 (UTC)

(ec) I must say I don't see too much excess reversion, with an insanely large watchlist, but mostly of relatively low-view articles. I try to look at the previous edit(s) if recent, and if such a reversion seems a net negative will of course revert, without knowing if it is a reviewer or not. Regarding Iri's points (which in general I agree with), it would help if WMF had a board with editorial/academic backgrounds, rather than just techie/activist type ones. AFAIK none of the "outside" board members has ever had such a background. Johnbod (talk) 15:24, 11 October 2018 (UTC)

I believe there have been a few journalists/media folks on the board, and a couple whose day jobs were in the education industry. I'm not sure that, say, a Professor of Education would be all that valuable. The board sets the budget (total amount + how much to spend in each of several major areas), and they set the overall direction ("Let's develop a strategy!" or "Give more attention to developing countries"), but none of them are involved in the day-to-day operations of the WMF, and the WMF is only involved in content tangentially (e.g., processing a DMCA takedown) or accidentally (developing software that increases or decreases the likelihood of a particular kind of content being created). WhatamIdoing (talk) 17:25, 12 October 2018 (UTC)

On the topic of stats, we do have a breakdown of edits to talk versus edits to articles HERE

Have considered the idea of having medical / nursing students systematically review edits to medical content. We are looking at about 500,000 article edits on EN WP per year which would not be impossible. Was thinking to trial with a summer student if I could find interest. Doc James (talk · contribs · email) 00:50, 13 October 2018 (UTC)

@WhatamIdoing I'm not sure either that a Professor of Education would be all that valuable, but what would be valuable would be a few textbook writers, museum curators, librarians and people with a background in "summarize current thinking for a mass audience" periodicals like Natural History, New Scientist and History Today; basically, people who reflect what Wikipedia actually does, rather than people who reflect the "better living through science" fantasy that any problem can be solved by throwing programmers at it. Because of the unique nature of Wikipedia, the WMF board has always had a tendency to reflect the values of the libertarian and Randroid lunatic fringe from which it came, and the influence of Silicon Valley types and sycophantic journalists is a negative, not a positive. As a great thinker once said, My ideal recruits to Wikipedia would be the people who write travel guidebooks, museum catalogs and children's nonfiction; they all understand the "absorb a lot of information and summarize the salient points in brief and neutral form" and if the WMF really want to spend money reaching out externally they'd do much better trying to recruit the people who write children's books and the people who write museum labels, as it's the ability to summarize material for people with little prior knowledge of the topic, not the ability to defend a point logically, that Wikipedia needs. (The ideal Wikipedia editor would be the authors of Cliff's Notes and the For Dummies books.); that goes just as much for the board as it does for the editor base, as it's that disconnect between the incompatible mentalities of "what can we do to get more readers and editors?" on the one hand and "what should we be doing to ensure we're as useful to readers as possible?" on the other that's at the heart of pretty much every systemic problem on Wikipedia and the other large WMF projects. It's no good having a board on which at most two members actually understand what it is that Wikipedia does. (I make no apologies for conflating Wikipedia with the WMF in this context. When it comes to the big policy issues, the big Wikipedias are the only games in town; no policy decision no matter how drastic made on Commons, Wikidata etc would have any significant impact on us other than a temporary inconvenience. Despite their protests to the contrary, the other projects exist to be a support mechanism for Wikipedia.)

@Doc James: The content/non-content ratio as a raw figure isn't that valuable—when I racked up c. 20,000 edits in a few days a few years ago search-and-replacing "and and" (something that needs human supervision as there are some instances in computing and logic articles where the term has a legitimate use) then in terms of raw edit count I must have appeared to be Wikipedia's greatest asset, whereas someone like Newyorkbrad or Moonriddengirl who don't make many article-space edits but do a lot of behind-the-curtain necessary stuff appear a total waste of space; likewise, a recent changes patroller who always stops to explain to each editor why they've been reverted and what they need to be doing differently is of considerably more value to Wikipedia than some human-bot hybrid running STiki and mindlessly machine-gunning the 'revert' button with one hand with 90% of their attention on the TV, even though the former will appear in terms of raw statistics to be someone treating Wikipedia as a social network (since each mainspace edit will be accompanied by multiple talk edits as they talk the new editor through what they should be doing).

I quite like the idea of picking a small field and getting people to regularly conduct systematic reviews, but medicine might be too broad a field, as well as too atypical a field if part of the aim is to conduct a genuine quality assessment of Wikipedia. (There are some absolutely fucking awful medical articles, but they're rarer than in most other fields because they tend to be more heavily patrolled and the standards more strictly enforced.) It might be better to start with relatively small and specialist fields in areas where Wikipedia already has good working relationships with relevant academic institutions (there must be some museums we haven't managed to piss off yet), and once we have the assessment and review processes up and running for teratology, 18th-century German porcelain or the comparative linguistics of Mediterranean island dialects, we then start rolling it out to broader fields like "medicine", "painting" and "astronomy". ‑ Iridescent 02:02, 13 October 2018 (UTC)

arbitrary editing convenience break: medical articles

There you go again, with all that high-minded stuff about being useful to readers. Intelligible, even.

Since enwiki's medical articles are the subject I know best, I'll use them as my example. What exactly does "useful to readers" mean for an article about a disease?

I have one simple answer, and it pretty much means that when you get a note from your friend saying "We just left the doctor's office, and it turns out that she has scaryitis", you'll be able to find that in Wikipedia and easily calibrate your response on a scale that runs from "What a relief" to "I'm so very sorry". But (a) that's not the only answer I have, and (b) not everyone agrees, even though a ==Prognosis== section is officially recommended.

Here's another simple answer I have: When you read about scaryitis in the news, you should be able to find out what the patient experience of that condition is. We do this in a very few articles. Off hand, Hyperhidrosis says (or at least used to) that severely sweaty palms make it inadvisable to take certain jobs, because knives slip out of wet hands, and Cancer probably still says that patients have a lot of emotional stuff around it. But with the exception of a few big subjects, such as cancer, our sourcing guidelines push us firmly away from that kind of content. You can get a good meta-analysis on whether drug A or drug B reduces cholesterol more. You can't get a good meta-analysis on whether sweaty palms is a disabling condition for a butcher, or whether parents of premature babies are just "really stressed" or "practically going insane from worry".

That's only two of the answers I would give, and neither of them are things that we're handling well.

BTW, WT:MEDMOS a while ago had a discussion about reading levels that might interest you. Since I've given up on my watchlists (both accounts, all wikis), I tend to wander in and out of discussions based on whether I remember them (pings help :-), so I have no idea what it ended up like. At the point I last read it, though, we had some reasonable consensus that articles ought to have a range of information (e.g., simpler introductory sentences, but still leaving room for jargon-filled paragraphs later). WhatamIdoing (talk) 02:45, 13 October 2018 (UTC)

Interestingly, I'm exactly the same with regards to watchlists—I tend to use my recent contribution history as a mini-watchlist. I do still keep my big watchlist, but treat it as I would a social media feed, occasionally dipping in to it when I'm bored rather than checking it top-to-bottom on a regular basis. (And after months of grumbling about Echo when it was introduced, I now completely see the point of it; not just the direct pings, but the thanks and links notifications serve as notices that someone else has taken an interest in something I've done so I probably ought to have a look and see why.)

No apologies for banging on about usefulness; too many people seem to see Wikipedia as an exercise in how much obscure knowledge they can show off, and forget that our readers in most cases just want to know more about insert topic without feeling that they're unwelcome because they don't know all the jargon. Years and years ago when I first started, something Giano said stuck with me; assume that every reader is a bright 14-year-old with no prior knowledge of the topic but who's interested in learning. In my experience, other than a few very technical subjects which are usually subsidiary to something else, that rule works consistently both in terms of how articles should be targeted in terms of reading comprehension, and in terms of how we format articles to try to keep readers engaged (mention anything that sounds particularly interesting or unusual in the lead even if it's relatively unimportant so readers keep reading after they've skimmed the lead, top-load the most attractive or interesting images at the beginning of the article, stop to explain anything that might not be obvious to every reader even if it means a footnote section that's as long as the article, if a word has synonyms always use the one that a child is most likely to understand unless you absolutely need to mix them up to avoid repetition).

I agree with you about medical condition articles—we can scream and shout as much as we like that readers shouldn't be using Wikipedia as a medical source, but I've seen enough people using it in the wild to know that Wikipedia is one of the most trusted medical websites in the world no matter how many disclaimers we post. (For most illnesses and medications, the top hits are Mayo Clinic, WebMD, CDC, Wikipedia and the NHS; whether fairly or not, the 95% of the world that isn't the US have spent their entire life hearing horror stories about the American medical system so will automatically discount CDC and the Mayo Clinic, WebMD looks and feels like a dodgy commercial outfit with its "click here to subscribe" popups and adverts everywhere, and the NHS is so specific to England and Wales that it's not necessarily useful to people in other countries.)

As a very-long-term project a lot of the medical coverage should probably get a two-level approach with separate articles aimed at patients and aimed at practitioners/students, at least with regards to common illnesses and commonly-used medications—if I wake up with a sniffle, stomach cramps and aching limbs and wonder if I have flu and if so whether I need to be worried, I don't want the lead of the article to include the terms "polymerase chain reaction" or "neuraminidase inhibitor oseltamivir". (I also am going to be seriously misled and upset by an article whose lead gives the strong impression that influenza has a 10% fatality rate, since if I'm a typical reader I'm going to parse three to five million cases of severe illness and about 250,000 to 500,000 deaths as "one in ten of the people who get it die" and not understand that what Wikipedia means by "severe illness" almost certainly doesn't include whatever I happen to have.*) It's been tried experimentally on a few broad topics like Virus and Genetics, but with limited success as in both cases the specialist technical article has been given the primary title and the non-specialist article hidden away at [[Introduction to...]], meaning the search engines are driving readers to the technical rather than the non-technical articles. ‑ Iridescent 14:22, 13 October 2018 (UTC)
*I'm hating on the Influenza article because this is a featured article and consequently considered to be one of the best articles Wikipedia has to offer, as determined by Wikipedia's editors and used by editors as examples for writing other articles, but the same points could be made about almost any article about a medical condition. In particular, the unnecessarily traumatizing language about mortality/recovery rates without putting it in context front-and-center that for most conditions the risk of death or long-term disability are heavily weighted towards people with already-compromised immune systems, is a particular bugbear of mine. (I don't know if figures exist, but any primary care physician, 911 operator or ER admissions staff can confirm anecdotally that "but I read on the internet that this might be fatal!!!" is a significant driver of people seeking unnecessary treatment and diverting services away from people who actually need them.)

Huh. I do make thorough passes through my not particularly long watchlist. I wonder if the issue with making articles interesting is that a) many dedicated editors can't tell when writing that they are using unnecessarily dense language (something people have said about my articles, most recently on Wōdejebato) and b) not all topics can necessarily be made interesting (Ita Mai Tai has an interesting etymology but Taapaca, Tutupaca and Ubinas don't have much to offer in kind) with the source material available. Jo-Jo Eumerus (talk, contributions) 14:40, 13 October 2018 (UTC)

As a quick-and-dirty way to bring the volcano articles more in line with Giano's guide:

Avoid words like "Holocene" in the lead and instead say "last erupted around 2300 years ago"; the former makes readers give up after just reading the lead and go find something that looks less like geography homework, while the latter conjures up images of comely Inca peasants fleeing the oncoming ash cloud astride their trusty llama steeds, even though they're semantically identical;
Although most of the volcanoes in northern Chile are far from towns and inhabited areas, nowhere is entirely uninhabited; search round for some photos of people who live on the slopes or in areas at risk from pyroclastic flows, preferably attractive women in colorful native costumes, handsome men with wistful expressions, or cute children, and put it near the top. They're encyclopediacally justified, and adding a human element can transform a boring technical article into something that engages the reader. If you have trouble finding anything on Commons, Flickr is usually a good bet (nudge); whatever the license, Flickr users are almost always happy to CC BY-SA relicense a photo if you point out that an appearance on a Wikipedia TFA will generate between 20,000–200,000 views, at least some of which will be interested enough in the image to click through and view the rest of that user's Flickr photostream. Even interesting looking buildings would do; the historic town of Putre is likely to be destroyed if Taapaca erupts makes it clear that this is a story with a human impact rather than a technical article about magma flow rates;
Even if this volcano hasn't erupted recently, that doesn't mean other similar volcanoes haven't; find some volcanoes of the same type and upload images with lots of impressive-looking lava streams and steam venting, to give the reader an idea of what it must have looked like when it was active.

Basically, what you need to remember is that your primary audience isn't "I am writing a history of the volcanoes of South America and want to know about all of them", it's "I went to Chile on vacation and saw this really cool looking mountain and want to know more about it", and if it does wind up at TFA it's "I have absolutely no idea what a Tutupaca is but I've learned from experience that if I click this link in the middle of the main page I sometimes see something interesting". The technical stuff needs to be there, but bury it at the bottom; despite what WP:LEAD may tell you the purpose of the lead isn't an introduction to the article and a summary of its most important contents (although it needs to be that as well), but as a sales pitch to convince people landing on the article that reading it will be worth their while (and equally importantly, to notify people who've landed on the article that this isn't a topic in which they'll be interested to save them wasting their time). ‑ Iridescent 15:18, 13 October 2018 (UTC)

Really on these, which mostly get fewer than 10 views a day, you are preparing for the day they blow up and get 50,000 overnight. Then you have provided the world's media with stuff to repeat confidently to camera. Ubinas looks promising. Johnbod (talk) 15:39, 13 October 2018 (UTC)

arbitrary break: page view spikes

Har. I did find some good images of Putre for Taapaca and added one of them. Now I got sidetracked by the spectacular imagery of the landscape... Jo-Jo Eumerus (talk, contributions) 19:38, 13 October 2018 (UTC)

It's not just preparedness for the day they blow up; all it takes is one high-profile event to happen there. (My shitty and ill-maintained Broadwater Farm article, which normally gets only a few page views a year, was briefly the most-viewed article on Wikipedia during some unpleasantness in 2011.) It doesn't even take anything interesting to happen; all it takes is a celebrity to take an interest in a Wikipedia page and tweet a link to it, and the page can go viral within seconds. Tarrare is my usual go-to example of the power of Twitter to affect Wikipedia page views; it normally gets the deservedly low page views you'd expect from an article on a case study of a patient with multiple metabolic digestive disorders in 18th-century France, but with metronomic regularity some celebrity or other finds the story interesting, tweets a link, and it becomes one of the most popular articles on Wikipedia for a day or two. (During his last spike his pageviews for that single day made him more popular over the entire week than China, Michael Jackson, World War II or Elvis Presley.) ‑ Iridescent 01:44, 14 October 2018 (UTC)

Ha. I am familiar with the "high profile event" thing since 1257 Samalas eruption - one of the most popular articles I've written with several translations - got a little more attention in 2017 during the eruptions of the neighbouring volcano of Agung.Too bad that the eruption was only discovered in 2013 and that there is no review source analyzing its region-by-region impact; if there was one it might stand a chance at FAC. Jo-Jo Eumerus (talk, contributions) 06:44, 14 October 2018 (UTC)

It would probably stand a chance at FAC regardless, provided you can demonstrate that you looked for a review source analyzing its region-by-region impact. WP:What is a Featured Article is one of the more misunderstood pages on Wikipedia, even by experienced editors, and the neglects no major facts or details language is a little misleading. To be an FA an article doesn't need to say everything there is to know about the article subject, it needs to reflect the state of current scholarship on the subject, so if something hasn't been covered in the literature it's perfectly OK to omit it. (The "neglects no major facts or details" wording means that we don't omit the findings of Researcher A just because we prefer the conclusions of Researcher B.)

In general, don't get too hung up on complying with the letter of the law of WP:WIAFA, except for 1d (NPOV); criteria 1a, 1b, 1e and 4 are purely subjective, 1c is an impossibility to comply with for any but the narrowest topic so is disregarded, while 2 and 3 are just common sense with which every article should be complying. In reality, the FA criteria are "is this confusingly or badly written?", "is there anything obviously missing that ought to be there?", "does it fairly reflect the various schools of thought?", "are the images correctly licenced?" and "do the sources say what the articles claim they say?". ‑ Iridescent 13:27, 14 October 2018 (UTC)

Jo-Jo, have you seen any readability discussions? I'm partial to http://hemingwayapp.com/ (which requires copying and pasting plus removing the numbers leftover from the ref tags), but other people prefer http://readabilityofwikipedia.com/ (which sometimes gives nearly random results if there's a parsing error due to unsupported wikitext constructions; check its reported word count to see if it "lost" most of the article). In the case of Taapaca, Hemingway says that the lead rates as the last year of high school, and the whole article as the first year of university, and it highlights almost half the sentences as being "very hard to read". ROW says that only 28% of Wikipedia articles are more complicated to read that this one, and that it is "difficult" (but not the most difficult category).

On the subject of medical articles, figuring out that 98% of people survive non-melanoma skin cancer, even if you have one of those, isn't medical advice. "You personally should use ____" is medical advice; "Overall, the most effective treatment is ____" is medical information. I think that we need to provide much clearer medical information.

A solid training program on how to write might be useful. For example, most people understand that "98% of people with non-melanoma skin cancer can be completely cured", but they don't necessarily make the leap from that to "2% of people with non-melanoma skin cancer die from it." And while 98% is fairly well understood as meaning "practically everyone", it's sometimes clearer to write "one person out of 50" than to write "2%".

If you want to get even more complicated, then there are subtle effect. 98% is a simple number that everyone on this page grasps easily – we're not talking about something complicated, like the possibility that the used Barbie doll in the neighbor's garage sale will be one of the few that says "Math class is tough" (about 0.0003%, if you're curious) – but presentation matters, because "2% eventually die" is more salient than "98% are cured". For an affected person and their loved ones, it's not a simple math equation. WhatamIdoing (talk) 17:46, 14 October 2018 (UTC)

Regarding the 98%/2% thing, "98% of people can be completely cured" and "2% die from it" aren't at all the same thing. "98% of people can be completely cured" could just as well mean "2% of people will suffer mild headaches occasionally for the rest of their lives as a side effect of the treatment". For cancers it's not such an issue as for most of them the prognosis splits between "full recovery" and "dead" without much in between, but for something like neurological disorders, where the result is a dice-roll on the spectrum between "full recovery within a few days" to "dead within weeks" and the full range from "minor inconvenience" to "lifelong debilitating disability" in between, making it crystal clear whether we're talking about "the percentage who make a total recovery" or "the percentage who survive" is of utmost importance if you (rightly) see a significant function of Wikipedia's medical coverage as letting people know how worried they ought to be. ‑ Iridescent 03:06, 15 October 2018 (UTC)

Sidetrack within a sidetrack on readability scoring

I don't think I've seen thorough readability discussions, although "hemingwayapp" rang a bell. I am guessing that it might be harder for me since I am ESL and learned much of it from academic text too. Jo-Jo Eumerus (talk, contributions) 20:13, 14 October 2018 (UTC)

Warning; hit nerve alert. I know from previous discussions at WT:MEDMOS that you (WhatamIdoing) are less skeptical than I, but I find the application to Wikipedia of Flesch–Kincaid and similar schemes that claim to assess readability to be very misguided. As with standard IQ tests, F-K doesn't actually measure anything that's particularly useful to Wikipedia, it measures how closely something conforms to American cultural expectations. (The very fact that its results are given in "grades", something completely meaningless in the rest of the world, is a giveaway.) As one very obvious case in point, the syllable and word counts are absolutely key to F-K, but all it takes is a couple of mentions of "laboratory", "Israel", "military" etc to send the syllable counts differing wildly between dialects, and that's before you get to the joys of English place names—want to see what a computer readability program makes of "from Hunstanton to Godmanchester via Costessey and Wymondham" (13 syllables, if you're counting)? If the article happens to include a foreign language quotation, does glasflächenreinigung (one word, five syllables) really improve the readability whilst nettoyage de la surface en verre (six words, nine syllables) wrecks it? The whole "short sentences are always better" thing is a cultural construct rather than an actual rule of writing; Dickens—the absolute exemplar of English language populist writing targeted at people who weren't necessarily avid readers—had an average sentence length just above 20 words and seems to have survived. "If the typical reader is likely to have forgotten how the sentence began when they reach the end, consider breaking it" is the only real rule of good writing when it comes to sentence length that isn't just snobbishness, and even that depends on the reader; if the reader finds a topic interesting, long sentences can increase comprehension as they're easier to focus on than a barrage of staccato short bursts.

The language used in Wikipedia articles should be as simple as it's possible to make them without losing meaning and not one step further. There's a legitimate argument to be had about when we should be assuming the reader has the background knowledge, when we should be explaining terms and background that might not be familiar,* and when we do stop to explain jargon and background whether it's better to do so in the footnotes where people might not notice it or inline where it disrupts the flow of text and risks appearing patronizing to those who are already familiar with the topic—but largely arbitrary scoring systems shouldn't be playing a part in it. (If you want more of me flying off the handle at Wikipedia's culture of 'improving the readability' of articles at the cost of losing the meaning, see User talk:Johnbod/36#Problems?.) ‑ Iridescent 02:11, 15 October 2018 (UTC)
*On an article that mentioned a railroad—I can't remember which—I remember once being quite surprised when Ealdgyth called me out for using "track lifting" instead of "removed or demolished the railroad-related structures". In hindsight she was quite correct, as even though "track lifting" is simpler in terms of readability, it's a false economy as enough readers will need to stop and look up what that means that it disrupts more readers than it helps.

One of the reasons that I prefer the Hemingway app is that the "grade level" score is just shiny chrome (added last year, I believe), and the core function is actually in evaluating the structure of individual sentences. "Go from Hunstanton to Godmanchester via Costessey and Wymondham" is high school reading level, but who cares? What is more important is that that sentence is not highlighted as being very difficult to read. Also, the Hemingway app provides sentence-by-sentence information, rather than an overall score. If you write a lengthy FA for this Wikipedia, and you don't get any sentences highlighted in red, then you probably have not accomplished "brilliant prose". The point isn't to have no difficult sentences; the point is to put your complex sentences where you want to have them. WhatamIdoing (talk) 00:43, 16 October 2018 (UTC)

I still can't really see the point of scoring in the context of Wikipedia. It makes sense for things like school textbooks, political pamphlets and news reportage, where you're trying to ensure that you communicate the pertinent points to readers who lack interest in the topic, before they lose attention. On Wikipedia, except for a very few limited exceptions such as material linked from the main page, readers are reading only what they've chosen to read, so if they've ended up at Guillain–Barré syndrome, it doesn't matter that the lead says In those with severe weakness, prompt treatment with intravenous immunoglobulins or plasmapheresis, together with supportive care, will lead to good recovery in the majority rather than If it's serious, injections and replacing ooky blood with clean blood will probably help, since the reader is obviously interested enough that provided they actually know what the words mean, they'll make the effort to understand. IMO, oversimplication is generally more of a problem on Wikipedia than overcomplication, particularly in talk pages; especially on hot-topic or high-traffic subjects there's a tendency to oversimplify, which means that particularly on medical and legal topics, where accuracy is more important than legibility, people tend to cut corners and change the meaning of things. For instance, although our Directive on Copyright in the Digital Single Market is correct, you try explaining to Jimmy Wales that the European Parliament has no legislative powers and their voting to support it has no impact on whether the individual countries of the EU will introduce it into their legal systems. As I said somewhere above in this wall of text, Wikipedia articles should be in as basic a wording as it's possible to get them without losing meaning (until someone discreetly removed it a few months ago, it was a source of irritation to me for years that an article supposedly distinguished by professional standards of writing, presentation, and sourcing included the word "decollated", defended and regularly re-added by the author on the grounds that intentionally using obscure words was advancing people's understanding and knowledge of the English language), but not simplified the slightest bit further. (Paging Newyorkbrad, who I know has had strong opinions in the past about balancing readability and ambiguity, in the different-but-related field of how Wikipedia's internal policy pages and arbcom sanctions are worded.)

I know there are good reasons we got rid of the Article Feedback Tool, and I'd never advocate its return, but if you (with your WMF hat on) want an idea for something on which the WMF can spend money, which isn't very glamorous but likely to be far more useful in the long term than adding another member to the 45 "Community Engagement" staff (at least three of whom I wouldn't trust to count their fingers and get the same answer twice), commission and publish the results of some in-depth polling of what readers do and don't like about Wikipedia's articles and in particular Wikipedia's Featured Articles. I don't mean pop-up "did you find what you were looking for?" yes/no boxes displayed to readers when they leave a page, or "How would you improve this article" feedback boxes at the end of pages for readers to express their desire for more tits. I mean commission some paid, independent focus groups that represent the actual population (not the somewhat undiverse community that makes up the editor base), give them a big stack of printouts of Featured Articles and high-traffic articles, and ask them whether they found the articles easy or difficult to read, comprehensive, unbiased, interesting… and why. Then, start an equally independent group at the www.wikipedia.org search page on a database dump stripped of any indication of article assessment, ask them to independently navigate Wikipedia reading topics they find interesting for a few hours, and discreetly note whether they spend more time on and are more likely to follow internal links from articles with higher quality assessments. And then, use that same database dump stripped of quality assessment (maybe omitting the obvious one-line stubs), give people a genuinely random selection of Wikipedia articles and ask which they thought were the best-written, most readable, most comprehensive, and see how closely that correlates with Wikipedia's own article assessments.

While the wording and formatting has changed over the years, WP:Featured article criteria is still largely based on the arbitrary rules Raul made up in 2004 (in turn based on some equally arbitrary assumptions inherited from Larry Sanger), and carries with it a huge stack of assumptions that "articles meeting these criteria are what the readers are looking for". Because FAs are pushed as a model for other articles to follow, these assumptions leak through into the rest of Wikipedia; because Wikipedia is—rightly or wrongly—seen as a model for other websites to follow, those assumptions leak through into the rest of the internet; because the internet affects so much of the news agenda and everyday life nowadays, those assumptions leak through into reality. AFAIK these assumptions as to what readers want and what readers find useful has never been empirically tested among actual readers. While it might bruise some egos, it would be good to have an empirical list of "things readers want in articles" and "how technical should the language be?"; my gut instinct is that while there might be some surprises about how detailed readers feel articles should be, readability would rank fairly low as a concern. (Also paging @FAC coordinators: @FAR coordinators: @TFA coordinators in case anyone can think of a reason this is actually a Really Bad Idea, or can point out somewhere that it's already been done.) ‑ Iridescent 17:27, 18 October 2018 (UTC)

Are we sure that we can treat "readers" as a bloc? I suspect that there are distinct groups of readers with different priorities and interests and that if you drill deep you'll notice some granularity. Also, at the risk of hijacking this talk page further, I shall apply some of the advice offered here to the Samalas article. Of course, the seamount articles have priority in terms of any FAC nomination. Jo-Jo Eumerus (talk, contributions) 18:43, 18 October 2018 (UTC)

I've felt for a decade that Wikipedia would be better served if content contributors were provided better data about how readers use articles, and if such things were the subject of discussion from time to time at the FA talk pages.--Wehwalt (talk) 18:51, 18 October 2018 (UTC)

Yeah I'd agree with that too - get a bunch of folks who'd never edited wikipedia, bribe 'em with some money or beers and get them to peruse articles. Not too fussed between either printing them out or just reading them online and posting comments somewhere. You could do a bunch of college students in one batch, a group of high school kids in another batch, and (say) a group of senior citizens another day etc. BTW although the FA criteria are arbitrary, they strike me as pretty sensible and generic. Cas Liber (talk · contribs) 19:03, 18 October 2018 (UTC)

I agree that they seem sensible in general (although you know my feelings on it is a thorough and representative survey of the relevant literature—you're not going to convince me that the editors who took Islam or Sea through FAC actually thoroughly read all relevant literature, much less included it in the bibliography). However, in both my and your case it's a gut feeling; we might well find that readers consistently don't want comprehensiveness and would rather have three short articles instead of one long one (or conversely that they find sub-articles confusing and would rather have a single enormous Banksia article rather than 179 individual articles on the species, even if it meant a megabyte-long page), or that the articles we consider our best work are consistently judged less readable than the rest—we have no way of knowing. We might even discover the shocking fact that readers don't care if citations are formatted identically, whether we use dashes or hyphens, or if an article with below 15,000 characters of readable prose has more than two paragraphs in the lead, provided the article is interesting and accurate. ‑ Iridescent 19:49, 18 October 2018 (UTC)

(edit conflict) We have some that information, and it's basically discouraging. People generally read the introduction, or they're looking for a very specific detail (so they look in the infobox, and then skip to a relevant-sounding section). So if "write what readers want" is the goal, then 99% readers want much shorter articles, and the other 1% want all the possible infoboxes and lists and trivia about exactly which wrestling-entertainment-actor had which colors and theme songs and whatever else in which seasons. Also, a quick way to find out the name of that TV show that Joe Film had that six-second cameo in.

It's not exactly all about pop culture (e.g., people look things up for work), but the idea that lots of people are excited about spending half an hour (or more) reading about breast cancer awareness and its social effects is not exactly realistic. Even though this is the month for peak page views, I'd be surprised if more than a dozen people actually read it from start to finish. I'm not sure that ever I've done that, and I wrote the thing.

Other research has indicated that users want more media, and especially more interactive media. A timeline that you could swipe through and zoom in on areas that interest you the most, in-article calculators, or infographics would be popular.

I don't think that reader motivation is the key factor. Motivation does not turn you into a fluent reader of English. The writers of patient information leaflets are usually advised to assume that the reader has a functional grasp of English equivalent to what 13-year-old students are given in English class. This recommendation does not change when you think that the patient is "motivated", e.g., you are writing about a life-threatening condition. If anything, you try to write in simpler language in that case. Women who are "motivated" to read about DCIS do not need a bunch of complicated language. They need a sign that says "NOBODY DIES FROM THIS. YOU ARE GOING TO LIVE."

The Simple Measure of Gobbledygook, which I link because of its delightful name, is the recommended standard for pharmacy labels. I think my favorite study on on pharmacy labels was from 2010. It took a reasonably representative sample of Americans, showed them labels on drug bottles, and concluded in a very upbeat tone that all the prior studies were wrong and the old style of prescription labels were perfectly fine. The only little caveat, barely worth mentioning, was that you might need to make some changes to accommodate "special" populations of patients, like those who didn't have a university education (i.e., most people).

With a glance at my staff hat, I'm not sure about the practical utility of any such research. We (the experienced editors) are attached to long-form articles. We like writing them. We idealize them. Making those is Why We Are Here. And if the research says that long-form articles don't get read, or they don't educate readers, or that most readers really need, want, and benefit from infoboxes, then we think the problem is with the readers, not with our beautiful articles. The WMF has talked about encouraging other approaches, and the core editing community at this wiki has not been receptive to this idea. We (the editors) are a bit like the authors of that paper: Everything's fine with our current practices, unless you're trying to accommodate a few "special" groups of readers ...like 99% of them. WhatamIdoing (talk) 19:59, 18 October 2018 (UTC)

"Actually I put that typo there intentionally as a tripwire, kind of like the monolith in 2001, to notify me when intelligent life on Wikipedia had evolved to the point of reading articles all the way through." E Eng 01:38, 19 October 2018 (UTC)

That wasn't really what my research (when at Cancer Research UK) suggested. Most of the 30 subjects asked to imagine that someone they slightly cared about had developed pancreatic cancer, and then find out about it on the web, went first to the top hits, namely specialized charities, or the NHS. Some later looked at WP, & if they didn't they were asked to at the end, but they had already got the infobox stuff elsewhere, and on the whole rather appreciated the extra depth on WP. Johnbod (talk) 21:13, 18 October 2018 (UTC)

Yes, I'd go along with that. Someone looking up Cancer might only skim the lead, but someone interested enough in a particular variant is likely interested enough to want detailed information. The same goes in all fields including pop culture; someone looking up The Beatles might just want to know when they split up or who their manager was, but someone looking up You Know My Name (Look Up the Number) probably wants detailed information about this song, why it came to be written and what it's about. To repeat, my point is we don't know how people use Wikipedia and consequently whether we're wasting time doing things readers don't want. ‑ Iridescent 15:35, 19 October 2018 (UTC)

Yes, the one thing all 30 subjects had in common is that they looked at a range of sites (but nearly always from the first page of results); of course in this case there were actually plenty of very good sites in that first page. Johnbod (talk) 17:26, 20 October 2018 (UTC)

WhatamIdoing, could you link to the research you called discouraging? I'd be interested to see what the questions were. What would help a lot would be knowing how many unique visitors each article gets, which section headings readers click on, and how much time they spend at the article. Then we'd have some data. You wrote that people look at the lead and infobox, then skip to the section they care about. But you conclude from that that they want shorter articles. A shorter article might not contain the section they decided to skip to. SarahSV ^(talk) 16:21, 19 October 2018 (UTC)

How readers use articles

Yes, I'd agree it would be interesting to see the raw data or at least a summary, if the WMF is willing to release it. When it comes to WMF research, there's something of a history of the WMF concluding that the data supports whatever the WMF party line happens to be (remember the huge support the WMF claimed there was for Flow and Winter, or the search engine unpleasantness?); while things may have got better since the Wikimedia Civil War had the side-effect of purging the noisiest Anything-We-Do-Is-A-Force-For-Good-And-Thus-Anyone-Opposing-Anything-We-Do-Must-Be-Evil cultists from the WMF, I haven't seen enough of the new regime to know if old habits have continued. Given that it's not that long since the WMF was producing stuff like this and twisting research to claim it's "what people wanted", I can believe that it's possible that they've listened to the Wikidata/Reasonator clique claiming "short articles and lots of infoboxes" are what people want and decided in advance that this is what the research will conclude. ‑ Iridescent 21:51, 19 October 2018 (UTC)

I assume Google Analytics data is available for Wikipedia articles. Kaldari, is this something you can help with? We're wondering what research is available about how readers behave when reading Wikipedia articles: how many unique viewers, how long they spend on each page, etc. SarahSV ^(talk) 17:08, 20 October 2018 (UTC)

Some of it gets published in great detail, such as m:Research:Wikipedia Readership Survey 2011/Results. Others are in such bits and pieces that publication is not the relevant concept, because you won't find it even though it's been published. Some of it is also platform-specific. For example, they can find out how many sections people read from mobile, where most sections start off collapsed. People read slightly more pages on desktop than on mobile within the same reading session. Hovercards reduces page views (but should increase the proportion that reads more than the first line.) How long they spend on each page (on average) is known, but I can't remember the numbers beyond "short", and I can't find it quickly. Coming soon: 90% of readers read Wikipedia in a single tab, and if they click on a second article, they don't open a second tab for it. If you're interested in this, then stalking Tilman on Meta might be worth your while.

(We won't get unique viewers, because Legal refused.) WhatamIdoing (talk) 15:53, 23 October 2018 (UTC)

Sorry, only just noticed this in the morass, and thanks for replying. Pinging SarahSV as well. What I'd be really interested in—although it would be a pain to do—would be if in the next reader survey, the WMF actually asked "did you think this article was too long/too short/about right" for a variety of articles, and whether it varies between core and obscure. Per User:Johnbod's cancer examples, and my own experience with art and artists, I'd be willing to bet that readers typically skim the "core" articles like Cancer until they find the particular piece of information or internal link they're looking for, but when they reach the specific subtopic are much more likely to read top-to-bottom. ‑ Iridescent 18:24, 30 October 2018 (UTC)

Responding to the ping (way) above, there's always a trade-off in writing anything between a simple, straightforward presentation (which will eliminate some facts and details), and a more complicated one (which will be "more accurate" but will take longer to read and be harder to understand).

I've written about this before in the context of drafting policies and ArbCom decisions. Do we want a simple and straightforward statement of what the rule is (which will invariably fail to anticipate some possible scenarios), and a more developed and complicated presentation (which will provide specific guidance for a greater range of possible events, but take longer to read and be harder to understand)? On the one hand, oversimplification leads to more disputes later on, and at best just kicks the can down the road. On the other hand, it is impossible to anticipate every possible set of facts even in theory, so there's a limit to how hard we should try. We should also remember that we are writing policies and guidelines for a website, not a criminal statute or a chapter of the Code of Federal Regulations.

The same trade-off exists in articles. If we write "the sky is blue," we're immediately half-wrong on average: unlike the dog in "Silver Blaze" we do things in the night-time. If we write "the sky is blue during the daytime," we're perennial optimists (or drought-lovers) who have wished the clouds away. If we write "the sky is blue during the night time on a clear day," we're astronomical idiots who've never heard of solar eclipses. And if we write "the cloudless sky is blue during most daylight hours except during a total or annular eclipse of the sun," the reader will either be impressed by our attention to detail or bemused by how we can overcomplicate almost anything.

(Now I'm curious: how do we explain it? The lede of sky gives During daylight, the sky appears to be blue.... Thus only one of the (at least) three qualifications is given. But then again, I could push back against including a reference to clouds because when I see a cloud I'm not seeing the sky; I'm seeing an obstruction that's in the way of seeing the "sky." So we need to spend more time defining "sky." The first sentence of sky is The sky (or celestial dome) is everything that lies above the surface of the Earth, including the atmosphere and outer space. That is unclear as to whether the "sky" includes the clouds or doesn't, plus we have the extra bonus distraction of "celestial dome." I think I'll stop there for now, but perhaps I've made my point.)

Another oversimplified example—choose one: "Leap year is every fourth year." "Leap year is every fourth year, except that the years divisible by 100 aren't leap years." "Leap year is every fourth year, except that the years divisible by 100 aren't leap years, except that the years divisible by 400 are leap years." "Leap year is every fourth year, except that the years divisible by 100 aren't leap years, except that the years divisible by 400 are leap years, except that we'll probably skip a leap year one time about 3,000 years from now." Which is the most useful to the reader? Obviously all this information needs to be included in an article, but how to lay it out comprehensibly requires more writing skill than is generally appreciated.

By the way, on a different topic that has concerned both you (Iridescent) and me: I've come to conclude that variable and overcomplicated systems of referencing are one of the major deterrents to gaining and keeping contributors. I've been here however many years by this point, but I was drawn in partly because in 2006-2007 it was easy to add information to an article. If I found Wikipedia today I'm not sure I'd stick around after I'd messed up the referencing templates for the seventeenth time. But I digress. Newyorkbrad (talk) 23:11, 18 October 2018 (UTC)

Continuing the digression, if one looks at the history of green, blue and red I have had an ongoing difference of opinion with an editor about blue skies, red blood and green leaves.....Cas Liber (talk · contribs) 23:25, 18 October 2018 (UTC)

I like the idea of a naturalistic study - get a bunch of people on wikipedia and track what they read and later ask then what they read and why. NB: if everyone reads bits and peaces of big articles, if they are all different bits and pieces...then surely being comprehensive is a good thing, right? Cas Liber (talk · contribs) 23:25, 18 October 2018 (UTC)

m:Research:Which parts of an article do readers read

Quick link on this subject: m:Research:Which parts of an article do readers read was updated last month, to include information about the effects of Page Previews. WhatamIdoing (talk) 00:09, 10 November 2018 (UTC)

That's interesting—thanks. Interesting to see that they're also using my technique of assessing whether a reader is engaging with main page content by seeing how much of a spike there is in related topics (e.g. when Candaules, King of Lydia, Shews his Wife by Stealth to Gyges, One of his Ministers, as She Goes to Bed was TFA, it got about 90,000 pageviews, [[Candaules]] got about 13,000 and [[William Etty]] and [[Gyges of Lydia]] both got about 11,000, implying that of the people who clicked on it—and I have no illusions that most of those clicks were from people either intrigued by the peculiar title, or drawn in by the naked buttocks, rather than people with an actual interest in 19th-century history painting—about one reader in seven found the topic interesting enough that they wanted to know more).

I don't suppose there's been any research (either automated by by survey) regarding why readers leave pages—that is, of the 60.1% who viewed a mainspace page without opening a section, is it because all they wanted to know was in the infobox or lead, because they realized this wasn't the topic they were looking for, or because they don't realize that the apparently blank paragraphs are collapsed sections which they can open rather than sections that have yet to be written? (Don't discount that last one; I've no idea how common it is, but I can certainly anecdotally confirm that I've had people ask why the mobile versions of articles only include the text from the lead.)

I'm not entirely convinced by their conclusion that because the links in the lead are most likely to be clicked, that means readers aren't reading past the lead. Because the lead summarizes the most important content of the article, that's also where the links that are most relevant to readers of the article are going to appear first, pretty much by definition. One would expect readership of any article, in any reference work, to tail off towards the end, as assuming most readers work from top to bottom, anyone looking for a specific piece of information becomes progressively more likely to have found it the more of the article they read. (I'm astonished that as many as 10% of readers using the Android app are getting as far as the external links section.)

Incidentally, how on earth do they work out that "Links located on the left side of the screen are more likely to be clicked"? Surely what appears on the left and right of the screen is going to vary wildly depending on the user's font settings and browser window size? ‑ Iridescent 01:29, 10 November 2018 (UTC)

On the last question, the answer is that I don't know, but I know that the researchers have talked about the pros and cons of mouse-tracking and eye-gaze studies, so perhaps someone did one of those. You could also make a few assumptions, e.g., the first word in a paragraph is on the left, and links in captions are on the right.

Directly asking real-world users why they leave a page would require phab:T89970, which everybody wants to use, but nobody wants to spend a year building. You could ask people for their general recollections later.

I assume that many readers really only want the first bit of the article. The drop-off in page views that's attributable to the NAVPOPS-like tool demonstrates that pretty conclusively, as does our own experience (Did we put her article at George, Georgie, or Georgina? Let me go look...).

The problem of people thinking that the rest of the article doesn't exist is probably something that the Readers team should take a look at. I wonder if there is anyway to document the existence of the problem. (Maybe someone complaining in a web forum or something?) WhatamIdoing (talk) 21:56, 16 November 2018 (UTC)

I can't think of any obvious way to measure the motivations for inactions, and whether someone not expanding the subsections isn't doing so because the lead has given them all they need, because seeing the lead has made them realize that this isn't actually the topic they were looking for, or because they aren't aware that the ∨ opens subsequent sections. It would probably need an a/b test in which a high-traffic article was configured such that half the readers were presented with the article as it normally appears and half the readers with the second section also expanded, and see if click-throughs to links that only appear in the second section are higher for those in the second group than the first. That said, a/b tests are a lot of work for what I imagine the WMF would consider a fairly limited return. A not-quite-as-accurate but much easier method would be to see if click-throughs to links that only appear in the later parts of the article are significantly higher from readers in desktop view compared to click-throughs from readers in mobile view in which these sections are hidden. Per my comments somewhere else in this morass, what I'd be really interested in seeing is how many readers feel they found what they were looking for, and whether that varies between people using different devices and between desktop/mobile view. ‑ Iridescent 22:35, 16 November 2018 (UTC)

Here's another on bit of research this subject: https://www.youtube.com/watch?v=RKMFvi_CCB0 (slides at File:(WtWRW-20181211) Research Showcase Presentation.pdf) One result: How much you read depends upon how many options you have. I think we can assume that most, of not all, previous research was done at the English Wikipedia. WhatamIdoing (talk) 21:57, 13 December 2018 (UTC)

I suppose we have an issue (which that seems to bear out), in that our readers come for different things; some want to find out a particular piece of information and don't care about context, some want a brief summary of the topic, and some want to find out all there is to know about the topic. My aspiration is that the lead of an article should wherever possible sell the topic, such that people who intended only to find out a bit of brief background decide that it looks interesting and they'd like to know more, and end up reading the whole thing. (This goes somewhat against the MOS, which says that the lead should give away all the goods up front such that the reader doesn't need to bother reading the whole thing.) "How much you read depends on how many options you have" can go both ways; should we be aiming to provide simultaneously for those people who just want the Cliff's Notes version and who want the full story, or should we be trying to avoid dumbing down even though it will inconvenience some readers? ‑ Iridescent 15:23, 14 December 2018 (UTC)

If there's anyone still following this (other than WAID who's presumably already aware), this article by some guy at the WMF I'd never previously heard of might be of interest. I'd be very interested to see if the "25% of traffic is generated by people clicking on blue links" figure went up or down once hover-to-preview was rolled out as the default—e.g., was the effect of previewing "that looks interesting, I'd like to no more" or "actually, that's not what I was looking for"? ‑ Iridescent 19:32, 2 January 2019 (UTC)

Huh. Here on TVTropes the website owners believe that the net effect of previewing articles is to reduce readership, and thus refused to implement previewing. Jo-Jo Eumerus (talk, contributions) 20:38, 2 January 2019 (UTC)

TV Tropes is funded by ads, and as such has an obvious interest in keeping pageviews high. For Wikipedia, it's more important that readers find what they want. If someone reading Simeon Monument is confused by the cupola would in turn be topped with a caduceus they can hover over cupola and caduceus to see that "a cupola is a relatively small, most often dome-like, tall structure on top of a building" and "The caduceus is a short staff entwined by two serpents, sometimes surmounted by wings"; that's all they need to know so it doesn't break the flow of their reading the main article as clicking the link would, saving them time; that's consequently a win for the reader as they've saved time and seen enough of the linked articles to decide if they're topics that would interest them, and it's a win for Wikipedia as that reader goes away thinking "hey! Wikipedia is really informative and helpful, maybe I shouldn't keep dismissing that annoying yellow popup begging for money". Consequently it's a win-win for us, even if it means the total number of readers drops, but that wouldn't be the case for a site that depends for its survival on visitors seeing as many ads as possible.

What I'd be interested in is whether popups have encouraged rabbit hole browsing habits ("wow, a staff entwined by two serpents, that sounds really fascinating, let's click this link to find out more") or discouraged it ("I came here to read about street lighting in the early 19th century, not entwined staffs, so I'm not going to bother clicking that link"). Since that WMF blog post says that they measure the total number of readers clicking blue links (as opposed to coming to a page via the search bar or links from outside), it would presumably be easy enough to see if that number went up or down on the day previews were switched on. ‑ Iridescent 21:03, 2 January 2019 (UTC)

section break: the result of a decade of WP:CITEVAR

(re Newyorkbrad) On the referencing, I've said and will continue to say that we should have a sole house style for references. There would be some grumbling at first but people would get used to it soon enough; it's ridiculous that we expect editors to be familiar with a couple of thousand different templates

The contents of Category:Citation templates

The following discussion has been closed. Please do not modify it.

(yes, some of those are sandboxes or miscategorized things that shouldn't be there, but most of them aren't; "a couple of thousand" isn't an exaggeration) and chide them for being unfamiliar with every obscure referencing convention in the world. I'd have thought the introduction of Visual Editor would be the perfect opportunity for this; only have it support a single reference template, meaning the only options for the next generation of editors are to comply with the template, manually format the citation themselves if they absolutely must display it in Bluebook for some reason, or use a bare URL.

(Re Casliber) Probably, although my point is we don't know. It might be that all those readers would prefer their bits and pieces spread across multiple small articles (i.e. instead of a single "Rail stations of Dutchess County" article we have eighteen separate and very similar articles), or they might prefer everything be merged into a single article so they only have to look in one place for whichever bit or piece they need (i.e. Infrastructure of the Brill Tramway lists all the paraphernalia, even though the reader is likely only interested in a specific part of it). The point is, we don't know. ‑ Iridescent 23:50, 18 October 2018 (UTC)

courtesy break so further replies don't need to scroll through the above

Because I was here for something else, I have read the above long thread with great interest. There so much that is important, it's really a shame that the discussion is not in a more prominent venue. Kudpung กุดผึ้ง (talk) 03:16, 21 October 2018 (UTC)

It's Wikipedia and consequently all CC BY-SA—feel free to cherrypick the juicy parts if you want to post a summary somewhere else. This is actually one of Wikipedia's more prominent venues; the Signpost may have 256 active watchers whereas I have only 206, but most of those 206 represent the closest thing Wikipedia has to an elite. ‑ Iridescent 10:13, 21 October 2018 (UTC)

Comment and another parable re: paid moderators: Hi, I'm new to this thread, and just now skimmed it. I read Iridescent's parable of his dating site with great interest, as it has many parallels with the saga of the IMDB message boards, on which I was quite active from 2005 through their demise in February 2017. They started out with paid moderators who reviewed each notification of the user-reported system and took whatever action or non-action they deemed appropriate. Eventually people started gaming the reporting system, creating sockpuppets to double-down on reporting people they didn't like, but by and large the system worked well to remove and keep away troll posts. Then IMDB cheaped out and let the deletions become more and more automated, removing the paid moderators; at that point gaming obviously became easier. Then in 2007 IMDB added a buttload of new message boards to the system: boards that had nothing to do with films; unfortunately, many of the new boards were troll magnets, like Video Games and that sort of youth-skewing stuff, and boards on Politics, Religion, and other dens of iniquity. At the same time, so-called moderation became completely automated -- no human moderators. The kids and trolls from the boards like Video Games soon discovered they could wreak havoc not only on their favorite boards, but all over the message boards, with impunity. Automated reporting got overwhelmed and virtually ceased to work at all, because it punished people who conscientiously reported a lot (there was a lot to report!) by ignoring their reports after a certain number. Also, since IMDB allowed people to create an infinite number of sockpuppets, the sockpuppets not only overran the site, they overran the reporting system and fairly easily got anyone deleted or even blocked whom they didn't like. Long story short: By 2012 or so, IMDB was Troll Heaven. Everyone on the internet knew it, so anyone who wanted to troll headed for IMDB. One of Wikipedia's most notorious sockpuppeting trolls, with hundreds of socks, abandoned Wikipedia to do the same thing on IMDB. Many good people left in disgust, unable to have a civil conversation amidst the barrage of trolling. Finally, when the bad press got too great, IMDB gave two weeks' notice and then deleted all the boards completely. Of course this could have all been prevented had they simply created a subscription service to use the boards; legitimate film fans would have gladly paid $5 or $10 a month to use the message boards. That would have cut out the trolls and socks, and would have provided revenue to return to paid human moderators. But they didn't do that. Softlavender (talk) 04:22, 21 October 2018 (UTC); edited 10:06, 21 October 2018 (UTC)

You know, is it important whether these moderators are paid or not? I am a moderator on the website known as TV Tropes and we do not get paid there. Jo-Jo Eumerus (talk, contributions) 09:10, 21 October 2018 (UTC)

As Iridescent mentioned, it's a matter of size. Once a website becomes one of the largest user-generated sites on the internet, the volume tends to militate against volunteer moderation working adequately. Although TV Tropes is a pretty well-known site, it does not even approach a fraction of the amount of user-generated input as the second-by-second barrage of input that Wikipedia gets or that the hundreds of thousands of now-deleted IMDB message boards got. Or Facebook, etc. Softlavender (talk) 09:49, 21 October 2018 (UTC)

(edit conflict) Go back to where we started, to the kind of people who want to act as volunteer moderators aren't always the people you would want as volunteer moderators. On something like TV Tropes which has a fairly tightly focused remit, and isn't high-profile or controversial enough for outside interests to have an interest in infiltrating or destabilizing it, it's probably not an issue as nobody would ever get to the "become a moderator" stage unless they had a strong interest in deconstructing TV, and consequently shared the site's purpose and values. (You won't even know TV Tropes exists unless you have an interest in the topics it covers; you'll know Wikipedia exists if you've ever done a Google search.) For a site like Wikipedia, which by its nature attracts a lot of "I don't want to do any of the work, I just like the idea of criticizing other peoples' work" types and is a constant target for spammers, not so much; since people who enjoy or are good at writing, coding, difficult cleanup or important maintenance are more likely to want to devote their free time to writing, coding, difficult cleanup or important maintenance, that means there's a constant tendency for the routine patrolling to be done by busybody "we must clean up all the trash!" types whose values don't really align with the rest of Wikipedia. But because the busybody types are the ones who hang round the noticeboards, Meta, RFA, the talkpages of Signpost articles etc, if you're not deeply familiar with the culture of en-wiki—which most of the WMF aren't—they're the ones who appear to be representative of the internal culture of Wikipedia, so the WMF assume that the interests of the busybody-patroller types are synonymous with the interests of Wikipedia. (We're talking about a situation where the WMF can conclude that "Ping users from the edit summary" and "Allow 'thanks' notification for a log entry" are higher priority than allowing VisualEditor to handle named references.) Hell, as I write an RFA for someone who's openly running on a platform of "I don't care about the content of Wikipedia, I just like reverting and deleting" is about to pass. ‑ Iridescent 10:13, 21 October 2018 (UTC)

VE and referencing

Visual editor not comletely fucking up notifications would be a start, let alone letting it level up:

...via [[Plymouth]]{{Sfn|Schöttler|2010|p=417}}]][[User:Serial Number 54129/sandbox#cite%20note-FOOTNOTESch%C3%B6ttler2010417-62|[54]]][[User:Serial Number 54129/sandbox#cite%20note-FOOTNOTESch%C3%B6ttler2010417-61|[53]]][[User:Serial Number 54129/sandbox#cite%20note-FOOTNOTESch%C3%B6ttler2010417-61|[53]]][[User:Serial Number 54129/sandbox#cite%20note-FOOTNOTESch%C3%B6ttler2010417-61|[53]]][[User:Serial Number 54129/sandbox#cite%20note-FOOTNOTESch%C3%B6ttler2010417-61|[53]]][[User:Serial Number 54129/sandbox#cite%20note-FOOTNOTESch%C3%B6ttler2010417-59|[51]]][[User:Serial Number 54129/sandbox#cite%20note-FOOTNOTESch%C3%B6ttler2010417-59|[51]]][[User:Serial Number 54129/sandbox#cite%20note-FOOTNOTESch%C3%B6ttler2010417-59|[51]]] and [[Cherbourg-Octeville|Cherbourg{{Sfn|Lyon|1985|p=186}}]][[User:Serial Number 54129/sandbox#cite%20note-FOOTNOTELyon1985186-63|[55]]][[User:Serial Number 54129/sandbox#cite%20note-FOOTNOTELyon1985186-62|[54]]][[User:Serial Number 54129/sandbox#cite%20note-FOOTNOTELyon1985186-62|[54]]][[User:Serial Number 54129/sandbox#cite%20note-FOOTNOTELyon1985186-62|[54]]][[User:Serial Number 54129/sandbox#cite%20note-FOOTNOTELyon1985186-62|[54]]][[User:Serial Number 54129/sandbox#cite%20note-FOOTNOTELyon1985186-60|[52]]][[User:Serial Number 54129/sandbox#cite%20note-FOOTNOTELyon1985186-60|[52]]][[User:Serial Number 54129/sandbox#cite%20note-FOOTNOTELyon1985186-60|[52]]] almost immediately.{{Sfn|Schöttler|2010|p=417}}

...any takers?! ——SerialNumber54129 11:54, 21 October 2018 (UTC)

I believe VE fucking things up falls into WAID's pigeonhole, but I don't really want to ping her as I'm sure she's sick of the sight of this thread. I do support the principle of VE 100%, but honestly if Lila had actively decided to run a feature launch to generate as much ill feeling as possible, she couldn't have gone about it better—"run fast and break things" is a great slogan but a disaster in practice. ‑ Iridescent 12:04, 21 October 2018 (UTC)

If you want me to repeat "sfn is not supported, and won't be" on the clock, then you'll have to ping my staff account. I can tell you that it wasn't just perverse willfulness or personal preference behind the decision; there's some complicated something or another that makes this simple-looking template actually be a royal pain. (They did try to explain once, several years ago, but my eyes glazed over fairly quickly, and I remember nothing of the explanation.) Now, of course, that team is doing little except mobile editing, so no new capabilities are expected there for another year. WhatamIdoing (talk) 03:36, 23 October 2018 (UTC)

Look up; I agree with the devs that VE shouldn't be expected to support all the variant reference formats. The problem is that at the moment we have two different systems (wikitext editing and VE) which are almost incompatible when it comes to referencing, as they handle them so differently, and there hasn't really been any effort to work towards a single style which both sets of editors can work with. ‑ Iridescent 09:55, 23 October 2018 (UTC)

Have you seen m:WMDE Technical Wishes/Book referencing? That should let us combine ref tags and {{rp}}, which is a step towards unification. WhatamIdoing (talk) 01:44, 26 October 2018 (UTC)

Just use {{r}} with |p= and be done with it. E Eng 02:14, 26 October 2018 (UTC)

Or just use {{sfn}} and stick to bloody source mode... ——SerialNumber54129 17:10, 30 October 2018 (UTC)

But r is 1000 times easier to use than sfn. E Eng 03:14, 31 October 2018 (UTC)

But (if I'm reading the documentation correctly), that requires list-defined references to function correctly, and LDRs bring a whole slew of problems of their own. New editors find the <ref>...</ref> system confusing, but they find LDR utterly incomprehensible (this is how Phineas Gage appears to someone using VE; see if you can work out how to add or change a reference). Consequently, converting a page to LDRs has the de facto effect of indefinitely semi-protecting it. From the editor point of view that's a good thing as anything which reduces the flow of well-intentioned noobs reduces the workload, but from Wikipedia's point of view it's strangling the next generation in the cradle. ‑ Iridescent 09:08, 31 October 2018 (UTC)

In fact you're not reading the {r} documentation correctly, but I've rewritten it to clarify [1]. As for PG, we've been over this before: editors can add, and have added, new refs just by doing what any editor would do i.e. use the familiar < ref>< /ref> machinery, which is perfectly compatible with what's already there and works exactly as expected. E Eng 22:13, 31 October 2018 (UTC)

But the {{r}} template still doesn't address the central issue, which is that VE can't cope with it so anyone attempting to amend the references using VE will turn the wikitext source into a mess of subst'ed codes.

It's more of an issue with {{r}} than with {{harv}} or {{sfn}}. As long as Wikipedia allows multiple different citation styles there will be people putting a steady stream of pressure on the devs to support sfn and the harv/harvnb templates, but if anyone approaches them telling them to support {{r}}, they'll (quite reasonably) say that they have enough on their plates trying to get VE to cope with the templates that are still live, and it's not a good use of their time supporting a legacy template whose use was deprecated almost a decade ago. (Sure, consensus can change, and one could theoretically hold a second RFC to un-deprecate it, but 26–3 was a clear enough supermajority that the WMF and devs are completely reasonable in assuming "en-wiki doesn't want this so we won't support it".) ‑ Iridescent 09:16, 1 November 2018 (UTC)

RFA and internal moderation

Good point on the last sentence. I found that worrisome as well but didn't want to go against the tsunami of support. People wring their hands about "RFA being broken", by which they mean too hard to pass, but in my mind it's way too easy to pass now, especially when there's peer pressure, unthinking bandwagonism, excessive participation by newbies, and a low percentage mandate. Softlavender (talk) 10:31, 21 October 2018 (UTC)

There are few successful RFAs not because RFA is hard to pass but because there are so many people saying RFA is hard to pass and it consequently discourages people from applying; plus, Wikipedia has been around for so long now that most of the active editors have either run at RFA or decided not to so there's a smaller pool to choose from. Looking at Wikipedia:Requests for adminship by year, I can't see any obvious miscarriages in the RFAs that didn't pass, whereas those candidates who weren't carrying some kind of problem-editing baggage tended to sail through with extremely high support. (Receiving 200 supports in an RFA was once so unusual that we created a dedicated page to document the phenomenon; this year five of the eight successful RFAs got that number.)

On the earlier point, how a site moderates itself is of great importance—in the current climate, arguably of paramount importance. It's obvious just from reading the news that the wild-west internet is becoming a political issue; when the Federal Communications Commission, the European Data Protection Board or the Information Commissioner's Office decides to call the big companies in for a chat, and Zuckerberg, Dorsey and Bezos point to the army of paid moderators they've hired to clean up the user-generated content and the teams of programmers they have working on scripts to spot fakes and libels before they go live, and Katherine Maher can only say "well, we kind of hoped some volunteers would take care of it", who's going to come out of that meeting worst? ‑ Iridescent 10:56, 21 October 2018 (UTC)

Just a tiny note re: number of vote(r)s in RFA: It's so high now because of the watchlist notification. Before that was implimented, unless you had the template on your userpage it was very easy to miss the fact that an RfA was occurring, even if you had WP:RFA watchlisted, because it's just a single blip there; whereas via watchlist everyone sees it, and the fewer pages one has watchlisted the easier it is to see the notice, hence all the newbies voting. Softlavender (talk) 11:09, 21 October 2018 (UTC)

In this particular case I'm not seeing many newbies, just a lot of "I've taken the candidate's word for it regarding his contributions and haven't bothered to check for myself and see that the purported creations were things like this, were all years ago, and that his only content creation this year was this". You can't legislate against laziness, but in this case the laziness is on the part of established editors not eager newcomers who don't know any better. ‑ Iridescent 12:19, 21 October 2018 (UTC)

I mean, as with everything on Wikipedia, it only works in practice. So we kind of hoped some volunteers would take care of it sounds bad, but who exactly is doing a better job, of say handling bots, or copyvios - I don't know about Facebook, but Reddit certainly abounds with obvious copyright violations, which Wikipedia at-least tries in taking care of. Galobtter (pingó mió) 11:05, 21 October 2018 (UTC)

We're better than the social networks at handling bots, as the format of Wikipedia articles isn't as conducive as that of Twitter and Facebook. We're not great at handling the Russian and Macedonian troll factories (who as I speak are duking it out on Jimmy's talkpage), and we're certainly not great at differentiating between constructive and unconstructive editing. (This thread has diverged slightly, but it was originally about mw:JADE and mw:ORES, and the possibilities and pitfalls of AI automating recent changes patrol.) ‑ Iridescent 11:33, 21 October 2018 (UTC)

We're "not great" but Facebook/Twitter seems to be absolutely-awful at dealing with Russian trolls with all the fake news/disinformation/ads/propoganda and so on spread through it. (I've been reading this thread as it pops up on my watchlist, and "diverged slightly" is probably the understatement of the year) Galobtter (pingó mió) 11:50, 21 October 2018 (UTC)

Sure, but the point I'm making is that FB, Twitter, Amazon etc are now hiring people to patrol full-time, assess the nature of the problem, delete what they need to and develop strategies for preventing it in future. When the FCC calls the big players in, they're the ones who can point to the positive action they're taking to try to address the issue; we're the ones whose solution is ten paid Trust & Safety staff backed up by 511 volunteer admins of varying degrees of activity, and whose figurehead boss openly tolerates assorted trolls, racists and crackpots on his talk page. ‑ Iridescent 11:59, 21 October 2018 (UTC)

Or one could say we're not doing anything because we don't have as much of a problem. Not that I think the FCC is going to do anything as long as this person sits in office. Galobtter (pingó mió) 12:19, 21 October 2018 (UTC)

Heh—I'd say the FCC is considerably more likely to do something while this person sits in office; Bush Jr and Obama might not have cared for the press or open internet but had enough respect for the constitution to largely keep their mitts away from them, whereas the current administration is openly hostile to any medium they don't control. If the Republicans hold both houses in the midterms (a big if) I wouldn't give §230 more than eighteen months. In any case, when it comes to the internet it's the EU and UK who largely make the running, as they have the clout, and more importantly the willingness, to enforce rulings extraterritorially, and don't have the American fetishization of free speech above all else. ‑ Iridescent 12:25, 21 October 2018 (UTC)

I haven't kept up with this thread, but as I was about to say before I accidentally deleted the whole thing, here is a new article I just saw on teh interwebz on the subject of humans versus algorithms (on the chance it may be of interest): https://www.nytimes.com/2018/10/25/technology/apple-news-humans-algorithms.html. -- Softlavender (talk) 11:59, 27 October 2018 (UTC)

There was a mass-market book just published on the topic—the name escapes me, but there are posters for it on the side of buses so the publisher presumably expects it to sell. (I imagine they see it as the next Brief History of Time, as a book bought by people who are struggling to find gift ideas for nerds, and equally destined to languish unread on shelves.) ‑ Iridescent 16:57, 30 October 2018 (UTC)

Especially useful for those people who have coffee tables with one leg an inch too short :) ——SerialNumber54129 17:10, 30 October 2018 (UTC)

Found it. ‑ Iridescent 17:53, 30 October 2018 (UTC)

Infermedica

Hi there, I was on holidays recently and by the time I got back here, the article that I published has already been deleted and it seems you pulled the trigger on it. I'd like to rewrite it so it will comply with Wikipedia policies and publish it again - can you please help me access the last version of it? Klim3k 13:28, 3 January 2019 (UTC)

Temporarily restored to Draft:Infermedica (and a courtesy notification to User:DGG, as the editor who tagged it as spam, that I've done so). If you're going to restore it to the article space, it at minimum needs to demonstrate significant coverage in multiple, independent, reliable sources (my emphasis), and to be sourced to sources that meet these criteria. Reprinted press releases and coverage of routine announcements don't qualify; basically, we don't care what any article subject or people connected to the article subject say about themselves, we only care about what independent reliable sources have said about any given topic. ‑ Iridescent 16:17, 3 January 2019 (UTC)

Thank you, I have copied the content from the draft. I also appreciate your feedback about the sources - I will work harder on that matter the next time before I publish the article. Klim3k 16:32, 3 January 2019 (UTC) — Preceding unsigned comment added by Klim3k (talk • contribs)

May I suggest putting it through the Articles for creation process? That will provide more eyes and thus more feedback, and, once accepted, harder to be deleted. ——SerialNumber54129 16:50, 3 January 2019 (UTC)

Giant Snowman and NFOOTY

I don't want to clutter up the case, but in regards to your comments about the crap that goes on in the lower realms of football....isn't that really an indication that our threshold for notability on football-related articles is far too low? We're wasting our time fighting these battles because we've sunk below the level of true notability, whatever that is. Thanks for listening.Jacona (talk) 12:41, 3 January 2019 (UTC)

Yeah, we wouldn't need to worry as much about bad stats/updating stats if we didn't have football "biographies" that are basically entries in a database. Galobtter (pingó mió) 16:23, 3 January 2019 (UTC)

@Jacona, in my view WP:NFOOTY shouldn't exist and footballers should be subject to the same "the onus is on the article's author to demonstrate the non-trivial coverage in multiple, independent, reliable sources" test as anywhere else.^[1] Someone who's never played at international level or in a fully professional league can still be unquestionably notable in Wikipedia terms (someone who scores twenty goals in a match, the striker for the plucky non-league team who scored the goal that put Arsenal out of the FA Cup, the goalie who played the last 30 minutes with a broken leg because his team had used up all their subs and he didn't want to let the side down, the guy whose goal celebration is to rip off his shorts and run a lap of the pitch with his meat-and-two-veg flapping in the breeze); someone who unquestionably meets WP:NFOOTY can be completely non-notable in Wikipedia terms (someone who made a single international appearance for Micronesia in a dead-rubber scoreless draw with Kiribati, the teenager who was brought on for the final three minutes of the last game of the season in the German 3. Liga because his manager wanted to blood him, who then injured his knee and never played again, and to be brutally honest about 90% of women's football^[2]).

I get why these guidelines were introduced—without them we'd have endless squabbles about whether each individual player is notable enough—but they're not working. The discussion that created that guideline took place at a time when Wikipedia was a third of the size it is now, and football coverage was largely focused on top-level leagues and the lower leagues of major footballing countries. As such, it was reasonable to assume that provided one could demonstrate that someone had made a league appearance for Midtable United, there would be at minimum a biography of him in the local paper and an article about him in The League Paper, that the absence of them as sources was just an artefact of nobody having bothered to dig it out yet, and that consequently we should work on the assumption that professional footballers in professional leagues are always notable.^[3] Unfortunately, some people have understood this to mean "every person who has ever played in a professional league must have a biography on Wikipedia", and as a consequence we've acquired a huge stack of unread and unmaintained but undeletable microstub biographies.

As it's not practical to watch them all, the proliferation of microstubs and unwatched articles means the handful of admins who try to prevent the football project degenerating completely end up having to revert en masse when they see something problematic going on. In my opinion GiantSnowman wasn't really to blame for the current fuckup; he found himself in a position where he was trying to enforce two mutually contradictory policies, had to choose whether WP:BLP trumps WP:ADMINACCT and WP:BOTPOL, and in good faith made what turned out to be the wrong call.^[4] This case is going almost certainly going to end in GS's desysop or being banned from reverting (which in the context of the work he does, may as well be an indefinite block, since it's impossible to keep biographies clean without reverting), as the usual crowd of Defenders of the Wiki Against the Evil Admin Cabal have all turned up baying for his head and I doubt the incoming committee will have the nerve for their very first action to be standing up to them and guaranteeing they spend the next two years reading hit-pieces against themselves in the Signpost (which in its most recent incarnations has abandoned any pretence at impartiality and become the de facto mouthpiece of the "the WMF are stifling my rights and the admins are their stooges except for those admins who are my friends who are all paragons of virtue!" bell-ends who lost their spiritual home when the old Wikipedia Review closed).

If it's any consolation, the situation is even worse with cricketers. The cricket project has the same "every professional is notable" and the same "our local consensus trumps Wikipedia's normal notability rules and we'll fight anyone who tries to demand notability be demonstrated" mentality; unfortunately, the English cricket archives at Lord's were destroyed in a fire in 1825, meaning we have a huge stacks of biographies from before that date like Clifton (1817 cricketer) where we literally know no information at all—not even the person's name—but can't delete them "Because WP:NCRIC".

(Pinging Dweller, Ymblanter and The Rambling Man, as the three people I most often see trying to stop the football articles degenerating into complete gray goo, for their thoughts.) ‑ Iridescent 17:44, 3 January 2019 (UTC)

^ This goes for most of the other biographical special notability guidelines as well, as they have the same issue of encouraging indiscriminateness and stand-alone articles for topics that would be better served as part of a list. And don't get me started on whoever decided "every listed building in England is notable" was a sensible idea (at the last count there were around half a million of the things).
^ Even a top-flight high-profile women's team like Liverpool has an average gate of 724. Outside the US, women's football never took off and aside from the full internationals most of these women aren't household names in their own households, but it's considered somehow improper on Wikipedia to suggest the men's and women's games aren't equivalent in importance.
^ This system does work well for things like railway stations, where it's a reasonable assumption that even the most obscure station will at minimum have had "New Station Opens" and "Station Closes" appearing in the local newspaper, but doesn't translate well to people.
^ The cynic in me says that had GS gone the other way, and decided that it was more important to assume good faith even if it meant allowing problematic edits to remain live, then not only would there then be a crowd baying for his blood on the grounds that he'd knowingly allowed potentially untrue statements to stand, the crowd would consist of exactly the same people.

I don't edit football but I handle too many football bios at AfC so I've got a lot of football pages on watchlist. This two games in a fully professional team business qualifies way to many players for bios. Lots of football players don't attract signifocant coverage. Contrast to people that built up a large successful business over many yeats genrating all kinds of press, yet it is almost impossible to get them an article that is not quicky labeled as spam. People need to realize that being a football player is a busiess. The players need attention to get hired and make more money, even more than business founders do generally.

I disagree with your evidence at the GS case. Football is not a special topic. Football is a "fan topic" like music, movies, celebs, crypto, and may others. All fan topics attract similar kinds of fan edits. Generally someone sees a fact on TV and rushes to edit the page or reads the page and sees something they "know" is wrong so they fix it. We should not be suspending our general editing policies to accommodate how a few people want to manage football. Legacypac (talk) 16:40, 3 January 2019 (UTC)

No, football is unique on Wikipedia. Most pop-culture topics—film, music, sport, visual arts—are specific to a particular culture or region; what makes football a special case is both that its coverage is fairly evenly spread worldwide, and that the players tend to move freely around the world and (outside players in the top flights of the big four leagues and PSG) only tend to get coverage in the media of the countries in which they're currently playing. Consequently, to write a biography of Nam Tae-hee in detail would require the author to be able to read—and to access—Korean, English, French and Arabic sources; good luck finding someone who can fact-check that. ‑ Iridescent 17:44, 3 January 2019 (UTC)

I didn't think it was held that "every listed building in England is notable" - it used not to be, and should not be. Grade 1 and probably Grade 2-star are different, but there are not so many of these. What has been defended is that every Dutch Rijksmonument is notable - there's a couple of screens-full on some aged planks over a small irrigation ditch somewhere. And of course all those 1940s gas stations and 1840s brick boxes on American official listings. Johnbod (talk) 18:28, 3 January 2019 (UTC)

@Johnbod it was unilaterally changed in 2012, to slip it in; it now reads Artificial geographical features that are officially assigned the status of cultural heritage or national heritage, or of any other protected status on a national level and which verifiable information beyond simple statistics are available are presumed to be notable (my emphasis). In my experience this has consistently been interpreted to mean "any article on a listed structure is undeletable", which is why Wikipedia is now graced by heaps of untouchable drivel like 7 & 9 Bounds Green Road, Ye Olde Dolphin Inne and 128 New King's Road. ‑ Iridescent 18:45, 3 January 2019 (UTC)

(adding) This isn't just a UK problem, although the English prediliction for dishing out listed building status to assorted fences and signposts makes it particularly problematic in England. See Category:Stub-Class National Register of Historic Places articles if you want reassurance that this kind of undeletable stub crapflood affects the US as well. ‑ Iridescent 18:52, 3 January 2019 (UTC)

Ye Olde Dolphin Inne is not crapflood drivel, it's Grade II listed!! (...and it's haunted!! so there) Martinevans123 (talk) 18:54, 3 January 2019 (UTC)

Looking more closely at that took me—by way of Category:National Inventory Pubs—to Boleyn Tavern, which has now supplanted Radcliffe & Maconie as my new favourite unintentionally hilarious Wikipedia article. ‑ Iridescent 18:58, 3 January 2019 (UTC)

Gandhi drinking Cream soda?? What's not to love! Martinevans123 (talk) 19:01, 3 January 2019 (UTC)

The FIFA PR-piece masquerading as a "source" is an equally rich vein of comedy gold. ("Sadly, there is no evidence proving that Gandhi ever turned out himself for any of the teams or took on any coaching roles".) ‑ Iridescent 19:15, 3 January 2019 (UTC)

Nce pub, though some of the antics we used to get up to there would have fallen slightly foul of his doctrine of peaceful protest ;) ——SerialNumber54129 19:20, 3 January 2019 (UTC)

The Old Dolphin is one of the oldest pubs in the country. With proper research, there's a lot more that could be said about that. I agree we don't need an article on every listed building, but a lot of them will be notable, and half a million is only about 3% of all the buildings in England (probably including most of its churches, which will have enough coverage elsewhere to sustain an article). Certainly anything grade I or II* (the top <10% of the 3%) could justify an article. I'd rather not have hundreds of three-sentence articles about each identical house on a street, but even that would be less problematic than all the microstubs on people who have moved a ball from one place to another. HJ Mitchell | Penny for your thoughts? 19:25, 3 January 2019 (UTC)

"Dolphins are people too, you know." dig it? Martinevans123 (talk) 20:46, 3 January 2019 (UTC)

Thanks for pinging me. I generally agree with your analysis. In my understanding, WP:GNG is a general policy, and specific criteria (including NFOOTY) should just serve as indicators which articles are likely to be notable because sources exist. Footballers who fail NFOOTY but pass GNG should be (and are already considered as) notable; I have seen cases when an article was kept at AfD if it failed NFOOTY but passed GNG. On the other hand, if a reasonable effort was made to look for sources, and the conclusion was that sources do not exist, the article should be deleted or draftified even if it passes NFOOTY. If a player for instance was not notable, and the sources did notexist, the mere fact that he played one match in a low-level professional league is unlikely to generate plenty of sources. However, I have no idea how it could be implemented.--Ymblanter (talk) 18:41, 3 January 2019 (UTC)

Iridescent - thanks for your comments/support. If I get de-mopped or topic banned etc. I'll most likely retire tbh.

Ymblanter - there is plenty of AFD consensus that passing NFOOTBALL but not passing GNG (and not being likely to either) is not sufficient for notability - see eg this and this, which both contain lists of others. Giant Snowman 19:19, 3 January 2019 (UTC)

What complicates matters in that regard, is that any reporting on players is now discounted as "routine coverage" which is using the inital assumption, that a player in x-league will get coverage, therefore it must exist, and turning it into the opposite of its intent. Agathoclea (talk) 19:29, 3 January 2019 (UTC)

<personal opinion> I'd say that "routine coverage" would be along the lines of "Midtable Rovers have signed Carlos Kickaball from Pateadores de Pelota for an undisclosed sum" and "in the 87th minute Carlos Kickaball came on as a substitute for Midtable Rovers, player rating 6.8". "Non routine coverage" would be "In an exclusive interview for the Midtable Echo, Rovers legend Arthur Flattcapp discusses promising youngster Kickaball's bright future" and "Carlos Kickaball speaks of his homesickness and submits a transfer request". As a very rough rule of thumb (for all articles, not just sports), if the entire text of the article could be generated from Wikidata by a bot were Wikidata to be given the appropriate facts and figures, it's probably not an article Wikipedia should be hosting. ‑ Iridescent 19:48, 3 January 2019 (UTC)

Seen the ping, TLDR, what's the question? --Dweller (talk) Become old fashioned! 20:13, 3 January 2019 (UTC)

'Does WP:NFOOTY's relatively low bar mean we have so many stubs on marginally notable players that it's impossible to check every change in detail, particularly for players in overseas leagues where the sources are patchy and often not in English, meaning the kind of "revert if you're not sure" action that got GS in trouble is the only way to ensure WP:BLP is enforced, and if so should we consider enforcing the "multiple, independent, non-trivial coverage" criteria more rigorously even though it will mean deleting articles like Ephron Mason-Clark not singling him out, picked at random which will in turn cause bad feeling?' ‑ Iridescent 20:33, 3 January 2019 (UTC)

That's several questions in one. Do we have too many stub articles? Yes. Is the solution to delete them? No. The solution is to find more editors and not to chase away those who pop up over the parapet. Does it make BLP impossible to enforce? Not really, we're quite good at serious catching infractions of BLP because we're quite good at catching vandalism generally across the squillions of articles on all topics. Did I answer all thw questions? --Dweller (talk) Become old fashioned! 21:05, 3 January 2019 (UTC)

I guess the root question is "how do we stop this happening again?", since even if GS isn't sanctioned this case will presumably discourage other people from reverting questionable football-related edits. When the NFOOTY guideline (along with many others) was written we had 2.1 million articles and 44127 active editors (47 articles per editor); we now have 5.8 million articles and 30900 active editors (188 articles per editor), and at some point the elastic is going to snap. (None of this is really a conversation for which my talk page is the appropriate place, but it's ended up here as a result of my comments at the GS arb case.) ‑ Iridescent 21:20, 3 January 2019 (UTC)

I don't think the GS case will prevent people doing anything that policy says should be done in football or any other kind of article. I stil don't understand the problem. Give me an example? Happy to discuss this anywhere. --Dweller (talk) Become old fashioned! 10:07, 4 January 2019 (UTC)

Sidetrack (sic) about rail stations

Related to several of the things you’ve said, but I’ve personally never gotten why rail stations and roads are probably the most sacrosanct articles on Wikipedia. I don’t consider them particularly harmful and that’s basically my standard for deletion, but they’ve always been in the same boat as the guy who hurt his knee in the Micronesian football match for me: things that have an intense niche following but that no one really cares about outside of that. I’m sure they can meet the GNG, but I personally don’t find that a particularly compelling guideline (the squirrel your cat killed likely meets if taken literally). All that to say, I’m actually curious your thoughts here since rail is your thing. Again, I’m not advocating deletion, just musing that as a practical matter the defunct railway station in middle of nowhere Kansas isn’t any different than the rando footballers in my view. TonyBallioni (talk) 13:43, 4 January 2019 (UTC)

Passenger railway stations are inherently notable. Especially for stations that were open before automobiles became widespread, even the most isolated strip-of-concrete station was a key part of the local economy, and in many cases the reason the local community existed in the first place. This is particularly true in North America, where the railroads built stations in open countryside and waited for communities to grow up around them, a process documented fairly accurately in the Little House books—look at a map of the Great Plains or western Canada and you can see that to this day, the population centers are strung out like beads along the former railroad lines.

Although the process was most pronounced in the Americas, it also happened in the Old World as well; see Metro-land for the most famous example in England, or look at a map of Asian Russia and see how all the new cities developed in the Soviet era are strung out along the Trans-Siberian Railway.

Even for stations that were built to serve already-existing communities, connection to the railway network was a major event with drastic implications on everything from house prices (as commuters move in), to the makeup of local industries (railroads make bulk shipment economically viable, allowing industries like brickworks, mining and grain processing that aren't viable by road transport, to operate in the area). Where towns have lost their rail connection, losing the connection invariably had a significant impact on the local economy; factories closing, people moving out, a steep drop in tourism for those towns that had a tourist trade.

And all of these economic and cultural impacts are invariably documented; even for places that don't have local newspapers, any station will have been the subject of extensive coverage in both the specialist railway press (for its opening and closing, if nothing else) and in the regional media for its broader impact. When you see stubs like East Hampton station or Poyle Halt railway station it's not because the sources don't exist, it's because nobody has yet bothered to dig the sources out. (Before I started expanding it, Droxford railway station looked like this.)

Where I disagree with the trains project is when it comes to the "one station, one article" policy. In my view, in many cases it's more useful to the readers either to have a single article on all the stations in a town allowing readers to compare and contrast the different services (example), or a single article on all the stations on a particular line allowing readers to see the impact of the route as a whole rather than as a series of disconnected pages (example). This is a battle that's been lost, however. Some of the other trains people (@Slambo, Redrose64, and Mjroots:) might be better able than me to give an opposing view explaining the benefits of stand-alone pages for each station. ‑ Iridescent 16:59, 4 January 2019 (UTC)

I tend to agree, Iridescent. The vast majority of stations should be capable of sustaining a stand-alone article, but it depends on the sources available. Take the stations on the Réseau des Bains de Mer network in France. Few sources (in English at any rate), so they are dealt with in the article on the system. Mjroots (talk) 17:11, 4 January 2019 (UTC)

Entirely agree, though it's a pity the articles so rarely live up to this noble vision! The place I've lived in for 20 years+ was all orchards before the railway arrived and started building on the land they'd prudently acquired around the station site. Johnbod (talk) 18:47, 4 January 2019 (UTC)

I was about to chime in with a "surely you don't mean tiny countryside halts" (an example from my favourite cycling route although now I've named a specific station you'll probably dig up enough sources to write an FA just to prove me wrong! ;)), but the I read the suggestion about grouping them by line. But as a rule, I'd say it's possible to write a decent encyclopaedia article on all but the smallest of stations, at least those in the UK that are still open. The oldest of them are over 150 years old (originally at least), so they've had plenty of time to be written about and many of them are listed buildings (526 in England according to a quick search, but I haven't checked how accurate the search results are). HJ Mitchell | Penny for your thoughts? 20:16, 4 January 2019 (UTC)

Even the tiny halts; the main source for Watergate Halt would be North Devon and Cornwall Junction Light Railway by C.F.D. Whetmath and Douglas Stuckey, with Vic Mitchell and Keith Smith's Branch Lines to Torrington for the technical stuff. I wouldn't consider it a sensible use of anyone's time, but especially in England—where the trainspotting subculture means these things have been obsessively documented—it's possible to expand even the most obscure country halt. Per my comment above, I think Watergate Halt railway station would be far more useful as a subsection of North Devon and Cornwall Junction Light Railway so it can be compared and contrasted with other stations on the route, but from a technical perspective expanding it wouldn't be difficult.

If anything, the abandoned Devon lines are probably a higher priority than most, given that NWR are seriously considering giving up the endless fight against the climate at Dawlish and rerouting the main line to Plymouth onto one of the disused North Devon routes (although when the day comes, a rerouted main line would be much more likely to go via Okehampton than Barnstaple/Bideford, as more of the rights-of-way remain in place). ‑ Iridescent 20:52, 4 January 2019 (UTC)

There's talk of reinstating the old LSWR route via Okehampton but that's still about 30 miles from me. There's next to no chance of anything further north reopening; besides the Exeter-Barnstaple branch line northern Somerset/Devon/Cornwall is a railway desert. Certainly Watergate will never see a train again. HJ Mitchell | Penny for your thoughts? 23:13, 4 January 2019 (UTC)

Stranger things have happened. (Did anyone ever think Tavistock railway station or Stow railway station would exist again?) Assuming things go to plan in a sector which has a notorious reputation for falling behind schedule, Virgin Orbit will move in to NQY in 2021; if Britain stomps out of the ESA and follows through with Theresa May's posturing to create its own rival GPS constellation, there will suddenly be a need to bring large quantities of heavy equipment and larger quantities of residents into North Cornwall, without overwhelming the existing rickety infrastructure. The far southwest could become Silicon Valley with pasties. ‑ Iridescent 23:42, 4 January 2019 (UTC)

Grouping them together sounds reasonable, and your explanation makes a lot of sense. I suppose my point is that from the perspective of an outsider who likes Amtrak but doesn’t really care about much else, one article per station makes about as much sense as one article per footballer. Any niche topic area with inherent notability is going to have that. I could write an essay on why every bishop in a major Christian Church is inherently notable (I’ll spare you), but I once had to explain to a semi-experienced editor why the Capuchin bishop who translated the Bible into Kashmiri didn’t qualify for A7 by virtue of the episcopal consecration. I think you’ve made the very valid point that Wikipedia is best for collecting detailed information on niche topic areas, and I guess my general view is that if the niche articles aren’t harming the project, I’m fine with the permastubs if the people who care about them think they make sense. TonyBallioni (talk) 20:46, 4 January 2019 (UTC)

I think the point (or at least my point, not that I'm a station aficionado although I do have a penchant for a nice bridge, even if it's only just possible to scrape together 200 words), is that they shouldn't be permastubs. There's enough to say about their history, impact on the area, location, and architecture to produce at least a couple of hundred words, and the sources exist (general railway histories, books on railway architecture or the engineer/architect responsible, railway magazines, local newspapers, histories of the local area, histories of the railway company responsible, NHLE entries, etc), it just takes someone with the inclination and access to the material to pull it all together. HJ Mitchell | Penny for your thoughts? 21:03, 4 January 2019 (UTC)

@TonyBallioni, the sports biographies have two issues that the railway stations and medieval bishops don't; firstly, that if they're still active they need to be updated every week (at least while their sport is in season); secondly, that they're living people. If we say that the the disused station of Buttmunch, ND was used to ship 30,000 tons of grain and 100,000 tons of coal per year when it actually shipped 100,000 tons of grain and 30,000 tons of coal, that's an embarrassing mistake but one that's easily corrected. If we say that Carlos Kickaball scored 30 goals in 100 appearances when he actually scored 100 goals in 30 appearances, we're arguably defaming a living person by misrepresenting someone extraordinarily gifted as someone unexceptional, and we're certainly making an untrue statement in a BLP and consequently should be removed immediately and without waiting for discussion comes into force. The issue we have is that when it comes to soccer biographies we're talking about a lot of articles (112,994, to be precise; the full list crashes the server to list in full, but there are over 20,000 biographies of English footballers alone), a limited and shrinking pool of people to monitor them, and a general head-in-sand "eventually the Wisdom Of Crowds will fix every error so we needn't worry" attitude from too many people.

@HJ Mitchell, yes, exactly. The reason I claim stations have inherent notability is because every station is significant enough that what I did to Wood Siding or Droxford could be done to any one of them regardless of how apparently insignificant it is. ‑ Iridescent 22:34, 4 January 2019 (UTC)

You have actually managed to explain the issues people have with NSPORT in a way that’s practical and isn’t crazy wiki-idealogues yelling at each other. This makes so much sense now. TonyBallioni (talk) 23:53, 4 January 2019 (UTC)

I'm still a little split on the need for an article on every individual station that ever existed, especially for rapid transit and tram/light rail lines; but I'm leaning more toward seeing them of value rather than harm. When the articles started to appear around 2006, I thought it was overkill. I didn't push the issue because I didn't think the argument was strong enough to delete them all and replace them with summary list articles, and we've seen articles on apparently random stations become high quality content for strange reasons (Jordanhill, as one obvious example developing from a single sentence stub to a GA and an FA candidate). It was the potential for station articles to become high quality content that kept me on the keep side. Small country stations in the US in the 19th century often formed the nucleus of town and city development, so it seemed plausible to me that this could happen for a lot of articles.

However, rapid transit and light rail lines are often all planned at once, so the individual stops and stations aren't necessarily special in their own rights. Many such articles (e.g. Sawtelle), for example are still stubs, and a lot of research is still needed to find the reliable sources to bring the articles up in quality. A lot of articles like this are now more than a decade old and still stubs. The obvious counterpoints to this are the multitude of articles on New York City Subway and London Underground stations. Each of those station articles is being built out to high quality content.

I guess the thing for me is that having all of the stations have individual articles provides a framework for editors to build on. It takes time to do good research and to put high quality content together. The stubs are placeholders until someone has the time to work on them. The real problem is finding editors who have the time and resources to work on them. The vast majority of editors are volunteering their time, and after being in leadership positions at several wholly volunteer run organizations, I've found the more reliable way to get work done is to recognize and thank the volunteers for the work that they are able to do rather than complain about the work that doesn't get done. Slambo (Speak) 22:18, 4 January 2019 (UTC)

I do not think tram stops are notable. I am not a native English speaker, and I may miss the exact meaning of rapid transit, but if these are what we call metro or subway (or, well, the Tube) in Europe, I do not see how they can be not notable. I created a number of articles of metro stations in Russia, and my standard to start and article is a couple of paragraphs and two-three sources. If a new station gets opened there are dozens of articles in media about the event.--Ymblanter (talk) 22:40, 4 January 2019 (UTC)

(ec, re to Slambo) I'd agree that every station could theoretically be a full article, and obviously something like Droxford railway station wouldn't sit comfortably as an entry in a list at Meon Valley Railway. My point is that where the long articles don't exist, in many cases it would make more sense to have them as entries on a list, rather than have a dozen identical articles saying nothing more than "Foo station was opened by the Bar Rail Corporation in 1896 as part of the Placeholder Valley Line. It had two platforms and a nondescript building built to exactly the same design as every other station on the line", which is too often the current situation. (This is even the case with the core-interest topics like the London Underground; for every Quainton Road or Green Park there's an Upney or Burnt Oak.) ‑ Iridescent 22:42, 4 January 2019 (UTC)

@Ymblanter, Tram stops can be notable—particularly in England, it's not uncommon for them to be on converted railway lines and have inherited the former railway station's buildings and history. As far as I know, the only tram networks for which Wikipedia routinely treats the individual stops as automatically notable are Muni Metro (which serves the Wikimedia Foundation's home city, so good luck getting garbage like Judah and 43rd Avenue station deleted even though it's nothing more than a bus shelter), and Manchester Metrolink which is something of a special case as there's actually quite a bit to say about every individual stop. ‑ Iridescent 22:48, 4 January 2019 (UTC)

Yes, indeed, my point was that while some tram stops can be notable (especially if there are buildings attached or whatever), most are not, sometimes there is not even a shelter and they are getting moved on a regular basis. (Which again can be different country to country or even system to system). For metro systems, I am sure there is sufficient coverage for every station - though it does not necessarily mean that having individual articles is the best solution.--Ymblanter (talk) 09:18, 5 January 2019 (UTC)

I think we both agree. I'd argue that some tram stops are notable—and for some networks like Tramlink in south London the trams are so heavily used and so strongly integrated into the transport network that they serve as de facto metro trains and we should cover every stop, while in other places like Brussels (where the trams run underground) the line between "tram" and "train" is hopelessly blurred. However, nobody will convince me that Duboce and Noe station is really a viable topic just because they call it "station" instead of "stop" and because the buses that stop there happen to be on rails rather than tires this tram stop is 3km from the Wikimedia Foundation and on the direct route between the WMF offices and the beach; if the article is only 77 words long, it's almost certainly because there are only 77 words to say about it.

That said, the metro station that's quite literally on the WMF's doorstep only has a 263-word stub; maybe the transport in SF really is that boring. If anyone reading is hoping to jump aboard the gravy train (So nice to know that $150,000 of donor funds per year go to an "Agile Coach"), you could do worse than expand that article, as presumably that's a page our insect overlords look at regularly. ‑ Iridescent 16:10, 5 January 2019 (UTC)

DYK for Simeon Monument

On 4 January 2019, Did you know was updated with a fact from the article Simeon Monument, which you recently created, substantially expanded, or brought to good article status. The fact was ... that, upon completion of the Simeon Monument (pictured), a local resident complained that "among the generality of the inhabitants it is called a p****** post"? The nomination discussion and review may be seen at Template:Did you know nominations/Simeon Monument. You are welcome to check how many page hits the article got while on the front page (here's how, Simeon Monument), and it may be added to the statistics page if the total is over 5,000. Finally, if you know of an interesting fact from another recently created article, then please feel free to suggest it on the Did you know talk page.

Alex Shih (talk) 00:01, 4 January 2019 (UTC)

Good job on the article. Six stars for you. Newyorkbrad (talk) 14:16, 4 January 2019 (UTC)

Six out of how many? I'm mildly frustrated at this one, as a topic which ought to be quite interesting turns out to be so boring. ‑ Iridescent 17:04, 4 January 2019 (UTC)

How so? You beat down the opposition at DYK :) ——SerialNumber54129 17:15, 4 January 2019 (UTC)

I don't have a problem with TRM raising objections; he can be grumpy but understands that not everyone agrees with him. Conversations with him tend to be along the lines of (1) "I don't think you should do it like that" (2) "I did it like that for a reason, here's why" (3) either "Oh, OK, I see your point" or "I still think it's a problem, how about this third way that addresses both issues". I have no issue with that and don't see it as beating down the opposition; where I have an issue is with the outright trolls like Kevin who make up non-existent issues just to give themselves something to complain about. (You have one of his made-up complaints heading your way next week, incidentally; There is no evidence that he was prosecuted for either crime; prostitutes were not usually arrested in London during this period, and sodomy was pursued in ecclesiastical courts clearly means that you're trying to claim that people committed sodomy in court. ‑ Iridescent 17:35, 4 January 2019 (UTC)

Me neither; ERRORS2 is a veritable university compared to the creche of DYK (which is where I meant, apologies for any confusion +ASPERSIONS @TRM!).
Yeah that's par for the course I suppose. Although, ironically, I've already got my !vote in early elsewhere ;) ——SerialNumber54129 17:47, 4 January 2019 (UTC)

Having not read the article or the nomination, I assumed that there would be a WP:CENSORED hue and cry somewhere about the use of asterisks and was mildly disappointed not to find anything (having to be satisfied with the Ultima Thule renaming discussion instead). I should have realised that censoring of that type would never be allowed to happen on Wikipedia, and it is explained here on the talk page (and in the article and at the nomination). Carcharoth (talk) 17:25, 4 January 2019 (UTC) P.S. Wasn't there a thread on your talk page about some church architectural feature that was commonly used for urination? I remember it, but can't find it right now. Maybe it was somewhere else? Carcharoth (talk) 17:33, 4 January 2019 (UTC)

TRM raised a concern that the asterisks might confuse readers, but per my comments there I don't consider it an issue. Either someone understands it to mean "pissing post", or they don't understand it and click the article to see the sentence in question next to an explanation that people probably urinated against it (the closest we can get to drawing a connection without straying into OR, in the absence of a source saying "no, Man definitely meant pissing"), or they don't think the topic looks interesting, don't click on it, and it's of no relevance to the reader. Are you thinking of the mighty Anti urination devices in Norwich? ‑ Iridescent 17:36, 4 January 2019 (UTC)

That's the one! :-) Carcharoth (talk) 17:41, 4 January 2019 (UTC)

Once you know they exist, you start noticing them everywhere you get the combination of historic architecture, dark alleyways and large numbers of drunks. There's probably quite a good book to be written about how their design varies by region and over time. ‑ Iridescent 17:44, 4 January 2019 (UTC)

*Crawls out from underneath Lava balloons and African humid period* I am sure I've seen such objects in Regensburg... Jo-Jo Eumerus (talk, contributions) 19:20, 4 January 2019 (UTC)

It would make sense; Regensburg has the trifecta of historic architecture, narrow dark streets, and a university to provide an endless supply of drunk students. The Norwich ones—and I imagine the Regensburg ones—make an effort to blend in; those in less scenic locations show no such concern. Other parts of Germany have adopted a more innovative approach to the problem. (Defensive architecture is fascinating once you start looking out for it. The anti-urination devices, Camden benches, and strategically placed plant-pots to discourage cyclists and skaters are fairly straightforward, but the field is a deep rabbit-hole; this and this are both devices to prevent prostitutes working in your doorway, for instance.) ‑ Iridescent 19:42, 4 January 2019 (UTC)

16,213 views on the day. Nobody's bothered to update WP:DYKSTATS for January yet, but unless something currently in the news or something particularly interesting comes along that has a fighting chance of being the most-viewed DYK of the month. I guess there's a previously-untapped vein of interest in early 19th-century street lighting in Berkshire among our readership, since the alternative—that I actually know what I'm doing when it comes to writing hooks or blurbs that are factually accurate but are interesting enough to draw readers into clicking to read articles on topics that are well outside most readers' comfort zone—is one which most of the self-appointed owners of the main page have long ago dismissed out of hand. ‑ Iridescent 16:31, 5 January 2019 (UTC)

Completely apropos nothing, but do you have any recommendations for the Regents Park area? It's both geographically and intellectually some miles from me :) ——SerialNumber54129 16:12, 6 January 2019 (UTC)

Do you mean recommendations in the sense of "where are some good pubs round there?" or "how can that article be improved"? ‑ Iridescent 16:14, 6 January 2019 (UTC)

Well, the former would be most fun :) but, yeah, more accurately, the history of its development, and, indeed, its pre-Nash days? All things considered, that article scores pretty high on the "Quality Inverse to Importance" scale. ——SerialNumber54129 16:28, 6 January 2019 (UTC)

If I were doing it, I'd:

Round up a copy of Paul Rabbitts's Regents Park and write a new article from scratch using Rabbitts as the framework;
Go to either the London Canal Museum or the National Waterways Museum and scrounge a history of Regent's Canal, as one can't write a history of one without covering the other in detail;
Get brief histories of the Zoo and Winfield House (or persuade someone else to write them);
Then, and only then, go through the existing article and see if there's anything mentioned there that isn't in your new article; decide whether they should be included or not.
Overwrite the existing PoS with the contents of your sandbox.

Personally, I wouldn't even mention Nash and Burton's buildings. Whoever added that section to the existing article has confused the broad areas Camden and Westminster councils call "Regent's Park" (confusingly, they each have a ward called "Regent's Park") with the park from which the districts take their name. Either the article should be about the park, in which case the only buildings that should be mentioned are those within the park, or it's about the larger district of Regent's Park in which case it also needs to include the Regent's Park Estate, Regent's Park Barracks, the assorted churches, houses, shops and police stations along Albany Street, the tourist traps around Baker Street station—pretty much everything in the rough square bounded by Park/Wellington/Finchley Roads, the West Coast Main Line, and the A501.

If you want my honest opinion, I wouldn't consider it worth the effort. I'd imagine the pageview count is deceptively high and most readers are just looking for directions from the station to the zoo, and it will be virtually impossible to keep it in shape as you'll have an unending stream of spammers from the diploma mill collecting fat wads of cash from gullible Americans who don't understand the English academic system and rich Sloanes whose grades were too low to be accepted by any legitimate institution "private university" that's taken up residence in the old Bedford College building. If you want to improve a high-importance London park, Hampstead Heath would almost certainly be easier both to write and to maintain. ‑ Iridescent 17:04, 6 January 2019 (UTC)

Apologies, I mislead you; I wasn't thinking about rewriting either of them (which isn't to say they don't need it), but I hadn't saved to sandbox at that point. It was more specifically the ice house—niche!—and I was looking for some historical/social/political/architectural/anythingreally stuff for context. I'd gone to the Regents Park page to mine it for sources...and found it wanting.

Incidentally (even more apropos nothing) to provide further grist to your mill/criticism of ODNB, their article doesn't even give an author. Unbelievable. Someone should remind them of WP:ATT]] :) OK, that might be a minor figure, but if you have access, have a look at their entry on Wallis Simpson. Probably the most famous woman in the world of her time (if for many of the wrong reasons) and they don't even credit their writer! Anyway, sorry to waste your time re. the Park. And way to crush the so-called uni :) ——SerialNumber54129 20:44, 6 January 2019 (UTC)

Take the hysterical coverage in the papers of the ice house, who are treating it like some kind of Victoriana equivalent of the discovery of Tutankhamun's pyramid, with an extreme pinch of salt. Everyone's always known that there was an ice-house there for storing ice shipped up the canal for onwards shipping by canal and rail, to the extent that the Wetherspoons up the road is called "The Ice Wharf"; what's happened is "construction works lead to old building being re-exposed" rather than any kind of amazing archaeological discovery. (We had the exact same hysterical "amazing lost building discovered!" a couple of years ago with Southwark Park railway station despite the fact that they could have read about it on Wikipedia any time since 2008 and the 'long lost' building is clearly visible from the main road; we'll no doubt see a steady stream of similar stories once the HS2 demolitions in London and Birmingham start in earnest.) ‑ Iridescent 21:15, 6 January 2019 (UTC)

Yeah I really want more from MOLA, but I guess it's far too soon for them to have published anything. Or do they release draft / preliminary reports? ——SerialNumber54129 21:18, 6 January 2019 (UTC)

You could ask them; even if nothing's been published they might be able to send you something. Personally, I find MOLA/MOLAS works virtually useless for Wikipedia purposes, as they're so hyper-specialist; they're great if you want to write List of the dimensions of each individual brick in London's Roman amphitheatre, but useless if you want to write Public entertainment in Roman London (the sources for which have been sitting in front of me for two years but which I keep finding pretexts not to start). ‑ Iridescent 21:23, 6 January 2019 (UTC)

Brillant :) Might be worth asking for a photo; one of those from within the thing would be good. Free WP:PROMO for MOLAs! Thanks for the advice though. ——SerialNumber54129 21:37, 6 January 2019 (UTC)

cough Tutankhamun's pyramid? A. Parrot (talk) 21:46, 6 January 2019 (UTC)

Bike Shedding and Deck Chair arranging

Where do you get all these idioms? How can I have got through life missing all these. Thanks for armchair quarterback by the way. Edaham (talk) 10:34, 16 January 2019 (UTC)

From the articles we have that discuss these idioms, perhaps? Jo-Jo Eumerus (talk, contributions) 17:21, 16 January 2019 (UTC)

Beautiful story about an an animator at Interplay Entertainment working on the game battle chess. I'm surprised the term "pet ducking" hasn't come about, since it has a nice pun built in around the idea of ducking a pointless redrafting, by including a readily apparent and easy to fix blunder. Edaham (talk) 02:35, 17 January 2019 (UTC)

"Armchair quarterback" entered the language in the 1950s, and is now standard idiomatic English, gradually replacing the earlier equivalent "armchair general"and "armchair critic" which have been idiomatic English since the 19th century; I would imagine its rise correlates pretty much exactly to the spread of televised football. ("A person who thinks he knows how to direct affairs in which he is not taking part. In a similar sense one talks of “back-seat drivers.” has been the definition in Brewer's since at least the 1920s.) Variations on "rearranging the deckchairs on the Titanic" first appeared in the 1970s and again have become a standard enough part of English that "Titanic" is now implicit. "Bike-shedding" is a slightly more niche term, and is used primarily when discussing collaborative decision-making by people with varying degrees of competence in the field in which their decisions are being made; however, Wikipedia is one of the canonical examples of collaborative decision-making in which the decision-makers have varying levels of competence and experience regarding the topics on which they're making decisions, and as such the phrase is firmly embedded within our own idiolect. "Duck feature" is indeed a term used by programmers for exactly this kind of intentionally unnecessary sacrificial software feature (the notion that it originated with Battle Chess is entire apocryphal). ‑ Iridescent 15:23, 20 January 2019 (UTC)

Here's a reasonable little talk about C. Northcote Parkinson and "bikeshedding": [2] Martinevans123 (talk) 15:32, 20 January 2019 (UTC)

On researches and talk-pages

Some months ago, I came across a WMF-research project that sought to build a tool to help editors and administrators deal with incivility in uniform ways. Among the aims were also to design a system that automatically flags certain behaviours as civil or uncivil. The stuff about Study 2 (specifically), is worth reading........:(

Luckily, the research concluded a month ago, with the observation ....what the language of incivility is continues to be a difficult question to answer.......recommend against building any kind of automatic detection tool to deal with incivility.

Incidentally, it simultaneously observed (on another locus):- .....the NPOV policy was sometimes used to prevent women from writing articles, or articles from being written about women. According to our participants, many editors view men's point of view as inherently neutral, while women's point of view is inherently biased. This, (in turn) leads them to state that policies are (mis)used by editors to silence others, particularly women, or to prevent speech about women.

WMF-researches have produced some brilliant out-of-the world observations in the past and it does not help that, I am yet to come across any case where somebody was invoking NPOV and imparting excess scrutiny to certain edits just for the very-premises of an woman editing an woman-subject. And, the proclaimed-prevalence of the community-thoughts that a part. gender is inherently biased, seems too weird. On the other hand, I might be missing some fundamental point or they are oversimplifying stuff or they might be basing their observations on some quite-outlying article; ('GG-stuff' maybe?)

Any ideas? ∯WBG^converse 17:42, 6 January 2019 (UTC)

I can't think of any examples of anyone—individually or collectively—actively discriminating against female subjects; if anything, we tend to relax the notability rules a little more in recognition of the fact that historical biographies of women can be harder to source. There does seem to be an attitude among a small but still statistically significant faction of editors that "she was a woman so anything she did probably wasn't important", but not really enough to skew decision making to any great degree. There is a meta-issue in that historically women were excluded from a lot of occupations and consequently had less opportunity to engage in activity that would make them notable in Wikipedia terms, but that's a bias only in the sense that reality is biased and we need to reflect reality; I consider things like Welsh Wikipedia's boast of having equal numbers of male and female biographies to be evidence of massive systemic bias on their part, since it's a straightforward statement of fact that there are more men than women in the historical record.

To be honest, this "research proposal" just looks like someone with an axe to grind rather than someone with a genuine concern, and I wouldn't be surprised if one of the long-term cranks who occasionally try to piggyback their pet grudges onto legitimate discussions about gender coverage is somehow involved. (More specifically, I suspect this proposal had "find a pretext to block Eric Corbett" in mind, and Eric's effective resignation means the WMF have no reason to take it any further.)

As with virtually every piece of research that comes out of the WMF, you can safely ignore it; in my experience WMF-sponsored research invariably consists of the researcher and/or the WMF deciding what conclusion they want to come to, and then cherry-picking facts to support that conclusion. If anyone doesn't feel that's the case here, I'd urge them to have a look at the questions which were asked which are blatantly leading; there's literally no way that pseudo-research could have come to any conclusion other than "everyone on Wikipedia is rude". Reading the conclusions, I don't see anything there other than generic "I found it upsetting that someone had an opinion that differed from my own" snowflakery. (That may be an artefact of the conclusions being badly written rather than the evidence not existing, but certainly if they're going to come to the conclusion that incivility is rife you'd expect them to provide some examples. It's always been my experience that the people who shout loudest about how everyone else on Wikipedia is being rude to them, go strangely silent when asked to provide an actual example of a fight they were involved in which they didn't start.) ‑ Iridescent 18:14, 6 January 2019 (UTC)

Paging SlimVirgin who might know more about this than me; I've been ignoring the GGTF and their associated projects since they went off the rails a couple of years ago. ‑ Iridescent 18:27, 6 January 2019 (UTC)

Also, going to put this here, since until just now I certainly wasn't aware of it. ‑ Iridescent 19:41, 6 January 2019 (UTC)

I was reminded of your comment here about there being more men than women in the historical record (an obvious point that I've belaboured as well in the past) when looking over: Clerk of the House of Commons (which nearly had a woman in the role recently). I ended up there because the Speaker and his ilk are in the news at the moment. I wonder if much can be made of the red-links Archibald Milman, Thomas Lonsdale Webster, Horace Dawkins, Barnett Cocks, David Lidderdale and Richard Barlas? The first I tried (Horace Dawkins) wasn't that promising. Barnett Cocks looks more interesting (depending on what people find interesting). We have an article on Thomas Lonsdale Webster's son: Thomas Bertram Lonsdale Webster. Carcharoth (talk) 17:09, 9 January 2019 (UTC)

In general, it probably wouldn't be possible (or worthwhile). Clerk of the Commons is a respected position, but ultimately 99% of the time a rather dull functionary role, and there won't have been much written about most of them other than routine announcements. I'd see it as akin to judges; some of them are obviously extremely notable because of their involvement with high-profile cases or decisions that set major precedents, but they don't have intrinsic notability in the way that MPs do.

To draw a parallel with the railway station thread above, elected representatives are railway stations and clerks are bus stops. It's reasonable to presume that even the dullest politician has been the subject of significant coverage during his or her election campaigns and as a spokesperson on issues affecting their constituency; the same presumption doesn't hold for clerks, and consequently if they're to have a Wikipedia article the significant coverage needs to be demonstrated and can't just be assumed. ‑ Iridescent 17:41, 11 January 2019 (UTC)

All very true. Though in practice, there may be more articles than expected. I moved on to look at Clerk of the Parliaments (House of Lords) and found we have articles on every holder of that position since 1788 (there have been 20 in total since that date). Many of the earlier ones were elected politicians or members of the Lords or otherwise titled or became a baronet, which explains some of the more obscure ones having articles. There are stubs such as the current holder of the post: Edward Ollard. The other more recent ones seem OK, if not that interesting to read (Beamish stands out for his Mastermind connection). Michael Davies (parliamentary official) has some family history and the line: "He was also the first Clerk to have relied on email and had a laptop at the table." Going further back: Henry Badeley, 1st Baron Badeley, John Shaw-Lefevre, and George Rose (politician) are the best-developed articles of the bunch, but all due mostly to their other activities/roles/titles. Such lists of holders of roles do give a definite sense of how things have changed over time as many distinguished roles become less so as institutions and society changes. Carcharoth (talk) 14:05, 12 January 2019 (UTC)

To draw a further parallel with the railway station thread above, the clerks are probably an even better example than railway stations of a group where a single list would be of more use to readers than a category full of stubs. Expanding the existing list at Clerk of the House of Commons#List of Clerks of the House of Commons such that it had a one- or two-paragraph biography of each entry would be useful to readers as it would allow them to draw their own conclusions as to how the roles and responsibilities of the job had changed over the years, while avoiding the WP:SYNTH issue we'd encounter if we tried to write an article on how the role has changed. ‑ Iridescent 18:36, 12 January 2019 (UTC)

That might work, but lists can be difficult to do that way sometimes. Do you have an example that can be pointed to as the 'right' way to do this? Changing subject completely, while trying to find another article on someone of the same name, I came across Anne Phillips (geologist) and Anne Phillips (field assistant), with a merge tag on them since August 2018. Not that long, but feels like it needs more attention somehow. Shouldn't be that difficult as the latter article (shorter and written later) only has three citations to a source not used in the earlier article. It probably does need a little bit of care, though. And the correct attributions, as always. Carcharoth (talk) 12:57, 19 January 2019 (UTC)

The "right way" would depend on the topic in question as to exactly how such a thing would be formatted. My go-to example of the kind of list I mean would be Infrastructure of the Brill Tramway#Stations, which allows the reader to skim through and compare the six largely similar buildings, while avoiding the necessary repetition of the 'background' sections of stand-alone articles. (It also means Church Siding—a lump of mud at which trains would occasionally stop to pick up milk cans—can have its existence appropriately noted, without the need to create the pointless two-line stub Wikipedia's "one station, one article" approach theoretically mandates.) List of paintings by Gustav Klimt is a good example of the same thing done with a table rather than a bullet-point approach; the paintings that are significant enough to warrant their own article get a brief summary and a link to their own article and consequently don't unduly overshadow the other entries, and there's enough information about all the entries—those with an article and those without—that readers can get an idea of the similarities and differences and of the themes Klimt painted.

In my opinion, because Wikipedia writers are so used to the "follow link, skim the article, go back to what you were reading before" workflow there's a tendency to forget that most readers don't behave like that, and either read the article in full without following links, or follow a link but don't return to the former page; as a consequence, I think we have a tendency not to include as much explanatory information on lists pages as we should. Treat the entries on a list article as you would the list of nominations if you were writing an article in the local paper on a commendation or awards ceremony; at minimum, each entry on the list should contain enough information so the reader can decide "is this person/building/painting somebody about whom I'd be interested in learning more?". (I know I sound like a broken record on the topic, but the reason there's currently a 300kb thread about this up above is that I firmly believe much of the Wikipedia editor community is too firmly fixated on appearances, and forgets that our only function is to try to ensure readers find the information they want and are made aware of other information they may find useful, not to comply with arbitrary policies on what is and isn't appropriate regardless of whether it affects utility to the readers.) ‑ Iridescent 15:25, 20 January 2019 (UTC)

{adding} List of extant papal tombs is a good example of a well-designed stand-alone list. Sylvester Medal is a biographical list with what I'd consider an appropriate level of detail. My general rule when it comes to the appropriate level of detail on Wikipedia is "does the page contain enough context that someone reading a printout of it would understand it?". ‑ Iridescent 07:49, 22 January 2019 (UTC)

[1] This goes for most of the other biographical special notability guidelines as well, as they have the same issue of encouraging indiscriminateness and stand-alone articles for topics that would be better served as part of a list. And don't get me started on whoever decided "every listed building in England is notable" was a sensible idea (at the last count there were around half a million of the things).

[2] Even a top-flight high-profile women's team like Liverpool has an average gate of 724. Outside the US, women's football never took off and aside from the full internationals most of these women aren't household names in their own households, but it's considered somehow improper on Wikipedia to suggest the men's and women's games aren't equivalent in importance.

[3] This system does work well for things like railway stations, where it's a reasonable assumption that even the most obscure station will at minimum have had "New Station Opens" and "Station Closes" appearing in the local newspaper, but doesn't translate well to people.

[4] The cynic in me says that had GS gone the other way, and decided that it was more important to assume good faith even if it meant allowing problematic edits to remain live, then not only would there then be a crowd baying for his blood on the grounds that he'd knowingly allowed potentially untrue statements to stand, the crowd would consist of exactly the same people.

[1]

[2]

[3]

[4]