Jump to content

Wikipedia talk:Flagged revisions/Archive 6

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 1Archive 4Archive 5Archive 6Archive 7Archive 8Archive 9

Weak point...

Well, there is a massive weak point to this "sighted articles are free of vandalism" statement - a case I definitely encountered within our test phase (that is still running, I suppose? Or why didn't WP/de had a poll on this?):

Fake vandalism

Not even a small bunch of editors are well versed in all kind of sub-topics enough. So it is possible for a vandal to place fake vandalism - and get its fake sighted as vandalism-free. That particular case I'm referring to was a fake poster adding some cr*p into the German One Piece article acting as if he was in Japan and knowing what'd happen in the very end of this series that is supposed to end in several years. This edit was sighted here - and stayed for hours as "vandalism-free". I was the one that reverted the edit and had to remove the flag.

Especially due to this case I'm convinced that editors only need the "one-click-revert" feature. And everything that's going beyond this creates unnecessary overhead... --Defchris (talk) 01:49, 12 June 2008 (UTC) Besides: For how long are we "testing"? And how many unsighted articles are still in our WP/de? Last week I read sth. about 30% that were sighted.

So you're saying that if an improvement can not cure all problems it should not be performed?
Overhead for who? How many hundreds of thousands of bad pages must readers see before a little review effort is justified on the part of Wikipedia? ... And why should flagging increase overhead: That a good revision is flagged means that it doesn't need to be reviewed 100x over by vandalism checkers.
German is up to >40%. --Gmaxwell (talk) 02:17, 12 June 2008 (UTC)
This is an inherent issue with having a publicly editable site, regardless of flagged revisions. Articles may contain inaccurate content, added intentionally as vandalism or unintentionally. Fact-checking is not part of normal vandalism patrol. That would be a requirement for a higher level of flagging, something tied to existing quality control processes like WP:FAC. Mr. Z-man 02:31, 12 June 2008 (UTC)
Exactly. That is why I proposed above for the first level of flagged revs to be solely about overt vandalism and other obvious issues, and not subtle things. Random89 06:39, 12 June 2008 (UTC)
I suppose it is possible that vandals who find they can't get away with obvious vandalism will start doing subtle vandalism instead. Hut 8.5 06:34, 12 June 2008 (UTC)

@Gmaxwell: It's not only because of this fake vandalism problem I'm saying that we don't need the flagged revisions.

  • The German's Stern compared different German articles to other encyclopedic articles from the Brockhaus - [1] (in German). And we were better in almost every aspect. So we don't need the flagged revisions out of quality issues.
  • There are still some issues with this feature: Articles can't be marked as sighted properly if there's an unsighted template within. And if this template is fully protected you need a sysop to mark it sighted. (Well the article is sighted but there's a draft "copy")
  • Many German users complain that this feature and using it is confusing - the readers and the ones that want to edit sighted articles can be confused because of flagged articles containing vandalism and a more actual version that isn't flagged. If there's no difference between flagged article and draft the editor has to look why there's still a draft.
  • If a non-editor reverts vandalism it has to be reviewed and flagged.
  • Edits are reverted because the editor wants to - vandal-free edits don't implicate this edit to be flagged. Users that don't have editor rights can't do anything about this - even if they revert this an editor's flag is already seen as "correct".
  • It'll be much more work beyond the usual research for articles if you do more than just one .
  • Everybody who's not an editor is suspected to be a vandal - but you even can't rely on the editors.

I have editor rights in the German Wikipedia - but I'm not marking any article without reading it completely and confirming through comparing to other versions of it that the edits are free of (fake) vandalism. If I don't know anything on a topic I won't either flag an edit nor the article itself - at all. So overall, this doubles my effort, my work I'm putting into Wikipedia - and it's too much. If I didn't have the editor rights my edits might be unflagged for days - and that's unacceptable.

And I didn't even complain on the "certified/surveyed" flags: This feature is asking for trouble between authors, editors and surveyors if the surveyor is well known to set his own, so called "higher expectations" on articles than it's asked for due to the guidelines. We already have users that are trolling through proposing deletion, editwars, demandig changes to articles that need more time than our proposed seven days etc.

You guys really want more conflict potential? --Defchris (talk) 11:48, 12 June 2008 (UTC)

A couple of issues:
  • Just because the selected articles compared well with Brockhaus doesn't mean the project can't do better.
  • I don't know what you mean be "articles can't be sighted properly". If there is some issue, it should be explained and go on bugzilla.
  • What do you mean by "many readers"? How many? And what proportion? And how often is the sighted version vandalized? Especially where the current versions is not and the reader noticed that. It's hard to buy that is a common problem without some evidence.
  • I'm not sure what you're describing when you say "vandal-free edits don't implicate this edit to be flagged".
  • I don't see how everyone else is "suspected" of being vandal. Suspected in this case means that one as seen as likely to have vandalized. That is just not the case. The point is to filter out a good portion of disruptive edits.
  • As for reverts needing to be sighted, and the workload, much of these issues could be solved by making more surveyors. Having queried toolserver, only a portion of the possible surveyors have been granted rights yet.
It will require more oversight and eyes, which seems like a good thing. It may be easy (less work) to not look over edits as much, but it also means less people finding and removing less vandalism. This would only cause conflict for people not confirming to policy and good conduct. If bad conduct and "trolling" (I don't know how true that is either) is really a problem, then it needs to be specifically, not by hindering things it happens to bleed on to. Also, no proof that this is even a wide spread problem is given. I could easily go to special pages to find unflaggings and then point out some conflict in there. But there is already conflict in editing. Without any sense of scale it becomes meaningless. Aaron Schulz 21:01, 13 June 2008 (UTC)
BTW, de:One Piece cleary shows, that the wish for "opening up" Wikipedia articles cannot be achieved by this feature. After several months of being semi-protected, that article was unlocked. After three weeks it returned to being semi-protected because flagged revisions - look at all the reverts in the history. Plz look for "rückgängig gemacht" as well as "Revert". Defchris (talk) 17:11, 12 June 2008 (UTC)
Reverts are probably not the best criteria to measure the success of flagged revisions. An important criteria should be number of times a member of the public sees something bad, and I'm pretty confident that flagging will will there even vs semi-protection as even semi-protected pages can get a lot of reverts. --Gmaxwell (talk) 22:30, 13 June 2008 (UTC)
Well, it's the only really useful and comparable criteria to measure its success - there's no effect on heavily vandalised articles. If vandals/fakers look for a way to "have fun" they'll find a way - even if they had to register an account.
There's a "public-opinion poll" - please have a look at Contra #125. de:Benutzer:Lienhard Schulz explains in detail (and in German) what's wrong with these flagged revisions - much better than I could translate into English. sry Defchris (talk) 01:12, 30 June 2008 (UTC)

This is an encyclopedia, not a social networking site ... last time I checked, anyway

I would like to remind everybody here that regardless of their impressions of what wikis in general should look like and how they should operate, this particular wiki – Wikipedia – is dedicated to developing a reliable, free, useful encyclopedia that readers can trust, not providing a social networking site with a fun little encyclopedia-building project off to one side. Flagged revisions appears to me to be the only possible way of making Wikipedia stable.

I would also like to state that I personally think that the stable revision of articles should be displayed by default to logged-out readers. (I've changed my position on this in proposals that I have made mainly because some part of flagged revisions is probably better than no part of flagged revisions.) I think that seeing vandalised articles is much, much more likely to deter potential contributors than edits not appearing immediately. And if we can't keep tabs on the flags of our articles, perhaps we've simply grown too big for our community?

I think that many of the opinions expressed on this page, while given in good faith, are underinformed and mislead. Please consider what's best for Wikipedia's readership, instead of continually worrying about how we are going to handle it or whether we are going to like it.

Best and friendly regards, – Thomas H. Larsen 00:18, 21 June 2008 (UTC)

How is keeping the current version of the article on default making this a social networking site? According to the five pillars, part of free is that anyone can edit, so people who oppose flagged revisions on the grounds that people wouldn't see their edits immediately aren't underinformed and mislead.
You do make a good point that seeing vandalism is likely to deter contributors as well, and as of now, I can't decide yet which would be better, but for me, not seeing my changes right away would have been a bigger deterrent to me that seeing vandalism on a page. Almost all of the IP contributions on my watchlist are fixing typos and grammar, many times it is their only contribution. I think seeing vandalism would mostly inspire a reader to edit the page to fix it, not just to shun wikipedia for ever (in most cases; I'll admit some vandalism that hits close to home could deter some users forever). I think that most people here have legitimate concerns here, so please don't try to dismiss them as being "selfish".
Respectfully, - J kasd 04:33, 21 June 2008 (UTC)

This is an encyclopedia everyone can edit and this may change if we implement Flagged revisions. Why do anonymous users contribute to Wikipedia articles? They know their contributions will be marked and seen by every visitor (I know it from experience; I've been a long-time anonymous contributor before I decided to get an account). Why correct a spelling error or explain a technical term if I don't know how long will it take to be able to see the correction and and will it be visible at all! With this model we are becoming more similar to Nupedia, the thing Jimbo Wales took us away from. This may become an encyclopedia where anyone can edit, but where most edits go unnoticed for hours at least. Yes, there is vandalism and this project will decrease the amount of it significantly, but I remember reading somewhere (maybe Wikipedia:Statistics) that vandals make up for only about 3% of all anonymous edits. Is chasing off the 3% bad worthy putting in jeopardy the 97% good?

A significant amount of anonymous editors can and will be chased away by the fact they have to register, then wait a few days to become autoconfirmed (in the least stringent version) and only then be able to make sure their edits will be seen by other users. Wikipedia's user base consists of 48,400,000 users, most of whom very rarely make edits. Anonymous users make up for a large percent of these edits. If we add Flagged revisions, the user base may sharply increase, but the number of edits by anonymous contributors will fall. Wikipedia will start to resemble a forum that allows all visitors to post in one section, but to post in all other section, one must register and wait until they become autoconfirmed to enable the public to see their revisions. If I had encountered such a system when I first came to Wikipedia, I would never have registered here. Admiral Norton (talk) 11:34, 26 July 2008 (UTC)

P.S. Not to mention vandal fighting has become a style of editing here. If we kick out all the vandals, people will lose their "jobs" and possibly quit Wikipedia. Admiral Norton (talk) 11:35, 26 July 2008 (UTC)

3% is very optimistic (the actual figure is about 20%) but the point still stands - we would still be chasing off many good editors for every vandal we get rid of. And also bear in mind that it is from the ranks of these unregistered and new contributors that Wikipedia's experienced editors and administrators are drawn and we could seriously deplete our experienced user base if we scare off new contributors. Hut 8.5 11:52, 26 July 2008 (UTC)

Focus on the negative

Looking at this article in New Scientist about convincing people, these proposals make the following mistakes:

  1. The proposals are framed as trying to show the benefits of flagging. The fact is that people are more convinced by opposition.
  2. The proposals try to give strong and authoritative arguments. The fact is that such arguments can dangerously backfire. Once a strong argument is resisted, a person is even more convinced that this is a bad idea.

Instead of defending the changes, proposals should focus on what is wrong with the status quo. I think this might explain why change is so often resisted on the Wiki. Merzul (talk) 10:29, 21 June 2008 (UTC)

Shouldn't it be pretty self-evident what's wrong with the status quo? --Conti| 12:59, 21 June 2008 (UTC)
That's what many of us might think, but it is much easier to immediately think of a few things that are wrong with this proposal and then oppose it. Merzul (talk) 18:35, 21 June 2008 (UTC)
Merzul has a point. It's not obvious to many editors here: Most of them have been desensitized to how shocking or confusing vandalism can be, and few of them appreciate how much readership volume there is here.
The like of thinking is "So the article said the senator liked beastiality, but we fixed it in 15 minutes. So what?" But in those 15 minutes hundreds of readers may have seen the vandalized text, and almost none of them have any clue how Wikipedia works. Most don't have any idea that Wikipedia is open for *anyone* to end at *any time* without *any review*. It's simply unthinkable to most people, so it's not obvious. So when thay see "and he enjoys beastality" their thoughts are along the lines of "Omg. Is that true? The article looks pretty reasonable otherwise. God what a cretin! Hm. Maybe they made a typo and intended to say that he opposes it? It seems like an important subject but they say nothing else. I'm so confused!" ... and perhaps 1/1000 of these confused members of the public send a confused email to OTRS. Some small fraction believe it and spread it. Another small fraction might go and write the senator.
And, of course, the harm and confusion of vandalism aren't just limited to BLP articles.
We simply can't expect the general public to deeply understand the sausage making of Wikipedia. They don't have time and interest for it and they have other things to worry about. All we can really expect them to understand about Wikipedia is "It's untrustworthy". I think it would be really sad if much of the value of Wikipedia were lost to the public simply because it was so confusingly untrustworthy that people ignored it completely.
Readers dramatically outnumber people who have made an edit here. If we were going to be truly democratic in governance, shouldn't we ask readers what they want here? After all ... they are the ones most impacted by the decision. I find it rather hard to believe that many readers would say no to "Would you prefer that changes to Wikipedia articles be reviewed before displaying them to the public?" ... I'd bet most would say, "You mean that isn't happening already!?@#! Hell yes! Tell me when it happens because I'm not going back there again until it does!" :) --Gmaxwell (talk) 19:33, 21 June 2008 (UTC)
Excellent. - Dan Dank55 (talk)(mistakes) 19:51, 21 June 2008 (UTC)
I logged on to OTRS looking for an example of this, and, not surprisingly, found one in a couple minutes. Someone emailed us regarding vandalism on Lou Piniella. Normally, vandalism is reverted hours before someone ends up replying to the email, but in this case it stuck for more than a day before I reverted it. We know that most things on Wikipedia should be taken with a grain of salt until checking sources and the page history for recent vandalism, but this person "thought everything on Wikipedia was true" (until now of course). While its possible it was sarcasm (sarcasm doesn't travel well through plain text), its far more likely, as Gmaxwell says, that he, like many people, really has no clue where content on Wikipedia really comes from. Mr. Z-man 20:20, 21 June 2008 (UTC)
It's true that we can't expect the general public to understand the processes inside WP. But one must not forget that flagged revisions themselves are one of these processes. I'm an editor at German WP; most people I know are not, but nearly all of them use it as readers. None of them has a clue of what is going on at the moment. Most of them intuitively take "sighted" as a mark of quality, meaning that sighted articles are reliable, while non-sighted ones are crap, which of course is definitely not true. Gmaxwell's example might or might not have passed the revision on de:wp; if it had, the damage done would even have been much bigger with a "sighted" mark on it. In fact, one couldn't even have blamed the person who sighted this, for it is no obvious vandalism.
Most readers are not aware of the meaning of the "sighted" mark and won't read our internal guidelines. The question is, do we want to delude them into thinking an article is reliable when we simply can't guarantee for it? Let's face it, as long as the wiki principle is at work, WP will never be completely reliable. Flagged revisions surely won't change this. In my opinion it's a very bad idea to have people think otherwise, and that's exactly what this system does, whether we like it or not. Greetings from Berlin, -- 87.123.124.145 (talk) 16:06, 28 June 2008 (UTC)
I'm not familiar enough with de.wikipedia to know what readers are expecting. I can tell you that readers of en.wikipedia.org are likely to judge the entire encyclopedia by whatever they read, and not look at the talk page or any markings; most readers don't know what the bronze star for WP:FAs means. If we find that, for some reason, readers are expecting a symbol for a flagged page to mean something it doesn't mean, the easy solution is to put the symbol on the talk page instead of in mainspace. - Dan Dank55 (talk)(mistakes) 16:51, 28 June 2008 (UTC)
Do these readers even know that anyone can edit? Otherwise, they are likely to assume all articles are reviewed anyway, like a normal encyclopedia. And I don't see why people would assume non-sighted articles are automatically "crap". I have know idea where that comes from. Aaron Schulz 18:36, 28 June 2008 (UTC)
I'm amazed how complicated examples and assumptions people are considering here in order to justify or to reject the proposal. The Sighted Versions proposals aims at fighting vandalism, and it's plainly obvious that vandalism is here, and that it's detrimental to the encyclopaedia. If you need an explicit example (and read a little German), have a look at this. That's the kind of stuff we don't want to see here, not even for 9 minutes. And flagged revisions prevented it from being shown to the public. --B. Wolterding (talk) 18:56, 28 June 2008 (UTC)
"Whoever needs help for your 9- to 12-year-old child on the subject of sex, call..." Yes, that's a reason for Flagged Revisions. - Dan Dank55 (talk)(mistakes) 19:16, 28 June 2008 (UTC)
Those versions containing personal information have to be removed through deletion from an oversight, not just reverted by "simple users" or hidden as centerfold between two flagged revision. Defchris (talk) 01:10, 30 June 2008 (UTC)

@Gmaxwell (indentation problem): Actually, the general public takes a large part in making Wikipedia and they will be definitely scared off by this proposal. Even if we lose only 5% of the anonymous contributors, we lose more edits than the vandalism we lost! Yes, the article probably (not definitely!) won't say "The senator lives in Florida and enjoys bestiality", but will it still say that the senator lives in Florida? We won't make a stand against the oppponents of Wikipedia here. Because of occasional vandalism that passes these magic lines and gets flagged as good edit, people will still say "It's untrustworthy." Even so, a plenty of readers do cooperate in Wikipedia. If only a third of all the 1,260 million edits is made by anonymous contributors, that still means that from these hundred edits, one or two will dare to revert the vandalism or at least make a contribution in another part of the article, a contribution that will be lost when a vandal fighter comes and reverts the full version to match the quality version. Most established editors, in fact most editors here started as anonymous contributors who got interested in the Wikipedia. They didn't hear about it on the TV one day and decided to get an account. Creating this rule opposes WP:BITE. Newcomers will be bitten as their good faith edits get lost in a queue or even actually appear on the public version, only days or weeks later (depending on the number of article visitors). This is the main issue about the flagged revisions. Admiral Norton (talk) 12:06, 26 July 2008 (UTC)

Similar proposal

I thought of a similar idea to this, which I would be interested to know peoples' views on. The basic idea is that wikipedia has two versions - 'stable' and 'current'. Any viewer of the site would be able to choose between them, either on a page by page basis (similar to going to the 'talk' page of an article), or globally using a cookie.

The way I was thinking it could be done is with a timestamped queue. When someone makes a change, it goes automatically into the 'current' version of wikipedia, but is put into a queue for automatic insertion into 'stable' only after a certain period of time has elapsed. If no-one else makes an edit during that period, then it goes into stable automatically. If someone does, then the timestamp is re-set and the article goes to the back of the queue. The stable page could display a line at the top saying how many days it was lagging the current version, to give viewers an idea of whether to look at the current version for recent changes.

One advantage of this idea is that it doesn't involve creating two classes of users - only on the principle that if no-one cares enough to correct an edit within two weeks, then it should be allowed through. Also viewers would have the choice of which version of the site they preferred to use.

A possible disadvantage is that it might not work so well for frequently edited pages - articles could remain permanently in stable with no way for them to reach the front of the queue. This could be avoided if the queueing program was clever enough to connect the timestamps with the particular phrase that was edited and allow one part of the article to be updated while another was held in the queue; however I can see that this would be hard to code. --Lofty00 (talk) 11:43, 18 July 2008 (UTC)

http://www.stablepedia.org functions somewhat along those lines, although it's filter applied to Wikipedia content via an external site. Seems to be down right now; I'm not sure if that's a temporary glitch or if it died at some point when I wasn't looking.--Father Goose (talk) 17:43, 18 July 2008 (UTC)
It has been down for a few days now, so is probably dead. --Lofty00 (talk) 11:23, 23 July 2008 (UTC)

German update?

The statistics page seems to clearly establish that every page can be flagged in a reasonable amount of time. Based on the trend it looks like they'll have it complete somewhere around four months after they started. That figure might be higher or lower on enWikipedia depending on the flaggers:articles ratio chosen, but based on this data almost any of the proposed 'flagger selection' criteria should let us flag everything within a year.

The story on changes to previously flagged pages is not quite as good, but still ok. They seem to clear the backlog every four days or so, for an average of about 40 hours between change and reflagging. That seems too high to me, but hopefully it will start to come down when they are no longer concentrating on getting everything flagged for the first time and/or spend less time dealing with vandals. Ideally, new changes would be rejected (if vandalism) or re-flagged within minutes.

Unfortunately, what that page can't tell us is whether the system is actually WORKING or not. Specifically, does anyone have statistics or general anecdotal statements on the two real issues here;

1: Vandalism. Have there been any notable changes in vandalism since the system was implemented? Has the AMOUNT of vandalism decreased? Have vandals shifted to targeting primarily the pages not yet flagged? Has there been a marked increase in 'subtle' vandalism (i.e. adding of facts and figures which the average user won't know are false)? Have there been many cases of users flagging things so fast that they missed obvious vandalism and marked it as ok? Has the number of COMPLAINTS about vandalism decreased (suggesting that vandalism is being SEEN by fewer people)? Et cetera.

2: User retention. Has there been a decrease in the number of anonymous edits being made? If so, is that decrease consistent with any decrease in vandalism committed by anons or is it a decrease in useful changes? What about new account creation? Has it increased, decreased, or stayed about the same? Has there been any evidence of larger than normal numbers of previously active users leaving? Has the rate of new page creation, featured article creation, or edits in general decreased significantly?

I think these are the questions we really need answers to. Basically, if there has been any significant reduction in the impact of vandalism and no significant reduction in user contributions then flagged revisions are indisputably a success and should be implemented here. If vandalism hasn't been slowed down much at all then there is no point in implementing this feature, and if user contributions are down then it may do more harm than good.

The stats page demonstrates convincingly that flagged revisions CAN be done, but not whether it SHOULD be done. It seems like with more than 60% of the pages flagged there could be a general sense of how things are heading on these issues by now. Does anyone have an idea how this stands currently? --CBD 13:45, 23 July 2008 (UTC)

(Replying only to a very small part of your post, and leaving the rest to other people:) In particular, there are two major factors to consider:
  • has there been a decrease (or increase) in anonymous contribution due to the implementation of flagged revisions on the German-language Wikipedia? and
  • has the German public seen significantly less (or more) vandalism and received content of significantly higher (or lower) reliability due to flagged revisions?
I think that we want to consider whether the public has got better content, not whether the project has received less vandalism.
In addition, I would like to see feedback from the German non-editing public on the feature, but I have no idea how to go about acquiring it. – Thomas H. Larsen 00:15, 26 July 2008 (UTC)
The flagged revisions system currently implemented on the German Wikipedia isn't really geared towards "better content"... just a quick verification that the page is not obviously vandalized. Over time that might lead to 'better content', but content improvements inherently take longer than vandal fighting. --CBD 07:49, 26 July 2008 (UTC)
Yeah, there's a distinct difference between "Sighted Versions" and "Quality Versions".--Father Goose (talk) 08:33, 26 July 2008 (UTC)
A few facts from the de-WP:
  1. First and foremost: As this adventure was begun without any planning there are no criteria what would constitute a success, there are no benchmarks, not even the slightest idea what kind of and how much effort is worth which result. Form the point of quality control, the implementation as it happened on de-WP is a disaster.
  2. From the outside: Feedback from outsiders overwhelmingly states that the public does not understand the purpose and scope of the flagged revisions. The readers believe a flagged revision has undergone quality control regarding the content. No one I spoke to understood that the checking was purely formal and done by editors who have no knowledge about the topic of the article.
  3. From inside: The backlog has essentially been constant over time! The predicted date for the initial flagging of every article is about 50 days in the future - it was after the first peak in the beginning and it is now. So the enthusiasm for initial flagging has fallen about the same rate with the progress of flagging. The result is stagnation.
  4. One week ago, the implementation was switched so that non logged-in readers see the latest version now, no matter if it is flagged or not. So the very purpose of the flagged versions - that readers will always see a version that is supposed to be good - is lost. This was made without consultation of anyone - as was the initial implementation - with the stated intent to try this type of implementation in reality. I believe this decision was detrimental to the motivation of many editors, who don't see any purpose in the whole feature this way. At least it is this way for me.
  5. Personally I don't engage in first line (RC) vandal fighting on de-WP - though I've spend a few hours with it over time to get a feeling. I work in the second line and regularly check a huge watch list, that tries to cover whole areas of (my) topics. On this second line and after now almost three moths I see no effect of flagged revision. The first line of defense was always very efficient on de-WP (frankly: significantly better than here on en-WP), so blantant vandalism was almost non-existent on my watch list before and is on the same level now. And flagged versions are worthless in supporting my type of quality control. Because 'flagged' means only that edits aren't blantant vandalism, the flagging editors have neither the expertise nor the time to check the edits whether they improve the article, so that still is my job in the second line for my topics.
  6. I probably don't have to elaborate that there has been the expected flaws. A few people violated the trust set into them by flagging without appropriate checking for preexisting vandalism. There still (after almost three months) are open questions about what is blantant vandalism and how thorough the checking of new edits should be. We had our share of conspiracy theories around the flagged versions. And so on. That should be expected and can't be points pro or con the concept and/or its implementation.
Preliminary conclusion: The missing benchmarks are a fatal flaw. Without them honest evaluation the flagged versions as success or failure is impossible. A feature of this magnitude must not be implemented without benchmarks and a fixed trial phase. Those rules were violated on de-WP and motivation suffered. Personally, I was skeptic but open minded in the beginning and have been constructive in discussions during the development phase. In my view: The outcome so far is not worth the effort. Theoretically it might be different in projects where the first line of vandal fighting is less efficient than it always was on de-WP. But I advise such projects, that they should have clear rules for a trial period and think about benchmarks before they try it. --h-stt !? 11:55, 26 July 2008 (UTC)
I believe this is a proof good enough that implementing Flagged revisions was a major error on de.wikipedia. Should we follow their example? Admiral Norton (talk) 13:15, 26 July 2008 (UTC)
I suppose that's a subjective judgment, but there's more than one way to, er, skin an encyclopedia (well, there is, just go look at Special:Preferences). I think the primary purpose of flagging, at least on en, should be to keep the general public from randomly seeing "JOSH IS GAY" splattered across articles. Running around trying to flag every edit is counterproductive, in my view, since it doesn't really change anything; you're still scrutinizing every edit as it comes in to see if it seems malicious. It seems more prudent to flag articles that are in good shape, so if someone comes in and tears up the place, the damage isn't shown to everyone. When the article develops more, flag a new version. I don't think it should be considered necessary, or even desirable, to flag a stable revision for every article.
Judging by the test wiki, you can put text in the "flagged revision" box, so I don't think explaining should be too hard. Just put something like, "You are viewing a version of this page thought to be free of obvious problems. However, Wikipedia makes no guarantee of its validity. See disclaimer." --Slowking Man (talk) 14:14, 26 July 2008 (UTC)
Actually, I don't think we should try to 'explain' revision flagging to the general public at all. If you say anything along the lines of, 'we think this version of the page is ok' and it turns out that it DOES have some mistake or vandalism on it then it looks almost like Wikimedia has endorsed it. Instead, we could say nothing at all or just, 'Click here for the most recent version of this page'. As to the bit about the Deutsch experiment having recently switched to always showing the most recent version to all users... did anyone give any sort of explanation WHY? I'd have to agree that's just about completely useless. The whole point of the 'quick vandalism check' level of flagging is to keep vandalized pages out of the public's sight during the brief interval it takes to correct them. Also, having benchmarks for success was suggested by many people (here on enWiki anyway) long before this went live. Just seems slipshod implementation all around. Very discouraging. --CBD 15:23, 26 July 2008 (UTC)
You didn't understand what h-stt wanted to say, Slowking Man. Most articles on de.wikipedia haven't been flagged at all yet. I know that because I paid a visit to DE yesterday and found "Artikel/Entwurf" versions on exactly eight articles and I'm sure I ran across at least 10 other articles. The regime you are proposing is exactly the thing that happened anyway against all attempts to keep vandalism under control, and it happened only on some articles. Turning Flagged revisions into a obvious-vandalism-free page version would be taking away ClueBot's job and not doing it significantly better, and that's what they were able to partially do at the DE. To be able to sight versions on DE is similar to being able to roll edits back on EN. They are undermanned and yet they have problems with irresponsible people getting sighting access. Implementing such feature on EN and giving rights to, say, autoconfirmed users (we have a larger article base and less active editors, so we need better coverage), would be a total disaster and a vandal heaven. Not only would they escape the RC patrol, but they would be able to mark their edits as legitimate, bringing doubt to those who would revert them in a wink of an eye otherwise. This is definitely not the way I'd like a free encyclopedia to go. Admiral Norton (talk) 21:03, 26 July 2008 (UTC)
According to the statistics 63.58% of pages are tagged. You cite 8 out of 18 (~44%), but obviously there is plenty of room for that sort of variation in such a limited sample. I also suspect that there are some types of articles which will take longer to flag for the first time... anything particularly long or technical for instance. What's your source for the claim that we have fewer active editors? I've always seen it cited very much the other way around... that our editor to page ratio is significantly higher than that on the German Wikipedia. The primary difference between Cluebot and flagged revisions is that the general public SEES vandalism before Cluebot reverts it... with quickly reviewed flagged revisions they wouldn't. There also isn't any reason that we would take Cluebot offline. As to vandals with flagging rights... the ability to edit semi-protected pages being limited to autoconfirmed users successfully reduces vandalism. Ergo, if revision flagging were limited to auto-confirmed users it could be expected to produce a similar reduction in vandalism SEEN by non logged in users across every page. Stricter limits would make viewed vandalism even less common and might even reach a point where a vandal would have to make so many positive edits before being able to flag anything that it'd be a net positive even after having to clean up their mess. However, stricter limits might also discourage contributions.
I'm not sure that we can draw many conclusions from the German test. They've changed their requirements for getting the access a few times, alot of people apparently have the ability to hand it out to anyone they want (which is just a terrible idea), and it is now set up to not hide vandalism at all. They've demonstrated that all pages can be flagged (even with the chaos they DO continue to make progress, and getting over 60% in this timeframe before slowing down inherently shows that 100% is possible) and that they can keep everything flagged within a few days of last change... probably less if they were concentrating on that. It just isn't clear what impact the flagging is having... and apparently there won't be any way to determine that. --CBD 23:51, 26 July 2008 (UTC)
I guess they had a reason to change their access requirements multiple times. They probably rotated the requirements possibilities because none of the schemes seemed to work efficiently enough and since we probably didn't learn much from them, I doubt implementing this would be a good idea. They took down the flagged version display because vandals apparently found a way to get around the system and inflict more damage by flagging vandalized versions. Even if they weren't giving away the rights to everyone, I'm sure there would still be enough vandals to do damage. As for the sources, per Special:Statistics, the average EN editor edits 31.82 times, while per de:Spezial:Statistik the average DE editor edits 86.17 times in their account lifetime. The de.wikipedia definitely has a higher share of active editors than we do. The chances are much higher you will meet someone on different articles twice in one day, yet this hasn't helped them judge who is a benevolent editor and who has the wrong intentions. You are talking about stricter limits being better, but the actual problem is that, the stricter the limits get, the less flaggers we get and the more strain is put into them. And, per #3 of h-stt's report, we can see that they can't manage even with their loose flagger requirements and that they still have a significant backlog. And the problem with backlogged pages is that vandals can still vandalize them as if the flagging system were not used. 37% is a very big number of pages that are virtually left out of control. Finally, I have to agree with you, it is unknown does flagging acutally reduce vandalism at all and we're not here to buy a pig in a poke. Admiral Norton (talk) 21:01, 27 July 2008 (UTC)
It looks like you got those numbers by dividing total number of edits by total number of users. However, since that (in each case) includes edits to all the non-article namespaces and DOESN'T include IP edits in the total user count it isn't really an accurate reflection of 'edits per account lifetime'... which wouldn't be a good indicator of "active editors" anyway. Mistating the stats another way I could say that since we have 'only' 3.17 times as many articles, but 12.79 times as many USERS as the German Wikipedia we have a much greater proportion of people available to mark things. The problem with both sets of stats is that they don't report CURRENT activity at all.
As to how strict the restrictions should be. No, I wasn't arguing for stricter limits, just describing the effects of such. German Wikipedia seems to have adopted a weird amalgm of strict/loose limits... only about 3700 users have the access so far, but it was apparently given out to anyone that any admin felt was worthy. I'd think that you would want more users with the access, but restrict it such that the positive edits they need to make before getting it outweigh the damage a vandal could do after getting it. They apparently wanted to build in an 'admin review', but probably should have then restricted it to admins reviewing a bot or system generated list of candidates who had passed a certain editing thresh-hold. In any case, the progress they made in such a short time and repeated clearing of the 'backlog' of unreviewed edits clearly shows it is possible to flag all articles and keep them flagged within an average of a couple of days after the last edit - despite the chaos around the implementation. I'd think that would need to be brought down to a couple of minutes, but that doesn't seem impossible either. Just isn't clear what impact any of it could have on vandalism... which can't be measured with the current setup. --CBD 11:17, 28 July 2008 (UTC)
Hi, I think your conclusions have become more negative than I intended. The flagging as such is going quite smoothly on de-WP. The rate of about 65% of all articles that are flagged at least once and more than 3600 users with the flagging-tag seems quite impressive on the first look. And the problems mentioned under #6 were minor and have been resolved. So I hope you could concentrate on the big questions around the flagged revisions. I believe benchmarks are the most important open questions. A test run is useless, if no one knows how to evaluate the result. And you need to look into the motivation issues - internal and from outside. As the backlog of initial flagging is constant over quite some time now, the motivation of editors to systematically wade through articles has fallen significantly. And the recent switch to showing the most recent version in all cases seems to be even more detrimental to their motivation. The third important aspect it the effect of flagged revisions to anons and non logged-in users. Will they continue to edit in the same rate as before if their edits aren't visible to the public before they have been flagged - which essentially means that we don't trust them to be overwhelmingly constructive. All of my ideas are preliminary - but so far I believe those are the issues you have to look at before you should try the flagged revision on en-WP. --h-stt !? 13:50, 28 July 2008 (UTC)

Use of flagged revisions suspended on German Wikipedia

In the middle of H-stt's comments in the previous section is what seems to be a significant change which shouldn't be overlooked. In his item "4." he says that a week ago the German WP turned off the prime effect of flagged revisions -- the hiding of unflagged edits from readers. Since almost all article readers arrive unlogged in, they now are back to viewing the latest, possibly vandalized, revision, just as before flagged revisions. In this state the sighted flag has no significant effect; it simply turns on or off obscure markings.

I bring this up so that people are aware of this change, which in essence seems to suspend the effect of flagged revisions on de-WP. We're still in the dark as to why this action was taken and as to what might be done in the future. -R. S. Shaw (talk) 19:02, 27 July 2008 (UTC)

According to this remark, the upcoming vote on German Wikipedia about the use of Flagged Revisions will include the option "use sighted versions, but show the most recent version by default"; that's why they're trying it for a while.
That being said, I don't think it's a reasonable thing to do, since it needlessly takes away much of the system's virtues, and discourages users from setting the flags. --B. Wolterding (talk) 11:25, 28 July 2008 (UTC)
Ah, so there is a method to their madness after all. They keep changing things to show how various different implementations would work, or not work as the case may be. Obviously, changing horses in midstream muddies the waters (and mixes the metaphors), but they may be able to get a general feel for which methods are easiest to maintain and provide the most benefit. --CBD 11:37, 28 July 2008 (UTC)
I can't really tell what they're up to, but if they are really trialing out different approaches, I commend them. People on en.wiki seem to resist "let's see if it works this way"-type changes tooth and nail. (Maybe de.wiki does too, and these experiments are an exception.)--Father Goose (talk) 18:50, 28 July 2008 (UTC)
Well, this "trying out" usually means something has gone wrong and DE is trying to revert it, but won't accept it to preserve their dignity. I don't mean to be rude or unhelpful, but the "let's see if it works this way" approach is very similar to the "let's set our clothes on fire and see if something happens". The EN wikipedians are afraid of change and I think they have a good reason. Not only is it unclear if this experiment worked, but it is definitely true that it deterred non-patrollers. Give me any other viable reason why would they kill the system if you're not satisfied with this one. Admiral Norton (talk) 19:55, 28 July 2008 (UTC)

Hints from a german wikipedian: The decision to show the unflagged versions was made here: de:Wikipedia Diskussion:Meinungsbilder/Weiterführung der gesichteten Versionen#Start des Meinungsbildes am 13. Juli (last paragraph) A meinungsbild(voting) was going to start to stop flagged versions. The start off this meinungsbild was delayed via the concession of showing the lastest version to non logged-in readers. --Stefanwege18:46, 30 July 2008 (UTC) —Preceding unsigned comment added by 141.63.56.202 (talk)

just to keep you up-to-date. There is a vote now about maintaining flagged versions on the de.wikipedia. There's a lot of discussion going on if the voting is fair, but I guess it will decide about the future of the flagged versions in the German wikipedia: here is the link. Option 1: Is about ending flagged versions completely, option 2 is to use sighted versions, but show the most recent version by default and option 3 is to use the original intent (show the last sighted version). The vote will end on 1st of september. --84.153.3.25 (talk) 23:35, 6 August 2008 (UTC)
Looks like 306 for no revision flagging (option 1), 171 for flagging but no display change (option 2), 613 for displaying the most recently flagged version to IP users (option 3), 28 abstaining, and 113 saying the equivalent of 'voting is evil'. No overwhelming consensus, but a clear majority of opinion. Poll isn't closed yet, but the ratios seem unlikely to change much in the final week / 25% of the poll. One thing which stands out is alot more entries marked 'not entitled to vote' in the 'no flagged revisions' section. They apparently have some fairly high 'voting' requirements as some of those excluded people had a couple hundred edits over two months or so. Anyway, 2:1 for option 3 over 1 and 3:2 for 3 over 1+2. --CBD 12:55, 22 August 2008 (UTC)
Yes, they seem to give more weight to polls than we do, and they require 200 main space edits for editors to vote, in order to reduce the sockpuppet problem. Anyway, there's one point which I find very interesting: Option 2 ("use flagged revisions, but show the most recent revision per default"), which was intended as a kind of compromise, is remarkably unpopular. This seems consistent to me: The advantages of flagged revisions, in particular against vandalism, become effective only when flagged revisions are shown by default. Since our proposal Wikipedia:Flagged revisions/Sighted versions is quite similar to that Option 2, there might be reasons to change it. --B. Wolterding (talk) 20:37, 22 August 2008 (UTC)

Thoughts on the future of vandalbots under flagged revisions

I was thinking about how current vandalbots could best be integrated with a revision flagging system. It seems to me that they could continue running as they are now with just two additions:

  1. If the version of an article reverted FROM by the bot was flagged as ok then the bot should spit out the diff link of the edit reverted and the name of the person who flagged it to a list somewhere for review. This would help identify false positives by the bot and vandalism by people with flagging permission... who would then quickly lose it.
  2. If the version of an article reverted TO by the bot was flagged as ok then the bot should be able to mark the new (exact copy) of that version as flagged also. This would prevent other users from having to review and flag every edit made by a vandalbot.

Thoughts? Overall it seems like the two anti-vandalism methodologies could actually complement each other very well. --CBD 11:28, 28 July 2008 (UTC)

"Vandalbot" generally means a bot that carries out vandalism - I hope that's not what you meant! Hut 8.5 11:48, 29 July 2008 (UTC)
Ok, make that 'Anti-vandalbot'... though I've seen 'vandalbot' used either as 'bot which commits vandalism' or 'bot which hunts vandals'. I intended the latter meaning. --CBD 13:04, 29 July 2008 (UTC)

German editing statistics

I have generated some statistics of edits to the German Wikipedia from the most recent database dump. They do show that since flagged revisions was introduced on 6 May the number of edits from unregistered users has declined, and there has been no comparable decline in edits by registered users (so this is not just people going on holiday). The number of edits also picked up again in July, which suggests it is correlated with the decision to show the newest version and not the sighted version by default. For context I have also produced graphs of edits going back to 2005. (The black lines are 14-day moving averages.)

--Hut 8.5 11:57, 29 July 2008 (UTC)

Very interesting results. I'm wondering though what percentage of IP edits are vandalism before and after the change. I'm not sure how you'd get a good estimate of that though. (Also, could you throw up a marker on May 6? A little tough to find other wise. ;-) ) --Falcorian (talk) 13:11, 29 July 2008 (UTC)
I've marked the dot for May 6 in red. Hut 8.5 20:51, 29 July 2008 (UTC)
Actually, based on those graphs I'd say that flagged revisions have had no discernable impact on editing (by IPs or logged in users). Yes, there was a down and then up trend... but both directions began BEFORE the events you seem to attribute them to. IP edits had been decreasing fairly steadily since around February 2007 and the uptick in IP edits shown began several weeks before they went back to displaying the most recent version. Someone could even claim from this data that flagged revisions had slowed and then reversed a trend of decreasing IP edits... but given that the fluctuations involved (in either direction) are inline with past deviations it seems more likely that this is just standard variation and flagged revisions had no real impact. --CBD 13:22, 29 July 2008 (UTC)
That was about what I concluded from the graph. That's why I'm interestest to see vandalism percentages (although I'm sure we won't be able to calculate them), because I imagine they've decreased, in which case this looks like quite a success. --Falcorian (talk) 13:54, 29 July 2008 (UTC)
As an aside, there is a clear decline of IP edits in May/June 2007, I wonder what the reason was? --B. Wolterding (talk) 15:46, 29 July 2008 (UTC)

Two more graphs, attempting to analyse vandalism levels before and after flagged revisions. The first is a graph of vandalism blocks, defined as a block where the reason includes the string "vandal" (the German for "vandalism" is "vandalismus"). The second is a graph of reverts (the standard edit summary for a rollback is "Änderungen von [user] (Beiträge) rückgängig gemacht und letzte Version von [user] wiederhergestellt", so I've defined a revert as an edit where the edit summary contains the string "wiederhergestellt"). Neither shows a big drop due to flagged revisions. (If anyone has a better search string for either of these statistics I am prepared to run the analysis again.)

--Hut 8.5 20:51, 29 July 2008 (UTC)

This is a tough one because flagged revisions doesn't do anything directly to 'stop' vandalism. Indeed, if semi and/or fully protected pages are unprotected I would actually expect to have MORE vandalism being committed. The intended benefit of flagged revisions is to hide the vandalism from the general reading public (un logged-in users) until it can be corrected. The only 'statistic' I can think of for measuring that would be to count the number of vandalism reports received from IP users / people e-mailing in. If flagged revisions were working you would expect to see a sharp decline in such reports. Just as much vandalism being committed, reverted, and blocked for, but not SEEN by the vast majority of readers. In theory, flagged revisions might also serve to discourage vandals, but most of the drive by random one time vandals WILL still see their vandalism when they hit 'save page'. Thus, any 'vandalism deterrent' effect would likely require a long gestation time for the general public to learn and understand that vandalism they commit will generally be seen only by themselves and the one person who fixes it. --CBD 22:36, 29 July 2008 (UTC)
Thanks for the new plots! The physicist in me is jumping for joy over the data! CBD though brings up a good point, which is that my metric is flawed (vandalism)... I'm not sure exactly then what one would want to look at. I'll have to give it some thought. Maybe something like "number of edits reverted where the revert returned it to a sighted version" but you'd need to normalize to the average number of vandal edits pre sighting. That way you'd expect ~1 before it takes effect, and then >1 if vanalism increases and <1 if it decreases. Of course, this still doesn't really tell you what you'd like to know, because in the end the number isn't important, it's what people 'see' that is, and in theory they'd see clean pages regardless of whether vandalism increases or not. Ah well, it's also nice that you marked the day sighted started. Cheers! --Falcorian (talk) 23:31, 29 July 2008 (UTC)
Maybe we do not need to look at anything, just accept that Flagged revision is a slight step away from the wiki philosophy, but anyone CAN still edit and the people actually READING what we write will have a better more correct experience, which to me is what we are really after, i.e. a good encyclopedia is the number one goal, not something that everyone can edit at any point. --Stefan talk 23:50, 29 July 2008 (UTC)
While I think that's probably the right philosophy, the scientist in me wants to play with the data. ;-) --Falcorian (talk) 01:06, 30 July 2008 (UTC)
The comment was not directed at you, it was a general comment, and I agree statistics is very fun and I love the graphs. --Stefan talk 05:13, 30 July 2008 (UTC)
The editing statistics are now here in case you want to do something with them. Hut 8.5 20:56, 30 July 2008 (UTC)
Oh boy! gnuplot here I come! --Falcorian (talk) 01:15, 31 July 2008 (UTC)
Just a notice: page protection is still in place on de-WP, so vandalism wasn't supposed to be on the rise. Admiral Norton (talk) 21:31, 30 July 2008 (UTC)

Very recently, reader feedback is possible with flaggedrevs, though not enabled. This can track and graph various dimensions of reader perception over time. Also, stats people (like Erik) could run regexps to determine reverted edits and then use the DB to find the number of reviews that corresponded to the reverted revisions. This could track a sample of the views of pages, and how many where to bad states. I already have an old JS script that does some of this things (it instead tracks how long the page was vandalized). Aaron Schulz 01:52, 30 July 2008 (UTC)

Why do we want to scientifically prove that readers prefer an articel about the International Olympic Committee that actually talks about it compared to one that does not or to see how long pages are vandalised or how many people see vandalised versions. Some vandalisms are fixed within seconds like my exampel, and some not so fast, I fixed one that was wrong for more than 6 months a few weeks ago.
Vandalism exists, we are getting better and better at handling it, but we are not perfect. Flagged revisions will obviously make the readers perception of wikipedia be better, and it will obviously make it harder or less revarding for IP editors to contribute and possibly not give us as many new contributors, since it might take many days before their contributions are shown to all. It is a simple choise, do we want that or not, no need for elaborate statistics that states that readers prefer good contents compared to vandalism, or that pages that are vandlised are only see in 0.0024% of the total page views, does it matter if it is 0.12% or 0.0000098%? What is our target, at what value do we accept flagged revision and when do we not. To me it is simple, flagged revisions will make fewer vandalised pages visible to the readers, at the expense of some new editors, that is worth it, with statistics we can prove that this is good and bad, but does it matter? --Stefan talk 05:13, 30 July 2008 (UTC)
The time that readers are exposed to vandalised versions is important in deciding whether to use the feature or not. If it turns out to be absurdly low then the huge effort of using flagged revisions clearly isn't worthwhile. There are plenty of other areas which badly need more editors' time. --Hut 8.5 20:56, 30 July 2008 (UTC)
There used to be dozens of people who spent hours every day doing nothing but reverting vandalism. That has clearly decreased with the introduction of highly functional anti-vandal bots, but it is still a significant drain. People still spend alot of time doing recent changes and new pages patrol. Will 'flagged revisions' really represent a "huge" effort above and beyond what these people are already doing? I'd always thought it would be incorporated into that work. If recent changes patrollers flag updates they might even cut down on repeat reviews of the same page and theoretically decrease the amount of effort being expended.
I'd agree that anti-vandal bots have greatly reduced the need for flagged revisions, but there are still benefits to be had and the effort involved may be no more than we are expending now. The 'article of the day' is viewed by multiple people every second. When it is vandalized for even a few seconds we hear about it... and the view of Wikipedia as an unreliable vandal magnet is perpetuated. Put in flagged revisions and that problem should drop to near zero... just the occasional page which gets flagged by mistake, the rare vandals who are dedicated enough to put in days/weeks/whatever of positive contributions in exchange for being able to flag five minutes worth of disruption 'ok', and the logged in users who generally know how Wikipedia works and just revert it. --CBD 21:45, 30 July 2008 (UTC)
Probably not. We already have some sort of new pages patrolling function, so flagged revisions wouldn't do any good there. As for the recent changes, incorporating flagged revisions wouldn't do any good there, too, because in the time it takes a page to be flagged, it has already disappeared from recent changes, which makes them redundant. Implementing flagged revisions in an external application style would just create another Huggle/Twinkle/whatever where patrollers have to check every page, not only those they suspect to be vandalized. Flagged revisions have no obvious benefits, but they do obviously increase the workload. Admiral Norton (talk) 17:22, 31 July 2008 (UTC)
Flagged revisions will represent a considerable step up from RC patrol. If an edit isn't vandalism then at the moment the patroller will spend a fraction of a second examining it, and many don't check every edit (they use tools which filter edits based on whether they contain common vandalism terms). Under flagged revisions, the article will need to be checked by the patroller, and the guidelines will probably ask that the article be checked for spam, BLP violations etc in addition to obvious vandalism. Since the backlog on the German Wikipedia is about a week long and vandalism levels have remained about the same, we would need two processes - RC patrol to check current edits, and another group to clear out the backlog. Furthermore, we would find it harder to do this than the German Wikipedia, as 0.05% of their registered users are admins (0.02% here) and unregistered editors make 15% of their edits (30% here), so the ratio of users generating edits which need to be patrolled to users doing the patrolling will be higher. New page patrol is a good analogy for flagged revisions - unpatrolled pages lose their "unpatrolled" status after a month, and we have articles which are hours away from doing exactly that, which doesn't speak volumes for the effectiveness of patrolling.--Hut 8.5 17:45, 1 August 2008 (UTC)
So basically we can conclude that we will not get anywhere with statistics, we just disagree, but I see some hope, maybe I can agree that flagged revisions will not have that much impact on obvious vandalims, but when I do vandal patrol I do not dare to revert changes where the editor changes things that might be correct, i.e. spelling of a name, changing age of a person, size of a animal and so on, since it takes so long to check these facts. On pages that I care more about and know more about I will do these checks when possible bad edits are made, but sometimes I just do not have the time to do these checks, I say to myself that I will do it later, but I forget. Flagged revisions whould be a great help in these cases, if I can have a watchlist of flagged pages, and just go through them one by one to make the latest version flagged. This would for sure improve the quality for least a very small part of wikipedia, and I'm sure there are many/many more editors like me that would benefit from this. This can be implemented without changing what version is displayed, which seams (I think) to be the biggest reason to oppose flagged revision? It would also not increas the work load on anyone since this now is a tool only to check for more 'sneaky' vandalism, i.e. it does not need to be used. It would also give us some experience with flagged versions and can use this to test it out so that we can have better discussion about if it really is a huge effort or not, now none of us really know, we just think and one of us is obviously wrong and I'm fine to find out that it is me.
So my proposal is to turn on flagged revisions, but not change what page is displayed. Purpose would be to easier find less obvious vandalism and to test the system to see if we like it or not. --Stefan talk 00:57, 4 August 2008 (UTC)
Hut 8.5, please define your limits for 'absurdly low' and 'huge effort'. Because if we can not set any limits, measurement is pretty useless, right? --Stefan talk 23:49, 30 July 2008 (UTC)
There's no point setting arbitrary standards, particularly as we know little about the extent of the problem. For one example, though, I investigated the article Caesar cipher, which probably isn't top of anyone's reading list but a fair number of people probably search for and would certainly get flagged under flagged revisions (it's an FA). For the months of May and June 2008, the article spent 920 seconds in a vandalised state. During the same period, 38,253 people viewed the article. This implies that six or seven people saw a vandalised version of this article. Most pages don't get anywhere near that sort of traffic. Is it really worth implementing flagged revisions to protect the handful of people that saw a vandalised version of this article? (This argument doesn't apply to high profile pages, but I have no problem with flagged revisions being used on these.) Hut 8.5 17:45, 1 August 2008 (UTC)
I agree, we will get nowhere using statistics, just for fun I did the same example on a page that I just reverted for vandalism a much more general page, Whale shark not FA, but GA and about twice your exampels pages views, 43,102 in July, in my example the page was in a vandalised state for 20,940 secons or 5h and 49 minutes, i.e. statistically about 345 people saw a vandalised version of this article, and no I did not check what type of vandlism, just counted reverts, excluding 6 hours+ that was reverted for bad external links. Anything can be proven with statistics, so where do we go from here. You state that you agree with flagging for high page view pages? So lets start with that? flag all FA and more than 250,000 views per month? --Stefan talk 01:16, 5 August 2008 (UTC)
I wouldn't say stats are useless. I did write the history stats JS code a while back, which uses a huge amount of edit summary heuristics (reverts tend to have summaries) to guess the amount of time vandalized for a page. It's the high profile pages that have the worst problems usually. Low key pages are typically very unlikely to be in a vandalized state. Aaron Schulz 16:05, 5 August 2008 (UTC)
I don't personally think FA is a good indicator of how high-profile an article is (flagging Plano Senior High School wouldn't do much good), but I would be prepared to support a scheme whereby all articles with over 250000 views per month are flagged. (In fact, since that's only about 300 pages, 100000 views a month might be a better benchmark.) Hut 8.5 18:09, 13 August 2008 (UTC)

Compromise solution?

Would it not be possible to present anybody opening Wikipedia with an initial screen asking whether they wish to see raw or checked articles? --PL (talk) 10:32, 21 October 2008 (UTC)

Let's just try this

While I understand the concerns that people are expressing about the implementation of flagged revisions on the English-language Wikipedia, I feel that we, as a community, should be less reluctant to try out new things that can be reversed as necessary. The German Wikipedia experiment, while valuable, does not necessarily provide results that can be directly superimposed on the English Wikipedia – the de.wp. community is very different in many ways from the en.wp. community.

Therefore, I propose that we try out an implementation of FlaggedRevs on the English-language Wikipedia – multiple implementations, if necessary – and see how the community, and the public, react. If the reaction – in all of its multiple aspects – is positive, then the feature should be left enabled; if not, then the feature should be disabled. And it's just as simple as commenting, or uncommenting, a line in LocalSettings.php.

Let's at least try it, folks ... – Thomas H. Larsen 09:31, 6 August 2008 (UTC)

I think there are still alot of people interested in trying this on en.wp. However, it seems unlikely that anything is going to be done until the results of the experiments on the German site are in. I'm particularly interested to see whether they can make it to 100% of articles tagged at least once (at 67.28% now) and whether that will then allow them to decrease the average time before changes are re-tagged down to six hours or less (at 3.2 days currently).
Yes, things may work differently here, but if the system does or does not work there then that is a fairly strong indication of how things might go here without adjustments to the implementation. --CBD 10:59, 8 August 2008 (UTC)
I'm all for introducing flagged revisions on en.wp. However, I think "trying it out" is quite a big deal - as seen on de.wp, it requires quite a bit of work for the "sighters", and trying several implementations may confuse people rather than help. Actually, without having any scientifically accurate numbers, it seems to me that quite a few editors were discouraged from flagging revisions when de.wp switched backed to showing the unsighted versions by default; the backlog has much increased since then. So, I think the best way to go is to wait a bit longer for the German results, at least until their final decision is made regarding the use of flagged revisions; in the meantime to update our proposal /Sighted versions to reflect any lessons learned from the test on de.wp; and then to start a smaller-scale introduction on en.wp in a well-prepared manner. --B. Wolterding (talk) 11:16, 8 August 2008 (UTC)

4th proposal added

See Wikipedia:Flagged revisions/Checkpoints and grading. There were some things I like and didn't like about each of the three previous proposals, so I took a spin at a new synthesis. -- Beland (talk) 17:47, 13 August 2008 (UTC)

Update

The poll on the German Wikipedia has closed with 708 in favor of continuing flagged revisions with IP users seeing the most recently 'sighted' version (Option 3) vs 362 preferring no flagged revisions (Option 1) or 197 preferring all users to see the most recent edit (Option 2).

Apparently there is a standard of 50% approval required to 'win' the vote. On the talk page some are objecting that the 362 plus 197 plus 33 abstaining plus 129 saying 'voting is evil' totals up to 721, and therefor 708 falls short of 50%. Others are saying that the 33, 129, or both are not counted towards the total and thus 708 passes 50%. I tried to point out that quite a few of those 33 + 129 stated that they preferred Option 3, but felt that the poll was improperly biased against it... which would obviously make counting those as opposed to Option 3 incorrect. In any case, from what I could gather it seemed likely that Option 3 had been or would be 'officially' ratified.

Meanwhile, the stats pages have been updated with a few new options and they've now got 73.52% of all artices flagged at least once. The listed completion time for flagging all pages is down to 42 days, but that is based on average progress over the life of the feature. At the current rate of progress it'd actually take about 80 days. Over 18,000 flags are set per day, but many of those are reflagging pages after changes. The past few days there seems to have been a concentration on reducing the number of pages with unflagged revisions... likely in anticipation of the feature causing IPs to see flagged versions being reactivated. Focusing on keeping pages up to date will likely further slow progress towards getting all pages flagged. From what I've seen they might actually be better off leaving the most recent edit displayed until they've got all pages flagged. Keeping pages perpetually flagged is time consuming, but reviewing all changes since the last flagging, even if it was a couple of weeks ago, is usually easy. --CBD 20:00, 1 September 2008 (UTC)

Proposal: Sighted revisions for specific articles

We don't have a consensus at this time for generalized flagged or sighted revisions, but it would be interesting to enable sighted revisions for certain articles. It would require a consensus on the talk page and then an administrator can enable sighted revisions on this page, it means that IPs will only see the latest sighted version. Then users in the group surveyors can flag a revision as sighted. (The group surveyor includes administrators and can be assigned by administrators and removed in case of abuse.) This has been proposed as a remedy in a current request for arbitration here. Cenarium Talk 15:21, 17 September 2008 (UTC)

I would be prepared to support this. The only way that flagged revisions could actually function to increase the openness of Wikipedia towards unregistered and new users is if it was introduced as an alternative to page protection on a page-by-page basis (with the criteria being no more loose than our current semi-protection criteria). Hut 8.5 17:21, 17 September 2008 (UTC)
I created a detailed proposal here, any comments are welcome. Cenarium Talk 10:59, 18 September 2008 (UTC)
I have updated my proposal, what do you think ? Cenarium Talk 16:10, 18 September 2008 (UTC)
This involves the ability for administrators to restrict the sighting of an article to a user subgroup of surveyor, say established surveyor (the name may be changed). This would be very helpful in cases of disputes, because the group of surveyors, charged essentially to prevent vandalism from being visible, is too large, while established surveyors are trusted enough for their respect of our core policies and known not to edit war. Cenarium Talk 17:25, 18 September 2008 (UTC)
Seems reasonable. I think it would be good to have flagged revs here. But it is a form of protection, so admins should enable it consistently with WP:PROTECT -Steve Sanbeg (talk) 18:07, 18 September 2008 (UTC)
So this essentially becomes sighting in lieu of semi-protection, restricted sighting in lieu of full-protection. It sure seems like a win-win to me. Jclemens (talk) 18:11, 18 September 2008 (UTC)
Sometimes we'll still need semi protection, because the level of vandalism is too high and we would have to revert thousand times a day (for some politicians for example), and adding sight revisions restricted to established surveyors will help to avoid edit wars, non-consensual additions, etc. Cenarium Talk 10:24, 19 September 2008 (UTC)
Well, I'm not particularly happy with this as a long-term solution. However, I do think it could be a reasonable short term solution, although I'm very much opposed to allowing only administrators to set it. (Flagging is not a form of protection, which restricts who can edit; it's a form of stability, which controls which versions of articles are displayed by default to the public.) Why not simply enable it, and allow certain users to set it on pages as they see fit (with strong recommendations against setting it on articles that are already fairly stable)? – Thomas H. Larsen 04:41, 19 September 2008 (UTC)
Agree that it's not really a form of protection, but rather of control on the content. If we allow more users to enable it, the number of pages with sighting revisions enabled will strongly increase, and we may not have enough surveyors to check the edits and we'll have huge backlogs. In the future, we may allow "establish surveyors" to enable it also. The sight protection restricts the ability to sight a page with sighting revisions enabled to "establish surveyors" or admins and may work like usual edit protection and move protection ([sight=established surveyor], etc). Cenarium Talk 10:19, 19 September 2008 (UTC)

Please don't use the word 'sighting'. It's far too vague and muddies the water about what is being proposed. 'Sighters' are screeners. That is their function and they must be stamped on firmly if they start trying to do anything else, such as acting as moderators, which is precisely what has been happening under the current German system (see below) -- and which, of course, is totally unacceptable. --PL (talk) 10:21, 17 October 2008 (UTC)

Straw poll

I've created a straw poll on flagged revisions to acquire some Public Opinion (TM). Everybody is welcome to participate at Wikipedia talk:Flagged revisions/straw poll. – Thomas H. Larsen 23:57, 19 September 2008 (UTC)

Straw poll statistics

Here are the statistics from the straw poll.

  • ~27 people argued that flagged revisions should be enabled over all articles.
  • ~10 people argued that flagged revisions should be enabled only over some articles, decided on a per-article basis.
  • ~14 people argued that flagged revisions should be enabled only over some articles, determined on a categorial basis.
  • ~2 people argued that flagged revisions should be enabled on any article at flaggers' discretion.
  • ~17 people argued that flagged revisions should not be enabled at all.
  • No people admitted that they didn't know the right course of action.
  • ~3 people claimed that they didn't really mind.

These are interesting statistics. They indicate that about people are in favour of enabling flagged revisions on the English-language Wikipedia, while 17 people oppose the introduction of the feature. Only 3 people don't have a polarised position on the issue.

Those arguing for flagged revisions to be enabled did so mainly from the point of view of reliability, while those who opposed the feature generally expressed concerns over deterring anonymous editors.

So, there we are. Interpret these statistics how you will. I hope that they'll provide a good benchmark for further discussion. – Thomas H. Larsen 23:00, 10 October 2008 (UTC)

I don't think you read the don't mind opinions, which were all (including me) in favour of flagged revisions but did not mind which option for turning them on took place. So that is really another 3 in favour. Davewild (talk) 08:07, 11 October 2008 (UTC)
It seems there is considerable weight of consensus behind implementing FlaggedRevisions in some fashion. I think now we need to decide on one, possibly compromising possibly not, implementation of the system, work out all the values and parameters, before submitting to one final straw poll to demonstrate a clear consensus to the developers. That development step in the middle should definitely not be a poll - a detailed discussion is needed to find the best solution for everyone. Happymelon 10:50, 11 October 2008 (UTC)
I think I spotted sample size bias. Can 70 people's views represent whole Wikipedia? Probably not. OhanaUnited Talk page 18:43, 11 October 2008 (UTC)
It represents everyone who bothered to comment over a long while in which the proposal was advertised to all users via WP:CENT. Is there a better way? ffm 18:52, 11 October 2008 (UTC)
Ideally, we should try to get the views of the people who will be most affected by flagged revisions - new and unregistered editors. We could probably add a notice to a mediawiki namespace to get their attention. Hut 8.5 19:15, 11 October 2008 (UTC)
Any news on the DE statistics yet ? according to this, [2], 2 bots are doing the flagged revisions. Mion (talk) 20:17, 11 October 2008 (UTC)
In other words, it represents the meaning of established users who are usually involved to some degree in the Wikipedia decision-making process, which is quite a small group of people (maybe a few dozen thousands) considering the number of Wikipedia editors (over 7,000,000). Admiral Norton (talk) 20:31, 11 October 2008 (UTC)
It's the only data we have. Besides, if we do what Happy-Melon proposes, it does not matter that much if the sample is not representative because it is not used to make any real decisions. -- Jitse Niesen (talk) 20:23, 12 October 2008 (UTC)
Really, it's a question of due diligence, and, IMO, there's two ways we can go about it: One, we can have yet another poll, this time on specific implementations of flags (with the risk that we won't get anywhere because our standards for acceptance are too high), or two, we can give it a "test run" and promise to revisit it X months down the line. I'm for the latter, mostly because I interpret the poll numbers as an indication that this is not as big a deal as, say, notability. Nifboy (talk) 23:03, 12 October 2008 (UTC)
Unless you establish benchmarks and criteria before a test run, you will never be able to revert the decision. That is because there will always users who embrace the concept, invest lots of time and will do anything to promote it as success, even if the result is a total failure by any objective standard. This is what happened on de-WP. The poll that declared the feature permanent was one about "feel good" and self affirmation, not about any measurable success. --h-stt !? 13:03, 13 October 2008 (UTC)

On another note, we'll still have much work before proposing sighted revisions for discussion to the full community (otherwise it'll be a big waste of time). How we will structure the discussion pages, in particular. Please see this page for a beginning. Cenarium Talk 12:41, 14 October 2008 (UTC)

Doesn't say much to me, I'm afraid! I really think you should be careful about embarking on anything that can't be spelt out in everyday words of one syllable! It could be rather like agreeing to involuntary devivificatory reification! Please start out by stating very simply in virtual kiddie language (and avoiding the word 'sighted' -- gesichtet means 'examined' or 'screened') exactly what you are proposing, precisely what results you expect it to have and how it would be better than what happens now. If this can't be done, and doesn't accord with the actual current German experience of it, it shouldn't be attempted. --PL (talk) 15:59, 14 October 2008 (UTC)
The description of sighted revisions is here. How to implement it is the question. There's definitely a strong opposition to implement it on all articles, as seen in the discussion and the poll results: 43 users oppose and 23 support. So we won't enable it like on dewiki, at all. I will try to clarify. Cenarium Talk 16:36, 14 October 2008 (UTC)
Please be careful when interpreting statistics from as complicated a poll as this. While the numbers you quote are technically correct, they are no more 'true' than the claim that, since 100% of contributors knew how to proceed, we collectively know how to proceed! All the statistics can objectively 'prove' is that 83% of people know that you can use statistics to prove anything! Happymelon 20:10, 14 October 2008 (UTC)
I share your point on view on polling and stats. This was only given as an indication, and as a mirror to the above data. The discussion as I said shows that there are strong oppositions to enable flagged revisions everywhere. Cenarium Talk 17:40, 15 October 2008 (UTC)

The way forward is to recognise that there are a number of substantial 'blocs' of support for various proposals, which need to be consolidated. We have about half a dozen proposals there; clearly we can only 'implement' one (counting "no FlaggedRevisions" as a possible 'implementation'). The only way we can hold a poll which demonstrates objective support for one proposal is to first distill the large number of possible options down into just two; inevitably, that means "no FlaggedRevs" and "FlaggedRevs configured like so". A poll against those two issues is entirely unequivocal, that's what we should be aiming at. While a "trial run" or "test period" is a nice idea, it simply isn't going to happen - the developers simply will not activate FlaggedRevs on en.wiki without a crystal clear consensus from this community. It is also wrong to assume that you can just run a succession of polls to narrow the options down: voting really is evil when it replaces active discussion. This middle section should primarily consist of a civilised discussion of the merits of the various proposals, and probably ending up with a nice compromise at the end. Polls are to break deadlocks, they should otherwise be avoided. I think we've just broken the latest deadlock, let's lay off the voting until we really need it. Happymelon 20:10, 14 October 2008 (UTC)

I wasn't aware of this straw poll. I would have joined in if I had known about it because I have strong opinions on the issue (I'm for flagged revisions on all articles). Why not draft the major proposals, announce it in the Signpost, and ask the developers to place an announcement banner for a few days across the top of the Wiki? That should generate better participation. Cla68 (talk) 06:36, 15 October 2008 (UTC)
We can create such notices ourselves by adding them to MediaWiki:Sitenotice and/or MediaWiki:Anonnotice. Before we do that, however, we need to come up with something coherent to say. Happymelon 12:15, 15 October 2008 (UTC)

But before you do, learn from the German experience, since it’s the only experience we have to go on, or you won't learn anything at all:

In my recent encounter with it, first of all one of the Sichter deleted a clause that had been there for months, if not years -- apparently because, not knowing the subject, he didn't realise its significance -- then accused me of starting an edit war when I reinstated it explaining that there was no problem. When I substituted a different version of it to take account of his objections, a second Sichter deleted it again and proposed that I write an entirely separate section. When I protested, he then substituted a clause of his own invention, which a third Sichter then deleted for entirely different reasons. All these interventions were flagged as ‘Sichtungen’. Their Sichter, in other words, are already arrogating to themselves the role of low-grade Admins.

Thus, if you introduce a flagged system, you are going to have physically to prevent Sichter from acting as ordinary contributors -- which may well mean that you won't get any Sichter at all. Merely stipulating that they must take off their hats as Sichter when they write as contributors won’t work because, as you know perfectly well, human beings are human beings. As a result, you will simply produce an army of potential little Hitlers, and the system (as I told the Germans to their intense annoyance) will soon become fascistic rather than democratic. It's inevitable.

Is that what you want? Because I assure you that that's what you'll get!

Don't say I didn't warn you! --PL (talk) 09:19, 15 October 2008 (UTC)

What you describe is the Iron Law of Oligarchy and one of the most important criteria to use in an discussion of policy is how to delay the certain. Z gin der 2008-10-15T14:25Z (UTC)
This is, I think, another reason to develop another level of flagged revisions, 'confirmed', or something, revisions. If we enable sighted revisions on disputed or otherwise contentious articles, it will be a real mess. However if it is possible to set up confirmed revisions, such that the last one is displayed to IPs and only admins or certain trusted users can confirm revisions, then it will save us a lot of trouble and allow to improve our control of disputes altogether. Cenarium Talk 17:58, 15 October 2008 (UTC)
"Trusted users"? And how do we determine who is best-suited to this next level of bureaucracy? In addition, admins have enough to do, and are not inherently subject-matter experts: they may be experts on policies and guidelines, but the subject area is a whole other thing.
So, what is driving this desire to impose some sort of rigid order on WP? (And rigid order is what it will become. Seems to be the exact opposite of Jimmy Wales' original idea and ideal.)
Also, I'd suggest that reading and digesting PL's comments would be a good idea. Humans are humans seems to be his biggest point, and this flagging idea reminds me of handing the keys to the henhouse to the fox and the keys to the pasture to the wolves. •Jim62sch•dissera! 21:00, 15 October 2008 (UTC)
This is not about having subject-matter experts. Our current editing process should not be changed, and a type of flagged revisions must have a determined objective.
  • Sighted revisions should be used only to prevent vandalism and clear violations of our core policies (bp violation, spam, etc), and any edit that is not vandalism or a violation must be sighted.
  • Confirmed revisions should be used only as a temporary help, to control disputes on certain articles (as an alternative to protection). As such, only admins (who spend a lot of time on disputes) should be able to "confirm" revisions, or certain 'trusted' users chosen because of their experience in dispute resolution, neutrality, etc. What I think doesn't work on dewiki is that only one type of flagging is used, and for a variety of purposes. Thus it creates difficulties in case of content dispute. It's why we need confirmed revisions, it will force the consensus-based process to run smoothly, revisions being confirmed when they are unrelated to the dispute or consensual. There is no and (I hope there) will never be "subject-matter experts" on enwiki with the last word. Everything works on consensus and WP:BOLD: any user can edit, that's all, and in case of dispute, the matter is settled on the talk page. Cenarium Talk 10:21, 16 October 2008 (UTC)

"Trusted users"? Quite! Who is supposed to decide that? Quis custodiet custodes ipsos?

As for "rigid order", you can understand why that appeals to the Germans -- but will it appeal to you?

To answer this, first, establish what is wrong with the present system: second, explain clearly in words of one or two syllables what you are trying to achieve and how you propose to do it: third, say exactly how the result is expected to be better than the present system: and fourth, explain what evidence there is for this in the only experience we have of it so far -- namely the German system. In short, is the present German system better than the original one, or not? [see separate section below]

If not (and in practice, as I have explained, it's far worse!) -- forget it! --PL (talk) 10:04, 16 October 2008 (UTC)

In reply to the first part of your comment, see my response above. Admins, who I think are 'trusted', already have to frequently implement changes on protected pages when there is a consensus. This is old business. We could have another consensus-based process to choose users able to confirm revisions, say, moderators, who are required to have a certain experience and neutrality in disputes. Any admin can remove moderators (and surveyors of course) rights in case of abuse. Now, you may say who guards the admins ? It's the arbitration committee. And who guards ARBCOM ? Good question, Jimbo on occasions. But they are elected and highly trusted users. Maybe we could discuss the German system in the below thread, I don't know how it evolved, and proposes something entirely different. (In any case, I don't like the concept of "draft" articles.) Cenarium Talk 10:45, 16 October 2008 (UTC)

It's what I think you would call a 'project' -- i.e. a not-quite-yet article. The Germans (see separate section below) have then taken that as an excuse for their Sichter (not 'sighters', but 'screeners') to act as moderators and generally act as local bullies in charge of everybody else. This cannot be allowed. --PL (talk) 15:53, 16 October 2008 (UTC)

It's what I say: they use sighted revisions for different things, so it creates problems. We won't have these problems. Just to clarify:
  • Any user who hasn't vandalized or recently violated editing policies like WP:BLP, WP:V, WP:ATP, WP:NPOV and WP:NOR, and essential behavioral guidelines (WP:NPA, WP:NLT, WP:SOCK, etc), provided the user's contributions are significant enough to judge fairly, will be granted surveyor rights. We may also have an automatic assignment.
  • Any revision that contains no vandalism and no clear violation of policies will be sighted.
  • Moderators should be elected in a sort of mini-RFA, to make sure they are trusted users and won't abuse their rights
  • Confirmed revisions are only used in articles where sighted revisions cause problems, essentially in disputes.
Sighted revs should never be used in disputes, since it's not the role of a surveyor to handle disputes, we have confirmed revs and trusted mods for that. If a sighter acts like a 'bully', the rights will be removed. Cenarium Talk 13:19, 17 October 2008 (UTC)

So the only essential change from present practice is the election of moderators? If not, please choose a clearer and simpler vocabulary. All this nerdspeak is hurting my head... --PL (talk) 15:23, 17 October 2008 (UTC)

The German implementation (discussion #21685)

Teased apart from the section above; apologies if the structure and flow is consequently quite poor. Happymelon 12:13, 15 October 2008 (UTC)
Having had recent, not very pleasant experience of the current German system, I would offer the following (hopefully jargon-free) observations:
1. Ordinary punters, however expert, are not allowed to post or correct finished articles directly, but only drafts or 'projects'.
2. Instead, all articles must be 'gesichtet' (checked) by a 'Sichter' (basically, an ordinary Joe who has been has been co-opted, but who doesn't necessarily know anything about the subject in question and may not even be particularly well-educated). This introduction of 'order' and 'discipline' into the process naturally appeals particularly to the German mind, but might not to yours!
3. This would continue to be the case if the third stage of the German system were implemented and the article in question then had to be checked for its factual accuracy by an expert on the subject in question -- which appears to be the real, ultimate purpose of the change so as to give Wikipedia 'authority'.
4. Just who the expert was would presumably have to be decided by the ‘Sichter’, and if the experts disagreed (as they often do) he or she would presumably have to choose one as being ‘right’ and ignore all the rest as being ‘wrong’ -- and that would probably be the one who is most insistent that he is an ‘expert’ (i.e. almost by definition an ignoramus).
5. No expert that I know of would be prepared effectively to seek the permission of non-experts under such a system.
6. The result would in many cases be just 'checked nonsense', with the apparent authority of the 'Sichter' behind it, but in fact reduced authority for Wikipedia.
7. Meanwhile, all the usual spontaneity and life is drained out of the system, with any interchange of views being constantly bogged down by the intervention of the Sichter (plural), who already increasingly arrogate to themselves the function of Admins, even pronouncing on what they think should be in the article and what shouldn't (give certain people any kind of authority and you know what happens…).
8. Thus, in my view, if you adopt such a system you will merely be making a rod for your own backs, and will probably regret it for the rest of your days! Because of voting inertia, it'll be good-bye to the wonderful, chaotic concept of the 'free encyclopedia' for ever! On past experience, certainly I shan't be contributing to it! --PL (talk) 10:08, 14 October 2008 (UTC)
This is appalling, flagged revisions were developed to deal with vandalism and egregious violations of our policies, not so that a few users rule on the content of articles. The key principle is that if an edit is not vandalism or a clear violation, it must be sighted. I expect that this kind of behavior described above will be frowned upon and led to immediate removal of sighting rights. This also highlights the need for strong guidelines, strict requirements and safeguards for sighted revisions, prior to implementation. Cenarium Talk 12:41, 14 October 2008 (UTC)
Note that FlaggedRevs can technically be used for almost anything. It really depends on local policies (within the confines of what the foundation will allow), though I'll admit I'm not really familiar with the German policies. But its certainly possible to have a level of flagging that goes beyond "vandalism-free" and I believe the original, original proposals, back before the extension was even completed had such a system in mind where there was a lower level of flagging for most articles, then an upper level for articles like FAs. Though I personally would take one dissatisfied person's explanation of the system with a grain of salt. Mr. Z-man 22:08, 14 October 2008 (UTC)
I think we need to be realistic on what we can do with sighted revisions, and respect the wiki spirit. If we are given a flagged revisions level, it should have a specific objective and restrict to it. As I said below, sighted revisions should be used for vandalism and clear violations (and not for dealing with disputes, accuracy checks, etc). Confirmed revisions should be used for disputes. Quality versions seems to be unnecessary and redundant to existing processes. Reliable versions: highly unrealistic and anti-wiki. A judgment call of some user that an article is "reliable" (and then that the rev showed to IPs is the latest reliable one) is a frankly ridiculous idea, cannot be efficiently controlled, and purely against the wiki spirit, which is based on collaboration (WP:OWN, etc). If the reliability of an article were judged in a specific debate, I wouldn't be so hostile to it. But it would add bureaucracy, considerably reduce the essential reactivity of Wikipedia (fix the article Barack Obama to an old version and expect its traffic to drop dramatically). While the system of sighted revs or confirmed revs will be very reactive (especially on articles related to current events), the process to check the reliability of an article would be too time-consuming if well done. Thus, I believe the 'old way', i.e. WP:BRD is still the best option in matter of reliability/accuracy checks, and the only one in accordance with the wiki spirit (the problem is alway the same and flagged revisions won't change it: some articles are unnoticed or remain unwatched longer than usual, and 'bad' content can stay there). Regarding 'experts', we know it doesn't work on Wikipedia (remember ?), and editors should have equal rights on content. Disputes are not resolved by a ruling of some expert but by consensus. The problems described by PL seems to stem in the fact that sighted revisions (or whatever their name) are used both for vandalism/violations checks, and "accuracy check" in a non-controlled way. Having a clearly defined objective for sighted revs, confirmed revs, and possibly other flagged revs types, would solve the issue. Cenarium Talk 19:34, 16 October 2008 (UTC)
Looks like a bad idea to me. Unless the Sichter is an expert in the specific field or fields related to the article, such "over-sight" is utterly pointless. Seems like trying to create a peer-review system without real peers. Quite honestly, such a system cannot fail to degenerate into the German system whereby the Sichter becomes der Richter in a WP version of a Volksgericht. •Jim62sch•dissera! 20:04, 14 October 2008 (UTC)

I don't think that yet another extensive discussion of the merits and demerits of the German implementation are appropriate for this section (althoguh of course they are always valuable in their own place). They should be split to a separate section. Happymelon 20:10, 14 October 2008 (UTC)

He who chooses to ignore both history and present processes of a similar nature is doomed to repeat the mistakes of both. Hence, an awareness of a situation that is very, very much the same is beneficial. •Jim62sch•dissera! 20:36, 14 October 2008 (UTC)
Despite my horrific mangling of the word "although", the meaning of my post above is (IMO) quite clear: the German implementation of FlaggedRevisions offers great opportunity to learn, and its successes and failures should influence any implementation here. My point, however, is that such discussion is not appropriate in this particular thread, which discusses the data gathered from the straw poll above and procedural notes on future development. The original post by PL is entirely unrelated to the preceeding discussion, and should have been made in a separate section (where its comments could have been more easily considered. As that did not happen, we now have the unenviable task of separating two interwoven strands of unrelated discussion. Again; I believe that this latest discussion on the German implementation should be carefully split to its own section. Do you disagree? Happymelon 21:31, 14 October 2008 (UTC)

I'm afraid I do. The whole point about bringing in the German experience is not the German experience itself, but the lessons that it teaches us regarding the risks of introducing a similar system here. It would be silly to ignore what is, in fact, the only experience of it we have in favour of a mere technical discussion that pays no attention to human tendencies and motivations or the actual results of such a system as they can be observed in practice (see previous section). There is almost as much danger of theory becoming disconnected from reality as there currently is in the world’s banking system! So I propose that this discussion be unsplit again: splitting it merely allows the technocrats here to close their eyes to reality. --PL (talk) 16:10, 15 October 2008 (UTC)

You're entirely missing the point of the section above, which is essentially a discussion about how best to have a discussion! It is, essentially, ludicrous, but it nonetheless must be had. The sooner we can decide that and get on with the actual discussion, the better. You raise some extremely interesting points about the German implementation which need to be addressed. But by mixing those points in with the "technical discussion" above (and you are quite correct to label it that), you risk those points being lost in that discussion, which will fairly rapidly close and move onto other things. It is disengenuous to think that by splitting I am trying to 'choose' one discussion over the other; rather, I'm trying to make it easier to continue both discussions to their natural conclusion. The discussion above can conclude pretty rapidly as soon as we decide how to proceed; the points about the German implementation are of no relevance to that technical discussion and so are just confusing the issues to be placed there. Those points, however, are of great relevance to the FlaggedRevisions discussion as a whole and deserve to be considered fully and clearly; putting them in their own section is the best way to achieve that. It's silly to think that only one thread on a page can be active at any one time, because that is simply not the case. Do you see what I'm trying to get at - splitting the threads is to make it easier to consider the points you raise, not harder. What's wrong with that? Happymelon 17:05, 15 October 2008 (UTC)

OK. But what's the point of having a technical discussion about how to do it if you already know from the results of its application in Germany that it’s something you wouldn't want to do? Eventually the discussion will merely acquire a momentum of its own that will pitch you into it whether you want it to not, simply because it's too late to stop.

So -- first, establish what is wrong with the present system: second, explain clearly in words of one or two syllables what you are trying to achieve and how you propose to do it: third, say exactly how the result is expected to be better than the present system: and fourth, explain what evidence there is for this in the only experience we have of it so far -- namely the German system. In short, is the present German system better than the original one, or not?

If not (and in practice, as I have explained, it's far worse!) -- forget it!

--PL (talk) 10:22, 16 October 2008 (UTC)

You are presenting as a fait acompli the fact that the FlaggedRevisions system is failing so catastophically on the German wikipedia that its successful implementation here is impossible. That is your opinion, to which you are entitled, and to the fact of which we must give due thought and dilligence. To discount the results and issues of the German implementation would be unacceptable. However, even more unacceptable is to suspend every discussion on the page every time an editor presents a new argument against FlaggedRevisions as a whole. It is simply not correct to state that we "already know... that it's something [we] wouldn't want to do"; the only person in a position to know that is you. I'm grateful to you for sharing your concerns here, and we need to be very sure that they are addressed in the forthcoming discussion about how best to implement FlaggedRevisions. If in the course of that discussion we realise that, learning from the lessons of the German implementation, there really is no way at all to successfully implement FlaggedRevs here, then of course the project will have to be abandoned. But refusing to have or allow that discussion to proceed until editor X is satisfied that their personal concerns (however well-grounded) are resolved is not the way to establish the consensus that is what we so desperately need. The solution is to work together and get as many people to contribute to moving the discussion forward as possible. I think you may be erroneously assuming that the discussion will lead inevitably to some form of FlaggedRevisions; if the concerns you raise are really insurmountable, then the result will instead be "no way". (also)Happymelon 18:15, 16 October 2008 (UTC)
What are you referring to as the original one, and what has changed in the current system ? Bo you mean before flagged revs, and now ? I looked at the current German TFA, [3]. First I see that it's fully protected. I see there is a separation between the Artikel, and "Entwurf" (draft) when the latest rev is not sighted, though it seems to be purely visual (it were no such separation in the test wiki). Cenarium Talk 17:58, 16 October 2008 (UTC)
Full-protection without cause? Beyond that, I'm not sure what you're trying to say. •Jim62sch•dissera! 19:19, 16 October 2008 (UTC)
Cause seems to be TFA, but we're not to judge other wiki's habits. There's not much to say on this. I proposed solutions to the existing problems on dewiki, this is what we need to focus on: how to successfully implement flagged revs, and use the German's experience. The latter remark was only about appearance. Cenarium Talk 19:43, 16 October 2008 (UTC)
I think what's missing here is any clear definition of need, a clear statement of potential benefits, a clear statement of drawbacks, etc., etc., etc. •Jim62sch•dissera! 19:55, 16 October 2008 (UTC)

And I would urge strongly that we abandon the term 'sighting', which is far too vague and doesn't say what is supposed to be going on. The German term means 'examining' or 'screening', and the latter is presumably exactly what is being proposed. 'Sighters’ are in fact screeners, and use of the term will make it quite obvious should they start exceeding their terms of reference and acting as moderators, which is precisely the problem with the current German system. Articles should be articles, not ‘projects’, and freely submittable. If what is being proposed is a supplementary army of official screeners who then flag those articles as having been screened, then please say so. But in that case isn't this exactly what happens already, albeit unofficially? --PL (talk) 10:13, 17 October 2008 (UTC)

And the separation sighted revs/confirmed revs and surveyors/moderators will precisely solve this problem. Sighted revs-> vandalism and clear violations, confirmed revs-> disputes. The other part of your comment is a common opposition to flagged revs, which has been advanced as a reason to use sighted revs only on articles with a real need for them (high visibility, subject to vandalism, repeated blp violations, etc). Cenarium Talk 13:26, 17 October 2008 (UTC)

Has anybody addressed the question of why we need to decide how best to Implement FlaggedRevisions? Shouldn't the first question be whether, not how? --PL (talk) 15:17, 17 October 2008 (UTC)

The poll showed that the majority of people support flagged revisions being enabled, with the argumentation that it is essential to maintain reliability of content. Those in the minority claim that the feature deters anonymous editors and thus violates the project's spirit of openness. Nevertheless, reliability is more important than openness since it gives content value, and those advocating for this side seem to have a stronger argument. Thus, you might say that we are going to enable flagged revisions at some stage in the future, and there's no point putting this off unless somebody can give us hard statistics that flagged revisions is actually causing a drop in quality on the German Wikipedia. – Thomas H. Larsen 22:59, 17 October 2008 (UTC)
Misinterpretation. I support enabling sighted revisions in certain specified cases, but I am also concerned that, as you say, it deters anonymous editors, and also new users. I am not the only one with this view as you can see in discussions. We need to find a balance between both needs. I see that there's much talk, but no effort to find solutions to known problems. Could someone address my points and comment on my proposals ? Till now, the discussion isn't constructive. Cenarium Talk 00:32, 18 October 2008 (UTC)

The German philosophies are and have always been different than English philosophies. The German and English Wikipedia communities have different processes and have for a long time. There are some things that are very controversial on the English Wikipedia that would never be tolerated for a day on the German. The Germans as a whole are more into organization, and English speakers more into libertarianism. Both philosophies have there advantages and which one works best may have something to do with geography and resources (Iceland has a very fragile environment and one of the most conservative governments in the world). We do not have to use flag revisions and I hope we do not, but if we do do so, very weakly. Z gin der 2008-10-17T23:22Z (UTC)

The above account about the practice of the flagged revisions on de-WP is entirely flawed by a total disregard for the purpose and use of flagging. I'm certainly not a fan of them - and would like to see them abolished rather today than tomorrow - but this report is ridiculous and must be corrected: Flagging on de-WP certifies that a version is not blatantly vandalized. Nothing less - nothing more. Flagging is not an issue in edit wars about content. So if PL claims that the flag was set on versions representing both sides of an edit war, this is not a flaw - but then the flagging works as it is intended. Flagged versions are not the lesser brother of semi protection, and if PL - or anyone else here - tries to sell them as such, this is simply a misrepresentation. --h-stt !? 10:32, 18 October 2008 (UTC)

My opinion is that surveyors shouldn't take position in an edit war, and sight reverts as long as they are not vandalism or clear violations. If there's a problem with sighted revisions, then we can remove rights or use confirmed revisions. PL has pointed out that it could be abused by surveyors trying to enforce their views on an article. Having witnessed several users using WP:AIV and WP:RFPP to gain an advantage in a dispute labeling edits by IPs as "vandalism", I would not be surprised it's indeed the case. How the German Wikipedia handles this kind of situations ? Cenarium Talk 14:38, 18 October 2008 (UTC)
On de-WP every experienced contributor is a "surveyer". And edits under IP or new accounts are usually flagged within minutes, if the change is not blatant vandalism. Edit wars and the resulting versions are flagged just as every other edit, and that is a feature not a bug. So no one enforces any version in an edit war onto the other side by flagging or not flagging. We don't want the flagged revisions to be loaded with other issues. They are about vandalism, not about content, edit wars or any other issue. --h-stt !? 20:46, 18 October 2008 (UTC)

Case in point: The German article on Nostradamus had an inappropriate link to a novel (one that has often been inserted and then deleted again in the past) quite correctly deleted by another user yesterday. This was marked as ungesichtet (unsighted). Today it has been reinserted by a Sichter calling himself ‘Andante’ and duly marked gesichtet. I have therefore deleted it again on principle and reminded him or her that altering the content of the article has nothing to do with ‘sighting’ it (do we have to use this virtually meaningless term?). I have a fair idea of what is going to happen next… :( --PL (talk) 10:56, 26 October 2008 (UTC)

Next day: As expected, the 'sighter' in question has now threatened to have me blocked if I dare to re-delete the title that he has now reinstated -- were it not for the fact that another 'sighter' has done it for me. What on earth does all this have to do with 'sighting', I wonder? --PL (talk) 11:17, 27 October 2008 (UTC)

I don't see why you connect that with the "gesichtete Versionen". Please deal with it like every other edit war: discuss it on the talk page. State why you deem the book in questions to be unsuited for the article. I didn't check it over there, but your description here sounds like you are fighting an edit war - and this would be a valid reason to ban you. But it is not connected whatsoever with the Gesichtete Versionen. --h-stt !? 15:05, 27 October 2008 (UTC)

I only did one reversion, explaining why in the edit summary (I was then accused of not giving reasons)! --PL (talk) 16:07, 27 October 2008 (UTC)

Intro

I realise we're still discussing how to have a discussion, but has anyone noticed that the intro section on Wikipedia:Flagged revisions is vague, poorly written and clichéd? I'm sure this is a work in progress, but it certainly doesn't bolster any confidence. •Jim62sch•dissera! 19:14, 16 October 2008 (UTC)

Garbage in, garbage out as usual, I'm afraid! --PL (talk) 10:15, 17 October 2008 (UTC)

Clear answers required

It seems to me that, before any decisions can possibly be taken, clear answers are essential to the following questions, free from acronyms, technical terms or obscure intra-Wikepedia references:

1. Why are the changes needed?

2. Exactly what changes are being proposed?

3. Exactly what results are they expected to achieve on the basis of actual evidence?

4. What checks on their operation are proposed?

5. What action will be taken if the hoped-for results failed to materialise?

6. What differences will ordinary punters notice, and how will they affect their ability to contribute?

Would somebody please answer these in simple, down-to-earth language that even non-technicians such as myself can understand? Feel free to intercalate (now there's a word!) your answers between the questions. --PL (talk) 16:07, 18 October 2008 (UTC)

I recommend that question 2 be tackled first; the anwers to the other questions are likely to follow from the resolution to that question. Happymelon 16:34, 18 October 2008 (UTC)
Saving answers I wrote down a while ago:
  1. To prevent vandalism and blatant violations from appearing on high visibility pages, and better handle disputed and contentious articles.
  2. A type of flagged revisions known as "sighted revisions" (for vandalism/blatant violations), and a type of flagged revisions known as "confirmed revisions" (for disputes/contentious articles). Their use is subject to guidelines to be determined.
  3. IPs will see the latest flagged revisions (confirmed revs have priority on sighted revs), so won't see so much vandalism. It'll allow article development in disputes, while protection prevents it.
  4. Logs will record flaggings and de-flaggings for verification. Non reviewed edits will be listed in special pages. Any user can access logs and any surveyor can access the special pages.
  5. It will be improved so that hoped-for results materialise, the use may also be restricted. It's almost certain though that vandalism and other blatant violations won't appear on articles with sighted revisions, so we know it'll have immediate effects.
  6. 'Punters' will see less vandalism and blatant violations. On certain articles, editors will see a notice like "Edits to this article may be slightly delayed to public view in order to deal with vandalism and blatant violations of our policies.". IPs will see the latest sighted (or confirmed) rev, users won't notice differences. Cenarium Talk 00:24, 19 October 2008 (UTC)

Thanks. Is vandalism really such a problem? I get the impression that constantly correcting it is what makes some people's day!

If people really insist on going ahead, though, what would really help now would be a numbered set of simple instructions for new editors, entitled 'How to edit and what to expect when you do'. --PL (talk) 10:13, 19 October 2008 (UTC)

Vandalism is really a problem for the public. They will not trust Wikipedia if 1 in 100 or 1000 articles have obvious problems. An encyclopaedia that you can trust most of the time is not very useful. Of course, we (experienced editors) know tricks we use when we gauge the reliability of an article, but most readers have never edited Wikipedia and have only a very vague idea how it works. Indeed, correcting vandalism is what makes some people's day, but that is not really the purpose of Wikipedia.
Yes, we will have to write a simple document. However, there are a couple of competing proposals doing the rounds. I imagine that people are hesitant to put much time in writing such a simple document as long as we're still discussing the various proposals. -- Jitse Niesen (talk) 11:06, 19 October 2008 (UTC)