User talk:Citation bot/Archive 29
This is an archive of past discussions with User:Citation bot. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 25 | ← | Archive 27 | Archive 28 | Archive 29 | Archive 30 | Archive 31 | → | Archive 35 |
Convert to cite biorxiv
- What should happen
- [1]
- We can't proceed until
- Feedback from maintainers
Google Books
- Status
- {{notabug}}
- Reported by
- Arminden (talk) 19:03, 16 November 2021 (UTC)
- We can't proceed until
- Feedback from maintainers
It's about this edit.
A book is listed there, as a one-asterisk item. Beneath it are 2 entries from that very book, listed as double-asterisk subsets. The book itself is provided with all details available, but the subsets only need the title & page number, with a quote in one case. There were already a couple of superfluous details in the first subset, Qasr Bardawil (R14), which details triggered the bot. By superfluous details I mean the "cite book" template as such, and the URL: the URL wasn't needed, as it didn't refer to the concrete page, and once the URL is removed, the template also becomes unnecessary. (The thing is, a few years ago I think I could access the page via the URL and now I cannot, which makes the URL superfluous, even if it did contain the page number.)
The bot made 2 types of mistakes. One might not be fixable, but the second one can and should be:
1) It added three unneeded details to the subset (ISBN, author, date — all 3 already contained in one-asterisk line above);
and the worst problem,
2) the author was added three times! in various forms (same person, same name, once correct, once with middle name, once with title "Professor").
Also, the "date" is not needed anyway, the "year" is enough for the vast majority of books (minor issue).
Regarding the page URL: Google Books seems to have prohibited access since July 2019. Maybe you know if this depends on where one tries to access it from? There have been discussions, but it's hard to guess how the Google Books access algorithm works. In any case, I have tried multiple country endings, such as .co.uk, .de, .fr, also at different times, because varying the endings often leads to different sets of pages being shown. In this case it didn't. Thanks, Arminden (talk) 19:03, 16 November 2021 (UTC)
- I would say there's something wrong with your internet. Google books access works fine for me. It's just a limited preview and you can't actually read page 117 that is being cited. So if it was me, I would delete the URL. But that's not something you should expect the bot to handle. The best probably is to leave the ISBN parameter properly in place so that anyone can follow that link to all kinds of book sources that match up with all kinds of possibilities around the world. — Chris Capoccia 💬 02:17, 17 November 2021 (UTC)
- Google Books access works differently for different people depending on the whim of Google, and inability to access it does not indicate anything wrong about the poster's internet. —David Eppstein (talk) 08:14, 17 November 2021 (UTC)
- but URL in question is only to the general book page and not to any preview of any particular page. should work everywhere. Regardless, far better link is just the system wikipedia uses for ISBNs. Click on the ISBN and find some access you have through a library or whatever. — Chris Capoccia 💬 19:17, 17 November 2021 (UTC)
- More than the networking, if the user experiences different results with the Google Books URLs by using a different TLD, the most likely explanation is cookies. Google may have different results depending on whether you're logged in, whether you've visited before etc. There's no reason to think the URL normalisation performed by the bot makes the result worse for the average user (or for any user at all), while there is reason to think it makes the experience more predictable. Nemo 22:15, 21 November 2021 (UTC)
- but URL in question is only to the general book page and not to any preview of any particular page. should work everywhere. Regardless, far better link is just the system wikipedia uses for ISBNs. Click on the ISBN and find some access you have through a library or whatever. — Chris Capoccia 💬 19:17, 17 November 2021 (UTC)
- Google Books access works differently for different people depending on the whim of Google, and inability to access it does not indicate anything wrong about the poster's internet. —David Eppstein (talk) 08:14, 17 November 2021 (UTC)
Bot introducing reference error
- Status
- Kind of {{notabug}} since it reveals underlying problems
- Reported by
- DuncanHill (talk) 15:49, 3 December 2021 (UTC)
- What happens
- The bot broke a Harv/sfn reference, causing the article to appear in Category:Harv and Sfn no-target errors
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Metavariable&diff=1058376475&oldid=1003650539
- We can't proceed until
- Feedback from maintainers
- Not a bug as described. 26 June 1973? Corrupt Google books metadata? I think so. The Google facsimile lists three publication/printing dates: 1971, 1973, 1996. The
{{harvnb}}
template has been malformed since it was added with this edit. The bot hasn't 'broken' anything but has revealed deficiencies in Google book metadata and the malformed{{harvnb}}
template. Not a bug. - For books, year of publication is the traditionally correct value for
|date=
; no need for greater precision. So, if anything is wrong with this bot edit, it is that the bot used day-precision when it should have used year-precision for|date=
. - —Trappist the monk (talk) 16:05, 3 December 2021 (UTC)
- Well, to me an automatic tool that breaks things is buggy. If a human did it we'd try to stop them. The bot did it again here. The bot DID break things, despite your denial. It broke the link between the reference and the source. DuncanHill (talk) 12:51, 4 December 2021 (UTC)
- The bot traded an invisible error (missing date), for a visible error (broken link). The net result is both errors getting fixed. That's a good thing. Headbomb {t · c · p · b} 16:03, 4 December 2021 (UTC)
- Only if I happen to be both online and bothered. Fixing broken Harv/sfn refs is an extremely minority pastime. DuncanHill (talk) 16:07, 4 December 2021 (UTC)
- The number of times I have come across a broken such reference that never pointed to anything is shocking. The editor copied from another page or just filled the sfn and never added the actual reference. AManWithNoPlan (talk) 16:16, 4 December 2021 (UTC)
- Oh I agree, and to be fair the bot leaves a mess that can be fixed, unlike the sort of thing you mention. DuncanHill (talk) 16:22, 4 December 2021 (UTC)
- The number of times I have come across a broken such reference that never pointed to anything is shocking. The editor copied from another page or just filled the sfn and never added the actual reference. AManWithNoPlan (talk) 16:16, 4 December 2021 (UTC)
- Only if I happen to be both online and bothered. Fixing broken Harv/sfn refs is an extremely minority pastime. DuncanHill (talk) 16:07, 4 December 2021 (UTC)
- The bot traded an invisible error (missing date), for a visible error (broken link). The net result is both errors getting fixed. That's a good thing. Headbomb {t · c · p · b} 16:03, 4 December 2021 (UTC)
- Well, to me an automatic tool that breaks things is buggy. If a human did it we'd try to stop them. The bot did it again here. The bot DID break things, despite your denial. It broke the link between the reference and the source. DuncanHill (talk) 12:51, 4 December 2021 (UTC)
Three cite parameters (|access-date=
, |archive-date=
, and |archive-url=
) have a canonical form with the dash, but an undashed form is also supported: |accessdate=
, |archivedate=
, and |archiveurl=
.
There is a bug in @InternetArchiveBot (aka IAbot) which causes it to ignore the undashed forms, and in some cases to add duplicate parameters. This triggers a visible error message, and categorises the page in Category:CS1 errors: redundant parameter.
I reported this bug nearly three months ago at phab:T291704. However, the bug has not even been triaged, and no fix has happened. Two weeks ago, @Cyberpower678 confirmed[2] that the bug was a priority, but that a solution was not on hand.
To avoid filing up Category:CS1 errors: redundant parameter, I wrote a wee script which I run before invoking IAbot: User:BrownHairedGirl/CiteParamDashes.js. I also incorporate the fix in any AWB jobs I do. Several other editors do similar fixes as part of their routines.
Meanwhile, a lot of pages still use the undashed form, and some cleanup tools such as WP:Reflinks still add the undashed form.
Please can this fix be added to Citation bot, as a minor task? i.e. to be done only if Citation bot makes a substantive change. BrownHairedGirl (talk) • (contribs) 19:20, 6 December 2021 (UTC)
- I tried doing accessdate fixes along side more substantial fixes when doing archive link fixing, and it still had no consensus for accessdate, someone asked me to stop changing accessdate to access-date so i did. IHMO, solution is to fix IAbot, not put bandaid around the problem. Rlink2 (talk) 00:10, 7 December 2021 (UTC)
- @Rlink2: obviously, IAbot should be fixed. But since that isn't happening any time soon, we need workarounds, and this is a simple one with zero adverse effects.
- What is the problem with converting parameters to their canonical form when it is done as part of a non-minor edit? BrownHairedGirl (talk) • (contribs) 01:12, 7 December 2021 (UTC)
- Pitchforks arise and get the bot blocked. AManWithNoPlan (talk) 01:44, 7 December 2021 (UTC)
- I understand your caution, @AManWithNoPlan. But it's depressing that simple problem-solving measures so often get derailed by pitchforks. BrownHairedGirl (talk) • (contribs) 02:05, 7 December 2021 (UTC)
- Pitchforks arise and get the bot blocked. AManWithNoPlan (talk) 01:44, 7 December 2021 (UTC)
{{wontfix}}
Financial Post and National Post
- Status
- {{fixed}}
- Reported by
- Anomalocaris (talk) 09:37, 7 December 2021 (UTC)
- What happens
- wrongly changed |website=National Post to |newspaper=Nationalpost (parameter name change OK but parameter value change is wrong, viz: National Post; also in same manner wrongly changed Financial Post to Financialpost; also changed |agency=Reuters to |work=Reuters: others may disagree but I believe that Reuters is fundamentally a news organization that happens to have a website with the same name, and therefore it should be | publisher=
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=COVID-19_pandemic_in_Canada&diff=1058772276&oldid=1058326903
- We can't proceed until
- Feedback from maintainers
|agency=Reuters
→ |work=Reuters
is correct when |work=
in {{cite news}}
is missing or empty.
—Trappist the monk (talk) 12:15, 7 December 2021 (UTC)
- the financialpost and nationalpost items are fixed in the source code as special cases of poor zotero data. Will be deployed soon. AManWithNoPlan (talk) 16:37, 7 December 2021 (UTC)
Bot removed colons from reference title wrongly
Emoticon faces need the colons for eyes, which are included in many titles. I undid this revision. ~ AntisocialRyan (talk) 17:25, 10 December 2021 (UTC)
- Comments added to stop bot. AManWithNoPlan (talk) 19:49, 10 December 2021 (UTC)
- {{fixed}} on the page and made some code changes to greatly reduce this. AManWithNoPlan (talk) 17:17, 11 December 2021 (UTC)
- Thank you!!! I appreciate it. Newest bot revision seems to be done much better. ~ AntisocialRyan (talk) 06:12, 12 December 2021 (UTC)
- {{fixed}} on the page and made some code changes to greatly reduce this. AManWithNoPlan (talk) 17:17, 11 December 2021 (UTC)
abc.net.au: No website=
- Status
- {{fixed}}
- Reported by
- BrownHairedGirl (talk) • (contribs) 15:39, 11 December 2021 (UTC)
- What happens
- Article with two refs to abc.net.au; one bare, one partially filled. In each case the bot correctly added title and date parameters, but did not add a
|website=
parameter - What should happen
- Bot should add
|website=[[Australian Broadcasting Corporation]]
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Edward_William_Cole&diff=prev&oldid=1059776418
- We can't proceed until
- Feedback from maintainers
Thanks for the prompt fix, AManWithNoPlan. This bot is hugely powerful, and does great work. Your constant improvements and very prompt bug fixes make using it a pleasure. --BrownHairedGirl (talk) • (contribs) 21:28, 11 December 2021 (UTC)
- I agree 100% Rlink2 (talk) 21:33, 11 December 2021 (UTC)
- On it being huge, in the last couple months I have only had to fix one crash. I found that myself, so it never appears here. Keep the reports coming. AManWithNoPlan (talk) 22:55, 11 December 2021 (UTC)
Bot still being abused by Abductive
Okay, I'm done. BrownHairedGirl, if you believe Abductive is in the wrong, either ask the community via RFC or take it to WP:ANI. Enough is enough. --Izno (talk) 01:22, 14 September 2021 (UTC)
|
---|
@Abductive continues to game the bot's queueing system by make large numbers of single page requests while the bot is processing a batch job requested by Abductive. This has the effect of denying other editors access to the bot, because two of the bot's four channels are in use by Abductive. Both the batch jobs and the single-page requests are poorly-chosen. See the bot's most recent 1,500 edits, spanning 9 hours. the following observations are all based on that set;
I am posting this just to put it on record, without any hope of Abductive ceasing to abuse their access to the bot. --BrownHairedGirl (talk) • (contribs) 14:40, 12 September 2021 (UTC)
"Above average results" not at 35% on random categories unrelated to a topic or citation-related cleanup category. My runs get 85-90% edit rates. Headbomb {t · c · p · b} 19:56, 12 September 2021 (UTC)
Abductive again. What to do?Today Abductive has had the bot run yet another low-return speculative trawl, this time on Category:Edible fungi Category:Edible fungi has 511 articles, conveniently close to the limit of 550 for categories. (Abductive has a long history of selecting categories whose size is close to the limit)
×
As can be seen from these 700 bot edits (search the page for The bot worked its way through this set from some time before 15:54 UTC to some time after 20:16UTC. This usually is the busiest period of the day for the bot, and today was no exception. The result of the bot trawling through Abductive's set — and deciding that in 78% of cases it had nothing to do — is that the bot was mostly unavailable to other editors for about four-and-a-half hours, because there was no slot available. Note too that this follows Abductive's claims yesterday that their category-based batches were fine, because they assert that by claimed careful selection they exceed a claimed 30% average for category jobs. That argument about other categories is bogus, because selection other editors achieve much higher edit rates by using other election methods ... and the claim is also bogus, because the choice of this category was so poor. I have now documented incidents like this repeatedly for about a month. AManWithNoPlan has twice reduced the size limit on category batches to curtail Abductive's abuses, an in the discussions I have seen nobody supporting Abductive's use of the bot, but about six editors denouncing this abuse of the bot. At least one editor has described it as a form of denial-of-service. I cannot know whether DOS is Abductive's intention, but the effect is DOS; and since this has been repeatedly pointed out to Abductive, their disruptive conduct is at best reckless. So I think it's time to ask Abductive to stop running any batch jobs on Citation bot, however they are constructed. Any thoughts? --BrownHairedGirl (talk) • (contribs) 21:38, 13 September 2021 (UTC)
|
Why no new jobs?
Since about 15:35, I have been trying intermittently to start a new batch job, but without success.
I have analysed the bot's last 500 edits, and after removing the edits for AManWithNoPlan's batch job, I am left with only 5 bot edits since 15:00:
17:28, 13 December 2021 diff hist +165 Marinobacterium Add: authors 1-5. | Use this bot. Report bugs. | Suggested by Abductive | #UCB_toolbar current rollback: 1 edit [rollback] [vandalism] 17:14, 13 December 2021 diff hist +4 Uchchhishta Alter: isbn. Upgrade ISBN10 to ISBN13. | Use this bot. Report bugs. | Suggested by Abductive | Category:Sanskrit words and phrases | #UCB_Category 2/283 current rollback: 1 edit [rollback] [vandalism] 15:07, 13 December 2021 diff hist +74 Amoeboid movement Add: doi-access, bibcode, pmc, pmid. | Use this bot. Report bugs. | Suggested by Chris Capoccia | #UCB_toolbar current rollback: 1 edit [rollback] [vandalism] 15:03, 13 December 2021 diff hist +226 User:Chris Capoccia/sandbox Alter: template type. Add: isbn, pmid, pages, volume, year, series, title, authors 1-1. Formatted dashes. | Use this bot. Report bugs. | Suggested by Chris Capoccia | #UCB_toolbar current rollback: 1 edit [rollback] [vandalism] 15:01, 13 December 2021 diff hist +161 User:Chris Capoccia/sandbox Alter: template type. Add: isbn, pages, year, title, chapter, authors 1-1. Formatted dashes. | Use this bot. Report bugs. | Suggested by Chris Capoccia | #UCB_toolbar
It's as if the bot had only one channel running consistently, with a second channel opening up occasionally.
What's going on? Is some big batch job being run which is making no edits? Or is the bot stuck in an infinite loop? BrownHairedGirl (talk) • (contribs) 17:55, 13 December 2021 (UTC)
- I have figured out which page causes the problem. I will look into it. I was able to reboot the bot again to clear the queue. AManWithNoPlan (talk) 18:19, 13 December 2021 (UTC)
- Many thanks, @AManWithNoPlan. I was able to start my batch job immediately. BrownHairedGirl (talk) • (contribs) 18:43, 13 December 2021 (UTC)
- {{fixed}}, not a bug, but a whole class of DOIs from Nature that no longer work right. AManWithNoPlan (talk) 23:16, 13 December 2021 (UTC)
- Many thanks, @AManWithNoPlan. I was able to start my batch job immediately. BrownHairedGirl (talk) • (contribs) 18:43, 13 December 2021 (UTC)
thestatesman.com
- Status
- {{fixed}}
- Reported by
- BrownHairedGirl (talk) • (contribs) 20:04, 13 December 2021 (UTC)
- What happens
- bot fills bare ref, but doesn't newspaper title:
{{Cite web|url=https://www.thestatesman.com/supplements/north/revival-of-tripuras-ancient-literature-1502902441.html|title = Revival of Tripura's ancient literature|date = 22 June 2020}}
- What should happen
{{Cite news |url=https://www.thestatesman.com/supplements/north/revival-of-tripuras-ancient-literature-1502902441.html|title = Revival of Tripura's ancient literature|date = 22 June 2020 |newspaper=[[The Statesman (India)|The Statesman]]}}
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Kokborok&diff=prev&oldid=1060155704
- We can't proceed until
- Feedback from maintainers
Thanks for yet another prompt fix! --BrownHairedGirl (talk) • (contribs) 00:19, 14 December 2021 (UTC)
Bot Rebooted - sorry for any big jobs lost
AManWithNoPlan (talk) 13:39, 13 December 2021 (UTC)
- Underlying problem {{fixed}} AManWithNoPlan (talk) 14:28, 14 December 2021 (UTC)
Daily Sabah is not a person
- Status
- {{fixed}}
- Reported by
- BrownHairedGirl (talk) • (contribs) 00:45, 15 December 2021 (UTC)
- What happens
- bot filled a ref to https://www.dailysabah.com/turkey/2014/12/12/turkey-may-sue-greece-over-1996-downing-of-fighter-jet, correctly adding
|newspaper=Daily Sabah
... but also (wrongly) adding|last1=Sabah|first1=Daily
- What should happen
- bot should have added
|newspaper=[[Daily Sabah]]
, linking the title - Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=1996_Turkish_F-16_shootdown&diff=1060353594&oldid=1060353347
- We can't proceed until
- Feedback from maintainers
- Thanks for another prompt fix. --BrownHairedGirl (talk) • (contribs) 02:15, 15 December 2021 (UTC)
The Washington Post is a newspaper
- Status
- {{fixed}}
- Reported by
- BrownHairedGirl (talk) • (contribs) 16:53, 15 December 2021 (UTC)
- What happens
{{cite web|url=https://www.washingtonpost.com/wp-dyn/content/article/2010/06/28/AR2010062802134.html|title=Supreme Court affirms fundamental right to bear arms}}
→{{cite web|url=https://www.washingtonpost.com/wp-dyn/content/article/2010/06/28/AR2010062802134.html|title=Supreme Court affirms fundamental right to bear arms|website=[[The Washington Post]]}}
- What should happen
{{cite web|url=https://www.washingtonpost.com/wp-dyn/content/article/2010/06/28/AR2010062802134.html|title=Supreme Court affirms fundamental right to bear arms}}
→{{cite news|url=https://www.washingtonpost.com/wp-dyn/content/article/2010/06/28/AR2010062802134.html|title=Supreme Court affirms fundamental right to bear arms|newspaper=[[The Washington Post]]}}
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Right_to_keep_and_bear_arms_in_the_United_States&diff=prev&oldid=1060397725
- We can't proceed until
- Feedback from maintainers
The Washington Post is still a newspaper
[4]. --BrownHairedGirl (talk) • (contribs) 09:46, 16 December 2021 (UTC)
- Too many code paths. That is now {{fixed}}. AManWithNoPlan (talk) 18:51, 16 December 2021 (UTC)
When expanding doi in cite journal transforms to cite book and adds URL, needs to use chapter-url
- Status
- {{fixed}} more
- Reported by
- — Chris Capoccia 💬 17:05, 16 December 2021 (UTC)
- What happens
- expanding DOIs like 10.14264/uql.2014.48 converts from cite journal to cite book and adds URL but does not use correct parameter
- What should happen
- when switching from cite journal to cite book, need to use chapter-url parameter
- Relevant diffs/links
- diff
- We can't proceed until
- Feedback from maintainers
That can be super hard to figure out. AManWithNoPlan (talk) 18:52, 16 December 2021 (UTC)
- I will see what I can do to make it more likely to guess right. AManWithNoPlan (talk) 18:57, 16 December 2021 (UTC)
- I added some more brains to the code. Will never be 100% right, but will catch this one and some other ones. AManWithNoPlan (talk) 19:28, 16 December 2021 (UTC)
- thanks. I don't know all the difficulties of the process. just seemed like from this kind of situation in the diff where expanding a blank cite journal doi into cite book should be more clear to identify and handle differently. — Chris Capoccia 💬 20:53, 16 December 2021 (UTC)
Wikipedia page
Can you make a Wikipedia for “RichBoi Streeter” 2601:7C1:100:2DA0:34FC:A0BD:7AFE:D9A7 (talk) 19:08, 19 December 2021 (UTC)
- If it meets the article guidelines, there's no reason why you can't do so yourself. However, this is not the place to discuss articles. Rlink2 (talk)
Seite nicht gefunden
- Status
- Fixed, and grabbed a couple more languages
- Reported by
- Jonatan Svensson Glad (talk) 08:02, 21 December 2021 (UTC)
- What happens
- Bot added
|title=Kunstakademie Heimbach - Seite nicht gefunden
- What should happen
Seite nicht gefunden
should be treated as as bad title (means "site not found")- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Charles_Fazzino&diff=prev&oldid=1061365708
- We can't proceed until
- Feedback from maintainers
Technically, "page not found" (HTTP 404), but same effect.
Bot reverted itself?
- Status
- Not a bug, and Fixed by hand sadly
- Reported by
- Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 02:37, 20 December 2021 (UTC)
- What happens
- The bot ran through the article once (as part of a category run initiated by me), and added
|doi-broken-date=20 December 2021
to a ref. Ten minutes later, without any intervening edits having been made to the article, the bot ran through the article again (this time by AManWithNoPlan using the edit-window gadget), and removed the parameter it had just added. No other changes were made as part of either edit. - Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=William_Pitt_the_Younger&diff=next&oldid=1061156216
- We can't proceed until
- Feedback from maintainers
doi.org is having issues rigbt now. Thankfully, For over a year, I have been reporting/fixing all broken DOIs, so i noticed this and I am undoing the bots marking as dead. It's less than one in a thousand DOIs. AManWithNoPlan (talk) 13:50, 20 December 2021 (UTC)
UK
Why does the bot do this[5] when "co.uk" is in the resources? I have to remember to take it out. GBFEE (talk) 19:09, 9 December 2021 (UTC)
- Not a bug canonical form. AManWithNoPlan (talk) 14:35, 21 December 2021 (UTC)
"Error: Citations request failed" is displayed and the citation gadget I cannot be used
I would like to some advice. --SilverMatsu (talk) 04:30, 11 December 2021 (UTC)
- @SilverMatsu:, on past performance, the most likely reason is as already complained about at #No new jobs (and #Proposed new rule for using Citation bot) - the service is full with batch jobs which obviously are just as important as we mere mortals who just want to verify that an article we've edited is now 'clean'. Sigh. All you can do is avoid 18:00 UTC to about 06:00 UTC, which is the maximum overlap of European and American editors being online. Maybe the batch runners might accept this curfew too? --John Maynard Friedman (talk) 12:43, 11 December 2021 (UTC)
- The advice is simply try again later. If you were in the edit window, simply preview the page again, and click the citations button again. Headbomb {t · c · p · b} 13:05, 11 December 2021 (UTC)
- it can also occur with invalid wiki-coded pages. The bot is pretty good about not doing its job when the code is messed up. AManWithNoPlan (talk) 14:22, 11 December 2021 (UTC)
- Thank you for the advice. It's not that there was a problem with the device, it just seems that there are too many requests, so next time I will use the gadget when there are few requests. Also, it seems that the gadget does not work even if there is a problem with the code, so I will enter doi or ISBN in the template first and write in the Edit summary that the gadget did not work. --SilverMatsu (talk) 17:10, 11 December 2021 (UTC)
- it can also occur with invalid wiki-coded pages. The bot is pretty good about not doing its job when the code is messed up. AManWithNoPlan (talk) 14:22, 11 December 2021 (UTC)
- @John Maynard Friedman: in the last few weeks, the bot queue has rarely been full of batch jobs. Sometimes it gets overloaded by Abductive still stacking up multiple single-page requests, but usually it's just 2 or 4 of the 4 channels being used by batches. In the last week or so, I have found that a single page request is usually processed with a few minutes.
- The idea of a curfew isn't workable, because a batch at or the near the max size of 2200 pages usually takes 12–25 hours to complete. I did one recently which took ~36 hours. BrownHairedGirl (talk) • (contribs) 22:40, 12 December 2021 (UTC)
- Non-gadget requests can get cached and even if the web browser gives up to soon, the edit is often still done. The gadget cannot do that, since it requires a constant web connection. AManWithNoPlan (talk) 23:01, 12 December 2021 (UTC)
- TBH, I haven't tried for a while. My expectation is that it will fail nine times out of ten so I've stopped bothering. Sometimes I was agreeably surprised, but failure was the usual option. --John Maynard Friedman (talk) 23:41, 12 December 2021 (UTC)
- @John Maynard Friedman: Have you tried running the bot on individual pages using the toolbar link rather than the edit-window gadget? IIRC, toolbar requests get cached and run later, unlike gadget requests. Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 21:09, 17 December 2021 (UTC)
- I only ever run it against a single page at a time, using the edit-window gadget. Tbf, I used it yesterday or the day before and it worked ok. I haven't investigated the toolbar option, where is that documented? --John Maynard Friedman (talk) 23:24, 17 December 2021 (UTC)
- It is on the left hand side of wikipedia pager "expand citations". See https://en.wikipedia.org/wiki/User:Citation_bot/use AManWithNoPlan (talk) 00:27, 18 December 2021 (UTC)
- I only ever run it against a single page at a time, using the edit-window gadget. Tbf, I used it yesterday or the day before and it worked ok. I haven't investigated the toolbar option, where is that documented? --John Maynard Friedman (talk) 23:24, 17 December 2021 (UTC)
- And failed gadget runs might waste the bots resources and contribute to its overloaded status. This is because the bot might run fully and then find that web-browser has given up. AManWithNoPlan (talk) 21:24, 17 December 2021 (UTC)
- Not a bug of the bot, but the gadget, which is a different code base with its own talk page, etc.. AManWithNoPlan (talk) 14:36, 21 December 2021 (UTC)
- @John Maynard Friedman: Have you tried running the bot on individual pages using the toolbar link rather than the edit-window gadget? IIRC, toolbar requests get cached and run later, unlike gadget requests. Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 21:09, 17 December 2021 (UTC)
- The advice is simply try again later. If you were in the edit window, simply preview the page again, and click the citations button again. Headbomb {t · c · p · b} 13:05, 11 December 2021 (UTC)
webarchive.nla.gov.au
- Status
- Not a bug, and other bots/people are dealing with this
- Reported by
- BrownHairedGirl (talk) • (contribs) 00:42, 14 December 2021 (UTC)
- What happens
- nothing
- What should happen
- bot should extract the original URL from the archived URL, like this
Old cite:<ref>{{cite web |url=https://webarchive.nla.gov.au/awa/20080222222437/http://pandora.nla.gov.au/pan/23790/20041220-0000/issue764.pdf |title=The ARIA Report, issue 764 |date=18 October 2004 |website=pandora.nla.gov.au}}</ref>
Fixed cite:<ref>{{cite web |url=http://pandora.nla.gov.au/pan/23790/20041220-0000/issue764.pdf |title=The ARIA Report, issue 764 |date=18 October 2004 |website=pandora.nla.gov.au |url-status=dead |archive-url=https://webarchive.nla.gov.au/awa/20080222222437/http://pandora.nla.gov.au/pan/23790/20041220-0000/issue764.pdf |archive-date=2008-02-22}}</ref>
The bot can extract the original URL from a web.archive.org URL, so it should also be able to do it for webarchive.nla.gov.au - Replication instructions
- test at User:BrownHairedGirl/sandbox102 (the ref is from the article Hilary Duff (album))
- We can't proceed until
- Feedback from maintainers
- While updating the notes on the Australian Web Archive in List of web archives on Wikipedia, I followed all the example links and found that some archive URLs that previously started with
pandora.nla.gov.au
redirected tohttps://webarchive.nla.gov.au/awa/<timestamp>/<old pandora url>
, whereas others redirected tohttps://webarchive.nla.gov.au/awa/<timestamp>/<original url>
without thepandora
part. A bit more digging and experimentation might reveal whether there is a pattern to which archived URLs still needpandora
. ClaudineChionh (talk – contribs) 08:49, 14 December 2021 (UTC)- Thanks, @ClaudineChionh. It seems that the pattern-matching here may be a wee bit more complex than I thought. BrownHairedGirl (talk) • (contribs) 13:05, 14 December 2021 (UTC)
- You are not the first person to suggest this. AManWithNoPlan (talk) 21:24, 17 December 2021 (UTC)
- An attempt at a taxonomy at WP:WEBARCHIVES.
- Pandora is considered a web archive URL in the sense of belonging in the
|archive-url=
field - except in certain cases, like above where there is no source URL at the end, it's not a web archive URL thus belongs in the|url=
. - IABot does not recognize the new form and rolls back fixes eg. [6] - I've opened a bug report but can not say how long it would be (days? months? years?). In the mean time the only way to keep IABot off webarchive.nla.gov.au is to add a
{{cbignore}}
.. example. - I'm developing a WaybackMedic task to fix every instance of Pandora (6,000 pages) including adding the cbignore. Time consuming many issues. Meanwhile, old Pandora links will continue to be added by IABot, until the Phab is resolved, when I can update the cached links in the IABot database to the new form. -- GreenC 05:24, 19 December 2021 (UTC)
- Thank you for your efforts GreenC! I will also ask at AWNB whether anyone has contacts at the NLA or any further insight into the changes. ClaudineChionh (talk – contribs) 10:27, 19 December 2021 (UTC)
- You are not the first person to suggest this. AManWithNoPlan (talk) 21:24, 17 December 2021 (UTC)
- Thanks, @ClaudineChionh. It seems that the pattern-matching here may be a wee bit more complex than I thought. BrownHairedGirl (talk) • (contribs) 13:05, 14 December 2021 (UTC)
One editor, two simultaneous batch jobs
Yet again, @Abductive is hogging the bot by running two batch jobs simultaneously.
The latest bot contribs shows that the bot is simultaneously processing both:
- Category:Gambling terminology (63 pages), with an edit rate so far of 32/61 pages, i.e. 53%
- Category:Papermaking (88 pages), with an edit rate of 17/88 pages, i.e. 19%
More low-return speculative trawling, using half the bot's capacity, delaying jobs targeted at bot-fixable issues, and locking out single-request jobs. This is more WP:DE. --BrownHairedGirl (talk) • (contribs) 15:19, 29 August 2021 (UTC)
- I requested the first one, and nothing happened for a very long time. I entered the second one, and nothing happened for a very long time. I have entered more, and still nothing has happened. All of a sudden, both started running. Abductive (reasoning) 15:27, 29 August 2021 (UTC)
- I am going to have to ask you to stop complaining about my so-called speculative trawling. Using the bot to clean up a category's citations is a legitimate use of the bot, and it is not a problem. Please strike your accusation of disruptive editing. Abductive (reasoning) 15:27, 29 August 2021 (UTC)
- On the contrary, your long pattern of using the bot disruptively has been well-documented. This is just the latest round of the saga. I will stop complaining about your disruption when you stop the disruption.
- Why were you entering a second category before the first one had even started processing? Are you trying to stash up batch jobs in some sort of private queue? --BrownHairedGirl (talk) • (contribs) 15:59, 29 August 2021 (UTC)
- You are accusing me of acting in bad faith and of being disruptive, when in fact it is some sort of error with the bot. Please desist. Abductive (reasoning) 16:09, 29 August 2021 (UTC)
- Please stop being disruptive, and please stop WP:GAMING the bot's queuing system.
- The error in the bot's functionality was simply that it failed to block your gaming of the queue, a practice which you have already acknowledged above. If you were not gaming the queue, that bot functionality would be irrelevant. --BrownHairedGirl (talk) • (contribs) 16:13, 29 August 2021 (UTC)
- You are accusing me of acting in bad faith and of being disruptive, when in fact it is some sort of error with the bot. Please desist. Abductive (reasoning) 16:09, 29 August 2021 (UTC)
As I said above, please find another place to argue. Thanks. --Izno (talk) 16:38, 29 August 2021 (UTC)
- Izno, as i replied above, surely this is the place to raise misuse of the bot? --BrownHairedGirl (talk) • (contribs) 17:41, 29 August 2021 (UTC)
- BrownHairedGirl, What most of us (in my opinion I feel) are getting at is that you have problems with 1 specific users actions, and this discussion is only between the 2 of you. While it is about how the bot is used, it seems to have long passed anything to do with the edits of the bot, what operators should do etc.. I think originally starting a discussion here was a good idea, as it should have been done, and that let to most people reconsidering the use of the bot, however this has turned in to a back and forward situation that nobody here can help. It's not about the bot anymore, it reads as a conflict between 2 people.
- Now apparently this specific section shows a technical problem "One editor, two simultaneous batch jobs" It is great to report that, as it should not be possible. this is exactly the place to report technical problems.
- But instead of just sticking to a detail about the problem, it instantly devolved in to back and forth accusations.
- Summary: It was a good place to start a discussion about bot usage due to limit resources which should also be fixed, but it is now just a conflict between 2 people. Suggestion: If you find technical issues just report them, and wait for the operator to respond, if its also combined with what you see as "misuse" it seems we have passed this venue and should be discussed elsewhere. Redalert2fan (talk) 13:34, 30 August 2021 (UTC)
- @Redalert2fan: the only impact of Abductive's systematic misuse of the bot is to impede other editor's use of the bot, so that belongs on the page where editors are discussing how to use the bot.
- See the section below User talk:Citation bot/Archive 27#Citation bot, where Fade258 posted[7] about the bot not having processed a page after several hours of waiting. That happened because an earlier round of Abductive's gaming of the bot's queuing system had blocked new jobs from starting. Fade258's post didn't do a great job of explaining the problem they had encountered, but this was the right place to raise the no-response-from-bot problem. Editors such as Fade258 would be much less likely to find the bot overloaded if we didn't have one editor systematically occupying 2 of the 4 entries. --BrownHairedGirl (talk) • (contribs) 18:06, 30 August 2021 (UTC)
- I am very sorry that the bot somehow allowed me to make more than one run at a time. But I did not "game" anything, and your continued accusations amount to incivility. Abductive (reasoning) 21:58, 30 August 2021 (UTC)
- My accusations are both a) evidenced, and b) supported by your own assertions. The technical issue is that the bot mishandled to block your request for a second batch before the first one had even started; but that technical issue arose only because of your lack of restraint. --BrownHairedGirl (talk) • (contribs) 23:02, 30 August 2021 (UTC)
- You asked if this is the place, I explained why I think it is not. I will leave it at that, I have no problem if you have issues with the edits of abductive, but if you have to fight back me as well on a question that you asked, and get another hit on abductive like that I'm no longer interested in explaining. You picked one specific section of my comment out just to be uncivil towards someone else. I Urge you to reconsider as well, it is looking for me increasingly like the bot should only do what you want.
- And this is just the comment from me I think should not be posted on this page, but I have to do. Redalert2fan (talk) 22:40, 30 August 2021 (UTC)
- @Redalert2fan: my reply to you is not a "fight", and it was in no way uncivil.
- And please AGF. I do not in any way believe that the bot should only do what I want; my objections are to misuses which I have documented thoroughly. --BrownHairedGirl (talk) • (contribs) 22:57, 30 August 2021 (UTC)
- And as I said previously, I do not agree with the term misuse being used, and now it is thrown out like a buzzword all the time, but I specificaly left out my own view on this this time above because I already know 0 progress is going to be made.
- If I'm honest, whenever I want to use the bot its not abductive that is blocking out people anymore, you are taking up one of the slots all the time as well, claiming that your edits are the most important. I think that everyone should be allowed the to use this tool if they think they can improve something with it, and not keep being targeted on their edits. "AGF" huh, how about aplying this to everyone that uses the bot, you didnt do that above with abductive either. Redalert2fan (talk) 23:12, 30 August 2021 (UTC)
- @Redalert2fan: I do not use the word
misuse
lightly. I do use it in this case, because I have clear evidence, documented above, that Abductive has been repeatedly misusing the bot in two ways: a) repeatedly flooding it with lists of article which lead to very low rates of change; b) repeatedly gaming the bot's queuing system to get the bot to simultaneously run two batch jobs for Abductive. Before you dismiss my complaints, please read the evidence, which is spread over several weeks in threads above. - Your assertion that I am
claiming that your [BHG's] edits are the most important
is demonstrably false. At no point have I made or implied any such claim. On the contrary, I have repeatedly stressed the importance of getting bot to do actual edits rater than processing pages which need no changes, and I have explicitly stated that I regard the academic journal refs targeted by Headbomb as a more important task.[8]. So please strike that slur. - Yes, I am usually taking up one of the bot' slots. But
- I take up one of the bot's slots, whereas Abductive has been systematically flooding the bot with two simultaneous tasks (see evidence here[9])
- Unlike Abductive, I do not set the bot off on large unproductive speculative trawls. Note that jobs which I ask the bot to do are highly visible because the bot has work to do on most pages; Abductive's low-return speculative trawls leave less trace because the bot' contribs lists does not log pages where th boy makes no edits. Note also that Abductive actually defends their long history of wasting the bot's time on huge unproductive trawls: see e.g. [10]
Bots exist to do tedious editing tasks. Your notion that editors have to do the tedious work before giving the bot a task is contrary to the purpose of bots.
. Those unproductive, speculative trawls by Abductive repeatedly tie up 1/4 of the bot's capacity for days on end, because Abductive will not restrict their use of the bot to batches which actually achieve something. Yet Redalert2fan somehow thinks it is very uncivil of me to point this out.
- I remind you that WP:AGF explicitly says
this guideline does not require that editors continue to assume good faith in the presence of obvious evidence to the contrary
. In the case of Abductive, I have documented thatobvious evidence to the contrary
. - If you have concerns that too much of the bot's time is taken up by targeted batch jobs of large sets of pages with identified bot-fixable problems, such as those suggested by me, Headbomb, or Josve05a, then please open a discussion on that issue. I think it would be helpful to have such a meta-discussion about how to balance the bot between batch tasks and individual jobs. But please stop allowing your concerns about that to make false accusations about my motives and goals ... an please stop conflating bot overload with efforts by Abductive to use the bot unproductively and to actively game its limits. --BrownHairedGirl (talk) • (contribs) 00:21, 31 August 2021 (UTC)
- I do not feel this is productive in any way nor do I think this is the right place for this as I said before. I'm not interested in clogging this page up further either. I will leave it here. Sorry if this is not satisfactory, but I feel it is for the best. Redalert2fan (talk) 10:42, 31 August 2021 (UTC)
- @Redalert2fan: I do not use the word
- I am very sorry that the bot somehow allowed me to make more than one run at a time. But I did not "game" anything, and your continued accusations amount to incivility. Abductive (reasoning) 21:58, 30 August 2021 (UTC)
I've been following this with a little interest best this is on my watchlist. I've not followed everything. BHG grabs my attention due to the working of bare-URL identification bot runs. But it is reasonable to discuss whether a bot startup can be modified with an algorithm to avoid unfair usage; or perhaps documenation if further. I'm not in the greatest of standing at the moment, but Wbm1058 has a few hours left on my watchlist and couldn't help but notice the comment (I think its been their for a long time). ": Secret to winning the Race Against the Machine: become an expert bot programmer, and hope that the bots don't learn to program themselves. HAL?". That comments been there for a long time, got me to dreaming about becoming an "expert bot programmer" (I've got lost from when input ceased to be from 80 column punch cards so a bit too late in life methinks). Wbm1058's seems to know something about bots and might be able to comment/mediate here? Thankyou. Djm-leighpark (talk) 23:28, 30 August 2021 (UTC)
- A couple of thoughts. Is abductive a primary maintainer of the bot in which case could reasonably request an exemption for two bots if only was for low maintenance. The other might be a lightweight new control/monitor bot to monitor the runners and queuers and if user A was running two or more bots and another user B had a scheduled run holding for some time the control bot would chop one of users A bots and reschedule it to start from the place it left of. Actually this would likely be beyond anyones capability to do in a reasonable time period. Sorry I come up with stupid questions. Thankyou. Djm-leighpark (talk) 01:34, 31 August 2021 (UTC)
- someone who knows how to and has permission to so needs to increase the total number of PHP proceses from 2*2 to something bigger/ AManWithNoPlan (talk) 01:47, 31 August 2021 (UTC)
- not to state the obvious, but shouldn't there be a place were we can find someone like that? Redalert2fan (talk) 12:00, 31 August 2021 (UTC)
- multiple efforts have been made. I personally do not plan to try again. Not worth wasting my time. AManWithNoPlan (talk) 12:57, 31 August 2021 (UTC)
- @AManWithNoPlan: If you have moment, please could you tell me whether my understanding is correct?
- AIUI:
- the bot processes one page at a time, and its throughput is constrained by the time it takes the bot's core to parse each page, invoke zotero etc, then process the results and save any changes.
- there is only one instance of that core functionality
- the order in which pages are processed involves taking pages one at a time from each of the 4 PHP processes in turn (however such turns are allocated)
- So AIUI adding more PHP processes would allow the bot's time to be shared between more users, but would not increase overall throughput. Is that correct? --BrownHairedGirl (talk) • (contribs) 22:10, 31 August 2021 (UTC)
- All PHP jobs are seperate with no shared processing. We are way below our memory and cpu limits of the tool server with 4 processes. The URL expander is run by wikipedia, so that might be a bottleneck, but only for that step. AManWithNoPlan (talk) 01:15, 1 September 2021 (UTC)
- someone who knows how to and has permission to so needs to increase the total number of PHP proceses from 2*2 to something bigger/ AManWithNoPlan (talk) 01:47, 31 August 2021 (UTC)
@Legoktm: per this discussion is this process a violation of Toolforge Rule #6: "Do not provide direct access to Cloud Services resources to unauthenticated users"? Or has this been Toolforge admin vetted? wbm1058 (talk) 15:53, 1 September 2021 (UTC)
- Everyone is authenticated. Headbomb {t · c · p · b} 15:55, 1 September 2021 (UTC)
- Perhaps, but I'm hearing complaints that the bot "allows for anyone to make arbitrary queries". – wbm1058 (talk) 16:00, 1 September 2021 (UTC)
- Where at? Headbomb {t · c · p · b} 16:09, 1 September 2021 (UTC)
- BrownHairedGirl, you, and maybe others have complained that Abductive makes arbitrary queries. Essentially you seem to be complaining that Abductive launches "DoS". wbm1058 (talk) 16:14, 1 September 2021 (UTC)
- "DDoS" on Citation Bot, maybe, not DDoS against Toolforge. Headbomb {t · c · p · b} 16:23, 1 September 2021 (UTC)
- His requests did fix stuff, they are not vandalism or something like that. The concern from some was/is that due to the limited capacity the bot has they were not optimal, out of large request only a very small amount of pages were edited, the rest there was nothing, or only minor fixes were made. Currently it is better to prioritise what the bot is being used for, pages there is some certainty ahead that large or more important fixes will be done, and prevent the bot from spending time on checking pages there is nothing to fix. BrownHairedGirl seems to have developed a list of pages that fit these criteria.
- Now if the capacity was larger there wouldn't be much of a problem with those requests from Abductive, since they did contain some pages that were fixed.
- It is not a case of someone just filling up the queue with the same pages 24/7 nor knowingly feeding pages that have nothing to do to specifically block others from using the bot. Even though some like me feel like anyone should be able to request a page or list they want to have checked, It currently is just a case of how we all use this limited resource together in the most efficient way. Redalert2fan (talk) 16:29, 1 September 2021 (UTC)
- @Redalert2fan: my complaint about Abductive is that they:
- routinely fill one of the bot's 4 channels with large sets of pages where the bot has v little to do. This waste of the bot's resources could be avoided by testing samples of these sets before processing the whole lot, but even tho I showed Abductive how to do that, they strenuously resist the principle of preparation work.
- repeatedly fill a second channel of the bot's 4 channels by piling up hundreds of individual page requests while the bot is already processing a batch of theirs.
This works because the bot won't allow a user to run two batches simultaneously, but it does allow a user to submit an individual job while the batch is being processed. That's good bot design, because it allows an editor to submit a batch and not be locked out from checking an individual page that they have been working on ... but Abductive has deliberately gaming this system by submitting hundreds of individual pages while their batch is running.
- @Wbm1058: I am not sure whether Abductive's use of the bot counts as DDoS. I don't think that Abductive's goal is to deprive others of the use of the bot, just that Abductive shows i) a reckless disregard for how the effect of their actions is to deny others use of the bot; ii) no particular interest in any type of article or any topic area, just a desire to feed lots of pages to the bot.
- I agree with Redalert2fan that this is about
how we all use this limited resource together in the most efficient way
. I think we need some guidelines on how to do that, and I will try to draft something. --BrownHairedGirl (talk) • (contribs) 04:46, 6 September 2021 (UTC)- In an ideal world, the bot would only allow a maximum of ONE batch job at a time and that it would have a queue for submission of such batch jobs. It is immensely frustrating to 'ordinary' editors who, having spent a lot of time on an extensive cleaned up and want to get the citations checked, get no response or (eventually) a failure. Batch jobs are by definition not needed in real time and, until there is such a queuing mechanism, may I suggest that the guideline states an absolute ban on submitting more than one run before the preceding one has finished (subject to a 48 hour [72 hour?] timeout). If there is any way to encapsulate the rest of my frustration in the guideline, I would be most grateful. --John Maynard Friedman (talk) 10:39, 6 September 2021 (UTC)
- @John Maynard Friedman: I share your frustration, which is why I have repeatedly raised this issue.
- However, I think your suggested remedy is far too extreme.
- I watch the bot contribs very closely, because I follow behind it cleaning up as many as I can of the pages where it seems not to have fixed any bare URLs.
- It is quite rare for all 4 channels to be simultaneously in use by batch jobs, and so long as one channel is free then individual requests such as yours will be processed within a few minutes ... unless Abductive has gamed the system by stashing up dozens or hundreds of individual page requests. There is no need to limit the bot to one batch job at a time, and any attempt to do so would seriously and unnecessarily impede the bot's capacity to process batches.
- I think that if there is some throttling of batch jobs, a distinction should be made between batches which have been selected because they concentrate bot-fixable problems (such as my sets of articles with bare URLs, or Headbomb's sets of articles citing scholarly journals) and what I call "speculative trawls", i.e. sets of articles selected on the chance that the bot may find something.
- The speculative trawls by Abductive were a serious drain on the bot when the category limit was 4,400 pages and Abductive was stashing up requests for categories which fell just under that limit, often with very low returns. In response to that abuse, the size limit for categories was cut twice, initially to 1,100 and then to 550.
- Abductive continues to run a lot of speculative trawls based purely on the size of the category. For example, this morning (see 2000 bot edits) they have fed the bot first Category:Articles with specifically marked weasel-worded phrases from August 2021 (497 pages) and then Category:Articles with specifically marked weasel-worded phrases from July 2021 (460 pages). Neither category concentrates pages by topic, so there is no topic-of-interest involved, and both categories are based on an attribute which the bot cannot fix. The only reason I can see for selecting them is that they fall just under the bot's category size limit.
- The bot has not had many other jobs this morning, so Abductive's speculative trawls haven't significantly delayed anything else. But at most other times, this use of the bot does create bottlenecks.
- We don't need to hobble the bot's throughput of targeted batch jobs just to restrain one editor who shows reckless disregard for the effect on others. --BrownHairedGirl (talk) • (contribs) 13:19, 6 September 2021 (UTC)
- Instead of uselessly complaining about prefectly legitimate uses of the bot, concerned senior editors should go to the Village Pump and request that this be fixed in a way that the bot operator has suggested, but which is outside his control. Abductive (reasoning) 23:17, 7 September 2021 (UTC)
- Sigh.
- Gaming the queuing system to deprive other editors of prompt processing of their requests is not a legitimate use of the bot.
- Repeatedly wasting the bot's time on low-return speculative trawls is not a legitimate use of the bot.
- Regardless of what improvements might be made to the bot, Wikipedia is a collaborative project, and editors should not systematically disrupt the work of others. --BrownHairedGirl (talk) • (contribs) 06:19, 8 September 2021 (UTC)
- Repeatedly complaining here will accomplish nothing. The jobs I request from the bot have been fixing lots of high-readership articles, which is one of the metrics I use to select more runs. I have been refraining from starting runs when there are already three big runs going, but at some point new editors are going to discover the bot and then there will be times when there are four large runs going. It would be advisable for you to use your energy to cajole the technical folks at the Village Pump into doing something. Abductive (reasoning) 18:38, 8 September 2021 (UTC)
- For goodness sake. Category:Articles with specifically marked weasel-worded phrases from August 2021 are collections of articles by a cleanup tag, not by readership. --BrownHairedGirl (talk) • (contribs) 20:17, 8 September 2021 (UTC)
- True, but if you look at the articles in those categories, they tend to be ones that attract readers, attract users who add questionable material to them, and then attract editors who add the weasel word tags. And the bot is correcting a reasonably high percentage of them. Win-win. Abductive (reasoning) 20:23, 8 September 2021 (UTC)
- According to the pageviews tool, the 492 pages in that category have a combined average pageviews of 413,000 per day, or 839 per page per day, and there are 217 articles in that category with less than 100 views per day, and 71 with less than 10 views per day. That's hardly
high-readership
, so the conclusion that articles with weasel word tagsattract readers
isn't true. * Pppery * it has begun... 00:52, 9 September 2021 (UTC)- Wikipedia gets about 255 million pageviews a day. There are 6,373,026 articles on Wikipedia (492/6,373,026)*255,000,000 = 19,686, which is 21 times less than 413,000. So you are wrong by more than an order of magnitude. A simple glance at the articles in the category would reveal that there are a lot of high-interest articles in there. For example, the bot just fixed Deborah Birx. Heard of her, right? Abductive (reasoning) 08:30, 9 September 2021 (UTC)
- Cherrypicking one example out of dozens of sets of hundreds of articles is no help. And averages obscure the fact that as ppery has shown, a significant proportion of these articles have low page views.
- If you want to set the bot to work on highly-viewed articles, then use a methodology which selects only highly-viewed articles, then work through it systematically, avoiding duplicates in your list and articles which the bot has already processed in the last few months. --BrownHairedGirl (talk) • (contribs) 01:04, 11 September 2021 (UTC)
- My activities are helping build the encyclopedia. Removing duplicates is not the best idea, since the bot often takes more than one pass to make all possible corrections. My offer still stands; if you like, place a list of articles in User:Abductive/CS1 errors: bare URL, let me know, and I will run the bot on them. Otherwise you should not concern yourself with other users' legitimate uses of the bot. Abductive (reasoning) 03:06, 12 September 2021 (UTC)
- Arguably, your activities are hurting the encyclopedia more than helping, by acting as a denial of service attack on Citation bot that prevents others from using it more effectively. —David Eppstein (talk) 04:11, 12 September 2021 (UTC)
- That presupposes that other large runs are somehow more helpful, which they are not. Also, please note that I make an effort to avoid running the bot when there are three other large jobs runnning, while other users do not. Abductive (reasoning) 04:41, 12 September 2021 (UTC)
- Well, that summarizes the dispute, doesn't it? In your own mind, your searches are better than anyone else's, despite all evidence to the contrary, and so you think you deserve more access to the bot than anyone else, and will game the system to take that access because you think you deserve it. —David Eppstein (talk) 05:49, 12 September 2021 (UTC)
- That is incorrect. I have an equal right, and I use the bot judiciously. I do not game the system, and I consider such accusations to be uncivil. Abductive (reasoning) 05:57, 12 September 2021 (UTC)
- "I have an equal right". Not when your use of the bot is disruptive, low-efficiency, and prevents others from using it more effectively. A run on a random non-citation-related maintenance category is not as "helpful" as targeted cleanup runs, especially when it makes running the bot on individual articles crash/extremely slow because of a lack of ressources. Headbomb {t · c · p · b} 13:34, 12 September 2021 (UTC)
- @Abductive: I have posted[11] below at #Bot_still_being_abused_by_Abductive an analysis of the bot's latest 1500 edits.
- It shows that:
- Your batch requests continue to be poorly chosen;
- You continue to game the queueing system by flooding the bot with single-page requests while the batches are being processed, amounting to 73% of single-page requests in that period. I estimate the rate of your single-page requests to be one every 3 minutes over 9 hours.
- Your choice of single-page requests has no evident basis, and includes a significant proportion of pages (6 of the sampled 30) which wasted the bot's time by follow closely a prev edit by the bot.
- So:
- Your claim to
use the bot judiciously
is demonstrably false. - Your claim that you
do not game the system
is demonstrably false. - Your labelling of @David Eppstein's complaint as
uncivil
is bogus, because David's complaint was civilly-worded and is demonstrably true.
- Your claim to
- I also note your offer above[12] to process batches selected by me. That amounts to another attempt to game the queueing system, in this case by inviting me to effectively use two of the bot's 4 channels. I explained this before, so I am surprised to see you suggesting it again. --BrownHairedGirl (talk) • (contribs) 15:13, 12 September 2021 (UTC)
- That is incorrect. I have an equal right, and I use the bot judiciously. I do not game the system, and I consider such accusations to be uncivil. Abductive (reasoning) 05:57, 12 September 2021 (UTC)
- Well, that summarizes the dispute, doesn't it? In your own mind, your searches are better than anyone else's, despite all evidence to the contrary, and so you think you deserve more access to the bot than anyone else, and will game the system to take that access because you think you deserve it. —David Eppstein (talk) 05:49, 12 September 2021 (UTC)
- That presupposes that other large runs are somehow more helpful, which they are not. Also, please note that I make an effort to avoid running the bot when there are three other large jobs runnning, while other users do not. Abductive (reasoning) 04:41, 12 September 2021 (UTC)
- Arguably, your activities are hurting the encyclopedia more than helping, by acting as a denial of service attack on Citation bot that prevents others from using it more effectively. —David Eppstein (talk) 04:11, 12 September 2021 (UTC)
- My activities are helping build the encyclopedia. Removing duplicates is not the best idea, since the bot often takes more than one pass to make all possible corrections. My offer still stands; if you like, place a list of articles in User:Abductive/CS1 errors: bare URL, let me know, and I will run the bot on them. Otherwise you should not concern yourself with other users' legitimate uses of the bot. Abductive (reasoning) 03:06, 12 September 2021 (UTC)
- Wikipedia gets about 255 million pageviews a day. There are 6,373,026 articles on Wikipedia (492/6,373,026)*255,000,000 = 19,686, which is 21 times less than 413,000. So you are wrong by more than an order of magnitude. A simple glance at the articles in the category would reveal that there are a lot of high-interest articles in there. For example, the bot just fixed Deborah Birx. Heard of her, right? Abductive (reasoning) 08:30, 9 September 2021 (UTC)
- According to the pageviews tool, the 492 pages in that category have a combined average pageviews of 413,000 per day, or 839 per page per day, and there are 217 articles in that category with less than 100 views per day, and 71 with less than 10 views per day. That's hardly
- True, but if you look at the articles in those categories, they tend to be ones that attract readers, attract users who add questionable material to them, and then attract editors who add the weasel word tags. And the bot is correcting a reasonably high percentage of them. Win-win. Abductive (reasoning) 20:23, 8 September 2021 (UTC)
- For goodness sake. Category:Articles with specifically marked weasel-worded phrases from August 2021 are collections of articles by a cleanup tag, not by readership. --BrownHairedGirl (talk) • (contribs) 20:17, 8 September 2021 (UTC)
- Repeatedly complaining here will accomplish nothing. The jobs I request from the bot have been fixing lots of high-readership articles, which is one of the metrics I use to select more runs. I have been refraining from starting runs when there are already three big runs going, but at some point new editors are going to discover the bot and then there will be times when there are four large runs going. It would be advisable for you to use your energy to cajole the technical folks at the Village Pump into doing something. Abductive (reasoning) 18:38, 8 September 2021 (UTC)
- Instead of uselessly complaining about prefectly legitimate uses of the bot, concerned senior editors should go to the Village Pump and request that this be fixed in a way that the bot operator has suggested, but which is outside his control. Abductive (reasoning) 23:17, 7 September 2021 (UTC)
- In an ideal world, the bot would only allow a maximum of ONE batch job at a time and that it would have a queue for submission of such batch jobs. It is immensely frustrating to 'ordinary' editors who, having spent a lot of time on an extensive cleaned up and want to get the citations checked, get no response or (eventually) a failure. Batch jobs are by definition not needed in real time and, until there is such a queuing mechanism, may I suggest that the guideline states an absolute ban on submitting more than one run before the preceding one has finished (subject to a 48 hour [72 hour?] timeout). If there is any way to encapsulate the rest of my frustration in the guideline, I would be most grateful. --John Maynard Friedman (talk) 10:39, 6 September 2021 (UTC)
- @Redalert2fan: my complaint about Abductive is that they:
- BrownHairedGirl, you, and maybe others have complained that Abductive makes arbitrary queries. Essentially you seem to be complaining that Abductive launches "DoS". wbm1058 (talk) 16:14, 1 September 2021 (UTC)
- Where at? Headbomb {t · c · p · b} 16:09, 1 September 2021 (UTC)
- Perhaps, but I'm hearing complaints that the bot "allows for anyone to make arbitrary queries". – wbm1058 (talk) 16:00, 1 September 2021 (UTC)
- @Wbm1058, I'm not super familiar with Citation bot, which Cloud Services resources do you think are being directly provided? If it just allows users to trigger a bot that runs on some pages I think that's fine since that's not directly letting users e.g. make SQL queries or execute bash commands. Please let me know if I missed something. Legoktm (talk) 17:08, 1 September 2021 (UTC)
No new jobs
For the last 35 minutes, the bot has been editing only individual page requests by Abductive, which as usual are stacked high. See the latest bot contribs.
No edits have been made on behalf of any other editor since my batch ended at 21:24 etc. The bot has given no response either to my batch request, or to an individual page request I did as a test.
What's going on? Why is everyone else apparently locked out? --BrownHairedGirl (talk) • (contribs) 22:16, 24 September 2021 (UTC)
- OK, the most recent contribs show that the bot let FMSky back in at 22:40, and me at 22:44. But it's weird that for an hour, only one editor got anything done. --BrownHairedGirl (talk) • (contribs) 22:52, 24 September 2021 (UTC)
- The bot saved up a bunch of my requests, then disgorged them all at once. Sorry about that. Abductive (reasoning) 23:02, 24 September 2021 (UTC)
- @Abductive: Had you also submitted category requests? --BrownHairedGirl (talk) • (contribs) 23:28, 24 September 2021 (UTC)
- Not at that time. Abductive (reasoning) 00:42, 25 September 2021 (UTC)
- @Abductive: Had you also submitted category requests? --BrownHairedGirl (talk) • (contribs) 23:28, 24 September 2021 (UTC)
- The bot saved up a bunch of my requests, then disgorged them all at once. Sorry about that. Abductive (reasoning) 23:02, 24 September 2021 (UTC)
Abductive, a good start would be not to submit "a bunch of my jobs". Do one, wait until it has finished before you submit another. Given how this bot shares a general resource, I assume that you are causing widespread collateral damage. Your runs are making the tool unusable to normal editors. I have stopped using this tool to fill out citations in articles as I edit them because I now expect failure to be the only option. --John Maynard Friedman (talk) 08:23, 25 September 2021 (UTC)
- @John Maynard Friedman: Give it a try, the bot will get to the job eventually. Also, as can be seen here, the bot has inexplicable delays in spite of high or low apparent usage. Sometimes, the bot looks like it has only one or two or three channels in use, but won't make edits for a considerable period. At other times, it looks like there are four channels in use, but the request is processed in a matter of minutes. Abductive (reasoning) 00:03, 26 September 2021 (UTC)
- Do you really believe that I am so stupid as to report experiences that don't actually happen, that it is just incompetence? I'm not talking about me giving up on a single article request because apparently nothing has happened for ten minutes. I mean a request that ends after N minutes with an explicit failure message. Repeatedly.
- Stop trying to find other people to blame. Your current use of multiple concurrent batch rubs is a denial of service attack. Your behaviour is unambiguously WP: disruptive. Go find something useful to do with your life. There are thousands of articles that need personal hands-on editing. --John Maynard Friedman (talk) 08:05, 26 September 2021 (UTC)
- If I was to blame, how do you explain the fact that I experience the same thing all the time? When I try to use the bot, and I have no outstanding requests, it's still hit or miss if I get my request done in a timely manner, or in a few minutes, or never. I don't run multiple batch jobs, and in fact that's impossible. Abductive (reasoning) 09:46, 26 September 2021 (UTC)
- You wrote above
The bot saved up a bunch of my requests, then disgorged them all at once.
You should not have had "a bunch of jobs". If you know how to write a batch run then you definitely know by now not to submit another job until the previous one has finished or 24 hours has elapsed without a response. What is "a timely manner" for a background batch run? As opposed to a single editor who has worked on a single page and needs to verify the citations or collect the remaining metadata from the DOI? Answer: a week v ten minutes max. - No, you are not the only cause of this problem: I see other bot runs doing marginal value activities like inserting {{reflist talk}} into talk pages [if editors of that page cared, they'd have done it already]; fixing lint errors on signatures and so on. All well-meaning folk who fail, before they do it, to work out the cost/benefit ratio - which is what BrownHairedGirl has been banging on about. But yours is the most annoying because of the effect it has on one-shot use of this tool. You know that the server has limited capacity. Maybe it is not just you but it is your head that is above the parapet so don't complain when it gets shot. --John Maynard Friedman (talk)
- I'm pretty sure that the average user wants the bot to correct more than it does, but there are endless complaints from article owners about so-called "cosmetic" edits, so these hae been stifled. Abductive (reasoning) 11:30, 26 September 2021 (UTC)
- You wrote above
- If I was to blame, how do you explain the fact that I experience the same thing all the time? When I try to use the bot, and I have no outstanding requests, it's still hit or miss if I get my request done in a timely manner, or in a few minutes, or never. I don't run multiple batch jobs, and in fact that's impossible. Abductive (reasoning) 09:46, 26 September 2021 (UTC)
- No. The average editor (the average user is a reader) objects to resources being wasted on edits that have no visible effect. If an article is being edited anyway, such changes can be made en passant: the objection is not to the change but to pettifogging changes that need to be verified. This has nothing to do with WP:OWN but everything to do with maintaining a reader-focussed approach. --John Maynard Friedman (talk) 12:08, 26 September 2021 (UTC)
- @John Maynard Friedman: see my proposal below, at #Proposed new rule for using Citation bot. --BrownHairedGirl (talk) • (contribs) 15:04, 27 September 2021 (UTC)
- No. The average editor (the average user is a reader) objects to resources being wasted on edits that have no visible effect. If an article is being edited anyway, such changes can be made en passant: the objection is not to the change but to pettifogging changes that need to be verified. This has nothing to do with WP:OWN but everything to do with maintaining a reader-focussed approach. --John Maynard Friedman (talk) 12:08, 26 September 2021 (UTC)
Proposed new rule for using Citation bot
In the last week, there have been several prolonged periods when access to this bot has been severely impeded by the actions of two editors: @Abductive and @Whoop whoop pull up. Both editors have repeatedly flooded the bot with series of category requests made alongside long series of individual page requests.
The effect of this practice is to allow one editor to lock up two of the bot's four channels for significant lengths of time. Even when some channels appear not to be in use, the vast queues of requests from these two editors prevents the bot from processing requests by others.
Whatever these editors' intentions, the repeated effect of their actions is similar to a denial-of-service attack.
Most of this disruption could be avoided by a simple rule, which I set out below. It may be that this could be implemented by technical restrictions, but I do not want to make assumptions about either the technical possibilities or the willingness of the maintainers to volunteer their time to code any changes. --BrownHairedGirl (talk) • (contribs) 14:36, 27 September 2021 (UTC)
Proposed new rule: One request at a time
When an editor requests that the bot process either a batch of articles or a single article, they must not make another request until that article or batch has been processed. An exception is permitted where an editor has within the last 24 hours edited an article to add or modify a reference; in that case they may make a single page request for that article before other requests have finished processing.
Discussion and survey of Proposed new rule
- Add your comments and/or support/oppose here
- Support, as proposer. --BrownHairedGirl (talk) • (contribs) 14:37, 27 September 2021 (UTC)
Oppose - Please stop trying to WP:OWN Citation bot, @BrownHairedGirl. If you want a citation improver all to yourself, go get your own copy of Citation bot - just like other heavily-used bots have multiple duplicate instances to spread out their tasks between. Also, I'd like to note that you tie up Citation bot essentially 24/7 with enormous pageLinked jobs of thousands of articles, whereas I run category and/or individual jobs for much smaller fractions of the day, and most of my category runs (don't know about Abductive's) are smaller categories taking little time to run. Additionally, whether or not an editor has added or removed one or more reference(s) from an article has no bearing on whether the article has other citation errors that need to be fixed.Striking in light of my concerns being addressed by BrownHairedGirl's suggestions below, but Comment that the fact that this is causing problems is a sign of Citation bot being severely overloaded, illustrating the urgent need for additional duplicate instances of Citation bot and/or additional processing channels per instance of Citation bot. Whoop whoop pull up Bitching Betty ⚧ Averted crashes 15:07, 27 September 2021 (UTC)
- @Whoop whoop pull up: asking editors to stop simultaneously using two of the bot's four channels is about sharing resources collaboratively. It is the complete opposite of WP:OWNership.
Also, your description of Abductive's usage is false: over the last few months I have documented on this page many cases where Abductive has tied up the bot for ages with low return category requests. --BrownHairedGirl (talk) • (contribs)- Insufficient capacity on Citation bot's part marks a problem with the bot (which should be taken up with its developer, or by creating additional instances of Citation bot to spread out the load, as per accepted practice for other highly-used bots), not with the editors using it to improve the encyclopedia. Whoop whoop pull up Bitching Betty ⚧ Averted crashes 15:23, 27 September 2021 (UTC)
- The developers are volunteers who have kindly donated their time and skills to create the bot. Like any tool, it has limitations, including limited capacity, and while those limitations exist editors should collaboratively to share the resource effectively. Locking up two of the four channels is disruptive to that collaboration ... in other words, it avoidably disrupting other editors who are also
using it to improve the encyclopedia
. If you want to process lots of articles, submit a single batch request. --BrownHairedGirl (talk) • (contribs) 15:30, 27 September 2021 (UTC)- Submitting a single huge batch request (as you routinely do) would take exactly the same number of channel-hours as submitting many individual requests, the only differences being whether the job(s) lock one channel continuously or several channels intermittently - additionally, large batch jobs have the disadvantage of requiring a great deal of manual typing to assemble a pipe-separated list of article names and then paste the whole shebang into the box on the Citation bot console, while submitting individual jobs via the toolbar is quite literally point-and-click. I reiterate my earlier suggestion for having multiple duplicate instances of Citation bot to multiply its capacity. Whoop whoop pull up Bitching Betty ⚧ Averted crashes 15:39, 27 September 2021 (UTC)
- The purpose of this proposal is to restrict editors to one bot channel at a time, so that others are not locked out of the bot.
More instances of the bot or more channels would of course be great, but for now we have one bot with only four channels, and this proposal addresses the problem that some editors are using it in ways which avoidably disrupt others. Making a list of pages is easy, and using a text editor's search-and-replace function to add pipes is also easy. --BrownHairedGirl (talk) • (contribs) 15:52, 27 September 2021 (UTC)- How would one use search-and-replace to pipe-separate the list, given that most methods of exporting a list of article titles separate the individual page titles either with line breaks (which text editors generally don't support search or replace functions on) or with spaces (which would require huge amounts of manual work to keep the search-and-replace from splitting multiword titles)? Whoop whoop pull up Bitching Betty ⚧ Averted crashes 16:21, 27 September 2021 (UTC)
- Any text editor which supports regex will do it very simply, by replacing
\n
. For example, on Windows there is Notepad++; on Linux try Kate. There are lots of alternatives for every platform. --BrownHairedGirl (talk) • (contribs) 16:31, 27 September 2021 (UTC)- Will look into that, thanx! :-)
Update: Kate is already exceeding expectations. Whoop whoop pull up Bitching Betty ⚧ Averted crashes 17:44, 27 September 2021 (UTC)
- Will look into that, thanx! :-)
- Any text editor which supports regex will do it very simply, by replacing
- How would one use search-and-replace to pipe-separate the list, given that most methods of exporting a list of article titles separate the individual page titles either with line breaks (which text editors generally don't support search or replace functions on) or with spaces (which would require huge amounts of manual work to keep the search-and-replace from splitting multiword titles)? Whoop whoop pull up Bitching Betty ⚧ Averted crashes 16:21, 27 September 2021 (UTC)
- The purpose of this proposal is to restrict editors to one bot channel at a time, so that others are not locked out of the bot.
- Submitting a single huge batch request (as you routinely do) would take exactly the same number of channel-hours as submitting many individual requests, the only differences being whether the job(s) lock one channel continuously or several channels intermittently - additionally, large batch jobs have the disadvantage of requiring a great deal of manual typing to assemble a pipe-separated list of article names and then paste the whole shebang into the box on the Citation bot console, while submitting individual jobs via the toolbar is quite literally point-and-click. I reiterate my earlier suggestion for having multiple duplicate instances of Citation bot to multiply its capacity. Whoop whoop pull up Bitching Betty ⚧ Averted crashes 15:39, 27 September 2021 (UTC)
- The developers are volunteers who have kindly donated their time and skills to create the bot. Like any tool, it has limitations, including limited capacity, and while those limitations exist editors should collaboratively to share the resource effectively. Locking up two of the four channels is disruptive to that collaboration ... in other words, it avoidably disrupting other editors who are also
- Insufficient capacity on Citation bot's part marks a problem with the bot (which should be taken up with its developer, or by creating additional instances of Citation bot to spread out the load, as per accepted practice for other highly-used bots), not with the editors using it to improve the encyclopedia. Whoop whoop pull up Bitching Betty ⚧ Averted crashes 15:23, 27 September 2021 (UTC)
- @Whoop whoop pull up: asking editors to stop simultaneously using two of the bot's four channels is about sharing resources collaboratively. It is the complete opposite of WP:OWNership.
- In response to Whoop's comment
whether the article has other citation errors that need to be fixed
: I agree, but that comment misses the point.
This is not about whether the article needs bot attention, but whether it needs attention now. The purpose of that exception is to allow an editor to use the bot interactively, e.g. by adding a ref which consists only of an isbn or doi, and immediately trying to have the bot fill that ref. That interactive usage is hugely helpful when adding refs, which I why I propose it as an exception to the one-job-at-a-time rule. Other pages which are not being currently can wait until other asks have finished. --BrownHairedGirl (talk) • (contribs) 15:45, 27 September 2021 (UTC)
- In response to Whoop's comment
- Is that really a good use for Citation bot, given that the editor could easily take the time to fill in their new reference themselves without consuming any of the limited Citation bot resources that so concern you? Whoop whoop pull up Bitching Betty ⚧ Averted crashes 15:56, 27 September 2021 (UTC)
- Sometimes, it is a very good use of the bot. For example, if a book or journal paper has multiple authors and editors, the bot can take a ref which consist of only an isbn or doi and fill those out quickly, instead of taking lots of that editor's time. See Wikipedia:Wikipedia Signpost/2022-08-01/Tips and tricks. --BrownHairedGirl (talk) • (contribs) 16:08, 27 September 2021 (UTC)
- And that is exactly one of the reasons that I [used to] use the tool. I strongly support the proposal.
- For as long as (a) the resource is limited and (b) there is no mechanism for batch runs to have background priority, this proposal is essential. Airy handwaving that the service needs to improve is just fatuous. To accuse BHG of WP:OWN displays remarkable chutzpah - or just plain hypocrisy. As I have already said above, hogging all channels with batch requests, especially for 'nice to have' changes, is unambiguously WP:DISRUPTIVE (insidious type). BHG's proposal offers a reasonable compromise between types of use. Let's work towards consensus rather than have to refer the dispute upwards. --John Maynard Friedman (talk) 17:53, 27 September 2021 (UTC)
- Sometimes, it is a very good use of the bot. For example, if a book or journal paper has multiple authors and editors, the bot can take a ref which consist of only an isbn or doi and fill those out quickly, instead of taking lots of that editor's time. See Wikipedia:Wikipedia Signpost/2022-08-01/Tips and tricks. --BrownHairedGirl (talk) • (contribs) 16:08, 27 September 2021 (UTC)
- Is that really a good use for Citation bot, given that the editor could easily take the time to fill in their new reference themselves without consuming any of the limited Citation bot resources that so concern you? Whoop whoop pull up Bitching Betty ⚧ Averted crashes 15:56, 27 September 2021 (UTC)
- Support I'm failing to see a good reason not to support this. * Pppery * it has begun... 17:55, 27 September 2021 (UTC)
- Comment. Many thanks to @Whoop whoop pull up for withdrawing their oppose. I have a lot experiencing in preparing batch jobs, so if Whoops or anyone else wants assistance, I will try to help. --BrownHairedGirl (talk) • (contribs) 18:59, 27 September 2021 (UTC)
- I have a number of concerns that I hope will be addressed. First, if a user runs a category, the rate of edits will be low, and the rate of substantive edits will be even lower. A category that by some rare confluence of circumstances has never been run may get a return rate in the 75% range. A random collection of articles gets a return in the 50% range, but only about 10% of these will be substantive. A run of a category of stubs gets a low rate of return (perhaps 20-25%), but goes very quickly as many stubs have zero, one or two refs. A second run of a category that has recently been run gets an edit rate around 12%, and includes edits that the bot should have made the first time around. Long articles take a long time for the bot to run, but more often find at least one edit to make; so running a bunch of them will look like a good use of the bot on a percent basis, but not actually be something to be emulated. As an example of this, just choosing to run articles that contain any set of search terms will find longer articles, the more the search terms narrow down the list, the longer the articles and the more likely are edits. Second, experienced users of the bot have been making efforts to find articles that need bot attention, or remove articles from their runs that have recently been run. I myself have a spreadsheet with weeks of bot activity on it which I use. But how can we expect all users to do this? Third (and related), users will discover the bot, and run sub-optimal categories and batch jobs. I fear that unless they are allowed some leeway at first and then gently guided towards this rule, rancor will continue. The sub-optimal runs can be used to ascertain bot performance on the average category. Similarly, if a user wants to test a method of discovering articles that the bot will make edits to, they may occasionally fail to reach a threshold of edits that will please intensive users of the bot. Fourth, as User:Whoop whoop pull up notes, it would be a shame if the rule make it seem like the problem of demand is solved, and progress towards technical solutions stalls. Fifth (and related), even if everybody follows the rule as proposed, it will only take four runs to lock out single-article users. This will happen a lot, and without, for example, an instance of the bot for them, there will continue to be complaints and rancor ("You big users have banded together at the little guy's expense!") So, I think the rule would be a good temporary measure, if other big users agree among themselves to refrain from running the bot if there are three medium or large runs already going, and agree to refrain from complaining about mid-return runs that otherwise abide by the bot limits and the rule. Abductive (reasoning) 20:33, 27 September 2021 (UTC)
- Most of Abductive's over-long and poorly-laid-out comment has nothing to do with this proposal. This proposal is not about Abductive's disruption-by-low-return-speculative-trawl; this proposal is about Abductive's disruption-by-lots-of-individual-requests-alongside-a-batch.
Also, this proposal will neither delay nor hasten any improvements to the bot's capacity. This is a proposal to remove one form of disruption to the bot as currently configured.
Finally, of course new users of the bot should be treated gently. It seems rather disingenuous of Abductive to try to invoke new users to deflect attention away from the fact that Abductive has been using the bot disruptively for months. --BrownHairedGirl (talk) • (contribs) 21:36, 27 September 2021 (UTC)
- Most of Abductive's over-long and poorly-laid-out comment has nothing to do with this proposal. This proposal is not about Abductive's disruption-by-low-return-speculative-trawl; this proposal is about Abductive's disruption-by-lots-of-individual-requests-alongside-a-batch.
- Support on batch requests, individual articles are pretty irrelevant. That said, simply removing batch request privilege from Abductive would also be sufficient in solving the issues. Headbomb {t · c · p · b} 20:51, 27 September 2021 (UTC)
- Of course that will not work, as it does not solve the problem of demand and single users will continue to get locked out. I propose that the rule be amended to state that no user will be prevented from using the bot based on arbitrary notions of correct usage, and that users should never start a job when there are three jobs running. Abductive (reasoning) 20:56, 27 September 2021 (UTC)
- Question - how would one tell how many jobs are currently running? The Citation bot console doesn't provide this information. Whoop whoop pull up Bitching Betty ⚧ Averted crashes 22:31, 27 September 2021 (UTC)
- I despair of the comment about
arbitrary notions of correct usage
. There is nothing at all arbitrary about deploring tying up bot for ages with low-return-batch-jobs. There is nothing at all arbitrary about deploring the practice of one editor systematically tying up two of the bot's channels. --BrownHairedGirl (talk) • (contribs) 21:40, 27 September 2021 (UTC)
- Of course that will not work, as it does not solve the problem of demand and single users will continue to get locked out. I propose that the rule be amended to state that no user will be prevented from using the bot based on arbitrary notions of correct usage, and that users should never start a job when there are three jobs running. Abductive (reasoning) 20:56, 27 September 2021 (UTC)
Off-topic
|
---|
|
- More flooding. Another example of why I proposed this rule.
This evening, we seem to have another instance of the bot being locked by a flood of individual requests. For the last 40 minutes, the bot has made no edits except (mostly large) individual pages requested by Abductive: see recent bot contribs. I have had no response to either a batch request made over 30 minutes ago, or an individual page request made ~15 minutes ago as a test.
I can live with a delay to my batch job, but it is highly disruptive to anyone trying to use the bot to fill in refs on an article which they are improving. --BrownHairedGirl (talk) • (contribs) 21:07, 30 September 2021 (UTC)
- It is odd how the bot seems to be unresponsive to requests, even if it appears that channels are open, then do a bunch at once. But over the last few days, there have been long periods where big users (not me), have monopolized all the channels without checking to see if there are already three big jobs running. So intense demand remains a problem. Abductive (reasoning) 21:51, 30 September 2021 (UTC)
- Long periods of unresponsiveness without results to show for it could potentially be due to the bot chewing through a long run of articles that it was fed but that didn't actually need any fixes; in that situation, the bot would be making no edits for a potentially-considerable length of time, thus seeming to be stalled. Whoop whoop pull up Bitching Betty ⚧ Averted crashes 01:27, 3 October 2021 (UTC)
- It is odd how the bot seems to be unresponsive to requests, even if it appears that channels are open, then do a bunch at once. But over the last few days, there have been long periods where big users (not me), have monopolized all the channels without checking to see if there are already three big jobs running. So intense demand remains a problem. Abductive (reasoning) 21:51, 30 September 2021 (UTC)
- Oppose. Creation of new rules will only serve to make life more miserable for everyone. I've checked the edits in the periods flagged above and I want to thank Abductive and Whoop whoop pull up for triggering so many useful edits. If we start making value judgements I might even say BHG is the one DOS'ing useful edits by requesting many (mostly minor and aesthetic) edits. But that's really not the case: it's a pity that any editor should be forced to wait 30 minutes for their request to be processed. The best solution is for the large requesters to work on ways to increase the ratio of submitted titles which actually end up producing an edit, to avoid wasting bot cycles on no-ops: for this I welcome cooperation between users and I encourage to take advantage of BHG's offer to help with list preparation, as well as my continued willingness to help with SQL queries. Nemo 22:29, 7 October 2021 (UTC)
- @Nemo bis: I do not request any
minor and aesthetic edits
. All my batch jobs are to clear up WP:Bare URLs, on articles which I have laboriously identified as having one or more bare URLs. BrownHairedGirl (talk) • (contribs) 15:12, 24 October 2021 (UTC)- That's indeed valuable work for the bot, thank you! Still, often those bare URLs will fail to expand and the edits can appear to be pointless. So personally I wouldn't feel comfortable trying to apply such proposed rules. Nemo 21:14, 5 November 2021 (UTC)
- @Nemo bis: the bot should never make pointless edits. Any breaches of WP:COSMETICBOT should be fixed.
- But I am really puzzled that you choose criticise me, whilst praising Abductive. My Bare URL cleanup runs have average an edit rate of over 50%, which is way ahead of the rates achieved by Abductive. Most runs remove all the bare URLs on about 30% of the pages, and some bare URLs on more pages. Again, that is a significantly higher substantive change rate than achieved by Abductive, as documented repeatedly on this page. BrownHairedGirl (talk) • (contribs) 11:32, 13 November 2021 (UTC)
- @Nemo bis: I do not request any
- Support any rule that potentially frees up capacity for individual page requests. Tired of getting 502 and other errors because the queue is all filled up with batch jobs. Impossible for regular editors to use the bot to improve citations by automatically adding identifiers and other details or to complete transformation of a chapter DOI into a full book citation because the bot is consistently overloaded by jobs for many hours through the day. — Chris Capoccia 💬 14:19, 20 October 2021 (UTC)
- Suppport per nom, my earlier comments about lack of a mechanism to let individual requests preempt batch runs, and as explained by Chris Capoccia. --John Maynard Friedman (talk) 15:57, 20 October 2021 (UTC)
Caps: Part II/III/IV...
- What should happen
- [13]
- We can't proceed until
- Feedback from maintainers
Covering Part Ii/Iii/Iv/Vi/Vii/Viii/Ix/Xi/Xii/Xiii/Xiv/Xv/Xvi/Xvii/Xiii ... Headbomb {t · c · p · b} 20:58, 4 December 2021 (UTC)
Link removal
- Status
- Not a bug
- Reported by
- DavidMCEddy (talk) 16:25, 21 December 2021 (UTC)
- What happens
- URLs deleted. It says, "Removed proxy/dead URL that duplicated identifier." I saw "url" and "archive-url" deleted, both of which worked for me when I tested them. I don't know what they are alleged to duplicate. Maybe "doi"? I don't know what that is. In any event, I'm vehemently opposed to removing either URL, because special knowledge that I don't have is required to access the article via the mysterious identifier that is allegedly duplicated by the URLs that were deleted. Removing URLs like this make it more difficult for others like me to find the article. Also, in cases like this where both "url" and "archive-url" are given, I think it degrades the utility of the reference if either is deleted. As long as the main "url" is good, the "archive-url" is redundant. However, at some time in the future the "url" may cease to function, at which point the "archive-url" becomes more valuable. What's the harm in retaining multiple references / links like this? I think it's done routinely in Wikidata, and I think it's valuable there as well.
- What should happen
- Don't delete URLs.
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Phylogenetic_Assignment_of_Named_Global_Outbreak_Lineages&diff=next&oldid=1057419861
The PMC link is guaranteed to always bee free. URLs often are not, and they often change, unlike DOIs and PMC. AManWithNoPlan (talk) 16:59, 21 December 2021 (UTC)
Caps: Drug Des Deliv [again]
- What happens
- [14]
- What should happen
- Drug Des Deliv
- We can't proceed until
- Feedback from maintainers
There's been a regression or something. Headbomb {t · c · p · b} 00:27, 23 December 2021 (UTC)
Mistakenly marking DOI as broken
- Status
- Fixed
- Reported by
- Jo-Jo Eumerus (talk) 18:23, 22 December 2021 (UTC)
- What happens
- DOI is marked as broken
- What should happen
- This DOI still works
dx.doi.org is having some issues right now. So, I am going back and fixing the few that pop up. AManWithNoPlan (talk) 21:07, 22 December 2021 (UTC)
- Is this the same issue? Jo-Jo Eumerus (talk) 21:15, 22 December 2021 (UTC)
- Looks like the bot might be readding the bad content. Jo-Jo Eumerus (talk) 09:43, 23 December 2021 (UTC)
- some changes made to the bot to double check these before committing to it. AManWithNoPlan (talk) 14:44, 23 December 2021 (UTC)
- Looks like the bot might be readding the bad content. Jo-Jo Eumerus (talk) 09:43, 23 December 2021 (UTC)