Wikipedia:Bots/Requests for approval
All editors are encouraged to participate in the requests below – your comments are appreciated more than you may think! |
New to bots on Wikipedia? Read these primers!
- Approval process – How these discussions work
- Overview/Policy – What bots are/What they can (or can't) do
- Dictionary – Explains bot-related jargon
To run a bot on the English Wikipedia, you must first get it approved. Follow the instructions below to add a request. If you are not familiar with programming consider asking someone else to run a bot for you.
Instructions for bot operators | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
Bot-related archives |
---|
Bot Name | Status | Created | Last editor | Date/Time | Last BAG editor | Date/Time |
---|---|---|---|---|---|---|
JJPMaster (bot) (T|C|B|F) | Open | 2025-01-02, 22:20:26 | DreamRimmer | 2025-01-03, 13:25:55 | Never edited by BAG | n/a |
Tom.Bot 8 (T|C|B|F) | On hold | 2024-12-27, 09:33:39 | Primefac | 2025-01-01, 13:25:52 | Primefac | 2025-01-01, 13:25:52 |
MacaroniPizzaHotDog Bot (T|C|B|F) | On hold: User response needed! | 2024-10-28, 20:59:48 | Primefac | 2025-01-01, 14:01:46 | Primefac | 2025-01-01, 14:01:46 |
RustyBot 2 (T|C|B|F) | On hold | 2024-09-15, 15:17:54 | Rusty Cat | 2025-01-02, 04:19:44 | Primefac | 2025-01-01, 14:02:29 |
PonoRoboT 2 (T|C|B|F) | On hold: User response needed! | 2024-07-20, 23:38:17 | Primefac | 2025-01-01, 14:03:48 | Primefac | 2025-01-01, 14:03:48 |
GalaxyBot 2 (T|C|B|F) | In trial | 2025-01-02, 19:18:44 | Extraordinary Writ | 2025-01-05, 22:26:30 | Primefac | 2025-01-03, 09:58:46 |
BunnysBot 2 (T|C|B|F) | In trial | 2024-11-23, 12:59:57 | Primefac | 2025-01-01, 13:56:52 | Primefac | 2025-01-01, 13:56:52 |
C1MM-bot 3 (T|C|B|F) | In trial | 2024-12-12, 04:42:12 | MPGuy2824 | 2025-01-02, 05:07:14 | Primefac | 2025-01-01, 13:34:44 |
KiranBOT 14 (T|C|B|F) | In trial | 2024-12-26, 23:47:23 | Primefac | 2025-01-01, 13:30:16 | Primefac | 2025-01-01, 13:30:16 |
CFA (bot) (T|C|B|F) | In trial | 2024-12-31, 05:00:34 | Primefac | 2025-01-01, 13:24:09 | Primefac | 2025-01-01, 13:24:09 |
CanonNiBot 1 (T|C|B|F) | In trial | 2024-12-17, 12:50:01 | Primefac | 2024-12-23, 12:35:47 | Primefac | 2024-12-23, 12:35:47 |
Ow0castBot (T|C|B|F) | In trial | 2024-11-14, 01:51:38 | Usernamekiran | 2024-12-05, 00:18:38 | Primefac | 2024-12-01, 20:39:29 |
Platybot (T|C|B|F) | In trial: User response needed! | 2024-07-08, 08:52:05 | Primefac | 2024-12-23, 12:43:22 | Primefac | 2024-12-23, 12:43:22 |
KiranBOT 10 (T|C|B|F) | On hold | 2024-09-07, 13:04:48 | Xaosflux | 2025-01-01, 18:01:09 | Xaosflux | 2025-01-01, 18:01:09 |
SodiumBot 2 (T|C|B|F) | In trial: User response needed! | 2024-07-16, 20:03:26 | Sohom Datta | 2024-12-26, 14:22:10 | Primefac | 2024-12-23, 12:44:24 |
AussieBot 1 (T|C|B|F) | Extended trial: User response needed! | 2023-03-22, 01:57:36 | Hawkeye7 | 2024-12-23, 20:12:37 | Primefac | 2024-12-23, 12:46:59 |
KiranBOT 12 (T|C|B|F) | Trial complete | 2024-09-24, 15:59:32 | Usernamekiran | 2025-01-05, 04:05:15 | Primefac | 2025-01-01, 13:19:45 |
BunnysBot 4 (T|C|B|F) | Trial complete | 2024-12-14, 15:54:28 | Bunnypranav | 2025-01-03, 12:14:46 | Primefac | 2025-01-01, 13:32:36 |
Current requests for approval
Operator: JJPMaster (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 22:20, Thursday, January 2, 2025 (UTC)
Automatic, Supervised, or Manual: automatic
Programming language(s): Python
Source code available: https://github.com/JJPMaster/jjpmaster-bot-enwp-t1 (GPLv3)
Function overview: Updates User:JJPMaster/Editnotice requests
Links to relevant discussions (where appropriate):
Edit period(s): Continuous
Estimated number of pages affected: 1
Exclusion compliant (Yes/No): No
Already has a bot flag (Yes/No): No
Function details: Every time a new request to modify or create an editnotice is made, the bot adds an entry to User:JJPMaster/Editnotice requests. While this falls under WP:EXEMPTBOT, I am asking for the bot to be flagged for the sake of avoiding rate-limiting.
Discussion
- As per https://gerrit.wikimedia.org/r/plugins/gitiles/operations/mediawiki-config/+/refs/heads/master/wmf-config/InitialiseSettings.php#3137, your bot can currently perform 90 edits per minute without a bot flag. Will it exceed this limit? – DreamRimmer (talk) 06:20, 3 January 2025 (UTC)
- @DreamRimmer: I wasn't really concerned about limits on the actual number of edits (I doubt it will exceed that), but more the fact that it attempts to make an edit every time the Template talk namespace is edited. I know from AWB experience that null edits are considered edits for the sake of API limits, so I was concerned about that being a problem here. JJPMaster (she/they) 06:25, 3 January 2025 (UTC)
- You can add a condition to check if there are any edit requests before making an API request to edit the page, ensuring that the page is only edited when there are edit requests (i.e., when editnotices is not empty). This is just a suggestion; I have no problem with this BRFA :)
- @DreamRimmer: I wasn't really concerned about limits on the actual number of edits (I doubt it will exceed that), but more the fact that it attempts to make an edit every time the Template talk namespace is edited. I know from AWB experience that null edits are considered edits for the sake of API limits, so I was concerned about that being a problem here. JJPMaster (she/they) 06:25, 3 January 2025 (UTC)
if editnotices:
annotated_list = ''.join(f'* [[{j}]]\n' for j in editnotices) if len(editnotices) > 0 else '\n* \'\'None\'\''
print(editnotices)
page.text = "Current editnotice edit requests:\n" + annotated_list
page.save(f"Bot: Updating editnotice request list ({len(editnotices)} requests)")
else:
print("no edit requests found")
– DreamRimmer (talk) 09:13, 3 January 2025 (UTC)
- @DreamRimmer: This has been Implemented, although I didn't use your specific code to do it. See commit, diff. JJPMaster (she/they) 13:06, 3 January 2025 (UTC)
- Looks good! You are smart. Thanks for doing this. – DreamRimmer (talk) 13:25, 3 January 2025 (UTC)
Operator: Tom.Reding (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 09:33, Friday, December 27, 2024 (UTC)
Automatic, Supervised, or Manual: automatic
Programming language(s): C#
Source code available:
Function overview: Process pages in Category:Pages using WikiProject banner shell with unknown parameters (649,386)
Links to relevant discussions (where appropriate): Template talk:WikiProject banner shell#December update
Edit period(s): OTR
Estimated number of pages affected: ~900,000
Exclusion compliant (Yes/No): Yes
Already has a bot flag (Yes/No): Yes
Function details: Migrate the |living=
parameter to |blp=
for {{WikiProject banner shell}} in Category:Pages using WikiProject banner shell with unknown parameters (649,386) (currently @ 874,604), and piggyback WikiProject-template standardizations.
Discussion
- Not sure if non-bureaucrat, non-bot experts are permitted to opine, but that's a huge number of articles on BLPs that have lacked the standard BLP notice at the top of the talk page since 14 Dec -- which provides a lot of links to important policies and, I think, forms an important part of education of new editors and possibly also people who have a Wikipedia article about themselves; not a good situation to leave the articles in. ETA: Wikipedia:Administrators' noticeboard/Incidents#Creating the need to make 400,000 unnecessary edits is another discussion about the template change that led to current the problem. Espresso Addict (talk) 13:17, 27 December 2024 (UTC)
- Tom, make sure the bot is using the tag
Talk banner shell conversion
when doing this task. --Gonnym (talk) 14:07, 27 December 2024 (UTC)- I'd like to know how to do that, but I'm not sure how/if it's possible via the AWB module. ~ Tom.Reding (talk ⋅dgaf) 14:29, 27 December 2024 (UTC)
Needs wider discussion. Given the ANI complaint linked above, I think the nearly 1 million edits proposed here need wider discussion than one template's talk page with discussions only a handful of people seem to have participated in. Template talk:WikiProject banner shell/Archive 11#Why we should choose between blp or living, for example, had only three people involved. Anomie⚔ 16:13, 27 December 2024 (UTC)
- I'm more than happy to have the template change that has caused this problem undone, but I don't think we should sit around talking about the best way forward for months, as all the BLPs in the nearly a million talk pages affected are currently lacking any obvious link to BLP policy. Espresso Addict (talk) 05:01, 28 December 2024 (UTC)
- If this bot is going to be approved, there needs to be consensus, probably on one of the Village pump pages. Reverting the problematic edit until that discussion can happen would probably be a good thing for the reasons you note, but that isn't something that can be decided here alone either. Anomie⚔ 16:05, 28 December 2024 (UTC)
- On hold, pending resolution of the above. Primefac (talk) 13:25, 1 January 2025 (UTC)
- If this bot is going to be approved, there needs to be consensus, probably on one of the Village pump pages. Reverting the problematic edit until that discussion can happen would probably be a good thing for the reasons you note, but that isn't something that can be decided here alone either. Anomie⚔ 16:05, 28 December 2024 (UTC)
Operator: MacaroniPizzaHotDog (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 20:59, Monday, October 28, 2024 (UTC)
Function overview: Performs sentiment analysis on pending AfC submissions, leaves AfC comments where appropriate.
Automatic, Supervised, or Manual: Automatic
Programming language(s): Python
Source code available: No
Links to relevant discussions (where appropriate):
Edit period(s): Daily
Estimated number of pages affected: 250
Namespace(s): Draft
Exclusion compliant (Yes/No): No
Function details: Detects non-neutral language (i.e., overly positive, negative, or subjective) in pending AfC submissions (retrieved by querying the MediaWiki API) using TextBlob. Adds AfC comments with mwclient where appropriate.
Discussion
Has this idea been discussed somewhere before?? * Pppery * it has begun... 00:05, 29 October 2024 (UTC)
- Comment: This seems like it would be a WP:CONTEXTBOT problem. What if someone were writing a draft article about Michael Jordan that contained text like
Jordan is often referred to as the greatest basketball player of all time
, with references to multiple reliable sources? Would that draft be tagged in some way? – Jonesey95 (talk) 15:12, 29 October 2024 (UTC)- Yes, that is a good point. I can make it so it checks sentence by sentence, and looks at the references. The real problem is finding out if those references actually say that, and if they are reliable. MacaroniPizzaHotDog (talk • contributions) 15:16, 29 October 2024 (UTC)
- Or I could eliminate the polarity check and focus entirely on objectivity. MacaroniPizzaHotDog (talk • contribs) 18:17, 29 October 2024 (UTC)
- Yes, that is a good point. I can make it so it checks sentence by sentence, and looks at the references. The real problem is finding out if those references actually say that, and if they are reliable. MacaroniPizzaHotDog (talk • contributions) 15:16, 29 October 2024 (UTC)
Needs wider discussion. At the very least, make sure WT:AFC actually wants this. Primefac (talk) 15:46, 30 October 2024 (UTC)
- On hold. until this is done. Primefac (talk) 15:46, 30 October 2024 (UTC)
- A user has requested the attention of the operator. Once the operator has seen this message and replied, please deactivate this tag. (user notified) Any movement on this? Primefac (talk) 14:01, 1 January 2025 (UTC)
For any discussion to occur on whether this is useful, it would be beneficial to see the comments this would produce. @MacaroniPizzaHotDog I would suggest setting up the bot to initially post the AfC comments in a page in userspace for demonstration. Perhaps a table-like format with the draft name and comment. Do also include entries for drafts for which no comment is generated (to check for false negatives). Once you have 300 or so entries, we can review it and get the feedback of the AFC project as well. – SD0001 (talk) 16:08, 14 November 2024 (UTC)
- Note: This bot appears to have edited since this BRFA was filed. Bots may not edit outside their own or their operator's userspace unless approved or approved for trial. AnomieBOT⚡ 23:10, 14 November 2024 (UTC)
- Oh I really messed up. Sorry. I accidentally had it edit outside of its userspace. See, the page variable was being used to store the page for mwclient. But it was overwritten to the last page in the for loop. Oops. Sorry. MacaroniPizzaHotDog (talk • contribs) 23:19, 14 November 2024 (UTC)
Operator: Rusty Cat (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 15:17, Sunday, September 15, 2024 (UTC)
Function overview: Categorize and create redirects to year pages (AD and BC).
Automatic, Supervised, or Manual: Automatic
Programming language(s): Python (pywikibot)
Source code available: Will provide if needed
Links to relevant discussions (where appropriate): Wikipedia:Bot requests/Archive 86#Articles about years: redirects and categories
Edit period(s): one time run
Estimated number of pages affected: about 1000-2000 year pages, so assuming we have to create 3 redirects for each, maximum 6000
Namespace(s): Main
Exclusion compliant (Yes/No): Yes
Function details: For each number 1-2000, the bot will operate on the pages "AD number" and "number BC".
On AD pages, the bot will append Category:Years AD to the page if it does not already have it.- The bot will create redirects "ADyear", "year AD", and "yearAD" to AD pages, and "BCyear", "BC year", and "yearBC" to the BC pages.
Discussion
- Support as requester. Note that the AD year articles are, in the main, currently not categorised other than by number (e.g. Category:98 for AD 98). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:15, 16 September 2024 (UTC)
- @Pigsonthewing: I just checked and realized that the number categories are subcats of the Category:Years category. Does that mean that the bot does not need to put the page into the AD Years category? Rusty 🐈 14:53, 16 September 2024 (UTC)
- Ah, I'd missed that. I guess so. I'll start a separate discussion about subdividing Category:Years into BC and AD sub-cats. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:58, 16 September 2024 (UTC)
- It was suggested to use categories like Category:Years of the 19th century instead, so I'm applying those now, using Cat-a-lot. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:36, 29 September 2024 (UTC)
- Ah, I'd missed that. I guess so. I'll start a separate discussion about subdividing Category:Years into BC and AD sub-cats. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:58, 16 September 2024 (UTC)
- @Pigsonthewing: I just checked and realized that the number categories are subcats of the Category:Years category. Does that mean that the bot does not need to put the page into the AD Years category? Rusty 🐈 14:53, 16 September 2024 (UTC)
- Which "R from" templates, if any, will be placed on the new redirects? I'm seeing one on AD 812 and a different one on 79 AD. Is there a systematic way of using them? – Jonesey95 (talk) 10:17, 17 September 2024 (UTC)
- I'd say that {{R from year}} is what should be used here, as it states "This is a redirect from a formatted year title to the related year article."
- And "AD" isn't a disambiguator in the parenthesis sense. Rusty 🐈 14:02, 17 September 2024 (UTC)
- Separate question: I am seeing both AD 128 and 152 as year pages, but the task description says that the bot will operate only on "AD pages", or, in a separate specification, "AD number" pages. How will the bot task know the correct target for its redirects? Is there a systematic numbering method of these pages? – Jonesey95 (talk) 10:22, 17 September 2024 (UTC)
- @Jonesey95:
- I didn't know about the existence of 152 previously, thanks for bringing that to my attention.
- I believe it will not be as straightforward to find all the year pages only beginning with a number; assuming the year pages are correctly categorized, the bot should check for a subcat of Category:Years on the page, and if so, assume it is a year page.
- If the "AD number" page exists and it is not a redirect, we assume that page is the year page for that year. Otherwise, it is assumed that the year page is just the number. Rusty 🐈 13:58, 17 September 2024 (UTC)
- Is there a consensus for this task? If there is a lack of standardisation in the naming of pages, that should be taken care of first, followed by a consensus on which redirects to have (I note that 2/3 of each example given in the BOTREQ thread were redlinks). Primefac (talk) 11:42, 20 October 2024 (UTC)
- On hold. pending answers to the above queries. Primefac (talk) 12:50, 10 November 2024 (UTC)
- @Primefac: Sorry for the late reply. I think that the page name standardization doesn't matter as long as we have the redirects to each page consistent (MOS:VAR?)
- I believe that the examples given in the BOTREQ are redlinks because they are what the requesting user wants to be created by the bot. Rusty 🐈 00:38, 12 November 2024 (UTC)
- Redirects may be cheap, but we're talking 2000 of them, at least. I would like to see a consensus that this is desired, rather than just something Andy thinks is necessary. Primefac (talk) 21:46, 17 November 2024 (UTC)
{{Operator assistance needed|D}}
Any movement on this? Primefac (talk) 14:02, 1 January 2025 (UTC)- @Primefac: no, I don't think so Rusty 🐈 04:19, 2 January 2025 (UTC)
- Redirects may be cheap, but we're talking 2000 of them, at least. I would like to see a consensus that this is desired, rather than just something Andy thinks is necessary. Primefac (talk) 21:46, 17 November 2024 (UTC)
- On hold. pending answers to the above queries. Primefac (talk) 12:50, 10 November 2024 (UTC)
Operator: Ponor (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 23:36, Saturday, July 20, 2024 (UTC)
Function overview: WP:MASSCREATE the remaining 3200 out of 6700 Croatian naseljes (settlements), which are the third level division of the country. The bot can create stubs like Dubrava, Split-Dalmatia County. Update the existing articles with ZIP codes (new official source), and historical population data graphs (where possible, under full supervision).
Automatic, Supervised, or Manual: Automatic creation. Supervised or manual updates.
Programming language(s): Python @ PAWS
Source code available: possible
Links to relevant discussions (where appropriate): Wikipedia talk:WikiProject Croatia/Archive 5#Croatian settlement articles mass creation
Edit period(s): one time run
Estimated number of pages affected: 3200 (+2500 or so)
Namespace(s): Articles
Exclusion compliant (Yes/No): irrelevant
Function details:
- Create some 3200 articles from the list Wikipedia:WikiProject Croatia/To Do List/Missing settlements, link them with Wikidata.
- Update existing Croatian settlement articles with ZIP codes and historical population data graphs, where possible (time permitting). The same job has been completed on hrwiki for all 6700 settlements.
Discussion
Needs wider discussion. The discussion you link was between you and only one other person. Please seek consensus at WP:Village pump (proposals) or a similar venue where we can be sure many people have seen it. Anomie⚔ 01:11, 21 July 2024 (UTC)
- Hm... Thanks, though I'm not sure I wanna go through anything like Wikipedia:Village pump (proposals)/Archive 207 again. I thought the 2000+ existing Croatian settlement stubs would prove the current consensus. A few hundred stubs created by the two users I mentioned in the linked WikiProject:Croatia discussion definitely contain less information than my bot can add, and were all kept. Let me ping @Joy to see if he can help push this through... somewhere. I don't have time for endless opinionated discussions myself, I'm afraid. Ponor (talk) 01:49, 21 July 2024 (UTC)
- You linked to a failed proposal to tighten the notability guideline, but it has little relevance to this proposal, because if all these new articles look like Dubrava, Split-Dalmatia County there's no way anyone's going to propose their deletion. These are not gas pumps masquerading as villages.
- Even if we wanted to upmerge that information into list articles, those historical population graphs would just seem to be unwieldy, it would be pointless shoehorning.
- @Primefac had previously allowed Wikipedia:Bots/Requests for approval/PonoRoboT and I don't remember seeing any problems, it seemed to be a nice, straightforward improvement to the encyclopedia.
- @Anomie, is there a real difference here? IOW why would this change to these 3k settlement articles need more discussion when the previous change to analogous 3k settlement articles didn't?
- The fact that one group of 3k Croatian places has articles while another group of them doesn't is a historical fluke. If we need a discussion on making this situation consistent, the previously existing group needs to be discussed as well. But we already know they all qualify under WP:5P1 etc, so I don't quite see why this would be frowned upon according to standard processes (WP:BOLD, WP:NOTBURO). --Joy (talk) 06:32, 21 July 2024 (UTC)
- I was only reminding everyone what our notability discussions end up looking like. Since there were recent MEAT creations of these stubs, I'm thinking creating them by hand would be a waste of anyone's precious time if I can do the same thing, or better, by my bot.
- I see that, for example, Serbia has all of their 3rd level two-sentence geo stubs created since 2010 or so. That says WP:EDITCON is there, no? Ponor (talk) 10:01, 21 July 2024 (UTC)
- The real difference between Wikipedia:Bots/Requests for approval/PonoRoboT and this is that this is about creations, and the community has for many years now wanted to vet bot creations of articles before they happen. And that's regardless of whether the proposed creations would pass WP:N (part of it is that the community wants independent evaluation of that before the creations happen) or whether other articles on the topic or related topics already exist.If you want to refer to policy, WP:MASSCREATION says (emphasis added)
It is also strongly encouraged (and may be required by BAG) that community input be solicited at WP:Village pump (proposals) and the talk pages of any relevant WikiProjects.
Unless you can get another BAGger to proceed without, this is me requiring. Anomie⚔ 11:17, 21 July 2024 (UTC)- I concur with Anomie on this one; we have an editor who, while other factors were involved, wanted to do a similar thing for 300 pages and is restricted to only making one per month. Creating ten times as many one-paragraph sub-stubs in a fraction of the time will need consensus. Yes, they aren't just gas stations, but other than "Town X has a population Y" there appears to be no more information readily available, so I would like to see a reasonable consensus to create these (and not just two editors agreeing it would be a good idea). As Anomie said, your first approved task was updating information, not creating new pages. Primefac (talk) 12:07, 21 July 2024 (UTC)
- It's a clerical difference, it's just because some editor mass-created tens of thousands of these two decades ago and happened to miss half of the Croatian settlements. But okay, let's go through the motions, I'll file a proposal when I have the time (and if no one beats me to it). --Joy (talk) 18:57, 21 July 2024 (UTC)
- @Ponor the best way to substantiate this proposal would be to make sure we show some external references on e.g. the Bureau of Statistics doing proper work (documenting existing human habitation as opposed to something weird), and illustrate the body of scholarly and other work out there on the topic of these settlements. If you have something to this effect already, please share. --Joy (talk) 19:05, 21 July 2024 (UTC)
- Sure, I'll help with everything I know, but can't take the burden of convincing everyone on the project alone atm. I'd start with the first four refs in Dubrava, Split-Dalmatia County: there are laws, one agency takes care of the division(s), the bureau uses their data. Every town and municipality have their web page listing these settlements. Most settlements have a church, school, etc. Let's continue at WikiProject Croatia, huh? Ponor (talk) 19:20, 21 July 2024 (UTC)
- @Ponor the best way to substantiate this proposal would be to make sure we show some external references on e.g. the Bureau of Statistics doing proper work (documenting existing human habitation as opposed to something weird), and illustrate the body of scholarly and other work out there on the topic of these settlements. If you have something to this effect already, please share. --Joy (talk) 19:05, 21 July 2024 (UTC)
- It's a clerical difference, it's just because some editor mass-created tens of thousands of these two decades ago and happened to miss half of the Croatian settlements. But okay, let's go through the motions, I'll file a proposal when I have the time (and if no one beats me to it). --Joy (talk) 18:57, 21 July 2024 (UTC)
- I concur with Anomie on this one; we have an editor who, while other factors were involved, wanted to do a similar thing for 300 pages and is restricted to only making one per month. Creating ten times as many one-paragraph sub-stubs in a fraction of the time will need consensus. Yes, they aren't just gas stations, but other than "Town X has a population Y" there appears to be no more information readily available, so I would like to see a reasonable consensus to create these (and not just two editors agreeing it would be a good idea). As Anomie said, your first approved task was updating information, not creating new pages. Primefac (talk) 12:07, 21 July 2024 (UTC)
- The real difference between Wikipedia:Bots/Requests for approval/PonoRoboT and this is that this is about creations, and the community has for many years now wanted to vet bot creations of articles before they happen. And that's regardless of whether the proposed creations would pass WP:N (part of it is that the community wants independent evaluation of that before the creations happen) or whether other articles on the topic or related topics already exist.If you want to refer to policy, WP:MASSCREATION says (emphasis added)
- I'd oppose the bot creating any more pages until Module:Croatian population data graph is translated into English and more pages become uneditable by editors unfamiliar with the language. Gonnym (talk) 11:00, 4 August 2024 (UTC)
- On hold. Please feel free to disable the {{BotOnHold}} template when consensus about the appropriateness of this task has been demonstrated. Primefac (talk) 23:48, 4 August 2024 (UTC)
- A user has requested the attention of the operator. Once the operator has seen this message and replied, please deactivate this tag. (user notified) Any movement on this? Primefac (talk) 14:03, 1 January 2025 (UTC)
Bots in a trial period
Operator: DreamRimmer (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 19:18, Thursday, January 2, 2025 (UTC)
Automatic, Supervised, or Manual: automatic
Programming language(s): Python
Source code available: User:GalaxyBot/Task2.py
Function overview: This bot will help with admin activity updates.
Links to relevant discussions (where appropriate): Wikipedia:Bots/Noticeboard#Bot not running - creator inactive
Edit period(s): Daily
Estimated number of pages affected: 4
Exclusion compliant (Yes/No): No
Already has a bot flag (Yes/No): Yes
Function details: This bot will help with admin activity updates. The affected pages are Wikipedia:List of administrators/Active, Wikipedia:List of administrators/Semi-active, Wikipedia:List of administrators/Inactive and Wikipedia:List of administrators. This task was previously handled by Rick Bot and has the same functionality. I wrote the code from scratch using the Pywikibot framework.
Discussion
Approved for trial (7 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. As per usual, with a code rewrite always best to check things are working as intended. Primefac (talk) 09:58, 3 January 2025 (UTC)
- Just a note that RickBot started working again today. Extraordinary Writ (talk) 22:24, 5 January 2025 (UTC)
Operator: Bunnypranav (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 12:59, Saturday, November 23, 2024 (UTC)
Automatic, Supervised, or Manual: automatic
Programming language(s): AutoWikiBrowser
Source code available: AWB
Function overview: Remove userpages from content categories listed at Wikipedia:Database reports/Polluted categories
Links to relevant discussions (where appropriate):
Edit period(s): Manual runs every week or so
Estimated number of pages affected: ~300 Every run
Exclusion compliant (Yes/No): No
Already has a bot flag (Yes/No): Yes
Function details: Removes user pages from content categories, like birth year, etc. from the listed database report at Wikipedia:Database reports/Polluted categories. I may do my own DB updates in my user space with the opensource code. Of course, it cannot be exculsion compliant as the cat should not be in that space.
Discussion
- Since you say it is automatic, how would you determine whether a category is meant for the mainspace or userspace? If I add Category:WikiProject tagging bots to Special:RandomPage, will it then be removed from the userspace pages with the category? DatGuyTalkContribs 01:55, 28 November 2024 (UTC)
- @DatGuy While making the lists based on the database report or SQL query, I will only add content categories to it, like birth/death cats for eg. Then, userspace pages from those cats will be removed. Adding cats to the list, i.e. is it a content cat or not, will be done manually to avoid such errors. ~/Bunnypranav:<ping> 10:20, 28 November 2024 (UTC)
- I have gone through some random content categories in this database report and haven't found any user pages in it. Did you mean that 300 pages would be fixable per week when the database report is updated, or is it up to you when you are running the bot? – DreamRimmer (talk) 15:55, 29 November 2024 (UTC)
- The database report runs infrequently compared to the stuff to fix in it. I shall run the SQL query in quarry:query/87967 before every run. 300 is a bit of high-end number, I am ready to fix as many or as few of pages available when I do a run. ~/Bunnypranav:<ping> 16:03, 29 November 2024 (UTC)
- Eg. of how a list is made. This petscan query shows user and user talk pages in some content cats from the quarry query, I shall go through these and disable them, i.e. [[:Category:XYZ]]. ~/Bunnypranav:<ping> 03:30, 30 November 2024 (UTC)
- Almost all of the pages on this list are subpages, and the DannyS712 bot also disables categories on userspace pages. While it mainly fixes pages that have draft or AfC templates, I am sure it helps with a fair number of pages each month that are part of this report. So, I think a weekly run would work well, as there should be about 40-60 pages for your bot to fix each week. I could be wrong, though. – DreamRimmer (talk) 11:05, 30 November 2024 (UTC)
- I am not fully convinced this is necessary; the Petscan provided shows ten sandboxes, which should have the cats commented out (or placed in {{draft categories}}) but not removed outright. Are there consistently categories that are used on main user pages or user talk pages? Primefac (talk) 16:58, 9 December 2024 (UTC)
- Almost all of the pages on this list are subpages, and the DannyS712 bot also disables categories on userspace pages. While it mainly fixes pages that have draft or AfC templates, I am sure it helps with a fair number of pages each month that are part of this report. So, I think a weekly run would work well, as there should be about 40-60 pages for your bot to fix each week. I could be wrong, though. – DreamRimmer (talk) 11:05, 30 November 2024 (UTC)
- I have gone through some random content categories in this database report and haven't found any user pages in it. Did you mean that 300 pages would be fixable per week when the database report is updated, or is it up to you when you are running the bot? – DreamRimmer (talk) 15:55, 29 November 2024 (UTC)
- @DatGuy, any category that should contain user pages, should be using {{Polluted category}}. The bot can then skip these. Gonnym (talk) 14:19, 27 December 2024 (UTC)
- While I will not do that automatically, I shall keep it in mind. ~/Bunnypranav:<ping> 14:37, 27 December 2024 (UTC)
- @DatGuy While making the lists based on the database report or SQL query, I will only add content categories to it, like birth/death cats for eg. Then, userspace pages from those cats will be removed. Adding cats to the list, i.e. is it a content cat or not, will be done manually to avoid such errors. ~/Bunnypranav:<ping> 10:20, 28 November 2024 (UTC)
- See this petscan, it shows 107 results. Unless I missed removing a non-content cat, I think this qualifies for a bot task. I generally see many year of birth cats in userpages, and others cats in sandboxes. Clarification: I shall disable all occurences of such content cats using [[:Category:XXXXX]]. ~/Bunnypranav:<ping> 12:27, 11 December 2024 (UTC)
- The petscan query you provided is empty. – DreamRimmer (talk) 12:39, 11 December 2024 (UTC)
- Oops, does this work? https://petscan.wmcloud.org/?psid=30328826 ~/Bunnypranav:<ping> 12:45, 11 December 2024 (UTC)
- It does, yes. Still seeing a lot of sandboxes (which can be filtered out once the list is made I suppose) but I think that filtering out talk pages would also be necessary; for example Special:Diff/1263439382 would have been an inappropriate removal as it needed to be piped. The more I see this the more issues I'm seeing with context whittling down the values to something that probably should just be a manual AWB task run occasionally (that is not me declining it, and I'm still open to the idea, but I'm seeing fewer and fewer reasons to make this an automated bot process). Primefac (talk) 17:43, 16 December 2024 (UTC)
- In the first Petscan query there were 36 results in one week and in the next week there were 107 pages, so there are enough pages to address with a weekly run. There is no consistency in the categories as newusers use different categories depending on the topic of their creations and the majority of these categories are used in user sandboxes or user subpages. Some of these pages are fixed by the DannyS712 bot but since it only handles pages with AfC templates, most sandbox pages without AfC templates remain unfixed. Although most of the work is manual such as querying the database to get content categories, finding userspace pages using those categories, and then fixing them via AWB, also the number of pages is relatively low, so I agree with you that it can be done with main account using AWB. – DreamRimmer (talk) 18:02, 16 December 2024 (UTC)
- @Primefac I plan to change the cats just like the diff you linked above. I didn't get why sandboxes need to be filtered out, for sake of simplicity and reduction in errors, I plan to [[:Category out the links for all types of pages, talk or not.
- Also, if talk pages are edited as well, doing it with a bot prevents the New Message notification, at this point I think that a bot task is warranted for the userpages as well. ~/Bunnypranav:<ping> 15:14, 17 December 2024 (UTC)
- {{BAG assistance needed}} ~/Bunnypranav:<ping> 14:37, 27 December 2024 (UTC)
- In the first Petscan query there were 36 results in one week and in the next week there were 107 pages, so there are enough pages to address with a weekly run. There is no consistency in the categories as newusers use different categories depending on the topic of their creations and the majority of these categories are used in user sandboxes or user subpages. Some of these pages are fixed by the DannyS712 bot but since it only handles pages with AfC templates, most sandbox pages without AfC templates remain unfixed. Although most of the work is manual such as querying the database to get content categories, finding userspace pages using those categories, and then fixing them via AWB, also the number of pages is relatively low, so I agree with you that it can be done with main account using AWB. – DreamRimmer (talk) 18:02, 16 December 2024 (UTC)
- It does, yes. Still seeing a lot of sandboxes (which can be filtered out once the list is made I suppose) but I think that filtering out talk pages would also be necessary; for example Special:Diff/1263439382 would have been an inappropriate removal as it needed to be piped. The more I see this the more issues I'm seeing with context whittling down the values to something that probably should just be a manual AWB task run occasionally (that is not me declining it, and I'm still open to the idea, but I'm seeing fewer and fewer reasons to make this an automated bot process). Primefac (talk) 17:43, 16 December 2024 (UTC)
- Oops, does this work? https://petscan.wmcloud.org/?psid=30328826 ~/Bunnypranav:<ping> 12:45, 11 December 2024 (UTC)
- The petscan query you provided is empty. – DreamRimmer (talk) 12:39, 11 December 2024 (UTC)
- Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Primefac (talk) 13:56, 1 January 2025 (UTC)
Operator: C1MM (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 04:42, Thursday, December 12, 2024 (UTC)
Automatic, Supervised, or Manual: automatic
Programming language(s): Python
Source code available:
Function overview: Adds or modifies election templates in 'Results' section of Indian Lok Sabha/Assembly constituencies
Links to relevant discussions (where appropriate):
Edit period(s): One time run on a category of pages.
Estimated number of pages affected: ~4000
Exclusion compliant (Yes/No): No
Already has a bot flag (Yes/No): Yes
Function details: This bot modifies the results sections of Indian Lok Sabha/assembly constituencies. It takes the 'Results' section and for the most recent two elections with published data it adds in all candidates with vote percentages above 0.9% and removing candidates with vote percentages under 0.9%. It does not edit candidate data (i.e. hyperlinks are preserved) except to correctly capitalise candidate names in all upper case. 'change' parameter is only filled if there is no elections which take place between the two data.
Candidates are sorted by vote totals and the subsections are sorted by election years in descending order (most recent election comes first). If a 'Results' section does not exist, it is placed in front of the 'References' section and the results from the two most recent elections are placed there.
Discussion
- What is the source of the election data being used by the bot? – DreamRimmer (talk) 14:27, 13 December 2024 (UTC)
- The ECI website: eci.gov.in (it is geoblocked for users outside India). It has reports for every Parliamentary and Assembly election in India since Independence, and the ones after 2015 are in PDF form and those after 2019 have csv files. C1MM (talk) 01:19, 14 December 2024 (UTC)
- Thanks for the response. I have used data from eci.gov.in for my bot task, and it is a good source. I tried searching for results data for recent elections, but I only found PDFs and XLSX files; I did not find any CSV files containing the full candidate results data. Perhaps I missed some steps. I will try to provide some feedback after reviewing the edits if this goes for a trial. – DreamRimmer (talk) 09:56, 14 December 2024 (UTC)
- I convert XLSX to CSV (it is second-nature to do it now for me so I forget to tell sometimes). C1MM (talk) 17:07, 14 December 2024 (UTC)
- Thanks for the response. Is the source code for this publicly available somewhere if I want to take a look at it? – DreamRimmer (talk) 09:44, 16 December 2024 (UTC)
- I convert XLSX to CSV (it is second-nature to do it now for me so I forget to tell sometimes). C1MM (talk) 17:07, 14 December 2024 (UTC)
- Thanks for the response. I have used data from eci.gov.in for my bot task, and it is a good source. I tried searching for results data for recent elections, but I only found PDFs and XLSX files; I did not find any CSV files containing the full candidate results data. Perhaps I missed some steps. I will try to provide some feedback after reviewing the edits if this goes for a trial. – DreamRimmer (talk) 09:56, 14 December 2024 (UTC)
- The ECI website: eci.gov.in (it is geoblocked for users outside India). It has reports for every Parliamentary and Assembly election in India since Independence, and the ones after 2015 are in PDF form and those after 2019 have csv files. C1MM (talk) 01:19, 14 December 2024 (UTC)
- There might be good reasons to keep a candidate's data even if they get less than 0.9% of the vote. I'd say that if the candidate's name is wikilinked (not a red link), then the bot should not remove that row.
- Also, consider "None of the above" as a special case, and always add/keep that data when it is available. -MPGuy2824 (talk) 10:07, 14 December 2024 (UTC)
- Good point. I forgot to mention I did treat 'None of the above' as a special case, don't cut it and in fact add it in where it is not in the template. I also add 'majority' and 'turnout' and when there is no election in between the two most recent elections for which I have data I also add a 'gain' or 'hold' template.
- How do you check if a page exists and is not a disambigution? I say this because a lot of politicians in India share names with other people (example Anirudh Singh) so I would rather only keep people below 0.9% of the vote if they are linked to an article which is actually about them. C1MM (talk) 13:47, 14 December 2024 (UTC)
- If you are using Pywikibot, you can use the
page.BasePage
class methods, such as theexists()
method, to check whether a wikilinked page exists on the wiki. It returns a boolean valueTrue
if the page exists on the wiki. To check whether this page is a disambiguation page, you can use theisDisambig()
method, which returnsTrue
if the page is a disambiguation page, andFalse
otherwise. – DreamRimmer (talk) 17:07, 16 December 2024 (UTC)- I've made the suggested changes and the pages produced look good (I haven't saved obviously). I unfortunately don't know how to run Python pywikibot source code on Wikimedia in a way that accesses files on my local machine, is this possible? C1MM (talk) 05:56, 23 December 2024 (UTC)
- Are you saying that you have stored CSV files on your local machine and want to extract the result data from them? Let me know if you need any help with the source code. – DreamRimmer (talk) 11:04, 23 December 2024 (UTC)
- I figured this problem out. I would now think a BAG member should probably come and give their opinion. C1MM (talk) 16:56, 30 December 2024 (UTC)
- Are you saying that you have stored CSV files on your local machine and want to extract the result data from them? Let me know if you need any help with the source code. – DreamRimmer (talk) 11:04, 23 December 2024 (UTC)
- I've made the suggested changes and the pages produced look good (I haven't saved obviously). I unfortunately don't know how to run Python pywikibot source code on Wikimedia in a way that accesses files on my local machine, is this possible? C1MM (talk) 05:56, 23 December 2024 (UTC)
- If you are using Pywikibot, you can use the
{{BAG assistance needed}} — Preceding unsigned comment added by C1MM (talk • contribs) 16:55, 30 December 2024 (UTC)
- Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Please do not mark these edits as minor. Primefac (talk) 13:34, 1 January 2025 (UTC)
- [1] Here are the contributions asked for. I think there are a couple of issues: I haven't actually added a source technically for these contributions and also for a certain party (Peace Party) I added the disambiguation links by mistake. I also accidentally made the replacement headings 3rd level instead of 2nd level, which I have now fixed. C1MM (talk) 03:47, 2 January 2025 (UTC)
- Please also go back and manually fix these 50 edits for the problems that you've noticed. Additionally, if you could also use the {{formatnum}} template for all the votes figures it would be great. The other parts of the edits look good. -MPGuy2824 (talk) 05:05, 2 January 2025 (UTC)
- [1] Here are the contributions asked for. I think there are a couple of issues: I haven't actually added a source technically for these contributions and also for a certain party (Peace Party) I added the disambiguation links by mistake. I also accidentally made the replacement headings 3rd level instead of 2nd level, which I have now fixed. C1MM (talk) 03:47, 2 January 2025 (UTC)
Operator: Usernamekiran (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 23:46, Thursday, December 26, 2024 (UTC)
Function overview: Remove instances of {{FFDC}}
which reference files that are no longer being discussed at FfD, similar to FastilyBot 17, with new code.
Automatic, Supervised, or Manual: Automatic
Programming language(s): pywikibot
Source code available: will publish at github repo
Links to relevant discussions (where appropriate): special:permalink/1265443290#Replacing FastilyBot
Edit period(s): weekly
Estimated number of pages affected: around 2-3 per week
Namespace(s): needs to be discussed
Exclusion compliant (Yes/No): currently yes, but that can be updated.
Function details: created new code for simplicity/posterity. When listing files at FfD, editors will sometimes add {{FFDC}}
to the articles that link the listed files. When FfD discussions are closed, it's common for the closing editor to miss and/or forget to remove {{FFDC}}
. This proposed bot task will simply find instances of {{FFDC}}
that reference closed/non-existent FfD discussions and remove them. —usernamekiran (talk) 23:46, 26 December 2024 (UTC)
Discussion
- @Explicit: what namespace should I restrict the bot to? currently, the template has been transcluded on a few article talk pages, user talk, and drafts. —usernamekiran (talk) 23:46, 26 December 2024 (UTC)
- Approved for trial (25 edits or 30 days, whichever happens first). Please provide a link to the relevant contributions and/or diffs when the trial is complete. While waiting for an answer to the above, please limit the bot to the Article namespace. Primefac (talk) 13:30, 1 January 2025 (UTC)
Operator: CFA (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 19:59, Tuesday, December 31, 2024 (UTC)
Function overview: Removes articles from Category:Wikipedia requested images of biota if they have an image
Automatic, Supervised, or Manual: Automatic
Programming language(s): Python
Source code available: No, but it can be if necessary
Links to relevant discussions (where appropriate): Uncontroversial
Edit period(s): Weekly
Estimated number of pages affected: ~3-6k first run; likely no more than 10/week afterwards
Namespace(s): Talk
Exclusion compliant (Yes/No): Yes
Function details:
- Removes talk pages of articles with images from Category:Wikipedia requested images of biota and its subcategories
- Removes {{image requested}} or the "needs-image" banner parameter if an extant image is present in the taxobox
Discussion
Approved for trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Primefac (talk) 13:24, 1 January 2025 (UTC)
Operator: CanonNi (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 12:49, Tuesday, December 17, 2024 (UTC)
Function overview: A replacement for tasks 1, 2, 7, 8, 9, and 15 of FastilyBot (talk · contribs), whose operator has retired
Automatic, Supervised, or Manual: Automatic
Programming language(s): Rust (mwbot-rs crate)
Source code available: Will push to GitLab later
Links to relevant discussions (where appropriate): See this
Edit period(s): Daily
Estimated number of pages affected: A couple dozen every day
Namespace(s): File:
Exclusion compliant (Yes/No): Yes
Function details: Near identical functionality of the previous bot, just rewritten in a different (and better) language. All are modifying templates on File description pages, so I'm merging this into one task.
Task details (copied from WP:BOTREQ)
| ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Discussion
- Thanks for stepping up to help! For easier review and tracking, could you please list all these tasks and their descriptions in the "Function details" section? You can use a wikitable for this. – DreamRimmer (talk) 13:51, 17 December 2024 (UTC)
- Added above. '''[[User:CanonNi]]''' (talk • contribs) 13:58, 17 December 2024 (UTC)
- Approved for trial (120 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Please perform 20 edits for each task. Primefac (talk) 12:35, 23 December 2024 (UTC)
Operator: Ow0cast (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 01:50, Thursday, November 14, 2024 (UTC)
Function overview: Replace external links to wikipedia with wikilinks
Automatic, Supervised, or Manual: Automatic
Programming language(s): Python (pywikibot)
Source code available: No
Links to relevant discussions (where appropriate): I do not believe that discussions are required for this action, as this is the entire point of wikilinks
Edit period(s): Continuous
Estimated number of pages affected: 25/day at the highest.
Namespace(s): Mainspace
Exclusion compliant (Yes/No): Yes
Function details: The goal of this task is to replace "external" links to wikipedia pages with the proper wikilinks.
- Watch Special:RecentChanges for edits containing "https://[*].wikipedia.org/wiki/[*]", then replace the external link with a wikilink.
Example: "Python https://en.wikipedia.org/wiki/Python_(programming_language) is cool" → "Python is cool."
Discussion
- Many articles contain external Wikipedia links to templates, policy pages, and discussion, usually added as comments. On average, about 20 of these kinds of links are added per day, with 95% of them as commented-out text. Replacing these links would only lead to cosmetic changes, which should be avoided per WP:COSMETICBOT, as commented-out text are not visible to readers. For the remaining 5%, using a bot isn't a good idea, as these minor edits can be easily handled by a human editor. Currently, over 62,000 pages have these types of commented-out links, and none need replacement based on your criteria. This suggests that these types of external links are fixed regularly. – DreamRimmer (talk) 14:32, 14 November 2024 (UTC)
- I do not want to pile-on, but for "en.wikipedia" this task wont be much useful like DreamRimmer explained above. However, in case the link is to some other wikipedia eg "de.wikipedia" (german), or "es.wikipedia" (spanish), this task would be useful, but again, the occurrences are extremely low, and they are generally handled/repaired by editors as soon as they are inserted. Also, bot operator is new (not extended confirmed), so this might get denied under WP:BOTNOTNOW. But this is actually a sound request, my first BRFA was outright silly. —usernamekiran (talk) 15:45, 14 November 2024 (UTC)
- DreamRimmer, I think CheckWiki #90 would probably be more useful for finding the number of pages affected by this; at the moment it's sitting at ~4500 pages so this probably does require some sort of intervention. Primefac (talk) 20:19, 17 November 2024 (UTC)
- @Ow0cast: Given there are around 4500 pages, this is indeed a useful task. Would you be able to program it to handle the subdomains? Similar to the example I provided above? —usernamekiran (talk) 20:25, 1 December 2024 (UTC)
- @Usernamekiran: Yes, I should be able to make it handle subdomains.
/etc/owuh $ (💬 | she/her)
20:29, 1 December 2024 (UTC)- Approved for trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Primefac (talk) 20:39, 1 December 2024 (UTC)
- Should I run it on Special:RecentChanges or the pages listed at checkwiki?
/etc/owuh $ (💬 | she/her)
22:26, 1 December 2024 (UTC)- @Ow0cast: pages listed at checkwiki would be the optimal choice. —usernamekiran (talk) 00:18, 5 December 2024 (UTC)
- Should I run it on Special:RecentChanges or the pages listed at checkwiki?
- Approved for trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Primefac (talk) 20:39, 1 December 2024 (UTC)
- @Usernamekiran: Yes, I should be able to make it handle subdomains.
- @Ow0cast: Given there are around 4500 pages, this is indeed a useful task. Would you be able to program it to handle the subdomains? Similar to the example I provided above? —usernamekiran (talk) 20:25, 1 December 2024 (UTC)
- DreamRimmer, I think CheckWiki #90 would probably be more useful for finding the number of pages affected by this; at the moment it's sitting at ~4500 pages so this probably does require some sort of intervention. Primefac (talk) 20:19, 17 November 2024 (UTC)
- I do not want to pile-on, but for "en.wikipedia" this task wont be much useful like DreamRimmer explained above. However, in case the link is to some other wikipedia eg "de.wikipedia" (german), or "es.wikipedia" (spanish), this task would be useful, but again, the occurrences are extremely low, and they are generally handled/repaired by editors as soon as they are inserted. Also, bot operator is new (not extended confirmed), so this might get denied under WP:BOTNOTNOW. But this is actually a sound request, my first BRFA was outright silly. —usernamekiran (talk) 15:45, 14 November 2024 (UTC)
Operator: BilledMammal (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 08:51, Monday, July 8, 2024 (UTC)
Function overview: Adjusts templates based on provided JSON configuration files. This request is limited to Template:Cite news and Template:Cite web, and is primarily intended to correct issues where the work or publisher is linked to the wrong target.
Automatic, Supervised, or Manual: Automatic
Programming language(s): Python
Source code available: Not currently
Links to relevant discussions (where appropriate):
Edit period(s): Initially, irregular one-off runs, with each held after significant expansions to the configuration file. Once most citations have been fixed I will open a request for continuous operation in a maintenance mode.
Estimated number of pages affected: Varies considerably based on configuration. This configuration, which applies to ten sources, will edit approximately 23,000. This configuration, which goes beyond correcting wrong links and also always inserts the correct link when one is missing, will edit approximately 450,000.
Namespace(s): Mainspace
Exclusion compliant (Yes/No): Yes
Function details: Adjusts parameters of Cite news and Cite web based on a configuration file. This configuration can be applied to any parameter, but the intent of this request is to apply it to the following:
- work
- publisher
- publication-place
- department
- agency
- url-access
It determines which change to apply based on current parameter field values. Any field or combination of fields can be used, but the intent of this request is to use the "url" field.
Adjustments can be specified as "always", "onEdit", or "never". When "always" is specified, if a change is identified as being desired for a parameter the article will be edited to implement it. When "onEdit" is specified, desirable changes are only implemented if we are already editing the page. This reduces the impact on watchlists by skipping articles that don't have high priority issues.
Configuration schema
|
---|
{ "$schema": "http://json-schema.org/draft-07/schema#", "type": "array", "items": { "type": "object", "properties": { "includes": { "type": "array", "items": { "type": "object", "properties": { "key": { "type": "string", "example": "url" }, "value": { "type": "array", "items": { "type": "string", "example": ["www.bbc.com", "www.bbc.co.uk"] } } } }, "description": "Lists conditions required to be met for this configuration to be applied to the template." }, "excludes": { "type": "array", "items": { "type": "object", "properties": { "key": { "type": "string", "example": "url" }, "value": { "type": "array", "items": { "type": "string", "example": ["www.bbc.com/sport", "www.bbc.co.uk/sport"] } } }, "description": "Lists conditions that must not be met for this configuration to be applied to the template." } }, "patternProperties": { "^[a-zA-Z0-9-]+$": { "oneOf": [ { "type": "array", "description": "Named for the parameter, and defines what will be done with it. Used when there are multiple possible configurations for the parameter.", "items": { "$ref": "#/definitions/parameter-config" } }, { "type": "object", "description": "Named for the parameter, and defines what will be done with it. Used when there is only one possible configuration for the parameter.", "$ref": "#/definitions/parameter-config" } ] } } }, "definitions": { "parameter-config": { "$schema": "http://json-schema.org/draft-07/schema#", "$id": "parameter-config", "type": "object", "properties": { "includes": { "type": "array", "items": { "type": "object", "properties": { "key": { "type": "string", "example": ["url"] }, "value": { "type": "array", "items": { "type": "string", "example": ["www.bbc.com", "www.bbc.co.uk"] } } } }, "description": "Lists conditions required to be met for this configuration to be applied to the parameter." }, "excludes": { "type": "array", "items": { "type": "object", "properties": { "key": { "type": "string", "example": ["url"] }, "value": { "type": "array", "items": { "type": "string", "example": ["www.bbc.com/sport", "www.bbc.co.uk/sport"] } } } }, "description": "Lists conditions that must not be met for this configuration to be applied to the parameter." }, "link": { "type": "string", "description": "Where the parameter should normally link to", "example": ["ABC News (Australia)"] }, "wikitext": { "type": "string", "description": "What the wikitext of the parameter should normally be", "example": ["ABC News"] }, "blacklist": { "type": "array", "items": { "type": "string", "example": ["ABC News (United States)", "ABC News"] }, "description": "Links that will always be removed" }, "greylist": { "type": "array", "items": { "type": "string", "example": ["Australian Broadcasting Corporation"] }, "description": "Links that will only be removed when already editing the page. Used to prevent edits that would only fix issues we consider minor." }, "whitelist": { "type": "array", "items": { "type": "string", "example": ["The Sunday Telegraph (Sydney)"] }, "description": "Links that will never be removed. Used when we believe editors may have deliberately provided a non-standard value that we wish to respect." }, "fixRedirects": { "type": "string", "enum": ["always", "onEdit", "never"], "default": "onEdit", "description": "Specifies when we will replace redirects to the provided link with the provided link." }, "fixDisplay": { "type": "string", "enum": ["always", "onEdit", "never"], "default": "onEdit", "description": "Specifies when we will replace the currently displayed text with the displayed version of the provided Wikitext." }, "fixOthers": { "type": "string", "enum": ["always", "onEdit", "never"], "default": "always", "description": "Specifies when we will replace links to pages that are neither redirects to the link nor on the provided lists." }, "fixMissing": { "type": "string", "enum": ["always", "onEdit", "never"], "default": "onEdit", "description": "Specifies when we will add a missing value" }, "priority": { "type": "integer", "default": 5, "description": "Provides a tie-breaker when multiple array objects meet the inclusion or exclusion criteria. Higher value is preferred. It is unspecified which configuration object is used when both have the same priority level.", "minimum": 1 } } } } } } |
What it does to these parameters depends on the configuration. For example:
"work": { "link": "ABC News (Australia)", "wikitext": "ABC News", "blacklist": ["ABC News (United States)", "ABC News"], "greylist": ["Australian Broadcasting Corporation"], "fixMissing": "onEdit", "fixRedirects": "onEdit", "fixOthers": "always" }
Will ensure that the "work" parameter only links to ABC News (Australia). When it finds a link to a source other than ABC News (Australia), its redirects, or Australian Broadcasting Corporation, it will edit the article to correct that link.
When it encounters a redirect, or Australian Broadcasting Corporation, or a missing value, it will only correct those if it is already editing the article.
If we change "fixMissing" to "always", it would edit the article to insert the value.
"agency": { "includes": [ { "key": "agency", "value": ["Reuters"] } ], "remove": "onEdit" }
Will remove the agency field when it contains "Reuters". This is used to correct when the field has been incorrectly filled with the name of the publisher or work.
"department": [ { "includes": [ { "key": "url", "value": ["reuters.com/world/"] } ], "wikitext": "World" }, { "includes": [ { "key": "url", "value": ["reuters.com/world/reuters-next/"] } ], "wikitext": "Reuters Next", "priority": 6 }, { "includes": [ { "key": "url", "value": ["reuters.com/business/"] } ], "wikitext": "Business" } ]
This fills in the department field based on the source url. If none of these are met then the department field is not filled.
The current configuration file will do the following:
- ABC News (Australia)
- Set "work" to ABC News
- Set "publisher" to Australian Broadcasting Corporation
- Remove "publication-place"
- Remove "agency" when incorrect
- The Daily Telegraph
- Set "work" to The Daily Telegraph
- Set "publisher" to Telegraph Media Group
- Set "publication-place" to "London, United Kingdom"
- Set "department" when it can be determined
- Reuters
- Set "work" to Reuters
- Set "publisher" to Thomson Reuters
- Set "publication-place" to "London, United Kingdom"
- Set "department" when it can be determined
- Remove "agency" when incorrect
- The New York Times
- Set "work" to The New York Times
- Set "url-access" to "limited"
- Remove "publisher"
- Remove "publication-place"
- BBC News
- Set "work" to BBC News
- Remove "publisher"
- Remove "publication-place"
- Set "department" when it can be determined
- BBC Sport
- Set "work" to BBC Sport
- Remove "publisher"
- Remove "publication-place"
- The Guardian
- Set "work" to The Guardian
- Remove "publisher"
- Set "publication-place" to "London, United Kingdom"
- Set "department" when it can be determined
- The Guardian (Swan Hill)
- Set "work" to The Guardian
- The Daily Telegraph (Sydney)
- Set "work" to The Daily Telegraph
- Set "publisher" to News Corp Australia
- Remove "publication-place"
- ABC News (United States)
- Set "work" to ABC News
- Set "publisher" to American Broadcasting Company
- Remove "publication-place"
The intent is that the community will expand the configuration file, increasing the number of citations it can fix.
Example of template replacements
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
When editing a template, to improve readability it will also apply a consistent format and naming convention. This involves converting parameters away from aliases to their primary values, and placing the parameters into the following order:
Order
|
---|
|
Discussion
- I'd prefer if this bot (and every bot) stopped short of reordering template parameters. Doing a full reorganisation on any template edited will make it much more difficult to tell what changes have been made when reviewing diffs. Folly Mox (talk) 09:23, 16 July 2024 (UTC)
- We can trust our bots that much, I'd say. And it shouldn't be much of a problem if you compare the diffs in visual diff mode, try here. In my experience, it's much easier for a bot (program) to reassemble a template in some predefined order. Having data in the order of final appearance does help with readability (BilledMammal: that'd be url?, author(s) data, date, title…).Ponor (talk) 06:48, 18 July 2024 (UTC)
- @Ponor: Currently, author(s) data, date, title, url - the full order can be seen in the final collapsed box. However, that is easy to change.
- It wouldn't be difficult to put it back in the original order (although it would result in new fields being dumped at the end), but personally I believe it is better to reorganize it, as while it makes it harder for editors using non-visual viewer to identify the changes, it easier for editors to parse the template going forward. BilledMammal (talk) 23:05, 18 July 2024 (UTC)
- I support putting the params in some canonical order, my only question is which one it should be. VisualEditor (TemplateData), IAbot, maybe even reFill, probaly use the same one ("Full parameter set in horizontal format" from {{Cite web}}?), which is what I'd use as well. Up to you, though. Ponor (talk) 14:05, 19 July 2024 (UTC)
- I started with the full parameter set from Template:Cite news, but quickly found that "full parameter set" doesn’t actually mean "full parameter set".
- I see the two templates differ in where to put the URL; I think Cite news' method is better, as the URL is difficult to read so better to put that at the end. BilledMammal (talk) 14:11, 19 July 2024 (UTC)
- The order is probably from the order used by TemplateData as that is where ProveIt takes its order from. Gonnym (talk) 11:07, 4 August 2024 (UTC)
- I support putting the params in some canonical order, my only question is which one it should be. VisualEditor (TemplateData), IAbot, maybe even reFill, probaly use the same one ("Full parameter set in horizontal format" from {{Cite web}}?), which is what I'd use as well. Up to you, though. Ponor (talk) 14:05, 19 July 2024 (UTC)
- We can trust our bots that much, I'd say. And it shouldn't be much of a problem if you compare the diffs in visual diff mode, try here. In my experience, it's much easier for a bot (program) to reassemble a template in some predefined order. Having data in the order of final appearance does help with readability (BilledMammal: that'd be url?, author(s) data, date, title…).Ponor (talk) 06:48, 18 July 2024 (UTC)
- I think consensus would need to be established for this at other venues. The part of the proposal regarding adding links where none exist has the potential to conflict with WP:WHENINROME. voorts (talk/contributions) 21:18, 16 August 2024 (UTC)
- That aspect doesn’t need to be enabled; exactly how this functions depends entirely on the configuration file.
- However, that aspect isn’t covered by WP:WHENINROME, which says
If all or most of the citations in an article consist of bare URLs, or otherwise fail to provide needed bibliographic data – such as the name of the source, the title of the article or web page consulted, the author (if known), the publication date (if known), and the page numbers (where relevant) – then that would not count as a "consistent citation style" and can be changed freely to insert such data.
- Emphasis mine. BilledMammal (talk) 18:24, 17 August 2024 (UTC)
- I was referring to the part of WHENINROME that states:
Editors should not attempt to change an article's established citation style, merely on the grounds of personal preference or to make it match other articles, without first seeking consensus for the change.
For example, if an article has proper citation formatting, but none of the publication titles are wikilinked, or only the first instance is, running this bot to add wikilinks to each publication parameter would run afoul of WHENINROME. In any event, given that we have a reasonable disagreement on this point, I think consensus would be needed to implement that part of the bot. voorts (talk/contributions) 18:28, 17 August 2024 (UTC)- Ah, I misunderstood. The configuration file can be updated to not replace unlinked, but otherwise correct, source names, if such behaviour is desirable.
- With that said, I’m not sure whether the decision to Wikilink or not falls under WP:WHENINROME, as such a decision appears to go beyond referencing style and instead fall under MOS:LINK, specifically MOS:UL, which says
Proper names that are likely to be unfamiliar to readers
- which would include virtually all source names, as few have worldwide recognition - should be linked. BilledMammal (talk) 18:48, 17 August 2024 (UTC)- I broadly construe WHENINROME to avoid referencing conflicts since the MOS is a contentious topic. voorts (talk/contributions) 19:04, 17 August 2024 (UTC)
- I don't necessarily have an issue with the rest of what the bot would do. Also, I would like to see a process for establishing consensus for what parameters should be included for each ref. For example, why doesn't The Guardian (Swan Hill) have a publication-place parameter? Why use publisher instead of publication-place for The Daily Telegraph(s)? These are things that might need to be worked out. voorts (talk/contributions) 18:31, 17 August 2024 (UTC)
- The omissions for Swan Hill Guardian are primarily because I wanted an example of a minimally completed source, to demonstrate the tools range.
- (The Daily Telegraph actually uses both)
- The process I was planning was standard WP:CONSENSUS, with the requirement that consensus be obtained prior to changing the primary configuration file. Or do you think something more involved is needed? BilledMammal (talk) 18:48, 17 August 2024 (UTC)
- I was referring to the part of WHENINROME that states:
I think even a rough consensus would be fine for the contents of the configuration file. I'd like to see it advertised at Wikipedia talk:Citing sources, Wikipedia talk:Manual of Style, and potentially other venues before this bot goes active. voorts (talk/contributions) 18:58, 17 August 2024 (UTC)
- Good idea; I think WP:VPR would also be a good location, although I’ll wait till BAG gives preliminary approval before taking it to the wider community. BilledMammal (talk) 19:01, 17 August 2024 (UTC)
- Apologies, have been meaning to tag this with Needs wider discussion. but have had other things to deal with; I would like to see a rough consensus that this is a desired bot task. Primefac (talk) 12:02, 22 August 2024 (UTC)
- I've opened a discussion at the Village Pump. BilledMammal (talk) 09:03, 25 August 2024 (UTC) Link expanded to include section, no other change made. Primefac (talk) 20:09, 25 August 2024 (UTC) discussion archived, link updated. Primefac (talk) 11:43, 20 October 2024 (UTC)
- Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. I do note a very weak consensus at the Pump that this will be a reasonable bot trial. For the sake of getting more eyes on this, please do not mark these edits as minor. Primefac (talk) 11:46, 20 October 2024 (UTC)
- A user has requested the attention of the operator. Once the operator has seen this message and replied, please deactivate this tag. (user notified) Any progress on this? Primefac (talk) 12:43, 23 December 2024 (UTC)
- Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. I do note a very weak consensus at the Pump that this will be a reasonable bot trial. For the sake of getting more eyes on this, please do not mark these edits as minor. Primefac (talk) 11:46, 20 October 2024 (UTC)
- I've opened a discussion at the Village Pump. BilledMammal (talk) 09:03, 25 August 2024 (UTC) Link expanded to include section, no other change made. Primefac (talk) 20:09, 25 August 2024 (UTC) discussion archived, link updated. Primefac (talk) 11:43, 20 October 2024 (UTC)
- Apologies, have been meaning to tag this with Needs wider discussion. but have had other things to deal with; I would like to see a rough consensus that this is a desired bot task. Primefac (talk) 12:02, 22 August 2024 (UTC)
Operator: Usernamekiran (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 13:04, Saturday, September 7, 2024 (UTC)
Function overview: go through Category:Articles missing coordinates with coordinates on Wikidata, add the coordinates from wikidata to enwiki article, and remove the {{coord missing}} template
Automatic, Supervised, or Manual: automatic
Programming language(s): pywikibot
Source code available: not yet, soon on github, pywikibot script
Links to relevant discussions (where appropriate): requested at WP:BOTREQ, permalink
Edit period(s): once a month
Estimated number of pages affected: around 19,000 in the first run, then as they come in
Namespace(s): mainspace
Exclusion compliant (Yes/No): no
Function details: the bot goes through Category:Articles missing coordinates with coordinates on Wikidata, for each article: it reads the coordinates from the wikidata QID of that particular article. adds it to the infobox with | coordinates =
parameter. If infobox is not present, then it adds to the bottom on the appropriate location, using {{coord}} template. If the coordinates are added successfully, then the bot removes {{coords_missing}} template. —usernamekiran (talk) 13:04, 7 September 2024 (UTC)
Discussion
- this seems to be borderline cosmetic bot, if that's the case would it be possible to run the bot with lower edit rates like one edit per minute, or 1edit/5minutes? —usernamekiran (talk) 16:19, 8 September 2024 (UTC)
- I think this would not fall under cosmetic bot because of the third point in WP:COSMETICBOT: [.. Changes that are typically considered substantive affect something visible to readers and consumers of Wikipedia, such as...]
the "administration of the encyclopedia", such as the maintenance of hidden categories used to track maintenance backlogs (e.g. changing
—usernamekiran (talk) 15:46, 17 September 2024 (UTC){{citation needed}}
to{{citation needed|date=September 2016}}
)
- I think this would not fall under cosmetic bot because of the third point in WP:COSMETICBOT: [.. Changes that are typically considered substantive affect something visible to readers and consumers of Wikipedia, such as...]
- {{BAG assistance needed}} —usernamekiran (talk) 08:41, 5 October 2024 (UTC)
- Approved for trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. — The Earwig (talk) 15:28, 5 October 2024 (UTC)
- @The Earwig: Hello. I made around 10 edits, but there were two technical, and another issue. I accidentally ran an older version of the script, which had problem of duplicate entries for coordinates, this has already been fixed. The second issue was of the format of coordinates. The third, non-technical issue is that this task currently does not have a consensus at Wikipedia talk:WikiProject Geographical coordinates. But I think this was discussed in the past, and not recently. First I will fix the formatting issue, and then initiate a discussion at Wikipedia talk:WikiProject Geographical coordinates. Till then, I think this BRFA should be put on On hold.. —usernamekiran (talk) 18:18, 6 October 2024 (UTC)
- Approved for trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. — The Earwig (talk) 15:28, 5 October 2024 (UTC)
- I asked this same question on a duplicate bot request: How are you ensuing that the information you will be publishing satisfies WP:V? — xaosflux Talk 00:52, 26 November 2024 (UTC)
- @Xaosflux: Hello. I apologise, I missed your comment somehow, I saw it a couple of minutes ago. I havent thought about it. Up until now, I was depending on wikidata information, with assumption wikidata information will be correct. If thats not enough, (either-way) I will try to think of something to verify the coordinates. —usernamekiran (talk) 17:02, 1 January 2025 (UTC)
- @Usernamekiran it is possible that information on wikidata is referenced and accurate, but it is certainly possible that it is not. I don't think there is a presumption that a claim existing on wikidata equates to it being reliable and verifiable. — xaosflux Talk 18:01, 1 January 2025 (UTC)
- @Xaosflux: Hello. I apologise, I missed your comment somehow, I saw it a couple of minutes ago. I havent thought about it. Up until now, I was depending on wikidata information, with assumption wikidata information will be correct. If thats not enough, (either-way) I will try to think of something to verify the coordinates. —usernamekiran (talk) 17:02, 1 January 2025 (UTC)
Operator: Sohom Datta (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 20:03, Tuesday, July 16, 2024 (UTC)
Automatic, Supervised, or Manual: automatic
Programming language(s): Python
Source code available: https://github.com/sohomdatta1/npp-notifier-bot
Function overview: Notify previous reviewers of a article at AFD about the nomination
Links to relevant discussions (where appropriate): Initial discussions on NPP Discord + previous BRFAs surrounding AFD notifications
Edit period(s): Continuous
Estimated number of pages affected: 1-2 per day (guessimate?)
Exclusion compliant (Yes/No): Yes
Already has a bot flag (Yes/No): No, on enwiki, yes, for other wikis on other tasks
Function details:
- Use the eventstream API to listen for new AfDs
- Extract page name by parsing the AfD wikitext
- Identify previous reviewers of page at AFD
- Notify said reviewers on their talk pages with a customised version of the existing AfD notification message
Discussion
- I like this concept in general. I tried to make a user script that does this (User:Novem Linguae/Scripts/WatchlistAFD.js#L-89--L-105), but it doesn't work (I probably need to rewrite it to use MutationObserver). Would this bot be automatic for everyone, or opt in? Opt in may be better and easier to move forward in a BRFA. If not opt in, may want to start a poll somewhere to make sure there's some support for "on by default". –Novem Linguae (talk) 07:58, 17 July 2024 (UTC)
- I think it would be better to be on by default with the option for reviewers to disable. (t · c) buidhe 14:28, 17 July 2024 (UTC)
- Ah yes. "Opt out" might be a good way to describe this third option. –Novem Linguae (talk) 22:13, 17 July 2024 (UTC)
- I think it would be better to be on by default with the option for reviewers to disable. (t · c) buidhe 14:28, 17 July 2024 (UTC)
- Support - seems like a good idea. I've reviewed several articles that I've tagged for notability or other concerns, only to just happen to notice them by chance a few days later get AfD'ed by someone else. A bot seems like a good idea, and I can't see a downside. BastunĖġáḍβáś₮ŭŃ! 16:31, 17 July 2024 (UTC)
- This is the sort of thing that would be really good for some people (e.g., new/infrequent reviewers) and really frustrating for others (e.g., people who have reviewed tens of thousands of articles). If it does end up being opt-out, each message needs to have very clear instructions on how to opt out. It would also be worth thinking about a time limit: most people aren't going to get any value out of hearing about an article they reviewed a decade ago. Maybe a year or two would be a good threshold. Extraordinary Writ (talk) 18:48, 17 July 2024 (UTC)
- The PREVIOUS_NOTIF regex should also account for notifications left via page curation tool ("Deletion discussion about xxx"). The notification also needs to be skipped if the previous reviewer themself is nominating. In addition, I would suggest adding a delay of at least several minutes instead of acting immediately on AfD creation – as it can lead to race conditions where Twinkle/PageTriage and this bot simultaneously deliver notifications to the same user. – SD0001 (talk) 13:41, 19 July 2024 (UTC)
- {{Operator assistance needed}} Thoughts on the above comments/suggestions? Also, do you have the notice ready to go or is that still in the works? If it's ready, please link to it (or copy it here if it's hard-coded elsewhere). Primefac (talk) 12:48, 21 July 2024 (UTC)
- @Primefac I've implemented a few of the suggestions, I've reworked the code to exclude pages containing
{{User:SodiumBot/NoNPPDelivery}}
, which should serve as a opt out mechanism :) I've also reworked the code to include SD0001's suggestion of adding a significant delay by making the bot wait at least a hour and also added modified the regex to account for the messages sent by PageTriage. - Wrt to Extraordinary Writ's suggestions, I have restricted the lookup to the last 3 years as well and created a draft User:SodiumBot/ReviewerAfdNotification which has instructions on how to opt out. Sohom (talk) 16:02, 21 July 2024 (UTC)
- Thanks, I'll leave this open for a few days for comment before going to trial. Primefac (talk) 16:07, 21 July 2024 (UTC)
- Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Please make sure this BRFA is linked in the edit summary. Primefac (talk) 23:50, 4 August 2024 (UTC)
- A user has requested the attention of the operator. Once the operator has seen this message and replied, please deactivate this tag. (user notified) Any progress on this? Primefac (talk) 12:44, 23 December 2024 (UTC)
- I had left the bot running, it hasn't picked up a single article by the looks of the logs. I'mm gonna try to do some debugging on what the issue is/was. Sohom (talk) 14:22, 26 December 2024 (UTC)
- A user has requested the attention of the operator. Once the operator has seen this message and replied, please deactivate this tag. (user notified) Any progress on this? Primefac (talk) 12:44, 23 December 2024 (UTC)
- Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Please make sure this BRFA is linked in the edit summary. Primefac (talk) 23:50, 4 August 2024 (UTC)
- Thanks, I'll leave this open for a few days for comment before going to trial. Primefac (talk) 16:07, 21 July 2024 (UTC)
- @Primefac I've implemented a few of the suggestions, I've reworked the code to exclude pages containing
- I ran across Wikipedia:Bots/Requests for approval/SDZeroBot 6 today, which is a very similar task, and uses an "opt out" strategy. This suggests that the community may be OK with having AFD notifications be on by default for a bot task like this. –Novem Linguae (talk) 07:10, 8 August 2024 (UTC)
Operator: Hawkeye7 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 01:57, Wednesday, March 22, 2023 (UTC)
Function overview: Mark unassessed stub articles as stubs
Automatic, Supervised, or Manual: Automatic
Source code available: Not yet
Links to relevant discussions (where appropriate): Wikipedia:Bot requests/Archive 84#Stub assessments with ORES
Edit period(s): daily
Estimated number of pages affected: < 100 per day
Namespace(s): Talk
Exclusion compliant (Yes/No): Yes
Function details: Go through Category:Unassessed articles (only deals with articles already tagged as belonging to a project). If an unassessed article is rated as a stub by ORES, tag the article as a stub. Example
Discussion
- Note: This bot appears to have edited since this BRFA was filed. Bots may not edit outside their own or their operator's userspace unless approved or approved for trial. AnomieBOT⚡ 00:10, 28 March 2023 (UTC)
- ^. Also, may potentially be a CONTEXTBOT; see Wikipedia:Stub:
There is no set size at which an article stops being a stub.
EpicPupper (talk) 23:04, 30 March 2023 (UTC)- The Bot run only affects unassessed articles rated as stubs by mw:ORES.
The ORES ratings for stubs are very reliable (some false negatives – which wouldn't be touched under this proposal – but no false positives)
. Hawkeye7 (discuss) 00:03, 31 March 2023 (UTC)
- The Bot run only affects unassessed articles rated as stubs by mw:ORES.
- ^. Also, may potentially be a CONTEXTBOT; see Wikipedia:Stub:
- Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Sounds reasonable as ORES is usually good for assessing stub articles as such. – SD0001 (talk) 11:41, 1 April 2023 (UTC)
- Bot run with 50 edits. No problems reported. Diffs: [2]. Hawkeye7 (discuss) 00:42, 18 April 2023 (UTC)
- Comment: Some behavior I found interesting is that the bot is reverting start-class classifications already assigned by a human editor, and overriding those with stub-class. [3] and [4] EggRoll97 (talk) 03:28, 18 May 2023 (UTC)
- This should not be happening. Frostly (talk) 03:58, 18 May 2023 (UTC)
- The question is: what should be happening? The article were flagged because some of the projects were not assessed. Should the Bot (1) assess the unassessed ones as stubs and ignore the assessed ones or (2) align the unassessed ones with the ones that are assessed? Hawkeye7 (discuss) 04:21, 18 May 2023 (UTC)
- Per recent consensus assessments should be for an entire article, not per WikiProject. The bot should amend the template to use the article wide code. If several projects have different assessments for an article it should leave it alone. Frostly (talk) 05:03, 18 May 2023 (UTC)
- @Hawkeye7: Courtesy ping, I've manually fixed up the edits where the bot replaced an assessment by a human editor. 6 edits total to be fixed out of 52 total edits. EggRoll97 (talk) 07:16, 18 May 2023 (UTC)
- Bot has been amended. Hawkeye7 (discuss) 04:51, 19 May 2023 (UTC)
- @Hawkeye7: Courtesy ping, I've manually fixed up the edits where the bot replaced an assessment by a human editor. 6 edits total to be fixed out of 52 total edits. EggRoll97 (talk) 07:16, 18 May 2023 (UTC)
- Per recent consensus assessments should be for an entire article, not per WikiProject. The bot should amend the template to use the article wide code. If several projects have different assessments for an article it should leave it alone. Frostly (talk) 05:03, 18 May 2023 (UTC)
- The question is: what should be happening? The article were flagged because some of the projects were not assessed. Should the Bot (1) assess the unassessed ones as stubs and ignore the assessed ones or (2) align the unassessed ones with the ones that are assessed? Hawkeye7 (discuss) 04:21, 18 May 2023 (UTC)
- This should not be happening. Frostly (talk) 03:58, 18 May 2023 (UTC)
- {{BAG assistance needed}} This has been waiting for over 2 months since the end of the trial, and over 4 months since the creation of the request. Given the concerns expressed that the bot operator has since fixed, an extended trial may be a good idea here. EggRoll97 (talk) 05:19, 8 August 2023 (UTC)
- My apologies. I have been very busy. Should I run the new Bot again with a few more edits? Hawkeye7 (discuss) 18:57, 15 October 2023 (UTC)
- Approved for extended trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. – SD0001 (talk) 19:10, 15 October 2023 (UTC)
- Thank you. Hawkeye7 (discuss) 22:33, 15 October 2023 (UTC)
- Approved for extended trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. – SD0001 (talk) 19:10, 15 October 2023 (UTC)
- My apologies. I have been very busy. Should I run the new Bot again with a few more edits? Hawkeye7 (discuss) 18:57, 15 October 2023 (UTC)
{{Operator assistance needed}} It has been more than a month since the last post, is this trial still ongoing? Primefac (talk) 13:26, 31 December 2023 (UTC)
- Yes. I wrote the bot using my C# API, and due to a necessary upgrade here, my dotnet environment got ahead of the one on the grid. I could neither build locally and run on the grid nor on build on the grid. (I could have run the trial locally but would not have been able to deploy to production.) There is currently a push to move bots onto Kubernetes containers, but there was no dotnet build pack available. The heroes on Toolforge have now provided one for dotnet, and I will be testing it when I return from vacation next week. If all goes well I will finally be able to deploy the bot and run the trial at last. See phab:T311466 for details. Hawkeye7 (discuss) 22:54, 31 December 2023 (UTC)
- A user has requested the attention of the operator. Once the operator has seen this message and replied, please deactivate this tag. (user notified) Primefac (talk) 20:10, 18 February 2024 (UTC)
- Work was done in January and some changes made on Toolforge. Will resume the trial run when I get a chance. Hawkeye7 (discuss) 23:33, 18 February 2024 (UTC)
- @Hawkeye7: any update on this? If it's a bit of a medium-term item and not actively worked on, are you happy to mark this BRFA as withdrawn for the time being? ProcrastinatingReader (talk) 10:54, 29 September 2024 (UTC)
- My technical problems have been resolved. A new trial run will be conducted this week. Hawkeye7 (discuss) 19:26, 29 September 2024 (UTC)
- [5][6][7][8][9][10] etc Hawkeye7 (discuss) 03:15, 2 October 2024 (UTC)
- One important change: Liftwing is being used instead of ORES now. Hawkeye7 (discuss) 03:25, 2 October 2024 (UTC)
- A user has requested the attention of the operator. Once the operator has seen this message and replied, please deactivate this tag. (user notified) Courtesy ping to make sure this is still proceeding. Primefac (talk) 12:46, 23 December 2024 (UTC)
- My technical problems have been resolved. A new trial run will be conducted this week. Hawkeye7 (discuss) 19:26, 29 September 2024 (UTC)
- The trial run was successful. The problems with the new Packbuild environment were resolved. I can run some more trials but would prefer permission to put the job into production. Hawkeye7 (discuss) 20:12, 23 December 2024 (UTC)
Bots that have completed the trial period
Operator: Usernamekiran (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 15:59, Tuesday, September 24, 2024 (UTC)
Function overview: update Accelerated Mobile Pages/AMP links to normal links
Automatic, Supervised, or Manual: automatic
Programming language(s): pywikibot
Source code available: github repo
Links to relevant discussions (where appropriate): requested at BOTREQ around 1.5 years ago: Wikipedia:Bot requests/Archive 84#Accelerated Mobile Pages link eradicator needed, and village pump: Wikipedia:Village_pump_(technical)/Archive_202#Accelerated_Mobile_Pages_links, recently requested at BOTREQ a few days ago: special:permalink/1247505851.
Edit period(s): either weekly or monthly
Requested edit rate: 1 edit per 50 seconds.
Estimated number of pages affected: around 8,000 for now, but the estimation is high, around thousands of pages. later as they come in.
Namespace(s): main/article
Exclusion compliant (Yes/No): yes (for now), if required, that can be changed later
Function details: with usage of extensive regex patters, the bot looks for AMP links. It avoids false matching with general "amp" words in the domains eg yamaha-amplifiers.com
. After finding, and updating the a link, the bot checks if the new/updated link is working, if it gets a 200 response code, the bot updates the link in article. Otherwise, the bot adds that article title, and (non-updated) link to a log file (this can be saved to a log page as well). —usernamekiran (talk) 15:59, 24 September 2024 (UTC)
- addendum: I should have included this already, but I forgot. In the BOTREQ, and other discussions, an open source "amputatorbot" github was discussed. This bot has a lot of irrelevant functions for wikipedia. The only relevant feature is to remove AMP links. But for this, the amputatorbot utilises a database for storing a list of
~400k~200k AMP links, and another list of canonical links of these AMP links. Maintaining this database, and the never-ending list of links for Wikipedia is not feasible. The program I created utilises comprehensive regex patterns. It also handles the archived links gracefully. —usernamekiran (talk) 17:50, 28 September 2024 (UTC)
Discussion
Maintaining this database, and the never-ending list of links for Wikipedia is not feasible
But you wouldn't have to maintain this database right, if the authors of that GitHub repo already do, or have made it available?The program I created utilises comprehensive regex patterns. It also handles the archived links gracefully.
Would you mind providing those patterns here for evaluation?
Aside from that, happy for this to go to trial. @GreenC: any comments on this, and does this fall into the scope of your bot? ProcrastinatingReader (talk) 10:40, 29 September 2024 (UTC)
- I will soon post the link to github, and reasoning for avoiding the database method. —usernamekiran (talk) 13:21, 29 September 2024 (UTC)
- @ProcrastinatingReader: Hi. Yes, the author at github has made it available, but I think the database has not been updated in 4 years, I am not sure though. I also could not find the database itself. If we utilise the database, the bot would not process the "unknown" amp links that are not in the database. In that case we will have to use the method that we are currently using. Also, the general process would be more resource intensive I think, ie: "1: search for the amp links in articles 2: if amp link is found in article, look for it in the database 3: find the corresponding canonical link 4: replace in the article. Even if the database is being maintained, we will have to keep it updated, and we will have to add our new findings to the database. I think this simpler approach would be better. KiranBOT at github, AmputatorBot readme at github. Kindly let me know what you think. —usernamekiran (talk) 19:50, 29 September 2024 (UTC)
- PS: I notified GreenC on their talkpage. Also, in the script, I added more comments than I usually do, and the script was created over the days/in parts, so the commenting might feel a little odd. —usernamekiran (talk) 19:54, 29 September 2024 (UTC)
- This sounds like a good idea. I ran into AMP URLs with the Times of India domains, and made many conversions. It seemed site specific. Like m.timesofindia.com became timesofindia.indiatimes.com and "(amp_articleshow|amp_videoshow|amp_etphotostory|amp_ottmoviereview|amp_etc..)" had the "amp_" part removed. Anyway, I'll watchlist this page and feel free to ping me for input once test edits are made. -- GreenC 23:42, 29 September 2024 (UTC)
- @ProcrastinatingReader: if there are no further questions/doubts, is a trial in order? I am sure about one issue related to https, but I think we should discuss it after the trial. —usernamekiran (talk) 15:16, 2 October 2024 (UTC)
- {{BAG assistance needed}} —usernamekiran (talk) 08:42, 5 October 2024 (UTC)
- Reviewing the code, you're applying a set of rules (
amp.domain.tld
→www.domain.tld
,/amp/
→/
,?amp=true&...
→?...
) and then checking the URL responds with 200 to a HEAD request. That seems good for most cases, but there are going to be some instances where the site uses an unusual AMP URL mapping and responds with 200 to all/most/some invalid requests, especially considering we are following redirects (but not updating the URL to the followed redirect). It also will not work for the example edit from the BOTREQ? I don't know how to solve this issue without some way of checking the redirected page actually contains some of the content we are looking for, or access to a database of checked mappings. Maybe the frequency of mistakes will be low enough for this to not be a problem? I am unsure. Any thoughts from others? — The Earwig (talk) 16:10, 5 October 2024 (UTC)- These are good points. Soft-404s and soft-redirects are the biggest (but not only) issues with URL changes. With soft-404s, you first process the links without committing changes, log redirect URLs, see which redirect URLs are repeating, manually inspect them to see if they are a soft-404; then process the links again with a trap added to treat the identified soft-404s as a dead link. Not all repeating redirects are soft-404s but many will be, you have to do the discovery work. For soft-redirects, it requires foreknowledge based on manual inspections, like the Times of India example above. URL changes are difficult for these reasons, and others mentioned in WP:LINKROT#Glossary. -- GreenC 17:53, 5 October 2024 (UTC)
- @GreenC any suggestions on logic/algorithm? I will try to implement them. I dont mind further work to perfect the program —usernamekiran (talk) 20:32, 6 October 2024 (UTC)
- These are good points. Soft-404s and soft-redirects are the biggest (but not only) issues with URL changes. With soft-404s, you first process the links without committing changes, log redirect URLs, see which redirect URLs are repeating, manually inspect them to see if they are a soft-404; then process the links again with a trap added to treat the identified soft-404s as a dead link. Not all repeating redirects are soft-404s but many will be, you have to do the discovery work. For soft-redirects, it requires foreknowledge based on manual inspections, like the Times of India example above. URL changes are difficult for these reasons, and others mentioned in WP:LINKROT#Glossary. -- GreenC 17:53, 5 October 2024 (UTC)
- Reviewing the code, you're applying a set of rules (
- @GreenC, ProcrastinatingReader, and The Earwig: I updated the code, and tested it on a few types of links (that I could think of), as listed in this version of the page, diff of the fix. Kindly suggest me more types/formats of AMP links, and any suggestions/updates to the code. —usernamekiran (talk) 02:49, 31 October 2024 (UTC)
- I see you log failed cases. If not already, also log successes (old url -> new url), in case you need to reverse some later (new url -> old url).
- One way to avoid the problems noted by The Earwig is simply skip URLs with 301/302 headers. Most soft-404s are redirect URLs. With the exception of http->https, those are OK. You can always go back and revisit them later. One way to do this is log the URL "sink" (the final URL in the redirect chain), then script the logs to see if any sinks are repeating.
- -- GreenC 04:19, 31 October 2024 (UTC)
- okay, I will try that. —usernamekiran (talk) 17:41, 11 November 2024 (UTC)
- {{BAG assistance needed}} I made a few changes/additions to the program. In summary: 1) if original URL works, but cleaned url fails, saving is skipped 2) if AMP url, and cleaned url both return non-200, cleaned url is saved 3) if the cleaned url results in a redirect (301, or 302), and the final url after redirection differs from the original AMP url's final destination, saving is skipped. All the events are logged accordingly. I think we are good for a 50 edit trial. courtesy ping @GreenC: —usernamekiran (talk) 05:51, 16 November 2024 (UTC)
- Just noting this has been seen; I'll give GreenC a few days to respond but otherwise I'll chuck this to trial if there is no response (or a favourable response). Primefac (talk) 20:39, 17 November 2024 (UTC)
- Hi. Given the large number of pages affected, and in case there is some issue — then potential of breaking references —essentially breaking WP:V, I don't want to take any chances. So no hurries on my side either. —usernamekiran (talk) 13:23, 20 November 2024 (UTC)
- I think it would be easier to error check if you were able to make 10 edits on live pages. If those go well, then 10 more. And so on, going through the results manually verifying, and refactoring edge cases as they arise, before moving to the next set. We should know by 50 edits total how things are. In that sense, if you were approved for 50 trial edits. User:Primefac. -- GreenC 17:11, 20 November 2024 (UTC)
- yes, I was thinking the same. I tested the program on Charles III, and few other pages, but I'm still doubtful about various possibilities. Even if approved, I'm thinking to go very slow for the first few runs, and only after thorough scrutiny I will run it normally, with 1 edit per 5 seconds. —usernamekiran (talk) 10:22, 21 November 2024 (UTC)
- Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Please follow the time frame set out by GreenC - you do not necessarily have tag this with {{BotTrialComplete}} after each grouping of 10 (that would get a little silly) but post the results of each group here so that others may review. For the sake of expanded viewing, please do not mark the edits as minor. Primefac (talk) 11:36, 21 November 2024 (UTC)
- yes, I was thinking the same. I tested the program on Charles III, and few other pages, but I'm still doubtful about various possibilities. Even if approved, I'm thinking to go very slow for the first few runs, and only after thorough scrutiny I will run it normally, with 1 edit per 5 seconds. —usernamekiran (talk) 10:22, 21 November 2024 (UTC)
- I think it would be easier to error check if you were able to make 10 edits on live pages. If those go well, then 10 more. And so on, going through the results manually verifying, and refactoring edge cases as they arise, before moving to the next set. We should know by 50 edits total how things are. In that sense, if you were approved for 50 trial edits. User:Primefac. -- GreenC 17:11, 20 November 2024 (UTC)
- Hi. Given the large number of pages affected, and in case there is some issue — then potential of breaking references —essentially breaking WP:V, I don't want to take any chances. So no hurries on my side either. —usernamekiran (talk) 13:23, 20 November 2024 (UTC)
- Just noting this has been seen; I'll give GreenC a few days to respond but otherwise I'll chuck this to trial if there is no response (or a favourable response). Primefac (talk) 20:39, 17 November 2024 (UTC)
- Trial complete. 54 edits I apologise, I somehow missed the "dont mark edits as minor", but I manually checked each edit soon after saving the page, and reverted the problematic edits immediately. I also miscalculated my previous edit count, and thought I had 15 left (when only 10 were left), so I accidentally almost performed 55 edits. In the earlier edits, there were few minor issues, I resolved them. In the final run, marked as
BRFA 12.7
, there was only one issue: when there wasweb.archive
url in question, the bot was sending head requests to bare/non-archive URLs. That resulted in two incorrect updates: 1, and 2. I have resolved this issue. In this edit, the old/amp URL is not functional, but the updated/cleaned URL works. Requesting another trial for 50 edits. courtesy ping @GreenC, ProcrastinatingReader, and The Earwig:. Also, should we create some log page on Wikipedia to document failures/skips, and sinks (on separate pages)? —usernamekiran (talk) 18:36, 13 December 2024 (UTC)- I checked every edit. Observations:
- In Islamic State line 480, there is a mangling problem, though oddly the URL still works in this case, it should not happen.
- In Afghanistan in first edit, broken archive URL.
- In Oasis (band) in first edit, removed only some of the amp
- In Kamala Harris in first edit, broken archive URL
- In Islamic State in first edit, broken archive URL
- In Argentina in first edit, broken archive URL
- In FC Barcelona in diff, a couple broken archive URLs
- In FC Barcelona in diff, another broken archive URL
- In Syria in diff, added extraneous curly brackets to each citation
- In Charles III first edit, broke the primary URL
- In Islamic State in diff, broken archive URL
- In Anime in diff, broken archive URL(s)
- In Bill Clinton in diff, broken archive URL
- In Kayne West in diff, broken primary and archive URLs
- In Lil Wayne in first edit, both the new and old primary URL are 404. There is no way to safely migrate the URL in that scenario.
- In Lebanon in line #198, the primary and archive URL are mangled
- In Nancy Pelosi in diff, broken archive URL
- In Charles III in diff, mangled URLs
- Suggestions:
- Before anything further, please manually repair the diffs listed above. Please confirm.
- When using insource search it will tend to sort the biggest articles first. This means the bot's early edits, the most error prone, will also be in the highest profile articles, often with the most changes. For this reason I always shuf the list first, to randomize the order of processing, mixing big and small articles randomly.
- Skip all archive URLs. They are complex and error prone. When modifying an archive URL, the WaybackMachine returns a completely different snapshot. It might not exist at all, or contain different content. Without manually verifying the new archive URL, or access to the WM APIs and tools, you will be lucky to get a working archive URL. There is no reason to remove AMP data from archive URLs it does not matter.
- Manually verify every newly modified URL is working, during the testing period.
- -- GreenC 19:56, 13 December 2024 (UTC)
- Thanks for doing the work here, and agree with these suggestions. This is too high of an error rate to proceed without changes. I'm particularly confused/concerned about what happened on Syria with the extra curly braces. — The Earwig (talk) 21:52, 13 December 2024 (UTC)
- @GreenC and The Earwig: I have addressed most, almost all of the issues that arose before the trial "12.7". It also includes the issue with extra curly brackets that Earwing has pointed out, it has been taken care of. The WaybackMachine/archive is difficult. Regarding Lil Wayne, I had specifically coded the program to update the URL if both the URL ends up in 404. I am not sure what you meant by Lebanon/line 198, I could not find any difference around line 198, or nearby. Even after the approval/trial period, I will set the cap on max edits, and I will be checking every edit until I am fully confident that it is okay to unsupervised. I should I have mentioned when I posted "trial finished": I have included one more functionality (in the edits with summary including 12.7): when the program finds amp characteristics in URL, it then fetches html of that particular page, and looks for amp attributes, if true, only then the URL is repaired. I have also added the functionality to look for canonical/non-amp URL on the page itself. In case it is not found, only then the program tries to repair the URL manually, and then tests the repaired URL. Should I update the code to skip updating URL if bot old and new are 404? I can keep on working/improving the program with dry runs if you'd like. —usernamekiran (talk) 17:16, 14 December 2024 (UTC)
- Can you confirm when you repair the errors listed above? That would mean manually editing each of those articles and fixing the errors the bot introduced during the trial edit. -- GreenC 20:11, 14 December 2024 (UTC)
- @GreenC and The Earwig: I have addressed most, almost all of the issues that arose before the trial "12.7". It also includes the issue with extra curly brackets that Earwing has pointed out, it has been taken care of. The WaybackMachine/archive is difficult. Regarding Lil Wayne, I had specifically coded the program to update the URL if both the URL ends up in 404. I am not sure what you meant by Lebanon/line 198, I could not find any difference around line 198, or nearby. Even after the approval/trial period, I will set the cap on max edits, and I will be checking every edit until I am fully confident that it is okay to unsupervised. I should I have mentioned when I posted "trial finished": I have included one more functionality (in the edits with summary including 12.7): when the program finds amp characteristics in URL, it then fetches html of that particular page, and looks for amp attributes, if true, only then the URL is repaired. I have also added the functionality to look for canonical/non-amp URL on the page itself. In case it is not found, only then the program tries to repair the URL manually, and then tests the repaired URL. Should I update the code to skip updating URL if bot old and new are 404? I can keep on working/improving the program with dry runs if you'd like. —usernamekiran (talk) 17:16, 14 December 2024 (UTC)
- Thanks for doing the work here, and agree with these suggestions. This is too high of an error rate to proceed without changes. I'm particularly confused/concerned about what happened on Syria with the extra curly braces. — The Earwig (talk) 21:52, 13 December 2024 (UTC)
- Since you are using Pywikibot and this is a complex task, you can make things more controlled by using
pywikibot.showDiff
for trials. This way you can review the diffs before saving any changes. Additionally, if this trial is extended, you could use theinput()
function to create an AWB-like experience. This allows you to confirm whether to save changes, which helps prevent mistakes during actual edits. While a dry run is usually the best approach, I prefer this method for similar tasks.
- I checked every edit. Observations:
if changes_made:
print(f"Changes made to page: {page.title()}")
print(pywikibot.showDiff(original_text, updated_text))
response = input("Save? (y/n): ")
if response.lower() == "y":
page.text = updated_text
page.save(summary="removed AMP tracking from URLs [[Wikipedia:Bots/Requests for approval/KiranBOT 12|BRFA 12.1]]", minor=True, bot=True)
# your code...
else:
print(f"Skipping {page.title()}")
# your code...
- Also, since the
botflag
argument is deprecated, you should usebot=True
to mark the edit as a bot edit. – DreamRimmer (talk) 14:47, 16 December 2024 (UTC)- @GreenC: Hi. I was under impression that I had checked all the diffs, and repaired them. Today I fixed a few of them, and I will fix the remaining ones after 30ish hours. During the next runs, I will mostly save the updated page text to my computer, and manually test the "show changes" through browser. This gives better control/understanding. When performing actual edits, I will add a delay of five minutes between each edit, that way I would be able to test the URLs in real time. @DreamRimmer: thanks. but commenting out the page save operation, and saving the updated text to file is better option, you can see the relevant code from line 199 to 209. Its very old code though, the current program is drastically different that that one. —usernamekiran (talk) 17:52, 17 December 2024 (UTC)
- Also, since the
- Approved for extended trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Primefac (talk) 13:19, 1 January 2025 (UTC)
- noting I have seen the approval for extended trial. I am currently working on the program, and testing it with dry runs. I will run it on live wikipedia after I have fixed all the occurring issues. —usernamekiran (talk) 02:04, 2 January 2025 (UTC)
- Trial complete. 50 edits + 1 in userspace I kept the edit rate at 1 edit per 110 seconds. I checked each edit/URL manually. No issues found. The bot updated URLs like
https://amp.abc.net.au/article/104478594
tohttps://www.abc.net.au/news/2024-10-18/king-charles-queen-camilla-arrive-australia-sydney-tour-royal
in special:diff/1267328621. The bot now completely ignores web.archive URLs, and saves/updates the URL only if repaired URL works. Courtesy ping to @GreenC and The Earwig: —usernamekiran (talk) 03:59, 5 January 2025 (UTC)- also, even after the approval, I will set the bot to edit only 50 pages per day, and I will manually check the edits till I am fully confident. —usernamekiran (talk) 04:05, 5 January 2025 (UTC)
Operator: Bunnypranav (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 15:54, Saturday, December 14, 2024 (UTC)
Automatic, Supervised, or Manual: automatic
Programming language(s): AWB
Source code available: AWB
Function overview: Use AWB auto tagging
Links to relevant discussions (where appropriate):
Edit period(s): Weekly runs
Estimated number of pages affected: 50-100 each run
Exclusion compliant (Yes/No): Yes
Already has a bot flag (Yes/No): Yes
Function details: Firstly, I am already approved to use Auto Tagging and GenFixes with the first bot task. That was mainly based upon CW errors, so I have decided to get explicit approval on running tasks primarily based on Auto Tagging. This is also similar to BattyBot's first task.
Specifics:
- Removal of {{orphan}} from Database reports/Orphans with incoming links, of which I run a userpage report at User:Bunnypranav/DeOrphan using same code.
- Change {{No footnotes}} to {{More footnotes needed}} from articles with at least one footnote. Run against articles in the monthly subcategories of Category:Articles lacking sources. Will also add {{Reflist}} and heading with GenFixes for articles lacking one.
- Merge supported templates into {{Multiple issues}}. Run against articles that have {{Multiple issues}}.
Appropriate skip options for cosmetic, no changes, only whitespace changes will be applied.
Discussion
- Question: Will the bot know the difference between content and tracking categories, or hidden and visible categories? For example, if an article has the category Category:CS1 maint: url-status as its only category and has the {{Uncategorized}} template, what will it do? (Hint: The article should remain tagged as Uncategorized.) — Preceding unsigned comment added by Jonesey95 (talk • contribs) 05:06, 17 December 2024 (UTC)
- @Jonesey95: As this is just using AWB autotagger and I did not write any code, I am not sure of the exact technicalities. As per my understanding, such tracking cats should be ignored by AWB. I am open for any testing with specific pages if you can provide them. ~/Bunnypranav:<ping> 12:30, 17 December 2024 (UTC)
- I looked for sample articles, but I guess I'm confused. Why are you proposing to run a category-related task on Category:Articles lacking sources? I don't see any relevant articles.
- I see only 36 articles that have the Uncategorized template. I don't think that batch needs a bot run. – Jonesey95 (talk) 16:17, 17 December 2024 (UTC)
- @Jonesey95: Thanks for your observation. The ref cat was added by mistake. Anyways, after a few searching of my own, I have removed that sub task
and added a new one for new pages tagging. — Preceding unsigned comment added by Bunnypranav (talk • contribs) 15:57, 19 December 2024 (UTC)
- @Jonesey95: Thanks for your observation. The ref cat was added by mistake. Anyways, after a few searching of my own, I have removed that sub task
- @Jonesey95: As this is just using AWB autotagger and I did not write any code, I am not sure of the exact technicalities. As per my understanding, such tracking cats should be ignored by AWB. I am open for any testing with specific pages if you can provide them. ~/Bunnypranav:<ping> 12:30, 17 December 2024 (UTC)
- Small update, I have changed the task list again ~/Bunnypranav:<ping> 12:49, 23 December 2024 (UTC)
- Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Primefac (talk) 13:32, 1 January 2025 (UTC)
- Trial complete. 50 edits No errors that I can see. ~/Bunnypranav:<ping> 15:46, 1 January 2025 (UTC)
- @Bunnypranav: Is this also meant to do the General Fix that changes {{Unreferenced}} to {{More citations needed}} if there's at least one source? Came across this bot while keeping an eye on Category:All articles lacking sources. ARandomName123 (talk)Ping me! 00:54, 3 January 2025 (UTC)
- @ARandomName123 Yes, that is also being done using the Auto Tagging feature of AWB. See Special:Diff/1266637064, an edit during this trial. ~/Bunnypranav:<ping> 12:14, 3 January 2025 (UTC)
- @Bunnypranav: Is this also meant to do the General Fix that changes {{Unreferenced}} to {{More citations needed}} if there's at least one source? Came across this bot while keeping an eye on Category:All articles lacking sources. ARandomName123 (talk)Ping me! 00:54, 3 January 2025 (UTC)
- Trial complete. 50 edits No errors that I can see. ~/Bunnypranav:<ping> 15:46, 1 January 2025 (UTC)
Approved requests
Bots that have been approved for operations after a successful BRFA will be listed here for informational purposes. No other approval action is required for these bots. Recently approved requests can be found here (edit), while old requests can be found in the archives.
- PrimeBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 46) Approved 12:22, 3 January 2025 (UTC) (bot has flag)
- MolecularBot (BRFA · contribs · actions log · block log · flag log · user rights) Approved 12:30, 19 December 2024 (UTC) (bot has flag)
- BunnysBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 3) Approved 17:42, 16 December 2024 (UTC) (bot has flag)
- GalaxyBot (BRFA · contribs · actions log · block log · flag log · user rights) Approved 15:56, 16 December 2024 (UTC) (bot has flag)
- DreamRimmer bot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 3) Approved 16:49, 9 December 2024 (UTC) (bot has flag)
- BunnysBot (BRFA · contribs · actions log · block log · flag log · user rights) Approved 16:49, 9 December 2024 (UTC) (bot has flag)
- DatBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 12) Approved 20:44, 1 December 2024 (UTC) (bot has flag)
- DreamRimmer bot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 2) Approved 20:44, 1 December 2024 (UTC) (bot has flag)
- TNTBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 6) Approved 21:25, 19 November 2024 (UTC) (bot has flag)
- BaranBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 8) Approved 16:12, 30 October 2024 (UTC) (bot has flag)
- KiranBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 13) Approved 17:08, 20 October 2024 (UTC) (bot has flag)
- BaranBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 7) Approved 11:55, 20 October 2024 (UTC) (bot has flag)
- Monkbot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 20) Approved 11:55, 20 October 2024 (UTC) (bot has flag)
- KiranBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 11) Approved 17:24, 13 October 2024 (UTC) (bot has flag)
- Qwerfjkl (bot) (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 31) Approved 17:24, 13 October 2024 (UTC) (bot has flag)
- Leaderbot (BRFA · contribs · actions log · block log · flag log · user rights) Approved 22:09, 17 October 2024 (UTC) (bot to run unflagged)
- DreamRimmer bot (BRFA · contribs · actions log · block log · flag log · user rights) Approved 16:59, 4 October 2024 (UTC) (bot has flag)
- BaranBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 4) Approved 11:57, 10 September 2024 (UTC) (bot has flag)
- BaranBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 5) Approved 15:53, 9 September 2024 (UTC) (bot has flag)
- Protection Helper Bot (BRFA · contribs · actions log · block log · flag log · user rights) Approved 13:59, 8 September 2024 (UTC) (bot has flag)
- KiranBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 9) Approved 17:21, 1 September 2024 (UTC) (bot has flag)
- Platybot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 2) Approved 17:21, 1 September 2024 (UTC) (bot has flag)
- BaranBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 2) Approved 12:02, 11 August 2024 (UTC) (bot has flag)
- HooptyBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 2) Approved 00:01, 5 August 2024 (UTC) (bot to run unflagged)
- ChristieBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 3) Approved 23:42, 4 August 2024 (UTC) (bot has flag)
- C1MM-bot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 2) Approved 23:26, 4 August 2024 (UTC) (bot has flag)
- HBC AIV helperbot14 (BRFA · contribs · actions log · block log · flag log · user rights) Approved 13:24, 27 July 2024 (UTC) (bot has flag)
- The Sky Bot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 3) Approved 10:58, 24 July 2024 (UTC) (bot has flag)
- IznoBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 4) Approved 12:58, 21 July 2024 (UTC) (bot has flag)
- AdminStatsBot 2 (BRFA · contribs · actions log · block log · flag log · user rights) Approved 12:41, 21 July 2024 (UTC) (bot has flag)
Denied requests
Bots that have been denied for operations will be listed here for informational purposes for at least 7 days before being archived. No other action is required for these bots. Older requests can be found in the Archive.
- MolecularBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 3) Bot denied 13:26, 1 January 2025 (UTC)
- Raph65BOT (BRFA · contribs · actions log · block log · flag log · user rights) Bot denied 00:37, 23 December 2024 (UTC)
- Silksam bot (BRFA · contribs · actions log · block log · flag log · user rights) Bot denied 12:54, 2 December 2024 (UTC)
- MdWikiBot (BRFA · contribs · actions log · block log · flag log · user rights) Bot denied 12:04, 3 August 2024 (UTC)
- Arjunaraocbot (BRFA · contribs · actions log · block log · flag log · user rights) Bot denied 07:35, 23 March 2024 (UTC)
- UrbanBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 2) Bot denied 14:23, 12 October 2023 (UTC)
- Aesthetic Bot (BRFA · contribs · actions log · block log · flag log · user rights) Bot denied 19:53, 9 September 2023 (UTC)
- Dušan Kreheľ (bot) (BRFA · contribs · actions log · block log · flag log · user rights) (Task: V) Bot denied 11:24, 25 July 2023 (UTC)
- UrbanBot (BRFA · contribs · actions log · block log · flag log · user rights) Bot denied 12:43, 18 July 2023 (UTC)
- pumi (BRFA · contribs · actions log · block log · flag log · user rights) Bot denied 11:46, 10 July 2023 (UTC)
- DYKToolsAdminBot (BRFA · contribs · actions log · block log · flag log · user rights) Bot denied 11:39, 1 April 2023 (UTC)
- KiranBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 5) Bot denied 07:53, 4 February 2023 (UTC)
- PuggleBot (BRFA · contribs · actions log · block log · flag log · user rights) Bot denied 12:03, 11 January 2023 (UTC)
- Dušan Kreheľ (bot) (BRFA · contribs · actions log · block log · flag log · user rights) (Task: IV) Bot denied 13:04, 29 September 2022 (UTC)
Expired/withdrawn requests
These requests have either expired, as information required by the operator was not provided, or been withdrawn. These tasks are not authorized to run, but such lack of authorization does not necessarily follow from a finding as to merit. A bot that, having been approved for testing, was not tested by an editor, or one for which the results of testing were not posted, for example, would appear here. Bot requests should not be placed here if there is an active discussion ongoing above. Operators whose requests have expired may reactivate their requests at any time. The following list shows recent requests (if any) that have expired, listed here for informational purposes for at least 7 days before being archived. Older requests can be found in the respective archives: Expired, Withdrawn.
- DannyS712 bot III (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 74) Expired 12:47, 23 December 2024 (UTC)
- JJPMachine (BRFA · contribs · actions log · block log · flag log · user rights) Withdrawn by operator 04:28, 26 November 2024 (UTC)
- FrostlySnowman (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 10) Withdrawn by operator 04:41, 4 November 2024 (UTC)
- BaranBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 6) Withdrawn by operator 16:29, 30 October 2024 (UTC)
- CapsuleBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 2) Expired 22:58, 11 October 2024 (UTC)
- StradBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 2) Withdrawn by operator 22:53, 11 October 2024 (UTC)
- PrimeBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 39) Withdrawn by operator 12:21, 29 September 2024 (UTC)
- BattyBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 81) Withdrawn by operator 15:48, 26 August 2024 (UTC)
- Dušan Kreheľ (bot) (BRFA · contribs · actions log · block log · flag log · user rights) (Task: VII) Expired 15:41, 27 June 2024 (UTC)
- Dušan Kreheľ (bot) (BRFA · contribs · actions log · block log · flag log · user rights) (Task: VIII) Expired 15:41, 27 June 2024 (UTC)
- PearBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 14) Expired 00:23, 15 June 2024 (UTC)
- PearBOT II (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 13) Expired 07:35, 23 March 2024 (UTC)
- VulpesBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 3) Expired 21:04, 10 March 2024 (UTC)
- DYKNomCheck (BRFA · contribs · actions log · block log · flag log · user rights) Withdrawn by operator 19:20, 10 March 2024 (UTC)
- BattyBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 78) Expired 13:13, 20 February 2024 (UTC)