Wikipedia:Bots/Requests for approval/Lonjers french region rename bot
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: Lonjers (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 03:36, Thursday, January 7, 2016 (UTC)
Automatic, Supervised, or Manual:Automatic
Programming language(s):Python
Source code available:https://github.com/utilitarianexe/wiki_france_region_rename
Function overview:Removes unused region and department parameters from the French commune info boxes for french commune articles.
Links to relevant discussions (where appropriate):https://en.wikipedia.org/wiki/Wikipedia:Bot_requests#New_French_regions_on_1_January https://en.wikipedia.org/wiki/User_talk:AHeneen#help_with_info_box_renaming https://en.wikipedia.org/wiki/Module_talk:Wikidata#Suggested_test_case:_New_French_Regions
Edit period(s): one time
Estimated number of pages affected:30,000
Exclusion compliant Yes:
Already has a bot flag (Yes/No):
Function details:Fairly simple find and replace task. Bot first gets the list of all English French commune articles. It then searches for the French commune info boxe on each article in the list. It finds in the info box the region and department parameters and removes them. The region and department are currently calculated using the INEE code. Having the additional non functional region and department parameters is confusing.
Discussion
[edit]Hi, User:AHeneen brought your proposal to my attention. You do not need to change the regions in the commune infoboxes because that is done automatically, using the INSEE code. See for instance Largentière: the infobox contains the line "|region = Rhône-Alpes", but this is ignored because the infobox uses the first two numbers of the INSEE code ("07132") to determine it is in the (new) region Auvergne-Rhône-Alpes. I have already updated the regions in all the relevant department, arrondissement and canton articles (infobox and article text). What still needs to be done is change the regions in the article text for the communes. Maybe a bot can help there. But be careful, because not all references to an old region should be changed, for instance Alsace may refer to the traditional region, not the former administrative region. Markussep Talk 08:36, 13 January 2016 (UTC)[reply]
- Context-sensitive changes are very tricky for bots. We can try to come up with a restrictive replacement, maybe, but (and I'm sorry to say this) a manually-assisted AWB job might be the way to go here. Needs a closer look to see how these articles are structured. — Earwig talk 08:44, 13 January 2016 (UTC)[reply]
- Thanks for the responses. Yikes that must have taken some time to edit all those manually. But better in the end because you fixed it in the article text too. I was avoiding that because of the context problems. User:The Earwig I think you could still edit the text in all the commune articles if you restricted to just the line at the top of the articles. Nearly all of them have the form "in the *** region" at the top of the article and you could just skip the ones that don't exactly match that in the first sentence. I do agree though that changing it anywhere else would require manual checking. Let me know if you think that is a good idea to try and I will modify my code for that task. Kinda just trying to find a way to still use the code that I wrote. But if you don't think it is a good idea that is ok too still good practice. Lonjers (talk) 21:03, 13 January 2016 (UTC)[reply]
- There's an important lesson to be learned here about work being spent on bots that later turn out to be unnecessary. It happens, although for your first task it's a bit unfortunate. We can give what you are proposing a shot. I am thinking of some additional conditions, like skipping articles that already include the new region name (which have likely been migrated already) or have "was" in the same sentence as the region-to-replace, but it's still tricky to get right. You might want to start by going through the relevant articles and building a list of which ones the bot would definitely change, so we can get a sense of the number of edits and do some spot-checks. — Earwig talk 22:29, 13 January 2016 (UTC)[reply]
- Did some checking and actually not very many of the articles match a standard template. In general it seems most of the articles don't even include the region in the text. The French wikipedia versions of the articles usually do but those seem to already be updated. I guess we should close this request for now. Still looking for little programming tasks to do on wikipediat if you have any suggestions. Lonjers (talk) 23:19, 16 January 2016 (UTC)[reply]
- There's an important lesson to be learned here about work being spent on bots that later turn out to be unnecessary. It happens, although for your first task it's a bit unfortunate. We can give what you are proposing a shot. I am thinking of some additional conditions, like skipping articles that already include the new region name (which have likely been migrated already) or have "was" in the same sentence as the region-to-replace, but it's still tricky to get right. You might want to start by going through the relevant articles and building a list of which ones the bot would definitely change, so we can get a sense of the number of edits and do some spot-checks. — Earwig talk 22:29, 13 January 2016 (UTC)[reply]
- Thanks for the responses. Yikes that must have taken some time to edit all those manually. But better in the end because you fixed it in the article text too. I was avoiding that because of the context problems. User:The Earwig I think you could still edit the text in all the commune articles if you restricted to just the line at the top of the articles. Nearly all of them have the form "in the *** region" at the top of the article and you could just skip the ones that don't exactly match that in the first sentence. I do agree though that changing it anywhere else would require manual checking. Let me know if you think that is a good idea to try and I will modify my code for that task. Kinda just trying to find a way to still use the code that I wrote. But if you don't think it is a good idea that is ok too still good practice. Lonjers (talk) 21:03, 13 January 2016 (UTC)[reply]
- I did not know that the template ignored the Region parameter and used the INSEE number. Sorry for your wasted effort @Lonjers:. Changing the article links within the prose is a huge task and cannot be easily done with a bot, as mentioned above. Also, the region names are only temporary for a few months. The regional governments must chose a new name by 1 July and the national government then has until October to recognize or reject the new region names. Except for Normandy, all of the new region names in the prose of the articles must be changed again when the official name is approved. AHeneen (talk) 03:32, 14 January 2016 (UTC)[reply]
- No worries Lonjers (talk) 23:19, 16 January 2016 (UTC)[reply]
- I did not know that the template ignored the Region parameter and used the INSEE number. Sorry for your wasted effort @Lonjers:. Changing the article links within the prose is a huge task and cannot be easily done with a bot, as mentioned above. Also, the region names are only temporary for a few months. The regional governments must chose a new name by 1 July and the national government then has until October to recognize or reject the new region names. Except for Normandy, all of the new region names in the prose of the articles must be changed again when the official name is approved. AHeneen (talk) 03:32, 14 January 2016 (UTC)[reply]
@Lonjers: Following comments above, do you wish to proceed with this BRFA in any manner? — HELLKNOWZ ▎TALK 15:55, 17 January 2016 (UTC)[reply]
- Lets wait to see how the discussion below with Rich goes. I probably do not want to proceed. But I will send you an update when I know for sure. Lonjers (talk) 22:44, 19 January 2016 (UTC)[reply]
- @Hellknowz: So I think my plan now is to use this to just remove the unused region name parameter from the articles. Should be simple to update the code to work like this. Let me know if you think this is a good idea. Sorry for being so long getting back to you. Lonjers (talk) 22:37, 25 January 2016 (UTC)[reply]
- Could you update the function details to the exact latest spec that you want to run so we know what the bot intends to do? — HELLKNOWZ ▎TALK 23:03, 25 January 2016 (UTC)[reply]
- Done will update the code today too. Lonjers (talk) 21:19, 27 January 2016 (UTC)[reply]
- Code is now updated to just do this small task. @Hellknowz: Lonjers (talk) 06:11, 29 January 2016 (UTC)[reply]
- Done will update the code today too. Lonjers (talk) 21:19, 27 January 2016 (UTC)[reply]
- Could you update the function details to the exact latest spec that you want to run so we know what the bot intends to do? — HELLKNOWZ ▎TALK 23:03, 25 January 2016 (UTC)[reply]
- @Hellknowz: So I think my plan now is to use this to just remove the unused region name parameter from the articles. Should be simple to update the code to work like this. Let me know if you think this is a good idea. Sorry for being so long getting back to you. Lonjers (talk) 22:37, 25 January 2016 (UTC)[reply]
- Let me just say that the region hack is just that: a hack. Once the new names are finalised updating the infoboxen would be a good idea.
- hmmm can you explain why it would be a better solution to use a region name explicitly. Seemingly the templace editors made the choice to change it to use this way for a good reason as the template code looks pretty intense. Would be happy to use this to remove the now unused region parameters. But if there is a good reason let me know and I will try to contact the people who made the template and we can change the template to use the explicit region. Lonjers (talk) 22:44, 19 January 2016 (UTC)[reply]
- Let's suppose, for example, that someone, wittingly or unwittingly changes the INSEE. A random editor seeing the wrong region would be at a loss to fix it. A better solution might be to calculate the region and compare it with the given region, adding the article to a hidden tracking category if they don't match. It would also be better to encapsulate the region calculation in a reusable manner, such as
{{French region name from INSEE code}}
. All the best: Rich Farmbrough, 22:45, 20 January 2016 (UTC).[reply]
- It certainly confused me how the template works. I think that removing the region param from the current pages would be a good first step. And then changing the template to include the new template you mentioned would make things much cleaner. That template could then be used in other places as well. Lonjers (talk) 22:37, 25 January 2016 (UTC)[reply]
- Let's suppose, for example, that someone, wittingly or unwittingly changes the INSEE. A random editor seeing the wrong region would be at a loss to fix it. A better solution might be to calculate the region and compare it with the given region, adding the article to a hidden tracking category if they don't match. It would also be better to encapsulate the region calculation in a reusable manner, such as
- hmmm can you explain why it would be a better solution to use a region name explicitly. Seemingly the templace editors made the choice to change it to use this way for a good reason as the template code looks pretty intense. Would be happy to use this to remove the now unused region parameters. But if there is a good reason let me know and I will try to contact the people who made the template and we can change the template to use the explicit region. Lonjers (talk) 22:44, 19 January 2016 (UTC)[reply]
- I also think it would be a perfect pilot task to fix the "Centre" to "Centre-Val de Loire" (in the French commune infobox) now, so maybe continue this BRFA on that basis?
- All the best: Rich Farmbrough, 21:20, 17 January 2016 (UTC).[reply]
- So it is actually not obvious to me how the Centre gets in there instead of Centre-Val de Loire. It is not something in the markup of each commune article in the region. It is somehow being generated by the template incorrectly. I am going to look into fixing this in the template. Or if we do decide to explicity add the region to each pages markup the template should pull it form there I think. @Rich Farmbrough: Lonjers (talk) 22:44, 19 January 2016 (UTC)[reply]
- The infobox should show "Centre-Val de Loire", not Centre. It does for the communes I checked, except the caption of the detailed map, that's corrected now (may take some time / null-edit to show). For the new regions, these maps are not available yet, and maybe we should change them for department maps instead (when available). Indeed the parameter fields "department" and "region" in the infoboxes are not used anymore, so they can be removed (but don't have to). Markussep Talk 09:19, 20 January 2016 (UTC)[reply]
Updated task
[edit]What are the "appropriate info boxes" -- you have to give the exact list. Is it just {{Infobox French commune}}? Are redirects included? Should you also remove |department=
per "The fields "region" and "department" are no longer used."? Please also leave a message with Wikipedia talk:WikiProject France to make sure no one has unforeseen objections. — HELLKNOWZ ▎TALK 13:27, 29 January 2016 (UTC)[reply]
- The list of articles to edit is not found by looking for all articles with the template. It is based on Lists of communes of France. It only modifies them if they have Infobox French commune. All the articles I have seen use that one at least on the English wikipedia. Some of the other ones on the redirect don't seem to be used at least in the few hundred I have looked through. But if they do come up when I go through all of them I will edit those to. I was not planning on editing the commune articles on the French and Japanese wikipedias. Those contain the other redirects. But I guess it would make sense to do those to. Just was not sure how to ask premission to also edit those as I don't know the languages. I do plan of editing out the department parameter too. I edited the code and request for this. Going to leave a message on the project page now. Let me know if this makes sense. Lonjers (talk) 02:39, 31 January 2016 (UTC)[reply]
Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Please use a descriptive edit summary telling what is happening, link to infobox for information and link to this BRFA or some bot description page about the task. I suggest you randomize the page list before running, so you don't end up running on a set of very similar articles and we have a higher chance to see anything unforeseen. (Also for clarity, this is for English Wikipedia only, we don't really deal with any other languages.) As an additional task, I would suggest making a report of what the fields contained versus what the output actually was via the INSEE template, but that's totally up to you if no one objects if the different unused values are removed. — HELLKNOWZ ▎TALK 19:16, 31 January 2016 (UTC)[reply]
- Trial complete. Completed the test run and manually checked all the articles it operated on. The randomization is a great idea so I did it hat way. The bot never incorrectly edited an article but occasionally it runs across one that the regex does not match and halts. For example Saint Barthélemy has no commune info box because it is no longer a commune. When I run the bot on more articles I will manually handle these cases. Just doing some spot checking nearly all the articles have region parameters that are wrong(unless the region did not change). But I plan on removing them even if they are correct as the main point of doing this is to reduce confusion for future editors(not quite as lofty as my original goal with this).Lonjers (talk) 04:48, 1 February 2016 (UTC)[reply]
- Oh let me know what the next steps are to actually use this. I think I will need the edit speed limit in pywikibot lowered or else it is going to take forever Lonjers (talk) 01:21, 2 February 2016 (UTC)[reply]
- I'll leave this open for a bit so anyone from WT:WP/F or watching BRFA stuff or articles edited can comment if needed. I'll review the edits later (if someone doesn't beat me to it.) — HELLKNOWZ ▎TALK 01:59, 2 February 2016 (UTC)[reply]
- Oh let me know what the next steps are to actually use this. I think I will need the edit speed limit in pywikibot lowered or else it is going to take forever Lonjers (talk) 01:21, 2 February 2016 (UTC)[reply]
Approved. Edits look good. Consensus for task appears established. No concerns received during trials. Approved for a one-off task. (If deprecated parameter removal is a generic task the OP wants, a new BRFA should be filled that can be processed quicker.) Please use a descriptive summary, and make sure the bot's talk page is available for anyone with issues (pre-create it). — HELLKNOWZ ▎TALK 15:17, 8 February 2016 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.