User:HBC archive builderbot
This user account is a bot operated by HighInBC (talk). It is used to make repetitive automated or semi-automated edits that would be extremely tedious to do manually, in accordance with the bot policy. This bot does not yet have the approval of the community, or approval has been withdrawn or expired, and therefore shouldn't be making edits that appear to be unassisted except in the operator's or its own user and user talk space. Administrators: if this bot is making edits that appear to be unassisted to pages not in the operator's or its own userspace, please block it. |
This bot is pending approval and is thus inactive. Contact User:HighInBC for questions. HighInBC (Need help? Ask me) 21:23, 2 February 2007 (UTC)
Purpose
[edit]This bot is designed to go through the revision history of pages such as WP:RFCN and automatically detect the removal of sections, and add a link to the last occurrence of that section in an archive. This will provide an archive of all past and future names discussed on the board.
See /sandbox for an example of what my output will look like once approved.
Technical
[edit]This bot runs in perl. It uses the Algorithm::Diff module to compare each revision with the next. If it detects that both a header was removed, and nothing was added, then it considers it an archiving of a discussion. It uses the revision number, the edit summary, the user doing the edit, and the contents of the heading to make an archive entry.
The actual revision history is gathered using the Special:Export command and a caching system I wrote that ensures only new revisions are downloaded. The first run of this will take 10-15 minutes to populate the cache, subsequent runs will take only moments as it load only the new ones.
In testing I found the diff module could analyze over 2600 diffs in less than 3 seconds, this is very fast.
The program will mostly likely run twice daily.