FlaschBot1

Joined 6 September 2024
Revision as of 17:38, 11 October 2024 by Fl.schmitt (talk | contribs) (+flag parameter)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Signalment

edit
  • Operator: User:Fl.schmitt
  • Tasks: FlaschBot1 will work with Commons file pages lacking {{Information}} template. FlaschBot1 isn't expected to work autonomous, but will assist in creating the appropriate text patterns to categorize the unstructured file page text content. To do so, it collects the file page texts and applies a set of regex patterns, categorizing the content. Bot operator's task is to check the analysis result and to adapt the regex patterns until the complete input set can be edited automatically. Some Commons files will need manual intervention if they don't fit the pattern.
  • Operation: supervised, semi-automatic
  • When: intermittendly
  • Maximum edit rate: 4-5 file page edits per minute
  • Language: python
  • Source code:
  • Project management on Phabricator - feel free to add task / issues / bug repoorts there!

Details and Limitations

edit

FlaschBot1 works in three consecutive modes:

  1. Analysis of a batch of file pages belonging to the same topic, uploaded by / attributed to the same person, and so on. Analysis consists of collecting the unstructured file description components, classifying them and identifying files which need manual editing. The analysis component will be run manually multiple times until the description components by all files in the batch are identified and classified.
    Analysis result will be made persistent and used in the other modes:
  2. Simulation: based on the analysis result, prepare the planned changes to be made to the filepages, and make those changes persistent locally for review. Simulation mode doesn't make any changes to the live content of commons, but creates SQL and plain-text output to check for any mistakes.
  3. Execution: Finally, after reviewing the proposed changes, run the bot in execution mode to modify the live commons content.