- Training models
-
Tigrinya Wikipedia tisee T308143#8827377 - Turkmen Wikipedia tk
- Tagalog Wikipedia tl
- Tswana Wikipedia tn
- Tongan Wikipedia to
- Tok Pisin Wikipedia tpi
- Turkish Wikipedia tr
- Tsonga Wikipedia ts
- Tatar Wikipedia tt
- Twi Wikipedia tw
- Tahitian Wikipedia ty
- Tuvinian Wikipedia tyv
- Udmurt Wikipedia udm
- Uyghur Wikipedia ug
-
Urdu Wikipedia urT308143#8827377 - Uzbek Wikipedia uz
- Venda Wikipedia ve
- Venetian Wikipedia vec
- Veps Wikipedia vep
- West Flemish Wikipedia vls
- Volapük Wikipedia vo
-
- Models verification
- Publish Datasets
- Populate the excluded section titles
- Deploy back-end
- Check how the model works on the wikis
- In Search, use hasrecommendation:link to find articles
- Test them on https://api.wikimedia.org/service/linkrecommendation/apidocs/#/default/get_v1_linkrecommendations__project___domain___page_title_
- Inform communities
- Deploy front-end
Description
Details
- Due Date
- Nov 22 2023, 5:00 PM
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | • lbowmaker | T307881 Scaling of link suggestions service | |||
Open | Trizek-WMF | T304110 [EPIC] Deploy "add a link" to all Wikipedias | |||
Resolved | Sgs | T308143 Deploy "add a link" to 17th round of wikis |
Event Timeline
I moved Tumbuka Wikipedia (tum) to an earlier batch as they are interested by the feature.
Model evaluation has been completed and below are the backtesting results:
Precision@0.5 | Recall@0.5 | |
tiwiki | 0.54 | 0.50 |
tkwiki | 0.74 | 0.27 |
tlwiki | 0.81 | 0.52 |
tnwiki | 0.90 | 0.74 |
towiki | 0.94 | 0.73 |
tpiwiki | 0.80 | 0.69 |
trwiki | 0.75 | 0.35 |
tswiki | 0.89 | 0.59 |
ttwiki | 0.93 | 0.38 |
twwiki | 0.80 | 0.61 |
tywiki | 0.97 | 0.85 |
tyvwiki | 0.78 | 0.39 |
udmwiki | 0.83 | 0.33 |
ugwiki | 0.88 | 0.53 |
urwiki | 0.62 | 0.23 |
uzwiki | 0.80 | 0.30 |
vewiki | 0.99 | 0.93 |
vecwiki | 0.96 | 0.75 |
vepwiki | 0.87 | 0.38 |
vlswiki | 0.84 | 0.55 |
vowiki | 0.98 | 0.43 |
CCing @MGerlach, in case he would like to add comments on the backtesting evaluation.
The conclusion on the backtesting results is that most of the languages look fine besides:
- tiwiki (0.54), tkwiki (0.74), urwiki (0.62) have a precision lower than the recommended one (0.75)
Talked to @MGerlach about these results and agreed that tkwiki should be published but tiwiki and urwiki shouldn't.
@kostajh, we published datasets for all 19/21 models that passed the evaluation in this round.
I ran this script for adding the link-recommendation task type and populating the excluded sections entries:
PHAB=T308143 for WIKI in tkwiki tlwiki tnwiki towiki tpiwiki trwiki tswiki ttwiki twwiki tywiki tyvwiki udmwiki ugwiki uzwiki vewiki vecwiki vepwiki vlswiki vowiki; do ORIGIN=`mwscript getConfiguration.php $WIKI --settings 'wgCanonicalServer' --format json | jq --raw-output '.wgCanonicalServer'` mwscript extensions/GrowthExperiments/maintenance/changeWikiConfig.php $WIKI \ --page MediaWiki:NewcomerTasks.json \ --create-only \ --json \ --summary "Growth features configuration boilerplate ([[phab:$PHAB]])" \ link-recommendation \ '{ "type": "link-recommendation", "group": "easy" }' jq "select(.wiki==\"$WIKI\" and .probability > 0.25) | .section" wiki_sections.jsonl \ | jq --slurp --compact-output "unique" \ | mwscript extensions/GrowthExperiments/maintenance/changeWikiConfig.php $WIKI \ --page MediaWiki:NewcomerTasks.json \ --json \ --summary "machine-generated configuration for excluding sections from link recommendations ([[phab:$PHAB]]), feel free to improve" \ link-recommendation.excludedSections \ "`cat`" echo "$ORIGIN/wiki/MediaWiki:NewcomerTasks.json" echo "$ORIGIN/w/index.php?title=MediaWiki:NewcomerTasks.json&diff=next" echo "Press <Enter> to continue" read # give time for manual verification done
Note that the script didn't populate excludedSections for towiki and tywiki because these were not present in the wiki_sections.jsonl, see T345562. Also vowiki didn't populate excluded sections because the probability for the ones in wiki_sections.jsonl were too low.
Change 974169 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):
[operations/mediawiki-config@master] GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis
Change 974169 merged by jenkins-bot:
[operations/mediawiki-config@master] GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis
Mentioned in SAL (#wikimedia-operations) [2023-11-15T14:39:11Z] <awight@deploy2002> Started scap: Backport for [[gerrit:974169|GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis (T308142 T308143)]]
Mentioned in SAL (#wikimedia-operations) [2023-11-15T14:41:55Z] <awight@deploy2002> sgimeno and awight: Backport for [[gerrit:974169|GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis (T308142 T308143)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
Mentioned in SAL (#wikimedia-operations) [2023-11-15T14:47:27Z] <awight@deploy2002> Finished scap: Backport for [[gerrit:974169|GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis (T308142 T308143)]] (duration: 08m 16s)
Change 976804 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):
[operations/mediawiki-config@master] GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis
Change 976804 merged by jenkins-bot:
[operations/mediawiki-config@master] GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis
Mentioned in SAL (#wikimedia-operations) [2023-11-27T08:09:42Z] <taavi@deploy2002> Started scap: Backport for [[gerrit:976804|GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis (T308142 T308143)]]
Mentioned in SAL (#wikimedia-operations) [2023-11-27T08:18:49Z] <taavi@deploy2002> taavi and sgimeno: Backport for [[gerrit:976804|GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis (T308142 T308143)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
Mentioned in SAL (#wikimedia-operations) [2023-11-27T08:29:36Z] <taavi@deploy2002> Finished scap: Backport for [[gerrit:976804|GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis (T308142 T308143)]] (duration: 19m 54s)
Checked selected wikis from the list - all works as expected; leaving in the Test in Production column to monitor it during this week.