- Training models
- Samoan Wikipedia sm
-
Shona Wikipedia snsee T308142#8804657 - Somali Wikipedia so
- Albanian Wikipedia sq
- Serbian Wikipedia sr
- Sranan Tongo Wikipedia srn
- Swati Wikipedia ss
- Southern Sotho Wikipedia st
- Saterland Frisian Wikipedia stq
- Sundanese Wikipedia su
- Silesian Wikipedia szl
-
Sakizaya Wikipedia szysee T308142#8804657 - Tamil Wikipedia ta
- Tulu Wikipedia tcy
- Telugu Wikipedia te
- Tetum Wikipedia tet
- Tajik Wikipedia tg
- Thai Wikipedia th
- Models verification
- Publish Datasets
- Populate the excluded section titles
- Deploy back-end
- Check how the model works on the wikis
- In Search, use hasrecommendation:link to find articles
- Test them on https://api.wikimedia.org/service/linkrecommendation/apidocs/#/default/get_v1_linkrecommendations__project___domain___page_title_
- Inform communities
- Deploy front-end
Description
Details
- Due Date
- Nov 22 2023, 5:00 PM
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | • lbowmaker | T307881 Scaling of link suggestions service | |||
Open | Trizek-WMF | T304110 [EPIC] Deploy "add a link" to all Wikipedias | |||
Resolved | Sgs | T308142 Deploy "add a link" to 16th round of wikis |
Event Timeline
Model evaluation has been completed and below are the backtesting results:
Precision@0.5 | Recall@0.5 | |
smwiki | 0.87 | 0.68 |
snwiki | 0.64 | 0.16 |
sowiki | 0.71 | 0.39 |
sqwiki | 0.89 | 0.58 |
srwiki | 0.90 | 0.47 |
srnwiki | 0.98 | 0.77 |
sswiki | 0.92 | 0.38 |
stwiki | 0.99 | 0.82 |
stqwiki | 0.90 | 0.72 |
suwiki | 0.98 | 0.81 |
swwiki | 0.88 | 0.63 |
szlwiki | 0.96 | 0.81 |
szywiki | 0.65 | 0.32 |
tawiki | 0.72 | 0.01 |
tcywiki | 0.88 | 0.11 |
tewiki | 0.79 | 0.13 |
tetwiki | 0.84 | 0.69 |
tgwiki | 0.90 | 0.61 |
thwiki | 0.72 | 0.21 |
CCing @MGerlach, in case he would like to add comments on the backtesting evaluation.
The conclusion on the backtesting results is that most of the languages look fine besides:
- snwiki (0.64), sowiki (0.71), szywiki (0.65), tawiki (0.72), thwiki (0.72) have a precision lower than the recommended one (0.75)
Talked to @MGerlach about these results and agreed that sowiki, tawiki, thwiki should be published but snwiki, szywiki shouldn't.
@kostajh, we published datasets for all 17/19 models that passed the evaluation in this round.
I ran this script for adding the link-recommendation task type and populating the excluded sections entries:
PHAB=T308142 for WIKI in smwiki sowiki sqwiki srwiki srnwiki sswiki stwiki stqwiki suwiki szlwiki tawiki tcywiki tewiki tetwiki tgwiki thwiki; do ORIGIN=`mwscript getConfiguration.php $WIKI --settings 'wgCanonicalServer' --format json | jq --raw-output '.wgCanonicalServer'` mwscript extensions/GrowthExperiments/maintenance/changeWikiConfig.php $WIKI \ --page MediaWiki:NewcomerTasks.json \ --create-only \ --json \ --summary "Growth features configuration boilerplate ([[phab:$PHAB]])" \ link-recommendation \ '{ "type": "link-recommendation", "group": "easy" }' jq "select(.wiki==\"$WIKI\" and .probability > 0.25) | .section" wiki_sections.jsonl \ | jq --slurp --compact-output "unique" \ | mwscript extensions/GrowthExperiments/maintenance/changeWikiConfig.php $WIKI \ --page MediaWiki:NewcomerTasks.json \ --json \ --summary "machine-generated configuration for excluding sections from link recommendations ([[phab:$PHAB]]), feel free to improve" \ link-recommendation.excludedSections \ "`cat`" echo "$ORIGIN/wiki/MediaWiki:NewcomerTasks.json" echo "$ORIGIN/w/index.php?title=MediaWiki:NewcomerTasks.json&diff=next" echo "Press <Enter> to continue" read # give time for manual verification done
Note that the script didn't populate excludedSections for stqwiki because it is not present in the wiki_sections.jsonl, see T345562.
Change 974169 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):
[operations/mediawiki-config@master] GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis
Change 974169 merged by jenkins-bot:
[operations/mediawiki-config@master] GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis
Mentioned in SAL (#wikimedia-operations) [2023-11-15T14:39:11Z] <awight@deploy2002> Started scap: Backport for [[gerrit:974169|GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis (T308142 T308143)]]
Mentioned in SAL (#wikimedia-operations) [2023-11-15T14:41:55Z] <awight@deploy2002> sgimeno and awight: Backport for [[gerrit:974169|GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis (T308142 T308143)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
Mentioned in SAL (#wikimedia-operations) [2023-11-15T14:47:27Z] <awight@deploy2002> Finished scap: Backport for [[gerrit:974169|GrowthExperiments: enable AddLink backend for 16,17th rounds of wikis (T308142 T308143)]] (duration: 08m 16s)
Change 976804 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):
[operations/mediawiki-config@master] GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis
Change 976804 merged by jenkins-bot:
[operations/mediawiki-config@master] GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis
Mentioned in SAL (#wikimedia-operations) [2023-11-27T08:09:42Z] <taavi@deploy2002> Started scap: Backport for [[gerrit:976804|GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis (T308142 T308143)]]
Mentioned in SAL (#wikimedia-operations) [2023-11-27T08:18:49Z] <taavi@deploy2002> taavi and sgimeno: Backport for [[gerrit:976804|GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis (T308142 T308143)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
Mentioned in SAL (#wikimedia-operations) [2023-11-27T08:29:36Z] <taavi@deploy2002> Finished scap: Backport for [[gerrit:976804|GrowthExperiments: enable AddLink frontend for 16,17th rounds of wikis (T308142 T308143)]] (duration: 19m 54s)
Checked selected wikis from the list - leaving in the Test in Production column for monitoring during this week.