Wikidata:Requests for permissions/Bot/ADSBot English Statement
From Wikidata
Jump to navigation
Jump to search
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Not done, not finished during Outreachy, not clear who could take this on. Feel free to re-open if you want to work more on this. Thanks. Mike Peel (talk) 18:46, 24 September 2022 (UTC)[reply]
ADSEnglishBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)}}
Operator: Feliciss (talk • contribs • logs)
Task/s: Adding missing statements and statement-related properties to existing scholarly articles on Wikidata from the ADS database. Part of Outreachy Round 24.
Code: values_from_ads_to_paper_on_wiki.py
Function details: --Feliciss (talk) 11:24, 15 July 2022 (UTC)[reply]
- Search all ADS bibcodes on Wikidata through Wikidata Query Service.
- Use ADS bibcodes as keys in the bibcode field in ADS to find papers in the ADS database.
- Add missing statement and statement-related properties back to scholarly articles on Wikidata.
Notes:
- This is a TBD bot request. Document on how to continue this bot request is available on Wikimedia Phabricator: https://phabricator.wikimedia.org/T316089. Welcome anyone to make a contribution to this bot request.
- For those who are curious about what statements will be added to Wikidata from the ADS database, there's an item listing that: https://www.wikidata.org/wiki/Q112684896
- Original thoughts come from Pathway 1 on a diagram drafting on Wikimedia Phabricator if anyone's interested: https://phab.wmfusercontent.org/file/data/lnlj5477majaglrd4eas/PHID-FILE-gidyiuwdukmtjap42zgi/Approach_to_Surnames_%282%29.png
- This bot regularly runs in case a new article with ADS bibcode is added to Wikidata and there's missing information in that article.
-- – The preceding unsigned comment was added by Feliciss (talk • contribs) at 11:36, 15 July 2022 (UTC).[reply]
- Hi Feliciss - this all sounds reasonable. I'm curious why you're starting from surname lists rather than from the actual article database itself? ArthurPSmith (talk) 16:14, 15 July 2022 (UTC)[reply]
- Hi. To answer your question, let me explain an outline and some limitations in the ADS database.
- TL;DR: using author surnames from Wikidata as a criteria to search articles in the ADS database and adding them to Wikidata is a long term solution because there is always a new surname added to Wikidata.
- Firstly, you may know I'm working on The Outreachy Internship Program during this summer, for about 2 months. The task for me to handle many or all actual article databases are extremely difficult during this internship. So I choose ADS database as my demo.
- Secondly, there is only one way to import the actual article database ADS to Wikidata, using two filters, author names and article titles (or ADS bibcode) that existed in every paper in the database, to search relevant articles in the ADS database.
- However, there are some limitations on author names and article titles
- - The title isn't always meaningful. It may contain special characters and it's hard to process.
- - Author full names are also tricky, as sometimes there's an initials in the first names of an author.
- So, in conclusion, surnames are constant, imambignous, unchangeable and eternal (most cases). They are ready to use and have positive feedback from Wikidata as I'm using the surname lists from Wikidata and it's auditable.
- You may also wonder why didn't I use ADS bibcode from ADS database to match articles on Wikidata and adding missing statements to Wikidata. Because at this moment, I think that is not a long term solution as I don't know how many articles with ADS bibcode added to ADS database in a short time.
- PS: this is a project related to author names. If you're interested to learn more about my strategy handling author names, you may refer to this wiki page: https://www.wikidata.org/wiki/User:Feliciss/StrategyOfNames Feliciss (talk) 05:37, 18 July 2022 (UTC)[reply]
- Support. Feliciss is an Outreachy intern, being mentored by User:Mike Peel and me. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:48, 17 July 2022 (UTC)[reply]
- Support Huh, I thought I'd responded here, but it seems lost. Anyway, I still don't really understand the surnames restriction, but the data looks fine so let's get this going. ArthurPSmith (talk) 16:19, 25 July 2022 (UTC)[reply]
- Hi @ArthurPSmith. Sorry for the late update. It's okay that we remove the surnames restriction on this bot request, as there's a direct way to use ADS bibcode to find existing scholarly articles on Wikidata and add information to scholarly articles, instead of using surname.
- My way to find all articles that include ADS bibcode is on this query, and the count of the items query, which are 750373 articles at the moment in which I'll add statements and properties to them.
- I'll get back working on this bot request later I finish running Wikidata:Requests for permissions/Bot/ADSBot English Paper on toolforge, giving the remaining time if I have in the Outreachy internship. Feliciss (talk) 09:32, 26 July 2022 (UTC)[reply]
- And here's the query on all of the values of P819 I can put them into ADS search engine. Feliciss (talk) 09:47, 26 July 2022 (UTC)[reply]
- Please register the bot and let it perform a test run of 50-250 edits. Lymantria (talk) 10:41, 27 July 2022 (UTC)[reply]
- Hi Lymantria. This is a WIP task and I expect I will get working on this task after I've finished running https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/ADSBot_English_Paper on Toolforge. Thanks. Feliciss (talk) 08:03, 5 August 2022 (UTC)[reply]
- I'm changing the description to TBD as I don't think I will be finishing this in the last week of my internship. I'll document what I want to achieve with this bot request and look for someone to handle it over in the Wikidata community. Feliciss (talk) 10:25, 19 August 2022 (UTC)[reply]
- Document is now available on Phabricator: https://phabricator.wikimedia.org/T316089. Feliciss (talk) 10:02, 24 August 2022 (UTC)[reply]