Wikidata:Requests for permissions/Bot/AlepfuBot
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Approved--Ymblanter (talk) 05:36, 24 September 2014 (UTC)[reply]
AlepfuBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Alepfu (talk • contribs • logs)
Task/s: Updating drug items with the statement drug action altered by (P769), see Warfarin (Q407431) for an example use.
Code: alepfubot.py
Function details: We are a team of researchers from the university of Pittsburgh and the medical university of Vienna who want to develop an automated process for improving the medical content on Wikipedia. Therefore we want to develop a Wikidata bot which updates drug items with the statement significant drug interaction (P769), see (RS)-warfarin (Q407431) for an example use. We have acquired several datasets concerning drug-drug interactions, e.g. from the Office of the National Coordinator for Health Information Technology (ONC). The structure of the datasets is like this: "object-drug", "object-Drugbank-ID", "precipitant-drug", "precipitant-Drugbank-ID". The bot would do the following steps:
- Download the datasets from a given HTTP-Resource
- Parse the dataset files
- For every object and precipitant, take it's Drugbank-ID and do a lookup in Wikidata
- If an item for both drugs is found via the Drugbank-ID, the statement drug action altered by (P769) is added to the object-item with the precipitant-item as the property value.
--Alepfu (talk) 13:10, 9 August 2014 (UTC)[reply]
- Sounds good. But there are only 2070 Wikidata items with a DrugBank ID (P715). So do you have any plans to import first these IDs? --Pasleim (talk) 13:37, 11 August 2014 (UTC)[reply]
- I think for a first version of the bot the 2070 DrugbankIDs within Wikidata will be enough, at the moment our datasets have interactions for about 250 drugs. --Alepfu (talk) 08:34, 12 August 2014 (UTC)[reply]
- Tobias1984Notified participants of WikiProject Medicine --Tobias1984 (talk) 10:38, 12 August 2014 (UTC)[reply]
Doc James
Bluerasberry
Gambo7
Daniel Mietchen
Andrew Su
Andrux
Pavel Dušek
Mvolz
User:Jtuom
Chris Mungall
ChristianKl
Gstupp
Sintakso
علاء
Adert
CFCF
Jtuom
Drchriswilliams
Okkn
CAPTAIN RAJU
LeadSongDog
Ozzie10aaaa
Marsupium
Netha Hussain
Abhijeet Safai
Seppi333
Shani Evenstein
Csisc
TiagoLubiana
ZI Jony
Antoine2711
JustScienceJS
Scossin
Josegustavomartins
Zeromonk
The Anome
Kasyap
JMagalhães
Ameer Fauri
CorraleH - Support - Great project. Bot should be able to set source for the data. Test edits are missing. Is the bot based on PyWikiBot? --Tobias1984 (talk) 10:49, 12 August 2014 (UTC)[reply]
- The bot will add sources to every statement it generates. What exactly do you mean with "test edits"? Yes, bot will be based on Pywikibot. --Alepfu (talk) 12:17, 12 August 2014 (UTC)[reply]
- @Alepfu, Andrew Su: - Test edits mean that the bot should do about 50 edits to see if it functions correctly. I created a item for the source Q17505343. So you could just say stated in (P248)=Q17505343 as the source. -Tobias1984 (talk) 10:12, 13 August 2014 (UTC)[reply]
- The bot will add sources to every statement it generates. What exactly do you mean with "test edits"? Yes, bot will be based on Pywikibot. --Alepfu (talk) 12:17, 12 August 2014 (UTC)[reply]
- Comment In principle I'd be highly supportive of more bot-imported biomedical data. I'm just not entirely clear what the source of the data is. For example, for (RS)-warfarin (Q407431), do you propose to import all the entries in this table? Cheers, Andrew Su (talk) 20:56, 12 August 2014 (UTC)[reply]
- The source of the data are lists of interactions that have been extracted from publications, like [1], these are only the most important interactions. --Alepfu (talk) 07:28, 13 August 2014 (UTC)[reply]
- @Alepfu: Personally, I would like to see a full list of sources that this bot would be ingesting and using to populate wikidata. For example, in the article you link to above, are you speaking specifically about Table 2? Are there other sources you imagine expanding to? Cheers, Andrew Su (talk) 17:07, 14 August 2014 (UTC)[reply]
- The source of the data are lists of interactions that have been extracted from publications, like [1], these are only the most important interactions. --Alepfu (talk) 07:28, 13 August 2014 (UTC)[reply]
- Like user:Andrew Su, I am concerned about tracking source data. Will you in the Wikidata interface also be able to associate the interactions you list with a citation to some source which confirms the existence of an interaction? The standard on English Wikipedia is to not make health assertions without a reference to a reliable source. You provided a list of the datasets but I am unable to understand this information. Can you present it in another way? Sources on Wikipedia are very important to our community, and it is difficult to give support if we cannot evaluate the quality of the original dataset. Blue Rasberry (talk) 15:23, 19 August 2014 (UTC)[reply]
- @Bluerasberry: I'm not quite sure what you mean, we plan on adding a source reference to every statement we set, the reference will be another item about a publication, like Q17505343, which Tobias1984 created for our dataset. --Alepfu (talk) 08:37, 20 August 2014 (UTC)[reply]
- Alepfu Support If the intent is to make a citations database in Wikidata, and link every assertion made about health interactions to a Wikidata item representing an academic publication, then I am very happy with your plans. I have always wanted Wikidata to work in this way. Blue Rasberry (talk) 20:44, 21 August 2014 (UTC)[reply]
- @Bluerasberry: I'm not quite sure what you mean, we plan on adding a source reference to every statement we set, the reference will be another item about a publication, like Q17505343, which Tobias1984 created for our dataset. --Alepfu (talk) 08:37, 20 August 2014 (UTC)[reply]
- We are waiting here for test edits.--Ymblanter (talk) 16:17, 22 August 2014 (UTC)[reply]
- I like the idea of cataloging known drug interactions through Wikidata and of creating Wikidata items for every relevant source. Would appreciate more info on the source of those interactions, though, as well as some test edits. --Daniel Mietchen (talk) 07:37, 27 August 2014 (UTC)[reply]
- sorry, i am busy at the moment, i think test edits will come in about 2 weeks --Alepfu (talk) 08:57, 1 September 2014 (UTC)[reply]
- Tobias1984Notified participants of WikiProject Medicine, I have made some test edits Alepfu (talk) 15:41, 23 September 2014 (UTC)[reply]
Doc James
Bluerasberry
Gambo7
Daniel Mietchen
Andrew Su
Andrux
Pavel Dušek
Mvolz
User:Jtuom
Chris Mungall
ChristianKl
Gstupp
Sintakso
علاء
Adert
CFCF
Jtuom
Drchriswilliams
Okkn
CAPTAIN RAJU
LeadSongDog
Ozzie10aaaa
Marsupium
Netha Hussain
Abhijeet Safai
Seppi333
Shani Evenstein
Csisc
TiagoLubiana
ZI Jony
Antoine2711
JustScienceJS
Scossin
Josegustavomartins
Zeromonk
The Anome
Kasyap
JMagalhães
Ameer Fauri
CorraleH- Support Everything look good. Tobias1984 (talk) 15:45, 23 September 2014 (UTC)[reply]
- Support I'm curious about the way you structured the citation/source (ie rather than just linking out to a pubmed id). I think this has been discussed before on the list, but not sure if consensus was reached? It seems like a perfect bot task to take a source = pmid and turn the pmid into what you guys have there. If you already have the internals of such a bot, perhaps you might make it open source and share it with the community? --Genewiki123 (talk) 17:20, 23 September 2014 (UTC)[reply]