Wikidata:Requests for permissions/Bot/Emijrpbot 10
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Approved--Ymblanter (talk) 19:34, 20 March 2024 (UTC)[reply]
Emijrpbot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Emijrp (talk • contribs • logs)
Task/s: Create pairs of written work (Q47461344)/version, edition or translation (Q3331189) for works/editions in the National Library of Spain datos.bne.es project (license CC-0).
Function details: First bot checks if a work/edition has already been created by other users, using ISBN, title, OpenLibraryID, GoodreadsID. If there is an item available, it skips that work/edition. If there isn't an item, it creates the pair work/edition, adding labels (es), descriptions (en, es), title, author, publication date, number of pages, etc, and multiple IDS (ISBN, OpenLibrary, Goodreads).
Examples for author XX943530:
- Lucha de historias, lucha de memorias. España, 2002-2015 (Q124737989): written work by Francisco Espinosa Maestre
- Lucha de historias, lucha de memorias. España, 2002-2015 (Q124737991): 2015 edition of written work by Francisco Espinosa Maestre
- Contra la República. Los "sucesos de Almonte" de 1932 (Q124733394): written work by Francisco Espinosa Maestre
- Contra la República. Los "sucesos de Almonte" de 1932 (Q124737398): 2012 edition of written work by Francisco Espinosa Maestre
--Emijrp (talk) 20:00, 7 March 2024 (UTC)[reply]
- Hello @Emijrp, do you have an idea of the number of items that will be created or edited? Is it the whole catalog of the National Library of Spain? -Framawiki (please notify !) (talk) 15:43, 9 March 2024 (UTC)[reply]
- Hello Framawiki, IMHO I don't think that all books in BNE are indexed in the catalog. The datos.bne.es project has over 4,000,000 objects I think, but not all them are works/editions (it includes sound recordings, films, etc, which are excluded by my bot). I think that a realistic figure is 1,000,000 works/editions, but I want to start gradually, perhaps creating works/editions only for those authors (like Francisco Espinosa Maestre (Q5865630)) which has already an item here in Wikidata, so works/editions can be linked to them. In future iterations, I can improve the approach and create items for missing authors too and then import their works. Emijrp (talk) 16:08, 9 March 2024 (UTC)[reply]
- For more details, I only import stuff with ISBN, so bot excludes anything before 1970. My interest is creating high quality items, reducing the duplicates error to a minimum, and adding several unique IDs to different databases, helping other users/bots to improve the items easily. Emijrp (talk) 16:21, 9 March 2024 (UTC)[reply]
- Hello Framawiki, IMHO I don't think that all books in BNE are indexed in the catalog. The datos.bne.es project has over 4,000,000 objects I think, but not all them are works/editions (it includes sound recordings, films, etc, which are excluded by my bot). I think that a realistic figure is 1,000,000 works/editions, but I want to start gradually, perhaps creating works/editions only for those authors (like Francisco Espinosa Maestre (Q5865630)) which has already an item here in Wikidata, so works/editions can be linked to them. In future iterations, I can improve the approach and create items for missing authors too and then import their works. Emijrp (talk) 16:08, 9 March 2024 (UTC)[reply]
- It seems we're working on very similar issues. Any plan on expanding it to entries from the Biblioteca Digital Hispánica (Q14931576) with BDH edition ID (P4956)? No ISBN or different databases, but most of those items can be imported to Commons and Wikisource. --Ignacio Rodríguez (talk) 15:05, 10 March 2024 (UTC)[reply]
- Hello @Ignacio Rodríguez: BDH is an interesting dataset and I don't discard importing stuff from it in the future, but I'll limit the current task to works with ISBN. Once I finish this task (if it's finally approved) I will explore BDH and the non-ISBN works in BNE. Regards. Emijrp (talk) 15:28, 11 March 2024 (UTC)[reply]
- Write me when you do it in the future. In spanish, obviously. Ignacio Rodríguez (talk) 15:55, 11 March 2024 (UTC)[reply]
- Hello @Ignacio Rodríguez: BDH is an interesting dataset and I don't discard importing stuff from it in the future, but I'll limit the current task to works with ISBN. Once I finish this task (if it's finally approved) I will explore BDH and the non-ISBN works in BNE. Regards. Emijrp (talk) 15:28, 11 March 2024 (UTC)[reply]
- @Emijrp: mostly, it seems very good. First, ideally, publication date (P577) should not be used on works (I know that some people want to do it, especially when there is no item for the specific edition but that's not relevant here). Did you think of the case where the edition already has an item but not the work (and vice-versa) ? Is it just strictly pairs (one work, one edition) or do you have cases of work with multiple edition (where the FRBR-like structure becomes really relevant). Finally, I see that you will filter to create only high quality item, I strongly approve that. Cheers, VIGNERON (talk) 18:17, 11 March 2024 (UTC)[reply]
- Hello VIGNERON. I can exclude publication date (P577) in works, I added it because it's on the recommendations in Wikidata:WikiProject Books. In the case there is already item for some edition, bot will skip it (based on ISBN) and create additional editions for other ISBNs (if any). About bot behavior with works with multiple editions, here is an example:
- La guerra civil en Huelva (Q124805540): written work by Francisco Espinosa Maestre
- La guerra civil en Huelva (Q124805545): 1997 edition of written work by Francisco Espinosa Maestre
- La guerra civil en Huelva (Q124805534): 2018 edition of written work by Francisco Espinosa Maestre
- La guerra civil en Huelva (Q124805540): written work by Francisco Espinosa Maestre
- Regards. Emijrp (talk) 15:28, 12 March 2024 (UTC)[reply]
- Hello VIGNERON. I can exclude publication date (P577) in works, I added it because it's on the recommendations in Wikidata:WikiProject Books. In the case there is already item for some edition, bot will skip it (based on ISBN) and create additional editions for other ISBNs (if any). About bot behavior with works with multiple editions, here is an example:
- @Emijrp have you considered emailing the datos.bne.es admin team looking for feedback? They are familiar with Wikidata (they use us as extenal ID also). mailto:info.datosenlazados@bne.es. —Ismael Olea (talk) 09:34, 12 March 2024 (UTC)[reply]
- Hello Olea. Not really, as content is CC-0 and both projects already link each other. Regards. Emijrp (talk) 15:59, 12 March 2024 (UTC)[reply]
- Anyhow, when you start the task consider send them an informative email. I'm sure they will be happy to know :-) —Ismael Olea (talk) 16:26, 12 March 2024 (UTC)[reply]
- Hello Olea. Not really, as content is CC-0 and both projects already link each other. Regards. Emijrp (talk) 15:59, 12 March 2024 (UTC)[reply]