Please use this page to report any issue or comment concerning OAbot's activity.

You can also check out our FAQ.

edit

I see that your bot is adding links to copies of articles on CiteSeerX; for example this change to abstract data type. CiteSeerX is very indiscriminate about where it gets copies of papers from; the accuracy of such a copy vis-a-vis the officially published version, and the provenance of the copy, cannot be verified by a bot. In this particular case, the online copy appears to be taken from a class web site unaffiliated by the authors of the original paper. The owner of the course web site is probably safe from copyright violation as course reading lists have powerful fair use exemptions to copyright, but CiteSeerX and Wikipedia do not. Using it here appears to be a copyright violation and a violation of WP:ELNEVER. If your bot cannot make such judgements accurately, it should not be making them at all. —David Eppstein (talk) 22:46, 26 February 2017 (UTC)Reply

For the record, this concern is discussed here. − Pintoch (talk) 08:24, 27 February 2017 (UTC)Reply
Yes, I posted this before discovering that page, but let's centralize the discussion there. —David Eppstein (talk) 08:54, 27 February 2017 (UTC)Reply
OAbot is again adding CiteSeerX links, apparently automatically. The next one I see after this warning that is not traceable back to the author or publisher will lead to a block. —David Eppstein (talk) 18:09, 12 October 2019 (UTC)Reply

Biorxiv

edit

In [1], the bot added a CHS Press doi, thinking it was a biorxiv doi. Please update the behaviour. Headbomb {talk / contribs / physics / books} 17:05, 23 March 2017 (UTC)Reply

@Headbomb: done, thanks. − Pintoch (talk) 17:59, 23 March 2017 (UTC)Reply
Instead of removing the section outright, you might want to simply check that such DOI point to the biorxiv repository. E.g. if you follow doi:10.1101/063081 and the link resolves to http://biorxiv.org/... then it's a biorxiv doi. Headbomb {talk / contribs / physics / books} 01:14, 24 March 2017 (UTC)Reply
@Headbomb: Unfortunately I do not have the time to do that. If anybody wants to implement that, I will be happy to merge it. − Pintoch (talk) 10:01, 24 March 2017 (UTC)Reply

Please tag as "bot"

edit

Hi, a lot of your recent changes such as to Sea urchin were NOT tagged as coming from a bot and so couldn't be filtered out. Chiswick Chap (talk) 05:22, 12 May 2018 (UTC)Reply

Thanks for noting it; I was about to check it this morning. It seems the bot flag is being ignored, we'll perform some more checks with Pintoch. --Nemo 09:50, 12 May 2018 (UTC)Reply
Hi Headbomb, thanks for approving the bot. It seems to me that although the BRFA has been closed, the account has not been tagged with the bot flag. Where should we request that? − Pintoch (talk) 11:49, 15 May 2018 (UTC)Reply
@Xaosflux: should be able to help here. Headbomb {t · c · p · b} 15:04, 15 May 2018 (UTC)Reply

Hi, this account is flagged as bot. Please note, in your bot software you must assert that you are a bot on edits to have the bot flagged applied when using the writeapi. — xaosflux Talk 15:11, 15 May 2018 (UTC)Reply

Yes, that was fixed. The problem was using OAuth credentials rather than a password. --Nemo 18:15, 15 May 2018 (UTC)Reply

Change matching

edit

The bot briefly added links to hal.upmc.fr instead of pmc= identifiers. The error has been corrected. --Nemo 16:46, 16 May 2018 (UTC)Reply

edit

See this edit. OAbot added |pmc=2000340 to a {{cite journal}} that had this as its title: |title=The coupling of synthesis and partitioning of EBV's [[plasmid]] replicon is revealed in live cells. Note that the value assigned to |title= contains a wikilink so the rendered citation had a URL–wikilink conflict error:

Nanbo, Asuka; Arthur Sugden; Bill Sugden (2007). "The coupling of synthesis and partitioning of EBV's [[plasmid]] replicon is revealed in live cells". The European Molecular Biology Organization Journal. 26 (19): 4252–4262. doi:10.1038/sj.emboj.7601853. PMC 2000340. {{cite journal}}: URL–wikilink conflict (help)

The bot should not be creating errors for editors to cleanup.

Trappist the monk (talk) 14:22, 9 June 2018 (UTC)Reply

Hm, I thought this had been fixed but I'm not sure how. What's the recommended way to proceed? I'd remove the wikilink if anything. Alternatively, the template could avoid linkifying such titles. --Nemo 19:14, 9 June 2018 (UTC)Reply
I don't think we should be wikilinking terms in titles, as this example does. On the other hand, some references are themselves notable, and in those cases the title seems to be the logical place to put the link to the article about that reference. —David Eppstein (talk) 19:18, 9 June 2018 (UTC)Reply
I guess I would suggest:
  1. if the content of |title= is not wholly wikilinked:
    1. remove the wikilink(s) (the example above)
  2. if the whole of |title= is wikilinked or if the template has |title-link= then:
    1. do-not add |pmc= or
    2. add |pmc= with the identifier commented out (|pmc=<!--pmc identifier-->) or
    3. add |id={{pmc|pmc identifier}}
I agree that we should not be wikilinking individual terms or phrases in a template's title-holding parameters because such links, while perhaps useful in article text are not really likely to help readers locate a copy of the source. We might modify Module:Citation/CS1 to detect wikinked terms and phrases – that's a topic for WT:CS1.
Trappist the monk (talk) 19:58, 9 June 2018 (UTC)Reply
Still not fixed. See this edit and this edit. Do not break cs1|2 citations and leave the mess for editors to clean up.
Trappist the monk (talk) 14:16, 9 April 2019 (UTC)Reply
I was going to fix them manually myself as they are so rare and they all go into Category:CS1 errors: URL–wikilink conflict (or don't they?). Nemo 17:59, 9 April 2019 (UTC)Reply
Still not fixed. Do not break cs1|2 citations and leave the mess for editors to clean up.
Trappist the monk (talk) 15:20, 23 July 2019 (UTC)Reply
I'm fixing those, no worries. Nemo 15:31, 23 July 2019 (UTC)Reply
Apparently not, see this edit. Just fix the bot so that neither you nor I have to fix the broken templates.
Trappist the monk (talk) 13:19, 29 July 2019 (UTC)Reply
The bot is not approved to change titles and skipping identifiers would be a loss, so I prefer to fix those few cases manually. Nemo 15:36, 29 July 2019 (UTC)Reply

OAbot adding redundant parameters

edit

Why, if a citation has the parameter PMC, and that field is not empty, is OAbot adding pmc=? Edits such as this and this result in CS1 errors where there were none before. Whilst I've not yet seen the same problem with pmid/PMID, it seems possible. Perhaps you could make parameter names case-insensitive? Cheers, BlackcurrantTea (talk) 13:43, 23 June 2018 (UTC)Reply

A valid question. Perhaps we could run a bot to change all "PMC=" to "pmc="? As far as I can see, the uppercase is non-standard. There are only 350 such usages currently, so I think it's best to fix the odd syntax rather than optimise for edge cases. --Nemo 07:13, 26 June 2018 (UTC)Reply
Edge case? In cs1|2, all identifier parameter names may be written uppercase or lowercase (mixed case not accepted). When deciding to add an identifier to a cs1|2 template, bots must look for all of the accepted parameter name forms or aliases before making the addition. This to me is only common sense.
Trappist the monk (talk) 08:37, 26 June 2018 (UTC)Reply
I'd add that limiting PMC and PMID to lowercase is counter-intuitive. When editors see them in a potential source, e.g. here, and when they appear in references in articles, they're capitalised. As dois are lowercase, editors are less likely to use DOI; those might be the edge cases. BlackcurrantTea (talk) 10:22, 26 June 2018 (UTC)Reply
Lowercase is by far more standard, and uppercase will be normalized to lowercase by bots/awb most of the time. So please use that. Uppercase is just to make it friendlier to people that may not know the convention. Headbomb {t · c · p · b} 10:33, 26 June 2018 (UTC)Reply
The acceptability of identifier parameter names in either uppercase or lowercase is documented, e.g. in cite journal for eissn/EISSN and isbn/ISBN. Although PMC and PMID are undocumented, they function in the template as pmc and pmid do. It doesn't make sense to me to require editors to adapt to the bot rather than adapting the bot to editors. BlackcurrantTea (talk) 10:40, 28 June 2018 (UTC)Reply
Indeed, which is why bots can and do fix the inconsistencies all the time to simplify the users' job. --Nemo 11:25, 28 June 2018 (UTC)Reply
I see others have mentioned this. Nemo, as you're one of the maintainers, have we a chance of this being fixed? BlackcurrantTea (talk) 22:55, 5 July 2018 (UTC)Reply
There are now less than 100 instances that require fixing, we'll take care of that with the bot in the near future. --Nemo 10:17, 2 August 2018 (UTC)Reply
That's good news. Thank you. BlackcurrantTea (talk) 04:12, 11 April 2019 (UTC)Reply
edit

Hello, at rheumatic fever, the bot added a PMC link to an article from 1938 to a PMID from a 2012 article with the same title. Graham87 03:18, 7 April 2019 (UTC)Reply

Thanks! I see at https://dissem.in/p/70990242/rheumatic-heart-disease that this is one of those relatively rare but very annoying cases where dozens or even hundreds of articles have been published with the same title. We already have a patch for it at https://github.com/dissemin/dissemin/issues/512 and hopefully it will be fixed within a couple of weeks. Nemo 07:29, 7 April 2019 (UTC)Reply
Another example here. Please don't just match on title, please match on additional key parameters such as journal name, year, volume, page start number etc. I am surprised it was ever thought suitable to just match on title. Otherwise the bot is sometimes adding PMCs for papers with the same/nearly the same name by different authors, or a different version of a dated paper, or reprints 50 years later in different journals. While reprints may be the same paper is it NOT in my view acceptable to simply add the PMC for a reprint (you don't know if it's a full reprint, partial, edited etc.) - if the original paper has no full free text and the reprint does then the bot would need to have approval to add a separate cite/link for the reprint so it's clear it is a reprint. Rjwilmsi 06:32, 22 June 2019 (UTC)Reply
I'm not sure what made you think that this is the fault of a title match: in fact, in your example, authors and journal match, while the title is different. It seems my patch to avoid such overmerging on Dissemin is not going to be merged, so I'll try and add some more post-suggestion checks. Nemo 11:56, 21 July 2019 (UTC)Reply

I'm recreating the link suggestions with the new code and will launch a new bot run today. Nemo 07:51, 23 July 2019 (UTC)Reply

I've sampled a number of edits and they were all helpful. Nemo 20:28, 23 July 2019 (UTC)Reply

url vs chapter-url

edit

When a citation is of a chapter in a book (|chapter= or |contribution= is present), the bot needs to distinguish between a URL for the chapter vs one for the book. For example, in this edit, the bot found a URL for the cited chapter (not the whole book) and put it in |url=, when it should have put it in |chapter-url=. If it had done that, it might have noticed that |chapter-url= already contained an equivalent URL. Kanguole 13:00, 2 October 2019 (UTC)Reply

Thank you for the report. That edit is determined by the presence of the DOI: the citation is about the specific chapter, not about the book, otherwise the DOI would be wrong. It's therefore correct to use the URL parameter, although I agree it's better not to have two URLs pointing to the same resource. Nemo 16:02, 2 October 2019 (UTC)Reply
The DOI points at the chapter, so surely the corresponding URL would belong in |chapter-url=, because |url= is for a URL for the whole book. Kanguole 16:14, 2 October 2019 (UTC)Reply
What I read in Template:Citation#URL doesn't confirm it. The URL parameter points to the "publication" i.e. the entire work, but both the book and the individual chapter can be considered works by themselves (otherwise the chapter wouldn't have a DOI). If you want to make it clear that the citation is about the book, it's advisable to use {{cite book}}.
Again, I'm not saying I disagree with your suggestion, I'm just explaining why OAbot ended up suggesting that URL. Nemo 17:44, 2 October 2019 (UTC)Reply
To cite a chapter, one can use either {{citation}} or {{cite book}} with |chapter= – they both work in the same way but give slightly different formatting. The documentation isn't great, but there are separate parameters |url= and |chapter-url=, with the former attaching a link to the book title and the latter attaching it to the chapter name. Kanguole 17:56, 2 October 2019 (UTC)Reply
Yes. Hence, if you want the citation to be about the book, using {{cite book}} is the clearest option. Nemo 05:47, 3 October 2019 (UTC)Reply
I'm trying to explain to you that the choice between those templates is supposed to be based on formatting (punctuation and capitalization) and whether |href= is set by default, not whether bots misunderstand them. If the bot isn't going to be fixed, I guess the exclusion template is the easiest answer. Kanguole 07:38, 3 October 2019 (UTC)Reply
No, the choice of templates should be dictated by what those templates are designed to do. Are you saying that you want to cite a book but avoid using the apposite template {{cite book}} because of formatting preferences? Nemo 09:22, 3 October 2019 (UTC)Reply
No, I'm saying that {{citation}} (citation style 2) is intended as an alternative to the family of cite XXX templates (citation style 1). If the bot cannot handle a {{citation}} containing |chapter= (or |contribution=), it should leave it alone. Kanguole 10:49, 3 October 2019 (UTC)Reply
Thank you! I had definitely not understood this was your aim. As far as I know, {{cite book}} and friends can use CS2 as well, by setting the mode parameter. Did I miss something? Nemo 11:25, 3 October 2019 (UTC)Reply
If I have an article full of perfectly valid {{citation}} templates, it seems unreasonable to have to change one of them to {{cite book}} with |mode=cs2 just because a bot doesn't handle the template correctly. Kanguole 11:31, 3 October 2019 (UTC)Reply
Not because of the bot, but because the {{citation}} template isn't able to convey the information you intend (that the citation is about the entire book rather than the chapter only). Nemo 11:50, 3 October 2019 (UTC)Reply
It is: whether using {{citation}} or {{cite book}}, it is the presence of |chapter= that indicates that what is being cited is the chapter, not the whole book. In that situation, the URL found should go in |chapter-url=, not |url=. So the Phab task isn't quite right as stated: it's the presence of |chapter= that triggers the problem, not the presence of |chapter-url=. Kanguole 16:13, 3 October 2019 (UTC)Reply
Further on this point: {{citation}} treats whatever it is citing as a book when all of the 'work' parameters are omitted or empty: |journal=, |magazine=, |newspaper=, |periodical=, |website=. |work=. There is oddity when |encyclopedia= is set but I don't think that is at issue here.
|chapter= has these aliases: |contribution=, |entry=, |article=, and |section=; each has its own matching |<param>-url= parameter. For the purposes of semantics, the pairs should match.
Trappist the monk (talk) 18:20, 3 October 2019 (UTC)Reply

Blocked

edit

I have blocked OAbot for adding copyvio links to references, after a previous warning was ignored. Specifically, the bot is adding CiteSeerX links without checking whether the links trace back to an author or publisher (not a copyvio), or to somebody else. Additionally, I don't believe the addition of such links was ever in the bot's remit; my recollection is that when the bot was reviewed, this issue was specifically discussed and removed from the list of approved bot tasks. As an example of a bad edit, see this diff, where the bot adds a citeseer link to a paper by László Székely, but the citeseer provenance of the link is to web pages of Micha Sharir and Bill Gasarch (neither of whom is an author or publisher of the paper). —David Eppstein (talk) 19:39, 12 October 2019 (UTC)Reply

Considering you opposed the task which was approved to perform these edits, I would consider this block WP:INVOLVED and I suggest that you reverse it, asking the intervention of an uninvolved admin instead.
There is no copyright infringement in that diff and the link is explicitly allowed by WP:COPYLINKS anyway. Nemo 23:14, 12 October 2019 (UTC)Reply
I have asked for administrative review of both the block and my involvement; see WP:ANI#Request for block review. —David Eppstein (talk) 00:35, 13 October 2019 (UTC)Reply
David Epstein is correct here. OABot 3 was about flagging existing identifiers as free, and adding free dois and hdls and the like. CiteSeerX has been deemed too contentious to add automatically in the past and OABot 3 does not overturn that consensus. Headbomb {t · c · p · b} 00:37, 13 October 2019 (UTC)Reply

April 2020

edit
 
This user's unblock request has been reviewed by an administrator, who accepted the request.

OAbot (block logactive blocksglobal blockscontribsdeleted contribsfilter logcreation logchange block settingsunblockcheckuser (log))


Request reason:

After various requests, and having consulted the sole other active bot operator Pintoch, I request unblock of User:OAbot to add doi/hdl/arxiv/pmc parameters. Details below. Nemo 16:14, 10 April 2020 (UTC)Reply

Accept reason:

Ok for runs not including the addition of citeseerx, the reason for the block David Eppstein (talk) 16:41, 10 April 2020 (UTC)Reply

According to consensus and bot task 3 (and previous), the bot this run will be launched with the command bot.py (hdl|doi|pmc|arxiv), which adds only those identifiers (and corresponding parameters like doi-access=free) and doesn't add CiteSeerX parameters nor any URL.

For the sake of transparency, some statistics about the edits the bot will attempt to do: after having gone through most of the articles with relevant citations, we have found about 75k articles to work on (each requires a single edit) and the parameters to be touched have the following frequency so far:

 145759 doi
   7758 hdl
   1180 pmc
    464 arxiv

So this unblock is 99 % about adding doi-access=free and hdl-access=free to citations where the doi and hdl have been added by others (including recent citation cleanups). The addition of pmc and arxiv parameters has never been controversial but I can do these separately in the future if anyone prefers so.

As a reminder, the operators of User:OAbot are not directly responsible for edits made with the sibling tool by individual users, some of whom remain separately blocked. Nemo 16:14, 10 April 2020 (UTC)Reply

edit

Regarding this edit, I am unable to find access to the free full text in any of those links, yet the source is flagged by OABot. I don't speak bot; could someone explain, and help me locate a URL to free full text? @Nemo bis and Pintoch: SandyGeorgia (Talk) 15:13, 17 April 2020 (UTC)Reply

Thanks for your question. I'm not sure I understand what you are having problems with, though. The way to locate the full text is normally to
  1. go to the References section;
  2. choose a green lock;
  3. click the link before the green lock;
  4. look for the full text in the HTML page itself, or for a prominent download button or icon, or for some other link to HTML or PDF or other.
So for instance note 6 links https://hdl.handle.net/10871%2F36535 which has a   icon near the top left which links the PDF [2].
Does this answer the question? Nemo 16:58, 17 April 2020 (UTC)Reply
In general search for "PDF" is a good quick way to find something, but here it's clearly marked with a download icon. Search for PDF also works, if you missed the icon. Headbomb {t · c · p · b} 18:06, 17 April 2020 (UTC)Reply
OK, I will start over, so you all will see how much I don't understand what is happening here.
  • But I finally figured out that I can find the PDF by clicking on the hdl link (I had never heard of hdl and did not know to click there-- I don't think our readers will either)
My additional confusion might be better understood by looking at Karel Styblo
  • I see a green link on the first citation, and clicking on that takes me to a free full-text URL
  • But the bot just added something to the Migliori citation which is different; there is no green OA lock.
    • But if I go to the Migliori citation, I can find a full-text PDF by clicking on the DOI, which is inconsistent with the Dementia with Lewy bodies situation, where I have to click on the hdl.
My aim is consistent citations, and I don't know why there is a green link on some, but not others, or why I have to click on hdl for the link on one, but DOI for another, and yet PMC for another-- confusing to readers ? Does this explain my confusion and need for clarification? SandyGeorgia (Talk) 18:30, 17 April 2020 (UTC)Reply

@SandyGeorgia: Do you not see the green lock? The reason there is a green lock on some, and not others, is that those with a green lock have been identified as open access resources. Like the Migliori doi, which is marked with a green locked. I don't know why you don't see it. Headbomb {t · c · p · b} 18:52, 17 April 2020 (UTC)Reply

The plot thickens. I see the green lock in the link you give above. On my iPad, I see the green locks in the articles linked above. On my PC, I do not see the green locks in the articles, either with Google Chrome or with IE. It's a browser thing. But it is still odd that I can see some of the green locks on my PC, but not others. SandyGeorgia (Talk) 19:23, 17 April 2020 (UTC)Reply
Sometimes the locks do not load for some JavaScript or CSS failure in my browser, but a refresh fixes it. Just for the sake of clarity, I've uploaded some screenshots of references with green locks which should be what we're supposed to be seeing (apart from custom fonts and skins). The green/red squares I've added myself, of course. Nemo 19:24, 17 April 2020 (UTC)Reply
OK, so I shall stop worrying, then, about whether I see the green lock ... sorry for all the questions! SandyGeorgia (Talk) 19:37, 17 April 2020 (UTC)Reply
Sounds like a caching/WP:PURGE issue. Headbomb {t · c · p · b} 19:48, 17 April 2020 (UTC)Reply
Oh, yes ... that did the trick. Unwatching now-- thanks for the help, SandyGeorgia (Talk) 22:58, 17 April 2020 (UTC)Reply

Non-free flagged as free

edit

https://en.wikipedia.org/w/index.php?title=MNDO&curid=2235160&diff=951504589&oldid=913941897 probably not your problem. Probably upstream data, but no full text. AManWithNoPlan (talk) 11:29, 18 April 2020 (UTC)Reply

Reported. Nemo 21:10, 18 April 2020 (UTC)Reply
How/where does one report those? AManWithNoPlan (talk) 22:30, 19 April 2020 (UTC)Reply
See https://support.unpaywall.org/support/solutions/folders/44000384007 Nemo 22:36, 19 April 2020 (UTC)Reply

Better edit summary, please

edit

Can you please fix the generated edit summary to better reflect what the bot is actually doing? In this edit, the bot claimed Open access bot: doi added to citation with #oabot, but the previous version already had a doi, so that's misleading. The correct summary would have been, marked doi-access as free, or some such. Thanks. Mathglot (talk) 21:18, 19 April 2020 (UTC)Reply

Would it be enough to say "parameter added to citation for doi, hdl" etc.? Nemo 22:37, 19 April 2020 (UTC)Reply
That's still avoiding the point, because it doesn't say what parameter was added and how the addition changes the citation. Why do you object to the less-obfuscatory and shorter summary suggested by Mathglot? —David Eppstein (talk) 22:45, 19 April 2020 (UTC)Reply
Simply because oabot doesn't know what parameters were added, currently. That part of the job is done by a library, which merges existing and new parameters. I'd need to compute the diff to know what was actually changed. Or in other words, patches welcome. Nemo 23:01, 19 April 2020 (UTC)Reply
I see what the problem is. Are those the only two possibilities, then, namely either adding doi, or adding/changing doi-access? If so, the summary might say, altered doi and/or doi-access param (or better wording). If there's more than two, it might get complicated, but maybe you can extrapolate and suggest something. Adding David Eppstein. Mathglot (talk) 00:03, 20 April 2020 (UTC)Reply
@Mathglot: Just so you know, pinging another user doesn't work when you modify an existing comment on a talk page. I noticed this anyway because I happen to have this talk page watchlisted for now. —David Eppstein (talk) 01:54, 20 April 2020 (UTC)Reply

A barnstar for you!

edit
  The Original Barnstar
Dear OAbot, Thanks for working on the page "Madhu Verma", I appreciate that. Could you please let me know, what is the next steps, before it is made online/visible to public Nehamidha (talk) 07:35, 4 May 2020 (UTC)Reply
@Nehamidha: thanks! The edit OAbot made on Madhu Verma is already visible (you can see a small green lock next to the DOI in the reference). − Pintoch (talk) 07:51, 4 May 2020 (UTC)Reply
edit

Hi, in the section above, incorrect addition of PMC links was reported. It was stated there that the issue was resolved in July 2019, however here is an edit from April 2020 where again the bot has picked the PMC for a paper with a similar title but in a different journal, different volume, year etc. Please advise why this is still happening? Rjwilmsi 16:18, 17 May 2020 (UTC)Reply

This is a correct match found by Unpaywall: the actual article is on Animal Genetics, while the Elsevier DOI is a mere stand-in which only carries an abstract identical to that of the actual article. I've corrected the DOI. Nemo 15:05, 25 June 2020 (UTC)Reply

Adding more journals as doi-access=free

edit

Hi, I'm adding content to Wikipedia using material published by Annual Reviews, starting with journals that had paywalls removed. Three journals are now freely-accessible: Annual Review of Political Science, Annual Review of Public Health, and Annual Review of Cancer Biology (read more here). It would be great if those three titles could be added to the OABot workflow, so that it can add doi-access=free where possible. Thanks, Elysia (AR) (talk) 14:58, 24 June 2020 (UTC)Reply

Elysia (AR), nice to see more promotion of open access works! I think OAbot already works with Annual Reviews: you can check whether the DOIs you're working on are currently marked as gold OA by Unpaywall, and/or you can test manually with the web interface at https://oabot.org . Nemo 14:57, 25 June 2020 (UTC)Reply

DML.cz in April 2020

edit

A repeated trouble, with lingering ill effects

edit

@David Eppstein, Nemo bis, and RobertFurber: The problem, or something seemingly closely related, appeared also in September 2019. I found this a few hours ago; and David found this and reported it in Wikipedia talk:OABOT#Old bad url — translation rather than text of English original in 2021. In both instances, as in the two examples David provided supra, the respective bot found another article, published in a Czech mathematical journal, and with the reasonable target present in the reference list. (David, you suggested that the article you found was a translation of the correct one; but I suspect that you know as little Czech as I do, and guessed. Look at the reference list at the end of the article!) Nemo, I strongly suspect that the reason two different users of OAbot made the same blatant mistake within a couple of days rather rested with the bot than with the users; and, if the bot was employing Unpaywall also in 2019, that the blame could be shifted one or two steps further, as in 2020.

However, finding blame for year-old errors is not very interesting. My reason for reactivating this thread is just this: Since I found this error instance to-day, and David one in 2021, and both David and Robert some in 2020, probably, there probably were more of these errors, at at least two occasions; and very likely are further instances as yet undetected. Nemo, I guess that also you did eliminate these errors, when you found them. Did you have the help of any bot (apart from AObot) for this? Could someone fix a list of all still remaining additions of references to such Czech articles from the relevant years, from both OAbot and Citation bot?

The remaining check probably has to be done by hand. (Of course, it would be rather nice if a bot also could check if the given title in the linked item is the article title or just a title of a reference list item; but I do not think that the present level of AI in the WP bots is sufficient for this.) Regards, JoergenB (talk) 18:22, 17 October 2023 (UTC)Reply

doi-access=free does not work with title=none

edit

In citations that use |title=none, adding doi-access=free now causes the citation template to emit an error message; see e.g. this diff. Unless/until the citation template is changed to re-allow this combination, I consider any additions of doi-access=free to such citations to be damage caused by the bot that must be stopped from happening. So to avoid messier ways of stopping it, please check for title=none and avoid altering these citations. —David Eppstein (talk) 22:24, 4 August 2020 (UTC)Reply

Now fixed on the template side of things? See Help talk:Citation Style 1 for discussion. —David Eppstein (talk) 23:12, 4 August 2020 (UTC)Reply

September 6th: NYC COVID-19 Multilingual Wikipedia Edit-a-thon - ONLINE

edit
September 6, 2-4pm E.S.T: NYC COVID-19 Multilingual Wikipedia Edit-a-thon - ONLINE
 

You are invited to join the Sure We Can community for our NYC COVID-19 Multilingual Wikipedia Edit-a-thon - ONLINE - this Sunday, Sept 6th, 2020. The edit-a-thon is part of Sure We Can's work with NYC Health + Hospitals to stop the spread of Covid-19. We plan to work on translating the COVID-19 pandemic in New York City article into other languages; as well as, brainstorm ideas about how we could use wikipedia to slow the spread of Covid-19. Please join us, all skill levels welcome!

Is there an idea you'd like to share? A question you'd like answered? Have an idea how we can use wikipedia to slow the spread of Covid-19? Please, let us know by adding it to the agenda.

2:00pm - 4:00 pm online via Zoom (optional breakout rooms available)

--Wil540 art (talk) 20:04, 4 September 2020 (UTC)Reply

Do not add doi-access=free to cite journal with title=none

edit

In this recent edit the bot broke one of the citations by adding |doi-access=free to a {{cite journal}} template with |title=none. That combination of parameters does not work and has not worked since the doi autolinking RFC was implemented. Bot edits like this should never cause a valid citation template to become a broken citation template. In the long term, maybe, the cite journal template maintainers can be persuaded to allow that combination of parameters to work. In the short term, the bot must be prevented from making broken citations. That could be done by making the bot recognize that |doi-access=free and |title=none are incompatible, and not adding the parameter in those cases. Or it could be done by holding off on making any more bot edits until the bug in the citation templates is fixed (if it ever is). Which would be preferable? —David Eppstein (talk) 22:06, 20 September 2020 (UTC)Reply

Ok, in this edit the bot is edit-warring to reinstate its bad version after it was reverted. To me that looks like a blockable offense. —David Eppstein (talk) 22:19, 20 September 2020 (UTC)Reply
Really this should be a temporary fix while the core problem (the template misbehaving) is fixed. Headbomb {t · c · p · b} 22:27, 20 September 2020 (UTC)Reply
How long is temporary? Is the core problem ever going to be fixed? It was discussed as a problem on Help talk:Citation Style 1 last May but with no movement towards getting it fixed. —David Eppstein (talk) 22:51, 20 September 2020 (UTC)Reply

Sorry for the lack of response here, but I was waiting for the template storm to settle down. What's the outcome, do we have an established consensus on how the template parameters are supposed to work? Nemo 11:12, 30 December 2020 (UTC)Reply

December 2020 run

edit

Based on the current refresh the bot is making several thousands edits now, mostly doi-access=free additions. Nemo 11:10, 30 December 2020 (UTC)Reply

doi-access at War guilt question

edit

What is the point of this edit at War guilt question? Thanks, Mathglot (talk) 05:18, 19 March 2021 (UTC)Reply

@Mathglot: I believe you refer to this instead. The point is to indicate that the source can be accessed freely from the publisher. The |url= parameter can be removed and the title will automatically be linked with the DOI. If the publisher decides to change the format of its URLs, the DOI will remain valid and will point to the article, so that prevents link rot. − Pintoch (talk) 07:37, 19 March 2021 (UTC)Reply
Thanks for your reply. Yes, but the doi does not require the access param to be there in order to be linked if the url is removed, unless there's been a CS1 change I'm not aware of. The linkage is automatic, free or not, iirc, which makes this change not an improvement to the article, because it doesn't affect the rendering of the link either now, or in the future if the url is removed. That's what I meant by, "what's the point". Mathglot (talk) 08:40, 19 March 2021 (UTC)Reply
"the doi does not require the access param to be there in order to be linked " it does. Compare
with
  • Wittgens, Herman J. (1980). "War Guilt Propaganda Conducted by the German Foreign Ministry During the 1920s". Historical Papers / Communications Historiques. 15 (1). Canadian Historical Association: 228–247. doi:10.7202/030859ar. ISSN 0068-8878. OCLC 1159619139.
It also adds the free-to-read DOI icon. Headbomb {t · c · p · b} 19:32, 19 March 2021 (UTC)Reply

Odd but not incorrect match

edit

I'm curious: how did the bot determine in Special:Diff/1020105627 that a preprint with a title beginning "Ideals" was a match for a published paper with a title beginning "Filters"? It is a match, but a bot should not be guessing that things match based on authors and similar but not identical titles, because in many cases the same authors will have different papers with similar titles. —David Eppstein (talk) 06:20, 27 April 2021 (UTC)Reply

Hm, good question. By looking at the diff alone I would have guessed it's a DOI match (i.e. the DOI was linked to the arxiv ID either on arxiv itself or on Unpaywall), but I'm not 100 % sure. I'd need to check.
We used to do title matching more, but we no longer really do it for the bot, although the tool may still do some when manually requested to examine a page. That is, we mostly check the titles (and authors, and date, IIRC) to reject a suggestion from Dissemin when it looks "unsafe". Nemo 06:32, 27 April 2021 (UTC)Reply
ArXiv definitely doesn't list a doi for this one. Is unpaywall a reliable source for this sort of information? —David Eppstein (talk) 07:35, 27 April 2021 (UTC)Reply
That's what confuses me: for Unpaywall to provide a match on ArXiv, usually the DOI would need to be on the ArXiv record itself. I've not verified that this match actually came from Unpaywall, so let's not get ahead of ourselves. At the moment the Unpaywall record for this DOI doesn't have any OA version, so this was probably a DOI match on this Dissemin record which is also an exact title match.
In general, Unpaywall is the best source there is. It's even used by Scopus and all the others nowadays. They have regular automatic and manual quality assurance on the links, all sorts of things. There are some bugs sometimes, usually produced by some new bug in one of their sources, but they're usually spotted and fixed quickly. Nemo 09:04, 27 April 2021 (UTC)Reply

Open access bot: doi added to citation with #oabot.

edit

The bot however just added |doi-access=free so the summary is wrong. Matthias M. (talk) 13:44, 6 May 2021 (UTC)Reply

See above. Nemo 18:02, 6 May 2021 (UTC)Reply

Thank you.

edit

Your edit here is much appreciated. --—Encephalon 21:28, 9 May 2021 (UTC)Reply

Should respect comments on doi-access parameters

edit

Well-behaved bots will notice that a parameter has a comment as a value, such as, oh, let's say doi-access=<!-- DO NOT ADD DOI-ACCESS=FREE BECAUSE IT BREAKS THE CITATION TEMPLATE -->, and will leave that comment alone rather than changing it to doi-access=free. In this case the comment was stale because the problem it was intended to work around has apparently been fixed: it is no longer the case that adding doi-access to citation templates that have title=none causes the template to break. Nevertheless, OAbot fails to be well-behaved in this regard: Special:Diff/1023568392. Its failure to respect this kind of comment is a bug, and should be fixed. —David Eppstein (talk) 05:06, 17 May 2021 (UTC)Reply

Ah, that's for title=none, right? I was hoping the template would be fixed, but it seems we need to give up and just skip such occurrences. As for leaving parameters with comments untouched, we rely on the behaviour of a standard library to handle the template parameters, but I'll see if I can add a rule to skip these. Nemo 08:51, 29 May 2021 (UTC)Reply
I think the issue with title=none has been fixed — at least I didn't see problems after this edit. So I am not complaining about that, only about not respecting these comments as a way to disable changes. (This is, at least, a standard way to get Citation bot to not change things.) —David Eppstein (talk) 07:46, 30 May 2021 (UTC)Reply
edit

See Special:Diff/1025587540, where it added a link to arXiv:1102.5568, "Counting (3+1) - Avoiding permutations", on a reference to doi:10.37236/225, "Counting 1324,4231-Avoiding Permutations". They are not the same paper, as a glance at their introductions verifies. They don't even have the same authors (although their author lists overlap). —David Eppstein (talk) 18:30, 28 May 2021 (UTC)Reply

Thank you, will give a look. Nemo 08:51, 29 May 2021 (UTC)Reply

Incorrect doi-access=free

edit

In Special:Diff/1027274899 the bot added |doi-access=free for an article for which only the abstract is available. Kanguole 09:20, 7 June 2021 (UTC)Reply

Hm, good catch. This journal used to be bronze OA and IA still has the publisher PDF. Nemo 17:50, 14 June 2021 (UTC)Reply

False positive

edit

Here, OAbot marked doi:10.1515/9781614511984.1 as |doi-access=free even though it's paywalled. De Gruyter recently revamped their website so that may have to do with it, but in any case this should be fixed. Nardog (talk) 03:20, 14 June 2021 (UTC)Reply

another false positive

edit

this edit introduces yet another doi-access= parameter to a citation to a paywalled article. Cambial foliage❧ 06:51, 14 June 2021 (UTC)Reply

edit

Unpaywall just announced they added 400k newly discovered bronze open access (gratis nonfree open access PDFs) from Elsevier. The next round of the bot run will probably add many to citations. The errors mentioned in the previous three sections have been fixed as soon as they were reported. Nemo 05:48, 1 July 2021 (UTC)Reply

Is that why for the last several days some 90% of my watchlist changes have been OAbot? It is forcing me to hide bot edits in my watchlist in order to find anything else, and therefore making me miss other bot edits that might be worth checking. Is there some way to throttle this down to make it less obtrusive? —David Eppstein (talk) 18:41, 6 July 2021 (UTC)Reply
@David Eppstein:, see WP:HIDEBOT for how to only hide one specific bot, and the caveats that comes with that. Headbomb {t · c · p · b} 21:12, 12 August 2023 (UTC)Reply

A Gift For You!

edit
File:Amogus.png Sussy Baka
Here's a sussy baka! EzriGamer26 (talk) 17:40, 20 September 2021 (UTC)Reply

Upgrade and new run

edit

The bot has been ported to Python3 (at last) and is now processing a backlog of changes, mostly based on suggestions cached from Unpaywall in January 2023. Afterwards I hope to resume a weekly processing schedule. Nemo 16:40, 28 January 2023 (UTC)Reply

503

edit

I keep getting a "503 Service Temporarily Unavailable" message. Has the bot's address on toolforge changed? 73.44.31.228 (talk) 01:00, 16 March 2023 (UTC)Reply

Another incorrect doi-access=free

edit

this edit incorrectly labels doi:10.1163/2405478X-00902002 as free. Kanguole 20:54, 12 August 2023 (UTC)Reply

Same here. Headbomb {t · c · p · b} 23:20, 12 August 2023 (UTC)Reply

Also this edit here. Access to the doi:10.1177/014362448600700203 article is not free through SAGE Publishing. Gricharduk (talk) 03:57, 13 August 2023 (UTC)Reply

Thanks for reporting. The first one was bronze OA (gratis but not libre) earlier, so one option is to add explicit URLs. Otherwise people will be able to retrieve the PDF from the landing page if they use appropriate browser extensions to access e.g. Unpaywall or Internet Archive Scholar.
The status of that PDF may have changed as recently as last month. Soon Unpaywall should pick up the changes and report it as non-OA again. Then I need to instruct OAbot to remove such outdated doi-access parameters. I've filed phabricator:T344114 to clarify this is in the works. (I first need to finish the current run.)
The other two cases seem to be similar. Nemo 07:53, 13 August 2023 (UTC)Reply
Another one: [4], [5] , [6], [7], [8], [9], [10] Headbomb {t · c · p · b} 11:06, 13 August 2023 (UTC)Reply

Outdated doi-access=free are now slowly being removed (example). I'll accelerate the process later if all goes well. Nemo 14:40, 14 August 2023 (UTC)Reply

If you are going to do that, remove the entire parameter along with its value. There is no need to leave an empty parameter around to clutter up the wikitext.
Trappist the monk (talk) 14:47, 14 August 2023 (UTC)Reply

More open access DOIs

edit

See the above link for a list of open-access DOI registrants. Headbomb {t · c · p · b} 21:11, 12 August 2023 (UTC)Reply

Thanks but we're not going to implement our own database of open access journals/publishers, if that's what you're suggesting. Nemo 07:25, 13 August 2023 (UTC)Reply
Why not? It should be trivial to implement this and would benefit thousands of citations. (And up to 30042 pages across mainspace.) Headbomb {t · c · p · b} 09:46, 13 August 2023 (UTC)Reply
Why would it? What makes you think all these DOIs aren't covered by Unpaywall? (Are these all DataCite DOIs or what?) I see several which Unpaywall correctly identifies as OA. Meanwhile, individual journals and even individual DOIs can be transferred to other publishers and stop being OA. Nemo 18:33, 13 August 2023 (UTC)Reply
I have similarly no idea what DataCite is and I don't know how Unpaywall works or how it determines if something is free-access or not, but these DOIs prefixes are free and using them is a reliable and cheap (processing wise) way of determining free dois. And OA articles don't cease to be OA if journals are sold. If they did, that would go against the publishing terms. New articles from the same journal may no longer be OA after it's sold, but that journal would have a new DOI prefix upon sale. Headbomb {t · c · p · b} 19:35, 13 August 2023 (UTC)Reply
https://unpaywall.org/faq explains. Unpaywall already has a list of fully open access journals and publishers, mostly thanks to DOAJ. If there are any issues, they can be reported to them. Nemo 14:13, 14 August 2023 (UTC)Reply

If you find

 
10\.(1100|1155|1186|1371|1629|1989|1999|2147|2196|3285|3389|3390|3410|3748|3814|3847|3897|4061|4089|4103|4172|4175|4236|4239|4240|4251|4252|4253|4254|4291|4292|4329|4330|4331|5194|5306|5312|5313|5314|5315|5316|5317|5318|5319|5320|5321|5334|5402|5409|5410|5411|5412|5492|5493|5494|5495|5496|5497|5498|5499|5500|5501|5527|5528|5662|6064|6219|7167|7217|7287|7482|7490|7554|7717|7766|11131|11569|11647|11648|12688|12703|12715|12998|13105|14293|14303|15215|15412|15560|16995|17645|19080|19173|20944|21037|21468|21767|22261|22459|24105|24196|24966|26775|30845|32545|35711|35712|35713|35995|36648|37126|37532|37871|47128|47622|47959|52437|52975|53288|54081|54947|55667|55914|57009|58647|59081)

in |doi= add |doi-access=free. Headbomb {t · c · p · b} 09:52, 13 August 2023 (UTC)Reply

False positives

edit

I've literally been having edit wars with OAbot at List of Galerucinae genera‎ and List of flea beetle genera, because it's labeling certain article DOIs as open access when they are not, I revert the bot's changes, but then it automatically relabels the same DOIs as OA again some time later.

Relevant edits:

Specifically I am referring to the "BezdekNie2019" reference in both cases (the "Moseyko2010" reference at List of flea beetle genera is fine, that actually is OA). Monster Iestyn (talk) 01:04, 15 August 2023 (UTC)Reply

And false negatives. See Gliese 710 for an example with three cases. Is the bot trusting NASA ADS (bibcode), which doesn't show a free-to-read link for any of these, while the direct doi link shows an open-access paper in each case? Lithopsian (talk) 14:54, 16 August 2023 (UTC)Reply
Thanks for reporting. I agree edit wars should be avoided. Perhaps we can come up with a parameter value that would confirm doi-access is explicitly not free? Otherwise you can ask the bot to skip the entire page. Please also report to Unpaywall support that the manuscript.elsevier.com is no longer accessible.
I'm not sure about the AANDA DOIs 10.1051/0004-6361/201629835 and 10.1051/0004-6361:20011330, they're considered open by Unpaywall. Sounds like a bug on my side.
10.3847/2515-5172/abd18d is a bit unusual. Are the RNAAS always like this, with a short HTML page and no PDF? Worth reporting to Unpaywall. Nemo 22:41, 17 August 2023 (UTC)Reply
What's weird about those? All of three are freely accessible. Headbomb {t · c · p · b} 23:36, 17 August 2023 (UTC)Reply
Yes, that's pretty standard for RNAAS. I'm seeing multiple edits by the bot every day at the moment on astronomy-related articles, removing "doi-access=free". It seems to be hitting The Astronomical Journal and Publications of the Astronomical Society of the Pacific today, for example HD 105382 and V752 Centauri. So far, I haven't found any edits of this type where the bot was correct. Seems like it is going to be very rare that someone incorrectly adds this parameter such that it needs removing, even rarer that a free-to-read journal article would later not be. Can the bot be stopped from doing this, it is a little tiresome. Lithopsian (talk) 14:14, 18 August 2023 (UTC)Reply
Ha, found one! The bot was right about HD 169853‎‎. Journal of Astrophysics and Astronomy paper at SpringerLink, free to read at various places but behind a paywall at the DOI. Lithopsian (talk) 14:29, 18 August 2023 (UTC)Reply
I think this is the right thread (the issue reported by Trappist the monk below at #bot incorrectly adds | doi-access=free seems to be a different issue). This edit added two incorrect |doi-access=free, both to DOIs resolving to Duke University Press. I confirmed they both contain a link titled "Buy this digital article" on the publisher's page. Folly Mox (talk) 13:07, 8 November 2023 (UTC)Reply
Today I confirmed that registering an account with the publisher does not grant access to the sources tagged in the edit, which OAbot redid yesterday. Folly Mox (talk) 12:46, 30 November 2023 (UTC)Reply
Thanks. I've reported the false positive to Unpaywall. Nemo 13:02, 30 November 2023 (UTC)Reply

Question

edit

Dear OAbot I have a question. Cologochideilia (talk) 13:55, 16 August 2023 (UTC)Reply

bot incorrectly removed manually added free access tag

edit

In Special:diff/1170978048, the bot removed a free access tag from a citation to doi:10.4153/CJM-1962-042-6 for which a PDF scan is directly available. –jacobolus (t) 14:42, 18 August 2023 (UTC)Reply

Here are some more examples: 1170970237, 1171005296, 1170969078, 1170974664. Maybe someone should be checking on the bot's removals of doi-access=free a bit more carefully? These are just examples from articles on my watchlist, so I am guessing there are thousands more free articles being incorrectly categorized by the bot as not having free access. –jacobolus (t) 16:50, 18 August 2023 (UTC)Reply
I must have seen over a hundred in the last few days on the articles I follow. I think three were correct and the rest I reverted. I don't think a bot should be doing things like this. Lithopsian (talk) 18:25, 18 August 2023 (UTC)Reply
The bot should probably be temporarily shut down and all such edits by the bot from recent days should be mass-reverted or manually checked by the bot author(s) until the bot can be more carefully coded to not be making such a high proportion of mistakes. This kind of bot should seek to have a vanishingly low error rate. Otherwise it switches from being marginally helpful to being significantly harmful and disruptive to the project. –jacobolus (t) 18:29, 18 August 2023 (UTC)Reply
@Nemo bis can you please stop your bot? It's getting in edit wars with human editors to impose its incorrect changes. Or perhaps some admin (@David Eppstein?) can temporarily shut the bot down until this is sorted out? –jacobolus (t) 23:14, 18 August 2023 (UTC)Reply
One more here. I agree the automated runs needs to be shut down until things are sorted out. Headbomb {t · c · p · b} 00:47, 19 August 2023 (UTC)Reply
Another one here where it removed a free access tag. Aithus (talk) 12:53, 19 August 2023 (UTC)Reply
And here to DOI:10.1074/jbc.M602297200 which is clearly open access. I've seen the bot remove the "free" tag on lots of articles on my watchlist recently. This must  ! Mike Turnbull (talk) 17:08, 19 August 2023 (UTC)Reply
I have blocked the bot indefinitely until this issue can be looked into. Any admin is free to unblock once the problem is fixed. firefly ( t · c ) 18:14, 19 August 2023 (UTC)Reply
Firefly, what's the point of blocking the bot when it had not been running for 15 hours? I was on a train and bus without internet while it was not running. Nemo 16:47, 20 August 2023 (UTC)Reply

The edits to correct overbroad doi-access=free were requested above. Bronze OA papers regularly switch between open and closed status, so inevitably if we add doi-access=free for bronze OA we also need to be ready to remove them. The bot is mostly reverting its own edits from 2020 (many of these papers were temporarily open for COVID-related initiatives, probably).

Re-adding doi-access=free manually is generally pointless (if you find a suitable URL target with an actual PDF you can add it in the url parameter: example), but to avoid edit wars you can exclude the bot from individual pages, as explained in User:OAbot#Scope.

It's true that currently Unpaywall currently detects less bronze OA DOIs than before. This is probably due to changes on the publishers' side which have made PDFs harder to access even when they're nominally gratis access. I've sampled the ongoing edits and I'm pretty sure such cases are a minority, while a majority of the removals are for now completely closed papers. I suggest to let the bot run.

As for the future, I'll look at the cases mentioned above. I was already making a list to be reported to Unpaywall. Most cases I found are about things other than usual article contributions (editorials, news, obituaries etc.). When they're detected as OA again, the bot will add doi-access=free again. I could also stop removing doi-access=free at all, if people prefer to make such edits manually. Nemo 16:47, 20 August 2023 (UTC)Reply

@Nemo bis - I had no way to know whether the bot was not running because you'd turned it off, or because it only runs on a set schedule. I don't know enough about the specifics here to respond to your other comments so will leave that to the subject-matter experts above. firefly ( t · c ) 16:52, 20 August 2023 (UTC)Reply
Firefly, ok. The bot was manually activated for a one-time run with this new feature, as I believe I mentioned above. Otherwise it's scheduled to run once a week. You can remove the block as I won't run it again manually while this discussion is ongoing, and I'll disable this feature in the scheduled weekly run. Nemo 17:07, 20 August 2023 (UTC)Reply
@Nemo bis - done, block removed as you've said you won't run the bot while the concerns are discussed. firefly ( t · c ) 17:25, 20 August 2023 (UTC)Reply
if you find a suitable URL target with an actual PDF you can add it in the url parameter – no this is not good advice. If the URL is redundant with the DOI it is much better to just put the DOI and add doi-access=free (readers benefit by hitting a journal metadata page with a "download PDF" button vs. a direct PDF link). If the bot is incorrectly removing those (like, anything more than a 0.01% error rate), there is something going very wrong, and a human should be regularly spot checking to make sure the bot is staying on target. –jacobolus (t) 21:03, 20 August 2023 (UTC)Reply
@Nemo bis can you please do a manual check of every instance of doi-access=free removed within the past few days, and revert any that were incorrect? Thanks! –jacobolus (t) 21:07, 20 August 2023 (UTC)Reply
Or if you don't want to do a manual check, can you please auto-revert every such edit from the past few days or week? Every one of the edits of this type that came up in my watchlist was OABot making a mistake. I'm sure there were some correct ones sprinkled in, but that's not good enough for bots that are editing thousands of pages in a short time frame. –jacobolus (t) 16:37, 21 August 2023 (UTC)Reply
User:Nemo bis – Could you please not ignore this? It needs to be fixed. I'd really rather not start a more dramatic process bringing in administrators or whatever. –jacobolus (t) 18:13, 27 August 2023 (UTC)Reply
Sorry if I didn't have new replies for you. It wasn't my intention to ignore your concerns. I'm still working on this, see phabricator:T344114#9118322.
More broadly, I understand that the bot run was surprising, and I'm very sorry it seems to have affected astronomy-related articles more than average, but I'd like to point out that in the grand scheme of things it was a rather small matter really. A query shows that only some 14k DOIs from 10k articles were touched, out of over 300k doi-access=free we have across all articles (most of which have been added by OAbot previously, at least the non-redundant ones). Many of these changes don't even affect which URL is linked. One week later the bot already has added more links than it removed the previous week. Nemo 08:59, 28 August 2023 (UTC)Reply
If the bot makes a big pile of errors removing doi-access=free labels, then "the bot separately added a bunch of doi-access=free so now the total number is higher" is not really an adequate response. The mistakes from the previous week should be fixed. If the bot can't fix them, the relevant edits should be manually checked or else mass-reverted until they can be done correctly.
Do you intend to do either of those things? –jacobolus (t) 13:56, 28 August 2023 (UTC)Reply

I've sampled the latest batch of doi-access=true the bot would remove. It's clear that Unpaywall has been updating large portions of their data. In about half of the cases, the edits are indisputably correct (there's no full text link to be found at least for me); in the other half, I found some full text copy but there are reasons to believe not everyone would be able to access it (due to captchas etc.), so the removal of the doi-access=free link is defensible because we do need to find better OA links. Therefore I'm planning to resume the removals. There are few thousand more doi-access=free parameters to remove, less than the bot added just last week. Nemo 18:29, 23 November 2023 (UTC)Reply

Please do not remove these unless you are 100% sure edit is correct (there should ideally be a manual check involved). Also, can you please figure out how to automatically mass-revert or go manually check the many incorrect changes your bot previously put through? –jacobolus (t) 18:48, 23 November 2023 (UTC)Reply
By definition it's impossible to be 100 % sure of bronze OA status. The only way to be 99 % certain of persistent OA status is to add a green OA link to a stable open repository, but unfortunately the bot is not yet authorised to do that; you can help at https://oabot.toolforge.org if you want. If instead you want to change the meaning of doi-access=free to remove bronze OA from its scope, please open a discussion at Help talk:Citation Style 1.
The bot is gradually (re)adding doi-access=free to bronze OA works from Wiley and friends where it previously wouldn't (example), so I remain pretty sure that it will do so in due time for the citations where it was previously there (if the PDFs don't go away again). Nemo 07:47, 28 November 2023 (UTC)Reply
The bot should not be doing mass changes where a nontrivial proportion of them are incorrect. Period. It wastes huge amounts of time and attention for human editors to check every example, so people need to be able to trust that the bot is like 99.9% accurate. Otherwise it's more harmful than helpful.
I'd recommend never having this bot remove the doi-access=free label unless checked by a human or part of some specific set of examples known with surety to no longer be open access. –jacobolus (t) 17:20, 29 November 2023 (UTC)Reply

I'm now starting a small and very slow run so we can reassess and discuss more broadly. Nemo 15:01, 28 November 2023 (UTC)Reply

This was an incorrect removal of |doi-access=free. Kanguole 09:56, 29 November 2023 (UTC)Reply
Kanguole, thanks for reporting. That's an interesting case, I'm pretty sure it's because persee.fr recently made its rate limits very strict so even humans often have to enter a captcha to download a PDF (let alone Unpaywall's bots). This will probably be addressed soon by Unpaywall's work around rate limits, but in the meanwhile you can use oabot to ensure the OA PDF URL remains linked. (A direct PDF link is also much more usable. I happen to know the persee.fr interface so I was able to locate the well-hidden PDF link and adjust my browser settings so that something would actually happen when clicking it, but many users are probably completely lost when landing on such a page.) Nemo 12:10, 29 November 2023 (UTC)Reply
Here's another. Kanguole 14:25, 29 November 2023 (UTC)Reply
.... and another Mike Turnbull (talk) 16:57, 29 November 2023 (UTC)Reply
You sure? It looks closed here (authwalled): phabricator:F41547194. It used to be open between 2012 and 2019 though, so you can add an archive link. Nemo 22:39, 29 November 2023 (UTC)Reply
The DOI points at https://jamanetwork.com/journals/jama/fullarticle/183643 which has a "FREE" badge on it and includes the full text of the paper on the webpage. The linked PDF says "Sign in to access free PDF" (apparently requires a registration where you give the publisher your email). If you care about the PDF per se you could use doi-access=registration or doi-access=limited, but I'd recommend using doi-access=free to reflect that the full text is freely available to anyone who looks at the web page. –jacobolus (t) 22:47, 29 November 2023 (UTC)Reply
Publishers often put "FREE" badges etc. on closed articles which in the end ask for money, it's just a marketing ploy. Sure, adding a doi-access=limited is an option; OAbot will not remove these. Nemo 23:17, 29 November 2023 (UTC)Reply
The edit @Mike Turnbull noted was a broken change. OABot should not have made this edit. "doi-access=free" was a correct parameter (the content is freely available), and blank "doi-access=" is flat-out incorrect. There's no "marketing ploy" involved here: the full text is right there. Arguably "doi-access=limited" could also be used if someone really thinks the PDF is essential to the content. If a nontrivial proportion of OABot's edits are like this one, then OABot should have its operation entirely halted until the problem can be fixed. –jacobolus (t) 23:32, 29 November 2023 (UTC)Reply
There's nothing wrong about a blank doi-access parameter, it's just an empty parameter. That article is not open access, so whatever the right parameter is, doi-access=free is not it. Luckily this kind of authwalled articles among formerly gratis OA articles are a very small portion of the cases, from what I've seen. Nemo 00:24, 30 November 2023 (UTC)Reply
There's nothing wrong with a blank doi-access parameter if the paper is paywalled. If the paper is open access and the doi-access=free parameter is blanked, that's a clear and obvious problem. –jacobolus (t) 02:29, 30 November 2023 (UTC)Reply
This article is not open access. It's also not compliant with the documented definition of "free", which says "free to read for anyone" (emphasis added), and which is distinct from "registration" for the case where "a free registration with the provider is required". Please open a discussion at Help talk:Citation Style 1 to change the meaning of the template. Nemo 11:01, 30 November 2023 (UTC)Reply
Actually, I've opened the discussion for you: Help_talk:Citation_Style_1#Allow_setting_doi-access_to_subscription_or_limited. Let's continue there. Nemo 11:21, 30 November 2023 (UTC)Reply
A registration is not required. The full text of the article is available to everyone directly on the web page. –jacobolus (t) 15:27, 30 November 2023 (UTC)Reply

I reported to OurResearch that the JBC is supposed to be OA and it will show up as such in future updates to the data. I've manually removed the doi-access=free removals which were in the queue for JBC. A future run will revert the previous removals. Nemo 22:47, 29 November 2023 (UTC)Reply

Previous incorrect removals are being reversed by the bot now (example). It may take a few more weeks to finish. Nemo 16:30, 3 December 2023 (UTC)Reply
AME and AAS journals have also reportedly been manually marked OA now on Unpaywall's end, so the doi-access=true parameter should be re-added in the next weekly run where it was removed. Nemo 22:24, 4 December 2023 (UTC)Reply

bot incorrectly adds |doi-access=free

edit

This edit marks doi:10.1016/0003-2697(83)90314-7 as free to read; it is not.

Further, still the bot continues to break citation templates by adding |doi-access=free when |title= has a wikilink. See in the example template: |title=A rapid method for the determination of naringin, prunin, and naringenin applied to the assay of [[naringinase]]. Please fix the bot so that it does not do that.

Trappist the monk (talk) 14:43, 7 November 2023 (UTC)Reply

The second one is a template issue, not a bot issue. If the doi is free, it should be flagged as free. If autolinking is borked, the solution is to fix autolinking. Headbomb {t · c · p · b} 00:08, 8 November 2023 (UTC)Reply
Indeed. Nemo 16:00, 8 November 2023 (UTC)Reply

I just want to acknowledge I've seen this and I'll look into it more later. It looks like ostensibly-bronze OA DOIs are on the rise again, partly countering the decrease we discussed previously.

The edit is correct in the sense that the DOI is considered bronze OA by Unpaywall. There is a delay in detecting changes to bronze OA papers, due to the nature of bronze OA. (Legacy publishers are increasingly unreliable, as captchawalls and loginwalls get placed in front of everything, even semi-free or semi-gratis resources.) It will be eventually be removed, thanks to phabricator:T344114. Nemo 16:00, 8 November 2023 (UTC)Reply

As discussed above, I've sampled the new edits adding doi-access=free and the portion of false positives is negligible. I don't see a need for any corrective measure on this side. Nemo 07:51, 28 November 2023 (UTC)Reply

The bot has now started its normal weekly scheduled run, which only adds parameters and doesn't remove any. So it may add some more false positives again, if so please report. (I couldn't find any.) Nemo 16:04, 3 December 2023 (UTC)Reply

Why is the bot adding access dates to PubMed citations

edit

This [11] seems pointless as the citations have PMID numbers. What value is the bot adding? Graham Beards (talk) 09:17, 10 November 2023 (UTC)Reply

You reverted the edit of a user using IABot, not OAbot's edit which simply added doi-access=free. I agree the URL is redundant; just remove it, so that people working on link rot know there's no point archiving it. Nemo 06:10, 17 November 2023 (UTC)Reply
I've also opened a discussion to make the task easier, so you'll see less of those pointless URL-archiving edits. Nemo 07:50, 28 November 2023 (UTC)Reply

URL maintenance

edit

As discussed above, the easiest way to handle links for DOIs where the full text status isn't super clear is to "hardcode" a suitable link target, be it open or closed, and mark its status appropriately. While the discussion about the doi-access parameter is ongoing, we could already get started on using url-access more. To avoid adding it unnecessarily where there is an OA link, and to avoid unlinking DOIs where a previously open PDF was already archived, it would be best to also add Internet Archive Scholar and other OA links at the same time. Citation bot has already been adding OA links to the url parameter for years now.

A semi-manual example shows the kind of edit I'd like to see. I could open a new bot approval request soon but I'm open to ideas. Nemo 06:55, 1 December 2023 (UTC)Reply

Removes doi-access=free when the dois are free

edit

See [12] [13] [14], etc... Headbomb {t · c · p · b} 00:27, 3 December 2023 (UTC)Reply

More [15], [16]. Headbomb {t · c · p · b} 02:17, 3 December 2023 (UTC)Reply

See discussion above: these are all either correct edits or temporary errors which will be reversed in short order. In more detail:

  • I've already reported the AAS journals to Unpaywall, they'll probably be fixed in a few days (as already happened with JBC). Don't hesitate to open a support ticket with Unpaywall to report specific journals whose entire archives are bronze OA. If you know the ABS people you could also suggest that they follow standards for repositories, so their PDFs are less hidden.
  • The Wiley etc. DOIs are authwalled via Atypon; there's no way of knowing who's able to access the full text there. They might come back once these authentication requirements are relaxed or worked around.
  • The Medknow DOI is broken, why does Citation bot re-add it? Reported there.
  • Why do you care about the Royal Society DOI? It's already linked to an archived copy.
  • The AME DOI leads to an interstitial before people can download a PDF. A direct link to the PDF is more helpful, one can use the archived copy as well for extra safety and to prevent the citation from going unavailable as happened with Medknow. Most of the journal has been previously preserved (probably when it was still accessible). It does look like a bug though, as Unpaywall considers it bronze. Will look into it, thanks for reporting.

Nemo 11:36, 3 December 2023 (UTC)Reply

All these DOIs are freely accessible, and they should accordingly be flagged as free. That the Medknow one is broken is irrelevant and a seperate issue than its freeness, because you can report it and then it'll get fixed.
Concerning "there's no way of knowing who's able to access the full text there" yes there is. Everyone is able to access those. Headbomb {t · c · p · b} 11:59, 3 December 2023 (UTC)Reply
Broken DOIs usually stay broken. Also, if the DOI goes nowhere you can't know whether the full text is available. It's better to re-add any doi-access information after the DOI becomes stable again.
And no, I appreciate your confidence in your testing capabilities but you are not everybody. Even if you have personally tested every single DOI for thousands of journals, that doesn't tell us that everyone else will be served the same result by the publishers, which use algorithmic decision-making to restrict access. Or if you just meant Annual Reviews, yes that's being handled; it's a moving target but will soon get easier as the S20 conversion completes. Nemo 12:53, 3 December 2023 (UTC)Reply
"I appreciate your confidence in your testing capabilities but you are not everybody"
This is all public information. If OABots keeps removing valid free access flags, it will need to be blocked until it no longer does so. Headbomb {t · c · p · b} 13:02, 3 December 2023 (UTC)Reply
I've stopped the bot now. What do you mean by "this"? Nemo 13:26, 3 December 2023 (UTC)Reply
That those DOI prefixes are all 100% open access DOIs. Headbomb {t · c · p · b} 13:52, 3 December 2023 (UTC)Reply
No it's not public information, where did you get it? Nemo 15:42, 3 December 2023 (UTC)Reply
Pick any of them. Medknow is an open access publisher. BioMed Central is an open access publisher. American Astronomical Society is an open access publisher. Athabasca University Press is an open access publisher. They all are. Headbomb {t · c · p · b} 15:48, 3 December 2023 (UTC)Reply
Which DOI prefix are you talking about? If you mean 10.4103, those DOIs belong to dozens of publishers including Springer, Elsevier, Thieme, de Gruyter, Wiley, SAGE and others, which are definitely not fully OA. So again, please be clear about what "public information" you're talking about. CrossRef certainly is not it, so I assume you're using some unofficial source, which is ok, but please clarify. Nemo 16:00, 3 December 2023 (UTC)Reply
I've linked the list many times now. 10.4103 are Medknow DOIs. Whatever location they point to now is irrelevant, because those started as Medknow DOIs and were published under open access licenses and that doesn't retroactively change whenever a journal is sold. Headbomb {t · c · p · b} 17:26, 3 December 2023 (UTC)Reply
The list made by you is not a source. You've still not stated how you verified that the DOIs with a 10.4103 prefix are OA. In reality, only 80 % of those DOIs are held by Medknow (the publisher now owned by LWW/WK), and there are over 30 publishers involved. Also, the supposed original OA status is no guarantee of anything because there is no free license, so those publishers can and do make those articles closed OA again. (Less than 1 % of those DOIs carry a free license and less than 10 % carry any license at all, according to CrossRef.) So once again, please state what kind of data verification procedure you've conducted that makes you more confident of your OA status determination than a process that involves actually checking the DOIs one by one. Nemo 22:13, 10 December 2023 (UTC)Reply
100% of these DOIS are free and were owned by Medknow (the 10.4103 ones). No exceptions. Zero. I'm not going to keep talking to a wall that's not interested in being convinced and who wants to ignore reality. Headbomb {t · c · p · b} 00:01, 11 December 2023 (UTC)Reply
Have you ever clicked of any of those DOIs? I suspect not, because reality is very different from how you picture it. Some 30 % don't go anywhere and some 10 % go to a 404 or similar. Will you check examples if I provide them, or is your 100 % certainty too strong to ever be pierced by facts? Nemo 07:41, 11 December 2023 (UTC)Reply
That Medknow was shit in updating CrossRef upon transfer does not change the fact that those are Medknow DOIs, or that they are free DOIs. Brokenness changes nothing. Headbomb {t · c · p · b} 08:43, 11 December 2023 (UTC)Reply
I think Headbomb is right. https://doi.org/10.4103 it's Medknow.
Most free content licenses are irrevocable. RudolfoMD (talk) 09:54, 11 December 2023 (UTC)Reply
So when Nemo's bot notices that content that was previously made available for free by the publisher, and is no longer available by the publisher, does/can the bot check if has been stored/cached by an archive service, and only if it hasn't been stored by any archive service, then mark it as free? (And otherwise -that is if it is archived, but Wikipedia doesn't link to the archive, add a link to it?) Nemo, do you want the bot do do that? Do you feel you need support (that you don't currently have) from the community to have it do that? Certainly seems wrong to have the bot deleting free tags from content that is available for free from archive services that legitimately archived it - and OA bronze seems to clearly fall into this category. Am I understanding/describing the situation correctly? RudolfoMD (talk) 03:02, 14 December 2023 (UTC)Reply
This bot needs to be shut down until it is fixed. It continues making heaps of incorrect edits, and the maintainer continually refuses to acknowledge the problem or act responsibly to fix it. Here are more examples I have reverted (nearly every example of OAbot edits I have checked from recent days was incorrect): 1188110729, 1188116446, 1188094221, 1188096255, 1188021066, 1188006733, 1187978981, 1187944816, 1187450391, 1187377185. For completeness, this edit seems to be correct: special:diff/1187552735. –jacobolus (t) 20:53, 3 December 2023 (UTC)Reply
As written above, the bot had already stopped removing doi-access=true before your message. The removals which actually were incorrect are being gradually reversed. I've also added some more information on Wikipedia:OABOT#Why did the bot remove a doi-access parameter?. Nemo 22:19, 10 December 2023 (UTC)Reply
Please do not start removing doi-access again until there's some community consensus that the bot is functioning correctly. –jacobolus (t) 22:25, 10 December 2023 (UTC)Reply
Actually, let me word this strongly: You need to demonstrate that you understand your bot's problems, take clear responsibility for your bot's malfunctioning and show how you intend to fix it (manually if necessary), and provide some assurance that it won't ever happen again. The cavalier attitude you are taking toward a bot which is making mistakes on such a large scale seems frankly unacceptable for bot operators. –jacobolus (t) 00:07, 11 December 2023 (UTC)Reply
I'm sorry you feel that way. I'm taking all this heat from you because I went out of my way to make the bot reverse some of its previous edits that people complained about. It took way longer than I had hoped for (the bot wasn't editing at all for many weeks) but all/most errors you reported back in August are in the process of being fixed this week, AFAICT. Nemo 08:01, 11 December 2023 (UTC)Reply
No, you're taking heat from me because your bot malfunctions and when people complain you don't respond in such a way that gives the impression that you understand the problem, care, or intend to fix it. If you are currently fixing some problem, then you'll avoid "taking heat" by explaining clearly what fix you are doing, and showing what other steps you're taking to make sure it doesn't happen again.
It's not "going out of your way" to fix errors that your bot caused; that's an expected part of running a bot, arguably the single most basic responsibility of any bot operator. Instead, everyone else here is "going our of our way" to pay attention to your bot, repeatedly explain what it's doing wrong, ask for the bot to stop, etc., even though none of us want to be doing that and it's otherwise a waste of our time. –jacobolus (t) 20:06, 11 December 2023 (UTC)Reply
Anyway though, thanks for trying to "reverse some of its previous edits that people complained about". I'm hoping that this "some" means all errors of the same general type are going to be fixed? Is there some explanation for what went wrong in the bot's data source / code / heuristics for it to wrongly consider this broad class of papers to be closed, when it is immediately obvious to any human who visits these DOIs that the papers are accessible? –jacobolus (t) 20:17, 11 December 2023 (UTC)Reply
While we're at it, the bot's edit summaries and user page are dramatically insufficient. "Open access bot: doi updated in citation with #oabot." is not specific enough as an edit summary. Instead the bot should say something like, "Open access bot: removed doi-access=free from a non-open-access source, see XYZ linked page for details" with the link pointing at somewhere explaining the bot's decisionmaking process. On the page user:OAbot, there should be a full list of all of the things the bot routinely does, with a detailed explanation and a link showing where each separate type of action was authorized. –jacobolus (t) 22:14, 3 December 2023 (UTC)Reply
The details of the operations are on Wikipedia:OABOT which is linked from the userpage. Nemo 22:14, 10 December 2023 (UTC)Reply
Nothing like "Open access bot: removed doi-access=free from a non-open-access source, see XYZ linked page for details" is on that page. Not even in part. RudolfoMD (talk) 09:55, 11 December 2023 (UTC)Reply
edit

E.g. [17], [18], [19], [20]

Headbomb {t · c · p · b} 20:16, 10 December 2023 (UTC)Reply

Ah sorry, that was supposed to be only for manual testing, fixing now. The last edit wasn't wrong though, the link had been taken over by malware. I'm not sure what https://www.isrctn.com/ISRCTN14173715 is: doi:10.1186/isrctn14173715 is supposed to be a dataset. Nemo 21:20, 10 December 2023 (UTC)Reply
I manually fixed the remaining broken link to PIA. The bot should be running correctly now. Nemo 22:07, 10 December 2023 (UTC)Reply

Bot incorrectly added |doi-access=free

edit

Diff link: [21]

Citation in question:

I'm not sure if there's a more effective way to report bugs but hopefully whatever caused this won't happen again. Umimmak (talk) 23:52, 10 December 2023 (UTC)Reply

Thanks for the report. I've reported this DOI and journal to Unpaywall (you can also do the same yourself for other cases, if you want to help).
As discussed above, there is one possible fix: to stop adding doi-access=free for bronze OA works (gratis OA without a Creative Commons license). These are the works where detection is most unreliable and which often change status. However there are two factions of users battling in this user page, some asking more doi-access=free and some less, and so far nobody engaged with this proposal, so we just keep going back and forth depending on which kind of edit is more common in any given week. Nemo 07:39, 11 December 2023 (UTC)Reply

Lousy edit summary

edit

This edit should have had a better edit summary - https://en.wikipedia.org/w/index.php?title=Paracetamol&diff=prev&oldid=1187943393 - explaining that the removed free tags were incorrect. Can the bot not do better at that with little work? RudolfoMD (talk) 08:14, 11 December 2023 (UTC)Reply

Another issue is that 1 of the 3 papers is still open access. –jacobolus (t) 20:12, 11 December 2023 (UTC)Reply
Oh? I commented above at User talk:OAbot#Removes doi-access=free when the dois are free. I may or may not grok the big picture. RudolfoMD (talk) 05:18, 14 December 2023 (UTC)Reply

List of bibliographies of works on Catullus

edit
  Resolved

The book Introduzione a Catullo by Paolo Fedeli in the Myrtia journal has a free HDL access but does not have an HDL value. First, Second, and third edits. Achmad Rachmani (talk) 06:41, 18 December 2023 (UTC)Reply

I'm afraid we don't support that way of adding comments yet. Is there a specific reason to reject the hdl identifier? Ah I see, there's a mismatch. The reason is that the DOI was wrong, should be ok now. Nemo 18:42, 5 January 2024 (UTC)Reply
It’s not that the DOI was wrong, per se, it’s that that journal only had one DOI — for the journal as a whole — as opposed to unique DOIs for each article. Is it still not worth including? Umimmak (talk) 17:31, 6 January 2024 (UTC)Reply
edit

In this edit, the bot added a link to a 2005 conference proceedings claiming it to be a free version of this 1966 book review. Title, names of authors, number of authors, and journal are all completely different. I assume this is an error propagated from somewhere else. Did this pass the bot's sanity checks? —Kusma (talk) 15:53, 5 January 2024 (UTC)Reply

The source of the error appears to be https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.263.7400 , which according to Unpaywall comes with the incorrect doi:10.2307/2004316 despite being about a 2004 paper. It looks like CiteSeerX is undergoing some frontend updates and I can't find anything any more, but there's a "report error" button which might do something useful. Nemo 18:31, 5 January 2024 (UTC)Reply
So does OABot believe in anything CiteSeerX says no matter whether it looks completely implausible? The entry you linked to is so thoroughly messed up that I am not sure it is even possible to correct it (and I don't have the correct bibliographic data anyway). —Kusma (talk) 20:57, 5 January 2024 (UTC)Reply
No, we don't use CiteSeerX as a source directly, there's some more information on Wikipedia:OABOT. Unpaywall usually uses various signals to verify that a match is correct. An incorrect DOI match is quite rare. Nemo 21:09, 5 January 2024 (UTC)Reply
The bot did it again, so I assume the band-aid fix linked to from phab isn't live yet? —Kusma (talk) 09:39, 15 January 2024 (UTC)Reply

PMC for wrong version of paper

edit

In Special:Diff/1193833171, OAbot added a PMC that points to a brief announcement of a result in PNAS, to a reference to the full publication of the same result in a different journal. That sort of edit is incorrect and bad. It's the sort of thing that leads to mangled citations as the error is then built on with more bot edits that treat the erroneous id as definitive and replace more of the citation with garbage. Do not do that. If the journals do not match, regardless of similarities in authorship and title, do not add metadata. —David Eppstein (talk) 23:53, 5 January 2024 (UTC)Reply

More OAbot-mangled citations: Special:Diff/1193818981 (same reference), Special:Diff/1193815125 (same problem with an unrelated reference), Special:Diff/1193584598 (same problem with a third unrelated reference). —David Eppstein (talk) 23:58, 5 January 2024 (UTC)Reply

The bot has also been edit-warring to reinstate this bad edit three times at Blumberg theorem. It can be locked out of this article but does it need to be blocked to prevent more widespread damage? —David Eppstein (talk) 07:44, 6 January 2024 (UTC)Reply

Thank you for reporting. The first diff seems unrelated, probably one missing digit. I was already looking into it and thanks to your kind explanations in Talk:Blumberg theorem and here I should be able to apply a workaround by today.
I've checked these title matches before, and unless something dramatically changed recently these should be pretty rare errors. They've happened multiple times here because of the unusual coincidence where PMC has scans of two journals which had articles with identical author, year and title but different content. Nemo 10:34, 6 January 2024 (UTC)Reply
The years don't necessarily match, and the titles are not always an exact match. I have shown the errors in a table below, calling the paper whose citation is erroneously added to paper L, and the paper whose PMC ID is erroneously added paper S.
Paper S Paper L
Diff 1193833171
Title Non-Separable and Planar Graphs
Author Hassler Whitney
Journal Proceedings of the National Academy of Sciences Transactions of the American Mathematical Society
Year 1931 1932
DOI doi:10.1073/pnas.17.2.125 doi:10.1090/S0002-9947-1932-1501641-2
Diff 1193815125
Title On the Theory of Dynamic Programming The theory of dynamic programming
Author Richard Bellman
Journal Proceedings of the National Academy of Sciences Bulletin of the American Mathematical Society
Year 1952 1954
DOI doi:10.1073/pnas.38.8.716 doi:10.1090/S0002-9904-1954-09848-8
Diff 1193584598
Title Dynamical Systems with Two Degrees of Freedom
Author George D. Birkhoff
Journal Proceedings of the National Academy of Sciences Transactions of the American Mathematical Society
Year 1917
DOI doi:10.1073/pnas.3.4.314 doi:10.1090/S0002-9947-1917-1501070-3
Diff 1193758371 and others
Title New properties of all real functions
Author Henry Blumberg
Journal Proceedings of the National Academy of Sciences Transactions of the American Mathematical Society
Year 1922
DOI doi:10.1073/pnas.8.10.283 doi:10.1090/S0002-9947-1922-1501216-9
Dmoews (talk) 16:32, 6 January 2024 (UTC)Reply
Thanks, all these cases and similar ones should be fixed now. (By ignoring title matches.) Nemo 21:17, 7 January 2024 (UTC)Reply

See also § Wrong PMC link from 2019 and § Wrong PMC link - April 2020. It seems that this insidious garbaging of citations has been going on for a long time and that the weak patches applied to fix specific instances of the problem have not actually fixed the problem. The bot needs to be much more careful about checking these matches than it apparently has been. —David Eppstein (talk) 17:37, 6 January 2024 (UTC)Reply

The April 2020 case was unrelated and caused by an incorrect DOI. The 2019 cases were because of our own title matching on Dissemin, which is not used by the bot now. Back then were fixed by making the title matches more restrictive, in a way that should prevent all the cases above (PMC title matching multiple DOIs): phabricator:T228666. I had plans to revisit those restrictions at some point, will keep this in mind: phabricator:T228702.
Generally speaking, I agree it would be bad to have "weak patches applied to fix specific instances of the problem". I tend to avoid exceptions for specific papers or journals in OAbot, though sometimes I contribute exceptions to Unpaywall. We try to maintain fixes for specific occurrences in the form of units tests to avoid regressions. Nemo 21:17, 7 January 2024 (UTC)Reply
edit

see https://en.wikipedia.org/w/index.php?title=Woodlark_Basin&diff=1193931371&oldid=1179325953

The hdl generated by the bot which has suddenly got active with new functionality is not recognised being 20.500.12210/63872

The doi still works: 10.1038/s43247-022-00387-9

Suggest hdl functionality be disabled for time being for anything identified by a doi as this is unnecessary duplication as doi is a subset of hdl.

There could be some clean up to do ! Does bot need to be disabled/blocked yet again ChaseKiwi (talk) 00:52, 8 January 2024 (UTC)Reply

Actually it's the opposite: every DOI is also a handle, but not vice versa. You can resolve a DOI like https://hdl.handle.net/10.1038/s43247-022-00387-9 but we don't generally put DOIs in the hdl parameter.
Thank you for reporting, I'll inform the repository admins. Usually, handle resolution failures like this are temporary issues with specific repositories. You can also add a direct link to the intended target which is https://hal.science/hal-03611693v1/document . Nemo 10:48, 8 January 2024 (UTC)Reply

Another incorrect hdl link: Special:Diff/1195808135. The reference goes to a journal paper but the hdl goes to a Ph.D. dissertation. (They have the same name and author but that is not a good enough match to make this decision.) —David Eppstein (talk) 18:29, 15 January 2024 (UTC)Reply

Bot repeatedly adding doi-access=free (Giant pangolin)

edit

Article: Giant pangolin
Referenced page:

https://doi.org/10.1111%2Faje.13279
(automatic redirection) → https://onlinelibrary.wiley.com/doi/10.1111/aje.13279
(after clicking 'Read the full text') → https://onlinelibrary.wiley.com/doi/full/10.1111/aje.13279

Bot's edit: Special:Diff/1228253952
Reverted: Special:Diff/1228447858
Re-insertion: Special:Diff/1229502557

But the access is not free – as I stated in the revert action, doi-access=free not true, the full text available through registration or purchase. --CiaPan (talk) 11:37, 18 June 2024 (UTC)Reply

Incorrect doi-access=free

edit

Special:Diff/950835738 added incorrect doi-access=free to "Restricted access" sources. 2600:4041:35E:4A00:AC48:659B:3743:6105 (talk) 06:05, 3 July 2024 (UTC)Reply

One barnstar

edit

(cannot find image) Thanks for citing! Have a nice day 14.102.171.218 (talk) 17:09, 8 July 2024 (UTC)Reply

No longer running?

edit

Last edit was almost a month ago on september 18 2024. So9q (talk) 08:10, 9 October 2024 (UTC)Reply

A barnstar for you!

edit
  The Special Barnstar
Open Access Electrou (formerly Susbush) (talk) 09:53, 11 October 2024 (UTC)Reply