Page MenuHomePhabricator

Hebrew Wikipedia data from citations with identifiers is empty
Open, Stalled, MediumPublic

Description

Mark Graham just reported that the hewiki* files from the "citations with identifiers" dataset [1] are empty. This looks like a parsing error, given that there are definitely DOIs and ISBNs in Hebrew Wikipedia. Filing a bug if we have bandwidth to look into this.

cc @Miriam and @bmansurov

[1] https://analytics.wikimedia.org/datasets/archive/public-datasets/all/mwrefs/mwcites-20180301/

Event Timeline

DarTar created this task.
bmansurov raised the priority of this task from Low to Medium.May 6 2019, 5:49 PM
bmansurov changed the task status from Open to Stalled.May 28 2019, 3:04 AM

I tried generating citations from hewiki dumps of 2019/05/01, but found a bug with the mwcites software as described here. My guess is that that's what caused the issue described in the task description. Let's wait and see until the bug is fixed.

@Miriam should we prioritize this bug to be fixed?

Aklapper subscribed.

Removing inactive task assignee account. (Please do so as part of offboarding.)