Hi everyone,
We regularly get feedback that Wikidata (and Wikibase in general) needs an
API that’s easier to understand and use in order to get more people to
build tools that use data from Wikidata and other Wikibase instances. The
existing action API has several problems. The biggest ones are that it’s
not a widely used standard, not versioned and that it is not very well
suited for Wikibase’ structured data (as opposed to MediaWiki’s usual
wikitext). Over the last weeks we’ve looked at ways to improve the
situation and have come up with a draft for a REST API for Wikibase. We’d
love to have your feedback on it.
*What we want to achieve with it:*
- Provide a more industry-standard and versioned way to access and
manipulate data in Wikibase. This will make it easier for programmers to
get started building tools with and around Wikidata and other Wikibase
instances.
- Provide an API that is more tailored to the Wikibase data model to,
for example, make it possible to get exactly the part of an Entity you need
instead of the whole entity.
- Solve a number of issues in the current API that are easier to solve
with REST.
*A few things to keep in mind:*
- This is only touching the Wikibase-specific API modules, not any of
the others that MediaWiki provides.
- We’ve started with the specification around Items and Properties. Once
we are sure the direction is good, we will look at other parts of the data
model and content like Lexemes and MediaInfo.
- For existing users of the action API: Nothing changes for you for now.
If the feedback is positive, and we go ahead, it’ll take us some time to
actually implement the proposed changes. It’d be very important for us to
hear your feedback now to ensure that the new API meets your needs in the
future.
If you are building tools around and on top of Wikidata/Wikibase, please
have a look at the draft and give us your feedback. You can find all the
information at Wikidata:REST API
<https://www.wikidata.org/wiki/Wikidata:REST_API>. Please leave your
feedback on the talk page there, so we have it all in one place.
Thank you!
--
Mohammed Sadat
*Community Communications Manager for Wikidata/Wikibase*
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Hi all,
Join the Research Team at the Wikimedia Foundation [1] for their monthly
Office hours on Tuesday, 2020-11-03 at 17:00-18:00 PM UTC (9am PT/6pm CET).
To participate, join the video-call via this Wikimedia-meet link [2]. There
is no set agenda - feel free to add your item to the list of topics in the
etherpad [3] (You can do this after you join the meeting, too.), otherwise
you are welcome to also just hang out. More detailed information (e.g.
about how to attend) can be found here [4].
Through these office hours, we aim to make ourselves more available to
answer some of the research related questions that you as Wikimedia
volunteer editors, organizers, affiliates, staff, and researchers face in
your projects and initiatives. Some example cases we hope to be able to
support you in:
-
You have a specific research related question that you suspect you
should be able to answer with the publicly available data and you don’t
know how to find an answer for it, or you just need some more help with it.
For example, how can I compute the ratio of anonymous to registered editors
in my wiki?
-
You run into repetitive or very manual work as part of your Wikimedia
contributions and you wish to find out if there are ways to use machines to
improve your workflows. These types of conversations can sometimes be
harder to find an answer for during an office hour, however, discussing
them can help us understand your challenges better and we may find ways to
work with each other to support you in addressing it in the future.
-
You want to learn what the Research team at the Wikimedia Foundation
does and how we can potentially support you. Specifically for affiliates:
if you are interested in building relationships with the academic
institutions in your country, we would love to talk with you and learn
more. We have a series of programs that aim to expand the network of
Wikimedia researchers globally and we would love to collaborate with those
of you interested more closely in this space.
-
You want to talk with us about one of our existing programs [5].
Hope to see many of you,
Martin (WMF Research Team)
[1] https://research.wikimedia.org/team.html
[2] https://meet.wmcloud.org/ResearchOfficeHours
[3] https://etherpad.wikimedia.org/p/Research-Analytics-Office-hours
[4] https://www.mediawiki.org/wiki/Wikimedia_Research/Office_hours
[5] https://research.wikimedia.org/projects.html
--
Martin Gerlach
Research Scientist
Wikimedia Foundation
The Search Platform Team
<https://www.mediawiki.org/wiki/Wikimedia_Search_Platform> usually holds
office hours the first Wednesday of each month. Come talk to us about
anything related to Wikimedia search, Wikidata Query Service, Wikimedia
Commons Query Service, etc.!
Feel free to add your items to the Etherpad Agenda for the next meeting.
Details for our next meeting:
Date: Wednesday, November 4th, 2020
Time: 16:00-17:00 GMT / 08:00-09:00 PST / 11:00-12:00 EST / 17:00-18:00 CET
Etherpad: https://etherpad.wikimedia.org/p/Search_Platform_Office_Hours
Google Meet link: https://meet.google.com/vyc-jvgq-dww
Join by phone in the US: +1 786-701-6904 PIN: 262 122 849#
Hope to talk to you in a week!
—Trey
Trey Jones
Sr. Computational Linguist, Search Platform
Wikimedia Foundation
UTC-4 / EDT
Hello all,
On this day of Wikidata’s birthday, I’m excited to announce some news about the
next edition of the WikidataCon
<https://www.wikidata.org/wiki/Wikidata:WikidataCon_2021>.
The WikidataCon is an event organized by Wikimedia Germany and focused on
the *Wikidata community in a broad sense*: editors, tools builders, but
also 3rd party reusers, partner organizations that are using or
contributing to the data, the ecosystem of organizations working with
Wikibase. The content of the conference will have some parts dedicated to
people who want to learn more about Wikidata, some workshops and
discussions for the community to share skills and exchange about their
practices, and some space left to include side events for specific projects
(WikiCite, Wikibase, GLAM, etc.).
During the first two editions, we gathered an international crowd in Berlin
for a few days. However, as the global COVID pandemic is still hitting the
world, and the forecast for 2021 doesn’t indicate much improvement, the
situation doesn’t allow us to plan a traditional onsite international
conference. In order to allow the event to take place, and to ensure the
safety of all participants, we had to make some clear decisions. In 2021,
we will not gather all participants in Berlin, and we will avoid any
international travel. Instead, we are experimenting with a *hybrid format* for
the conference: most of the content and interactions will *take place
online*, and small, *local gatherings* will be possible, if the situation
allows it.
In a similar way to the distributed Wikidata birthday events, we will
encourage *Wikimedia chapters, local groups and communities* to organize
their part of the event, to support and gather people inside their country,
and to contribute to the content, for example with running talks, workshops
and discussions. Wikimedia Germany will provide the technical
infrastructure and support the coordination of these distributed events.
We’re already inviting Wikimedia organizations to include the event in
their 2021 plan (especially if you’re requesting an APG), and to reach out
to me if you want to be involved in the distributed conference.
On top of this, we would like to partner with a *Wikimedia organization
outside of Europe/North America*, to strengthen the Wikidata community in
their country and to allow a fairer distribution of content and speakers.
We are currently evaluating several possibilities, focusing on groups who
have been very active with Wikidata-related events over the past years. Our
criteria include not only the motivation and past activities, but also the
ability of the group to support part of the WikidataCon organization
workload. The choice of the partner organization will be announced at the
end of November. If you would like to know more about the process, feel
free to contact me off-list and I’ll be glad to give you some details.
In these quite unpredictable times, organizing an event is a challenge, and
the WikidataCon 2021 will definitely be one of a kind: we will have to
adapt, to be agile, to make the best out of the situation, and to work
closely with the local Wikidata groups all over the world.
Since the first WikidataCon in 2017, I am deeply committed in providing a
nice experience for everyone and making sure that this event remains a
gathering for and by the Wikidata community. I’m really excited to
coordinate this project and run this new experiment with you all!
If you’re interested in joining the effort, if you have questions,
suggestions for formats, etc.: feel free to use this talk page
<https://www.wikidata.org/wiki/Wikidata_talk:WikidataCon_2021> or to reach
out to me off-list.
Cheers,
--
Léa Lacroix
Community Engagement Coordinator
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Hello all,
Wikidata’s content is growing and our data is used in more and more
high-profile places. This means the pressure around data quality is rising.
We want to provide people with good data. One important piece in the data
quality puzzle is being able to *understand where we currently stand
quality-wise and how that changes over time*. We need to be able to do this
at scale and in an automated and repeatable way because no-one of us wants
to do this by hand for 90 Million Items for sure.
That’s where ORES <https://www.mediawiki.org/wiki/ORES>, the machine
learning system, comes in. One of the things it can do is judge the quality
of an Item. Or to be more exact it can judge some aspects of the quality of
an Item. It puts each Item into a quality class between A (amazing) and E
(ewwww, terrible). It’s been doing this for a while already but the quality
judgments it provided were not very good. The reasons for this were that it
took only a limited number of signals into account (that’d be something
like the number of References on the Item or the number of Labels) and
because it was trained on rather old data. Since then Wikidata’s data has
changed a lot so ORES could not tell what to do with the new kinds of Items
like astronomical objects because it had never seen them before.
We wanted to improve that and *make the quality judgments ORES provides
better*. We did this by:
- adding a number of new signals (e.g. does this Item have an image
attached)
- changing existing signals (e.g. missing references on external ID
statements no longer punish the Item so much)
- retraining the model on more current data so it better understands
scientific papers, astronomical objects, etc.
While we were at it we also wanted to better understand how data quality
changes over time on Wikidata. Before we only looked at the global average
quality score. But how do Items change over time? How many Items are being
improved from D to C or even B class for example? To better understand this
we started creating diagrams like this one
<https://commons.wikimedia.org/wiki/File:Wikidata_quality_diagram,_January_2…>.
It shows the development from January 2019 to January 2020.
We’re happy to present these improvements for Wikidata birthday
<https://www.wikidata.org/wiki/Wikidata:Eighth_Birthday/Presents>, and we
hope this will help us get a better and more accurate view of the data
quality on Wikidata now.
If you want to see the quality score near the header on each Item you can
include the following user script in your Common.js
<https://www.wikidata.org/wiki/Special:MyPage/common.js> page:
importScript("User:EpochFail/ArticleQuality.js")
*What’s coming next on the same topic?*
- ORES can’t judge all aspects of quality. It for example can not tell
if a statement is generally considered true. We will look at ways of
judging this aspect of quality as well but it’s considerably harder. If you
have ideas how to go about it let us know.
- We will build a small tool that’ll make it possible for you to provide
a list of Items and then get the quality of that subset of Wikidata as well
as the lowest and highest quality Items. This will hopefully help wiki
projects etc to have a good overview of their data.
If you have any questions or feedback, or want to keep discussing Item
quality, feel free to use this talk page
<https://www.wikidata.org/wiki/Wikidata_talk:Item_quality>. Cheers,
--
Léa Lacroix
Community Engagement Coordinator
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Hello all,
As you may know, in May 2020 we released new data for automated references,
as well as a game that you can use to associate references with statements.
We released this game containing 4200 potential references (see
statistics). In the meantime, we parsed many more websites and collected
529K potential new references.
These new references will not be added to the game, because they are too
many for their relevance to be checked by hand. As requested by some of you
after the previous announcement, we published the list of all references in
a dump available here
<https://analytics.wikimedia.org/published/datasets/periodic/wikidata-potent…>
.
Subsets of this dump can be reused by bots and tools, however, we advise
you to be careful when using it and to not mass import them to Wikidata
without careful review: it is quite raw, some references may be wrong or
irrelevant. In order to help you analyze these references and filter the
most useful ones, we are also providing a dashboard
<https://wmdeanalytics.wmflabs.org/WD_GameReferenceHunt> containing an
overview of the judgements made in the game so you can see which parts are
more likely to be of higher or lower quality.
We’re happy to release the dumps and the dashboard just in time for
the Wikidata
birthday <https://www.wikidata.org/wiki/Wikidata:Eighth_Birthday/Presents>
:)
If you have any questions or encounter issues with the dump or the
dashboard, please let us know on the talk page
<https://www.wikidata.org/wiki/Wikidata_talk:Automated_finding_references_in…>
.
Cheers,
--
Léa Lacroix
Community Engagement Coordinator
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Hello all,
As a reminder, the 24-hours online meetup to celebrate Wikidata's birthday
is starting today at 17:00 UTC/GMT (18:00 in Central Europe), in a bit less
than two hours.
This informal discussion is open to everyone who would like to chat about
Wikidata and virtually meet other people from the community. The main topic
is "what makes you feel enthusiastic about Wikidata" and some people will
facilitate certain slots with a specific topic. You can find more
information here
<https://www.wikidata.org/wiki/Wikidata:Eighth_Birthday/24-hours_meetup>.
You're welcome to jump in at any point that is convenient for you! To
access the call, you will first need to register here
<https://pretix.eu/wikidata/wikidata24meetup/>(free of charge, valid email
address needed). An individual link to the BigBlueButton call will then be
sent to you one hour before the beginning of the event. (the service
provider complies to GDPR and the email address you provide will be deleted
after the event, we will not access or reuse it in any way)
We're looking forward to chatting with you and to learning more about your
favorite projects, tools or anecdotes on Wikidata!
If you encounter any technical issue, feel free to contact me by email.
Cheers,
--
Léa Lacroix
Community Engagement Coordinator
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Today the 2020 WikiCite ‘virtual conference’ begins. We have 14 sessions,
over 3 days, in 5 languages.
All sessions are listed and described at the event homepage here:
https://meta.wikimedia.org/wiki/WikiCite/2020_Virtual_conference
(short URL for easy sharing https://w.wiki/gwo )
All sessions will be live-streamed on the WikipediaWeekly YouTube channel
(playlist:
https://www.youtube.com/playlist?list=PLL05-NbVFLBbr5vfeTN-LZReRRarH-zAh )
and also via the WikiCite Twitter account (
https://twitter.com/wikicite ). There are sessions being hosted at
appropriate times all timezones. Certain hosts will also be live-streaming
into their community’s Facebook page (including sessions hosted by the
Brazilian, Swedish, Australian and Indonesian communities). Conversation
will be hosted on the WikiCite Telegram group
https://t.me/joinchat/FeOscRxFj17BRhwX-Mg7Sg All sessions will be uploaded
to Wikimedia Commons afterwards.
The program for today (Monday 26), is below with the Youtube link. For
details of each session visit and alternative viewing options, the program
page:
In one hour from this message (at 10:00 UTC) is the opening session: “State
of WikiCite in 2020” [in English] presented by Daniel Mietchen, and hosted
by Jakob Voß & Eva Seidlmayer:
https://youtu.be/TmGpJnukbYU
Then,
- "Author items" [English] 13:00 - 16:45 UTC, hosted by Simon Cobb & Jason
Evans. Speakers: Simon Cobb; Arthur P Smith; Maria Gould; Tom Demeranville;
Finn Årup Nielsen; Jack Nunn. https://www.youtube.com/watch?v=wZUB62hp5dU
- "Hands-on: Wikidata-Einführung (Library Carpentry style)" [Deutsch] 14:00
- 17:00 UTC, hosted by Rabea Müller & Konrad Förstner
https://www.youtube.com/watch?v=l3gIZemmRnI
- "Citations in Swedish Parliamentary documents" [English] 17.00 - 20.00
UTC, hosted by Jan Ainali & Daniel Eriksson Speakers: Daniel Mietchen;
Robin Linderborg. https://www.youtube.com/watch?v=yMtP64yFKn0
- "Research output items" [English] 22:00 - 24:00 UTC, hosted by Thomas
Shafee & Alex Lum. Speakers: Daniel Mietchen, Margaret Donald, Federico
Leva, Antonin Delpeuch, Amanda Lawrence, Thomas Shafee
https://www.youtube.com/watch?v=oUNw878Xonw
Sincerely,
*Liam Wyatt [Wittylama]*
WikiCite <https://meta.wikimedia.org/wiki/WikiCite> Program Manager
Wikimedia Foundation
For technical reasons, we can't send you the newsletter directly in this
email this week, but you can read it here:
https://www.wikidata.org/wiki/Wikidata:Status_updates/2020_10_26
Cheers,
--
Mohammed Sadat
*Community Communications Manager for Wikidata/Wikibase*
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Hi,
A group of volunteers have worked during the last year to add open data
from the Swedish Parliament to Wikidata. One of them, Daniel Eriksson,
wrote a blog post that we shared on Wikimedia Sverige's blog today, on what
the data means and what you can do with it.
Perhaps it would interest someone on this mailing list!
https://wikimedia.se/2020/10/26/analyze-swedish-politics-with-wikidata/
Best,
*Eric Luth*
Projektledare engagemang och påverkan | Project Manager, Involvement and
Advocacy
Wikimedia Sverige
eric.luth(a)wikimedia.se
+46 (0) 765 55 50 95
Stöd fri kunskap, bli medlem i Wikimedia Sverige.
Läs mer på blimedlem.wikimedia.se