-
-
Notifications
You must be signed in to change notification settings - Fork 16
Inflection-94 Improve Wikidata coverage in dictionary-parser #95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# | ||
# These are lexemes that should either be ignored due to irrelevance that can't be easily tagged as irrelevant, | ||
# or words that are just not that common that should be sorted last in the inflection patterns. | ||
L128740=omit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This noun is not typical, and it conflicts with the common pronoun. Remove it for now to deconflict it.
# These are lexemes that should either be ignored due to irrelevance that can't be easily tagged as irrelevant, | ||
# or words that are just not that common that should be sorted last in the inflection patterns. | ||
L128740=omit | ||
L166820=omit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is стола, which is also the feminine form of стол (table). There is a conflict here. There are ways to deconflict this, but let's exclude this for now.
@@ -2,9 +2,10 @@ | |||
# | |||
# These are lexemes that should either be ignored due to irrelevance that can't be easily tagged as irrelevant, | |||
# or words that are just not that common that should be sorted last in the inflection patterns. | |||
L15388=rare | |||
L299075=omit | |||
# TODO remove this, since it is fixed upstream. | |||
L342586=omit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed after the next Wikidata dump is consumed.
@@ -120,6 +120,7 @@ public String toString() { | |||
|
|||
enum Tense { | |||
PAST, | |||
DISTANT_PAST, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hindi concept for one word.
Resolves #94
There are some changes that can be made to reduce the number of test failures across languages being transitioned to use Wikidata.