-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG] Updated docstrings for TfidfVectorizer functions #15509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hailey0huong ! A few comments below.
sklearn/feature_extraction/text.py
Outdated
Returns | ||
------- | ||
feature_names : list | ||
A list of feature name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line should be indented only 4 spaces more than the one above
sklearn/feature_extraction/text.py
Outdated
@@ -1770,11 +1807,14 @@ def fit(self, raw_documents, y=None): | |||
Parameters | |||
---------- | |||
raw_documents : iterable | |||
an iterable which yields either str, unicode or file objects | |||
An iterable which has either str, unicode or file objects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert "has -> yields", as an iterable does yield objects.
sklearn/feature_extraction/text.py
Outdated
"""Return a callable that handles preprocessing, tokenization | ||
|
||
""" | ||
Return a callable that handles preprocessing, tokenization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert the added line break.
It should be
"""Return ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just updated the format as your suggestions. Thanks!
Co-Authored-By: Roman Yurchak <rth.yurchak@gmail.com>
Co-Authored-By: Roman Yurchak <rth.yurchak@gmail.com>
Co-Authored-By: Roman Yurchak <rth.yurchak@gmail.com>
Co-Authored-By: Roman Yurchak <rth.yurchak@gmail.com>
Co-Authored-By: Roman Yurchak <rth.yurchak@gmail.com>
Co-Authored-By: Roman Yurchak <rth.yurchak@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On last comment otherwise LGTM.
Co-Authored-By: Roman Yurchak <rth.yurchak@gmail.com>
Thanks, merging! The Codecov failure is not relevant (only docstrings changed in this PR). |
Reference Issues/PRs
Reference #15440
What does this implement/fix? Explain your changes.
Updated docstrings for TfidfVectorizer functions to pass numpydoc validation
Any other comments?
There are some methods left under this class got
TypeError
while runningpython maint_tools/test_docstrings.py
that I don't know how to fix:idf_
,norm
,smooth_idf
,sublinear_tf
,use_idf
The error is below:
Traceback (most recent call last): File "maint_tools/test_docstrings.py", line 173, in <module> msg = repr_errors(res, method=args.import_path) File "maint_tools/test_docstrings.py", line 112, in repr_errors for code, message in res["errors"] TypeError: sequence item 0: expected str instance, NoneType found