[MRG] Updated docstrings for TfidfVectorizer functions #15509

hailey0huong · 2019-11-02T22:02:19Z

Reference Issues/PRs

Reference #15440

What does this implement/fix? Explain your changes.

Updated docstrings for TfidfVectorizer functions to pass numpydoc validation

Any other comments?

There are some methods left under this class got TypeError while running python maint_tools/test_docstrings.py that I don't know how to fix: idf_, norm, smooth_idf, sublinear_tf, use_idf

The error is below:
Traceback (most recent call last): File "maint_tools/test_docstrings.py", line 173, in <module> msg = repr_errors(res, method=args.import_path) File "maint_tools/test_docstrings.py", line 112, in repr_errors for code, message in res["errors"] TypeError: sequence item 0: expected str instance, NoneType found

rth

Thanks @hailey0huong ! A few comments below.

sklearn/feature_extraction/text.py

rth · 2019-11-04T00:08:33Z

sklearn/feature_extraction/text.py

+        Returns
+        -------
+        feature_names : list
+                    A list of feature name.


This line should be indented only 4 spaces more than the one above

sklearn/feature_extraction/text.py

rth · 2019-11-04T00:10:07Z

sklearn/feature_extraction/text.py

@@ -1770,11 +1807,14 @@ def fit(self, raw_documents, y=None):
        Parameters
        ----------
        raw_documents : iterable
-            an iterable which yields either str, unicode or file objects
+            An iterable which has either str, unicode or file objects.


Please revert "has -> yields", as an iterable does yield objects.

sklearn/feature_extraction/text.py

rth · 2019-11-04T00:11:39Z

sklearn/feature_extraction/text.py

-        """Return a callable that handles preprocessing, tokenization
-
+        """
+        Return a callable that handles preprocessing, tokenization


Please revert the added line break.
It should be

"""Return ...

Just updated the format as your suggestions. Thanks!

Co-Authored-By: Roman Yurchak <rth.yurchak@gmail.com>

rth

On last comment otherwise LGTM.

sklearn/feature_extraction/text.py

Co-Authored-By: Roman Yurchak <rth.yurchak@gmail.com>

rth · 2019-11-15T20:31:49Z

Thanks, merging! The Codecov failure is not relevant (only docstrings changed in this PR).

hailey0huong added 2 commits November 2, 2019 14:49

update docstrings for TfidfVectorizer functions

fa85558

fixed some formatting errors

1a5ed9c

TomDLT added the Documentation label Nov 2, 2019

rth mentioned this pull request Nov 4, 2019

MAINT Improve method detection in numpydoc validation script #15532

Merged

rth reviewed Nov 4, 2019

View reviewed changes

hailey0huong and others added 7 commits November 3, 2019 19:41

Update sklearn/feature_extraction/text.py

47aa30d

Co-Authored-By: Roman Yurchak <rth.yurchak@gmail.com>

Update sklearn/feature_extraction/text.py

fb41ad4

Co-Authored-By: Roman Yurchak <rth.yurchak@gmail.com>

Update sklearn/feature_extraction/text.py

0eb2d55

Co-Authored-By: Roman Yurchak <rth.yurchak@gmail.com>

Update sklearn/feature_extraction/text.py

caf3628

Co-Authored-By: Roman Yurchak <rth.yurchak@gmail.com>

Update sklearn/feature_extraction/text.py

a53f2d8

Co-Authored-By: Roman Yurchak <rth.yurchak@gmail.com>

Update sklearn/feature_extraction/text.py

931b7ef

Co-Authored-By: Roman Yurchak <rth.yurchak@gmail.com>

Updated some formatting issues

8783f1b

rth approved these changes Nov 4, 2019

View reviewed changes

sklearn/feature_extraction/text.py Outdated Show resolved Hide resolved

Update sklearn/feature_extraction/text.py

eb01c8b

Co-Authored-By: Roman Yurchak <rth.yurchak@gmail.com>

rth merged commit 25a88b4 into scikit-learn:master Nov 15, 2019

adrinjalali pushed a commit to adrinjalali/scikit-learn that referenced this pull request Nov 18, 2019

DOC docstrings validation in TfidfVectorizer (scikit-learn#15509)

bf85fb4

adrinjalali pushed a commit to adrinjalali/scikit-learn that referenced this pull request Nov 18, 2019

DOC docstrings validation in TfidfVectorizer (scikit-learn#15509)

ef27989

adrinjalali pushed a commit that referenced this pull request Nov 19, 2019

DOC docstrings validation in TfidfVectorizer (#15509)

012841a

panpiort8 pushed a commit to panpiort8/scikit-learn that referenced this pull request Mar 3, 2020

DOC docstrings validation in TfidfVectorizer (scikit-learn#15509)

3bb37a9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] Updated docstrings for TfidfVectorizer functions #15509

[MRG] Updated docstrings for TfidfVectorizer functions #15509

hailey0huong commented Nov 2, 2019 •

edited

Loading

rth left a comment

rth Nov 4, 2019

rth Nov 4, 2019

rth Nov 4, 2019

hailey0huong Nov 4, 2019

rth left a comment

rth commented Nov 15, 2019

[MRG] Updated docstrings for TfidfVectorizer functions #15509

[MRG] Updated docstrings for TfidfVectorizer functions #15509

Conversation

hailey0huong commented Nov 2, 2019 • edited Loading

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

rth left a comment

Choose a reason for hiding this comment

rth Nov 4, 2019

Choose a reason for hiding this comment

rth Nov 4, 2019

Choose a reason for hiding this comment

rth Nov 4, 2019

Choose a reason for hiding this comment

hailey0huong Nov 4, 2019

Choose a reason for hiding this comment

rth left a comment

Choose a reason for hiding this comment

rth commented Nov 15, 2019

hailey0huong commented Nov 2, 2019 •

edited

Loading