Skip to content

Commit a0b64fb

Browse files
committed
Fix text data tutorial
- Fix a typo. - Fix a floating point error in doctests. - Fix `VisibleDepricationWarning` due to conversion of an array with ndim > 0 to an index. Signed-off-by: Rohan Jain <crodjer@gmail.com>
1 parent 1dbc069 commit a0b64fb

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

doc/tutorial/text_analytics/working_with_text_data.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -184,7 +184,7 @@ The most intuitive way to do so is the bags of words representation:
184184

185185
The bags of words representation implies that ``n_features`` is
186186
the number of distinct words in the corpus: this number is typically
187-
larger that 100,000.
187+
larger than 100,000.
188188

189189
If ``n_samples == 10000``, storing ``X`` as a numpy array of type
190190
float32 would require 10000 x 100000 x 4 bytes = **4GB in RAM** which
@@ -443,13 +443,13 @@ to speed up the computation::
443443
The result of calling ``fit`` on a ``GridSearchCV`` object is a classifier
444444
that we can use to ``predict``::
445445

446-
>>> twenty_train.target_names[gs_clf.predict(['God is love'])]
446+
>>> twenty_train.target_names[gs_clf.predict(['God is love'])[0]]
447447
'soc.religion.christian'
448448

449449
The object's ``best_score_`` and ``best_params_`` attributes store the best
450450
mean score and the parameters setting corresponding to that score::
451451

452-
>>> gs_clf.best_score_
452+
>>> gs_clf.best_score_ # doctest: +ELLIPSIS
453453
0.900...
454454
>>> for param_name in sorted(parameters.keys()):
455455
... print("%s: %r" % (param_name, gs_clf.best_params_[param_name]))

0 commit comments

Comments
 (0)