Skip to content

TarFile.extractall() got an unexpected keyword argument 'filter' #31521

@TE-YongweiSun

Description

@TE-YongweiSun

Describe the bug

For the latest version 1.7.0, it can be installed with Python 3.10, but the parameter filter is available starting from Python 3.12 (See: https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractall ).

fp.extractall(path=target_dir, filter="data")

As a result, when I attempted to download the 20newsgroups dataset, an error occurred:

  File "\xxx\sklearn\utils\_param_validation.py", line 218, in wrapper
    return func(*args, **kwargs)
  File "\xxx\sklearn\datasets\_twenty_newsgroups.py", line 322, in fetch_20newsgroups
    cache = _download_20newsgroups(
  File "\xxx\sklearn\datasets\_twenty_newsgroups.py", line 87, in _download_20newsgroups
    fp.extractall(path=target_dir, filter="data")
TypeError: TarFile.extractall() got an unexpected keyword argument 'filter'

Steps/Code to Reproduce

from sklearn.datasets import fetch_20newsgroups
cats = ['alt.atheism', 'sci.space']
newsgroups_train = fetch_20newsgroups(subset='train', categories=cats)

Expected Results

list(newsgroups_train.target_names)
newsgroups_train.filenames.shape
newsgroups_train.target.shape
newsgroups_train.target[:10]>>> cats = ['alt.atheism', 'sci.space']

Actual Results

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "\xxx\sklearn\utils\_param_validation.py", line 218, in wrapper
    return func(*args, **kwargs)
  File "\xxx\sklearn\datasets\_twenty_newsgroups.py", line 322, in fetch_20newsgroups
    cache = _download_20newsgroups(
  File "\xxx\sklearn\datasets\_twenty_newsgroups.py", line 87, in _download_20newsgroups
    fp.extractall(path=target_dir, filter="data")
TypeError: TarFile.extractall() got an unexpected keyword argument 'filter'

Versions

`1.7.0`

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions