Skip to content

Commit d442ec3

Browse files
ambvgsnedders
authored andcommitted
use charade instead of chardet for Python 2.6 - 3.3 compatibility
1 parent efa4e61 commit d442ec3

8 files changed

+17
-19
lines changed

.travis.yml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,6 @@ before_install:
2828
install:
2929
- pip install -r requirements.txt -r requirements-test.txt --use-mirrors
3030
- if [[ $USE_OPTIONAL == "true" ]]; then pip install -r requirements-optional.txt --use-mirrors; fi
31-
- if [[ $TRAVIS_PYTHON_VERSION != 3.* && $USE_OPTIONAL == "true" ]]; then pip install -r requirements-optional-2.txt --use-mirrors; fi
32-
- if [[ $TRAVIS_PYTHON_VERSION == 3.* && $USE_OPTIONAL == "true" ]]; then pip install -r requirements-optional-3.txt --use-mirrors; fi
3331
- if [[ $TRAVIS_PYTHON_VERSION != "pypy" && $USE_OPTIONAL == "true" ]]; then pip install -r requirements-optional-cpython.txt --use-mirrors; fi
3432
- if [[ $FLAKE == "true" ]]; then pip install --use-mirrors flake8; fi
3533

README.rst

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -30,10 +30,9 @@ Optionally:
3030

3131
- ``genshi`` has a treewalker (but not builder); and
3232

33-
- ``chardet`` can be used as a fallback when character encoding cannot
34-
be determined (note currently this is only packaged on PyPI for
35-
Python 2, though several package managers include unofficial ports
36-
to Python 3).
33+
- ``charade`` can be used as a fallback when character encoding cannot
34+
be determined; ``chardet``, from which it was forked, can also be used
35+
on Python 2.
3736

3837

3938
Installation

debug-info.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
"maxsize": sys.maxsize
1313
}
1414

15-
search_modules = ["chardet", "datrie", "genshi", "html5lib", "lxml", "six"]
15+
search_modules = ["charade", "chardet", "datrie", "genshi", "html5lib", "lxml", "six"]
1616
found_modules = []
1717

1818
for m in search_modules:

html5lib/inputstream.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -457,7 +457,10 @@ def detectEncoding(self, parseMeta=True, chardet=True):
457457
if encoding is None and chardet:
458458
confidence = "tentative"
459459
try:
460-
from chardet.universaldetector import UniversalDetector
460+
try:
461+
from charade.universaldetector import UniversalDetector
462+
except ImportError:
463+
from chardet.universaldetector import UniversalDetector
461464
buffers = []
462465
detector = UniversalDetector()
463466
while not detector.done:

html5lib/tests/test_encoding.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,9 +53,12 @@ def test_encoding():
5353
yield (runPreScanEncodingTest, test[b'data'], test[b'encoding'])
5454

5555
try:
56-
import chardet # flake8: noqa
56+
try:
57+
import charade # flake8: noqa
58+
except ImportError:
59+
import chardet # flake8: noqa
5760
except ImportError:
58-
print("chardet not found, skipping chardet tests")
61+
print("charade/chardet not found, skipping chardet tests")
5962
else:
6063
def test_chardet():
6164
with open(os.path.join(test_dir, "encoding" , "chardet", "test_big5.txt"), "rb") as fp:

requirements-optional-2.txt

Lines changed: 0 additions & 3 deletions
This file was deleted.

requirements-optional-3.txt

Lines changed: 0 additions & 6 deletions
This file was deleted.

requirements-optional.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,7 @@ genshi
55
# DATrie can be used in place of our Python trie implementation for
66
# slightly better parsing performance.
77
datrie
8+
9+
# charade can be used as a fallback in case we are unable to determine
10+
# the encoding of a document.
11+
charade

0 commit comments

Comments
 (0)