Skip to content

JSON type fails in Python 3 w/ any string, Python 2 w/ non-ascii data #488

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zzzeek opened this issue Jul 24, 2016 · 10 comments · Fixed by #490 or #496
Closed

JSON type fails in Python 3 w/ any string, Python 2 w/ non-ascii data #488

zzzeek opened this issue Jul 24, 2016 · 10 comments · Fixed by #490 or #496

Comments

@zzzeek
Copy link
Contributor

zzzeek commented Jul 24, 2016

The JSON string comes back as bytes on Python 3 which is not what json.loads() expects. This is also inconsistent vs. mysqlclient. I'm not 100% sure there isn't something else that makes this work, since I had some trouble reproducing and I thought maybe it was working in 0.7.1, but now it's not working there either.

import MySQLdb

import pymysql


import json

def run_test(dbapi):
    print("Running DBAPI: %s" % dbapi)
    conn = dbapi.connect(user='scott', passwd='tiger', db="test", charset='utf8', use_unicode=0)
    cursor = conn.cursor()
    cursor.execute("drop table if exists json_test")
    cursor.execute("create table json_test (data JSON)")
    cursor.execute("insert into json_test (data) values (%s)", ['{"foo": "bar"}'])
    cursor.execute("select data from json_test")
    result = cursor.fetchone()[0]
    print(json.loads(result))

run_test(MySQLdb)
run_test(pymysql)

output:

Running DBAPI: <module 'MySQLdb' from '/opt/python3.5/lib/python3.5/site-packages/MySQLdb/__init__.py'>
{'foo': 'bar'}
Running DBAPI: <module 'pymysql' from '/opt/python3.5/lib/python3.5/site-packages/pymysql/__init__.py'>
Traceback (most recent call last):
  File "test.py", line 20, in <module>
    run_test(pymysql)
  File "test.py", line 17, in run_test
    print(json.loads(result))
  File "/opt/python3.5/lib/python3.5/json/__init__.py", line 312, in loads
    s.__class__.__name__))
TypeError: the JSON object must be str, not 'bytes'
zzzeek added a commit to zzzeek/sqlalchemy that referenced this issue Jul 24, 2016
Change-Id: I4f935cce878eb1678b432f4cb4ba6ee122f8e9d4
@zzzeek zzzeek changed the title JSON type fails in Python 3 JSON type fails in Python 3 w/ any string, Python 2 w/ non-ascii data Jul 24, 2016
@zzzeek
Copy link
Contributor Author

zzzeek commented Jul 24, 2016

On Py2k, things don't work when there's unicode data present:

# coding: utf-8

import MySQLdb

import pymysql


data = u'{"foo": "méil"}'

def run_test(dbapi):
    print("Running DBAPI: %s" % dbapi)
    conn = dbapi.connect(user='scott', passwd='tiger', db="test", charset='utf8')
    cursor = conn.cursor()
    cursor.execute("drop table if exists json_test")
    cursor.execute("create table json_test (data JSON)")
    cursor.execute("insert into json_test (data) values (%s)", [data])
    cursor.execute("select data from json_test")
    result = cursor.fetchone()[0]
    print(result)

run_test(MySQLdb)
run_test(pymysql)

output:

Traceback (most recent call last):
File "test.py", line 22, in
run_test(pymysql)
File "test.py", line 17, in run_test
cursor.execute("select data from json_test")
File "/home/classic/dev/sqlalchemy/.tox/py27-cext-mysql/lib/python2.7/site-packages/pymysql/cursors.py", line 146, in execute
result = self._query(query)
File "/home/classic/dev/sqlalchemy/.tox/py27-cext-mysql/lib/python2.7/site-packages/pymysql/cursors.py", line 296, in _query
conn.query(q)
File "/home/classic/dev/sqlalchemy/.tox/py27-cext-mysql/lib/python2.7/site-packages/pymysql/connections.py", line 819, in query
self._affected_rows = self._read_query_result(unbuffered=unbuffered)
File "/home/classic/dev/sqlalchemy/.tox/py27-cext-mysql/lib/python2.7/site-packages/pymysql/connections.py", line 1001, in _read_query_result
result.read()
File "/home/classic/dev/sqlalchemy/.tox/py27-cext-mysql/lib/python2.7/site-packages/pymysql/connections.py", line 1292, in read
self._read_result_packet(first_packet)
File "/home/classic/dev/sqlalchemy/.tox/py27-cext-mysql/lib/python2.7/site-packages/pymysql/connections.py", line 1347, in _read_result_packet
self._read_rowdata_packet()
File "/home/classic/dev/sqlalchemy/.tox/py27-cext-mysql/lib/python2.7/site-packages/pymysql/connections.py", line 1385, in _read_rowdata_packet
rows.append(self._read_row_from_packet(packet))
File "/home/classic/dev/sqlalchemy/.tox/py27-cext-mysql/lib/python2.7/site-packages/pymysql/connections.py", line 1396, in _read_row_from_packet
data = data.decode(encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 10: ordinal not in range(128)

@methane
Copy link
Member

methane commented Jul 25, 2016

@zzzeek I think #484 fixes it. Regression was introduced in 0.7.5.
I'll release 0.7.6 in this week. But could you confirm the issue is fixed in master branch?

@zzzeek
Copy link
Contributor Author

zzzeek commented Jul 25, 2016

master branch does not resolve, I still get bytes back on py3k, "'ascii' codec can't decode byte 0xc3 in position 10: ordinal not in range(128)" on py2k. I am also in some circumstances getting the error "(pymysql.err.InternalError) Packet sequence number wrong - got 5 expected 1" which I've also seen with previous versions though it is more difficul to reproduce.

@zzzeek
Copy link
Contributor Author

zzzeek commented Jul 26, 2016

yeah that seems to have fixed thanks!

@zzzeek
Copy link
Contributor Author

zzzeek commented Jul 26, 2016

Im still having some issues, not with the simple test here, ill try to make another test.

@zzzeek
Copy link
Contributor Author

zzzeek commented Jul 26, 2016

here we go

# coding: utf-8

import MySQLdb
import pymysql


statement = 'SELECT CAST(%(param_1)s AS JSON) AS anon_1'
params = {'param_1': '{"data": {"k1": "dr\\u00f4le"}, "r\\u00e9veill\\u00e9": "r\\u00e9veill\\u00e9"}'}

def run_test(dbapi):
    print("Running DBAPI: %s" % dbapi)
    conn = dbapi.connect(user='scott', passwd='tiger', db="test", charset='utf8')
    cursor = conn.cursor()
    cursor.execute(statement, params)
    result = cursor.fetchone()[0]
    print(result)

run_test(MySQLdb)
run_test(pymysql)

py27 output:

Running DBAPI: <module 'MySQLdb' from '/home/classic/dev/sqlalchemy/.tox/py27/lib/python2.7/site-packages/MySQLdb/__init__.pyc'>
{"data": {"k1": "drôle"}, "réveillé": "réveillé"}
Running DBAPI: <module 'pymysql' from '/home/classic/dev/sqlalchemy/.tox/py27/lib/python2.7/site-packages/pymysql/__init__.pyc'>
Traceback (most recent call last):
  File "test2.py", line 19, in <module>
    run_test(pymysql)
  File "test2.py", line 14, in run_test
    cursor.execute(statement, params)
  File "/home/classic/dev/sqlalchemy/.tox/py27/lib/python2.7/site-packages/pymysql/cursors.py", line 161, in execute
    result = self._query(query)
  File "/home/classic/dev/sqlalchemy/.tox/py27/lib/python2.7/site-packages/pymysql/cursors.py", line 317, in _query
    conn.query(q)
  File "/home/classic/dev/sqlalchemy/.tox/py27/lib/python2.7/site-packages/pymysql/connections.py", line 837, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/home/classic/dev/sqlalchemy/.tox/py27/lib/python2.7/site-packages/pymysql/connections.py", line 1021, in _read_query_result
    result.read()
  File "/home/classic/dev/sqlalchemy/.tox/py27/lib/python2.7/site-packages/pymysql/connections.py", line 1311, in read
    self._read_result_packet(first_packet)
  File "/home/classic/dev/sqlalchemy/.tox/py27/lib/python2.7/site-packages/pymysql/connections.py", line 1370, in _read_result_packet
    self._read_rowdata_packet()
  File "/home/classic/dev/sqlalchemy/.tox/py27/lib/python2.7/site-packages/pymysql/connections.py", line 1408, in _read_rowdata_packet
    rows.append(self._read_row_from_packet(packet))
  File "/home/classic/dev/sqlalchemy/.tox/py27/lib/python2.7/site-packages/pymysql/connections.py", line 1427, in _read_row_from_packet
    data = converter(data)
  File "/home/classic/dev/sqlalchemy/.tox/py27/lib/python2.7/site-packages/pymysql/converters.py", line 339, in convert_json
    return b.decode('utf-8')
  File "/home/classic/dev/sqlalchemy/.tox/py27/lib64/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf4' in position 19: ordinal not in range(128)

py35 output:

  Running DBAPI: <module 'MySQLdb' from '/home/classic/dev/sqlalchemy/.tox/py35/lib/python3.5/site-packages/MySQLdb/__init__.py'>
  {"data": {"k1": "drôle"}, "réveillé": "réveillé"}
  Running DBAPI: <module 'pymysql' from '/home/classic/dev/sqlalchemy/.tox/py35/lib/python3.5/site-packages/pymysql/__init__.py'>
  Traceback (most recent call last):
    File "test2.py", line 19, in <module>
      run_test(pymysql)
    File "test2.py", line 14, in run_test
      cursor.execute(statement, params)
    File "/home/classic/dev/sqlalchemy/.tox/py35/lib/python3.5/site-packages/pymysql/cursors.py", line 161, in execute
      result = self._query(query)
    File "/home/classic/dev/sqlalchemy/.tox/py35/lib/python3.5/site-packages/pymysql/cursors.py", line 317, in _query
      conn.query(q)
    File "/home/classic/dev/sqlalchemy/.tox/py35/lib/python3.5/site-packages/pymysql/connections.py", line 837, in query
      self._affected_rows = self._read_query_result(unbuffered=unbuffered)
    File "/home/classic/dev/sqlalchemy/.tox/py35/lib/python3.5/site-packages/pymysql/connections.py", line 1021, in _read_query_result
      result.read()
    File "/home/classic/dev/sqlalchemy/.tox/py35/lib/python3.5/site-packages/pymysql/connections.py", line 1311, in read
      self._read_result_packet(first_packet)
    File "/home/classic/dev/sqlalchemy/.tox/py35/lib/python3.5/site-packages/pymysql/connections.py", line 1370, in _read_result_packet
      self._read_rowdata_packet()
    File "/home/classic/dev/sqlalchemy/.tox/py35/lib/python3.5/site-packages/pymysql/connections.py", line 1408, in _read_rowdata_packet
      rows.append(self._read_row_from_packet(packet))
    File "/home/classic/dev/sqlalchemy/.tox/py35/lib/python3.5/site-packages/pymysql/connections.py", line 1427, in _read_row_from_packet
      data = converter(data)
    File "/home/classic/dev/sqlalchemy/.tox/py35/lib/python3.5/site-packages/pymysql/converters.py", line 339, in convert_json
      return b.decode('utf-8')
  AttributeError: 'str' object has no attribute 'decode'

@zzzeek zzzeek reopened this Jul 26, 2016
@methane
Copy link
Member

methane commented Jul 27, 2016

PyMySQL respects encoding specified in column descriptor packet.
But it makes many issues since packets returned by MySQL (or MariaDB, etc...) are unstable.

In next major version (0.8), I will stop reading encoding in column descriptor.
But for now, I should hack to make it works.

@methane
Copy link
Member

methane commented Jul 27, 2016

note.

test 1

#488 (comment)

column charsetnr=63 (binary)

When changing connection encoding to 'latin1', charsetnr=binary, but contents is encoded in latin1.

test 2

#488 (comment)

column charsetnr=33 (utf8_general_ci)

When changing connection encoding to 'latin1', charsetnr=8 (latin1_swedish_ci)

methane added a commit to methane/PyMySQL that referenced this issue Jul 28, 2016
methane added a commit that referenced this issue Jul 28, 2016
* Fix SELECT CAST(... AS JSON) cause UnicodeError

fixes #488
@zzzeek
Copy link
Contributor Author

zzzeek commented Jul 28, 2016

works great. Release! :) thanks for the great response time.

@methane
Copy link
Member

methane commented Jul 29, 2016

Released 0.7.6

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
2 participants