-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add documentation about type conversion. #556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Including how to disable conversion for specific types. For PyMySQL#520
=============== | ||
|
||
By default PyMySQL will convert MySQL data types into native Python types. The | ||
full list of conversions is: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's only half. PyMySQL converts Python data into SQL too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point; I'll add a separate table based on the list of encoders
.
=============== ============================================= | ||
MySQL type Python type | ||
=============== ============================================= | ||
``BIT`` [No conversion] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What [No conversion]
means? Maybe, bytes are better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, just "bytes"? Or bytes
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
``bytes``
MySQL type Python type | ||
=============== ============================================= | ||
``BIT`` [No conversion] | ||
``TINY`` ``Int`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use int
instead of Int
for Python type.
full list of conversions is: | ||
|
||
=============== ============================================= | ||
MySQL type Python type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What "MySQL type" means?
It's not directly relating to MySQL Data types.
It's actually Query Response's column types.
``LONG_BLOB`` ``String`` | ||
``STRING`` ``String`` | ||
``VAR_STRING`` ``String`` | ||
``VARCHAR`` ``String`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of them are str or bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, like:
str
or bytes
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes.
=============== ============================================= | ||
|
||
Items marked * are returned as ``None`` if they are of an invalid format or are | ||
invalid dates. For example, these valid ``DATETIME`` values:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it's current behavior, I think it is implementation detail, and I'm thinking about changing this
to return string instead of None.
Documenting it makes harder to change it. Maybe, I won't change if it's documented behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's useful to mention the issue with invalid dates not being converted to native python date/time types, as it took me a while to figure out what was happening when this happened to me.
Instead of removing this section we could say that this behaviour is likely to change in the future, to return strings (which sounds a good idea) rather than None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add note about (sorry, my English is bad, please rewrite to more good sentence):
- Customizing converter is hack and not recommended. It's integrated implementation detail very tightly, so it's difficult to keep backward compatibility sometimes.
- You should try converting at higher layer like ORM always when possible.
- If your DB contains broken data, you should fix it, while you can use converter hack for temporal workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you'd prefer not to have this page in the documentation at all that's fine :) I thought it might be useful, but if you think it's going to cause you difficulties then we can just leave it alone? Let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't will wasting your work. I just want strong message "don't use this, unless no other way!" in this document.
This feature is kept with many pitfalls and special dirty hacks. This feature makes difficult to finding and fixing bugs without regression or backward incompatibility.
For example, see PyMySQL/mysqlclient#145 is one recent regression caused by converter, and PyMySQL/mysqlclient#155 is my attempt to fix the issue by dirty hack.
Another example is this dirty code in https://github.com/PyMySQL/mysqlclient-python/blob/master/MySQLdb/connections.py
:
db = proxy(self)
def _get_string_literal():
# Note: string_literal() is called for bytes object on Python 3 (via bytes_literal)
def string_literal(obj, dummy=None):
return db.string_literal(obj)
return string_literal
converter (both of encoder and decoder) should know encoding of connection.
But converters doesn't have encoding argument. So string encoder is overwritten by connection constructor. And weakref is required to avoid circular reference (it cause leak at Python 2. I now hate Python 2 too...).
Decoder has another dirty hack too.
If no one use converters, I can add new argument and pass necessary information via argument. Then I'll be able to remove lots of unmaintainable dirty hacks!!!
I want to do it someday. But it breaks backward compatibility.
That's why I don't want to people using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem - I'll revise things next week; no more time until then I'm afraid.
See `this StackOverflow answer <http://stackoverflow.com/a/6882884/250962>`_ for | ||
more detail on MySQL's handling of dates/times like these. | ||
|
||
If you know your database contains such dates you may want to prevent PyMSQL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/PyMSQL/PyMySQL/
=============== | ||
Data Conversion | ||
=============== | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add these notes at top:
- Information in this page is implementation detail. Normal PyMySQL users don't need this information.
- This is backdoor for low level problem. Don't use this for (de)serializing application layer types. Use ORM instead.
I'm not sure this will be as uncommon as you may think. While at first the change in #520 to convert to strings sounds ok, it actually caused issues for us. We also had many rows with 0000-00-00 00:00:00 in a datetime field in DB which is fine in MySQL but not in python. Up through 0.7.11, this was not an issue as they were handled as None through the stack. This is similar to how other mysql drivers function, fwiw (returning None in such cases). I tested an upgrade to 0.8.0 and the change modifies data coming back not only by not returning None in cases where it used to, but returning a different type altogether than what you would expect (a string rather than datetime). We can certainly account for it application side and add other checks, but defaulting to this change in 0.8.0 without appropriate documentation does indeed cause issues, especially given ORMs such as SQLAlchemy don't account for it and handle either (as of the latest version of SQLAlchemy, you can define your attribute as datetime but it'll still come back a string if using pymysql 0.8.0) |
What the "this" means? |
I meant the change in #520 that would require overriding conv or custom conversion hack as of 0.8.0. In 0.7.11, pymysql did automatic conversion to python datetime objects or None, as documented above in this PR.
However, this actually isn't accurate anymore as of 0.8.0. You mentioned above you were thinking of returning a String (not doing a conversion to None, I assume) for invalid dates. This change (#520) looks like it's already merged. At the very least, the documentation above should be adjusted to address that. Personally, I prefer it to return |
You can update your DB to drop all invalid dates too. Or you can convert string to None in application layer too.
What other drivers mean? Do you have list of behavior of drivers? I want to make mysqlclient and PyMySQL as close as possible. |
Hm, MySQL Connector/Python seems return None for invalid date. |
@methane this is my point. As of 0.8.0 pymysql does not return None. It returns a string. On dropping invalid dates from DB, I think it's nice in theory, but unfortunately not always possible in some environments (ie here, where multiple teams are using the DB that is the way it's been for years and changing/dropping the way it works is easier said than done due to concern around causing issues in production). Personally, I agree that if a date is not provided, it should be NULL rather than a NOT NULL field with a default value of 0000-00-00, but sometimes I don't get my way :( Converting to None in application layer is a little bit of work but doable. It would be nice if SQLAlchemy handled this automatically to maintain consistency between different connectors it supports (ie always return None, or the string, since it may vary between connectors and SQLAlchemy is supposed to abstract it all away), but currently it doesn't seem to enforce that (ie you can list an attribute as a In any case, just pointing out that at least for us, the recent change to returning a 'string' did lead to issues and wasn't entirely backwards compatible, and I had to roll back to previous version of pymysql. |
Hi Everyone, What is the status of this pull request? These conversions are very important piece of documentation and I'm wondering how accurate these changes are. Can I assume these type conversions are correct? Thanks. |
Including how to disable conversion for specific types.
I think what I've said is correct but do make sure that I haven't got something wrong!
I added a whole new page for this, but happy to organise it differently if you have a better idea in mind.
For #520