Skip to content

Add documentation about type conversion. #556

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

philgyford
Copy link

Including how to disable conversion for specific types.

I think what I've said is correct but do make sure that I haven't got something wrong!

I added a whole new page for this, but happy to organise it differently if you have a better idea in mind.

For #520

Including how to disable conversion for specific types.

For PyMySQL#520
===============

By default PyMySQL will convert MySQL data types into native Python types. The
full list of conversions is:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only half. PyMySQL converts Python data into SQL too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point; I'll add a separate table based on the list of encoders.

=============== =============================================
MySQL type Python type
=============== =============================================
``BIT`` [No conversion]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What [No conversion] means? Maybe, bytes are better.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, just "bytes"? Or bytes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

``bytes``

MySQL type Python type
=============== =============================================
``BIT`` [No conversion]
``TINY`` ``Int``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use int instead of Int for Python type.

full list of conversions is:

=============== =============================================
MySQL type Python type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What "MySQL type" means?
It's not directly relating to MySQL Data types.
It's actually Query Response's column types.

``LONG_BLOB`` ``String``
``STRING`` ``String``
``VAR_STRING`` ``String``
``VARCHAR`` ``String``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of them are str or bytes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, like:

str or bytes

?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes.

=============== =============================================

Items marked * are returned as ``None`` if they are of an invalid format or are
invalid dates. For example, these valid ``DATETIME`` values::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it's current behavior, I think it is implementation detail, and I'm thinking about changing this
to return string instead of None.
Documenting it makes harder to change it. Maybe, I won't change if it's documented behavior.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's useful to mention the issue with invalid dates not being converted to native python date/time types, as it took me a while to figure out what was happening when this happened to me.

Instead of removing this section we could say that this behaviour is likely to change in the future, to return strings (which sounds a good idea) rather than None?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add note about (sorry, my English is bad, please rewrite to more good sentence):

  • Customizing converter is hack and not recommended. It's integrated implementation detail very tightly, so it's difficult to keep backward compatibility sometimes.
  • You should try converting at higher layer like ORM always when possible.
  • If your DB contains broken data, you should fix it, while you can use converter hack for temporal workaround.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you'd prefer not to have this page in the documentation at all that's fine :) I thought it might be useful, but if you think it's going to cause you difficulties then we can just leave it alone? Let me know.

Copy link
Member

@methane methane Mar 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't will wasting your work. I just want strong message "don't use this, unless no other way!" in this document.

This feature is kept with many pitfalls and special dirty hacks. This feature makes difficult to finding and fixing bugs without regression or backward incompatibility.

For example, see PyMySQL/mysqlclient#145 is one recent regression caused by converter, and PyMySQL/mysqlclient#155 is my attempt to fix the issue by dirty hack.

Another example is this dirty code in https://github.com/PyMySQL/mysqlclient-python/blob/master/MySQLdb/connections.py
:

        db = proxy(self)
        def _get_string_literal():
            # Note: string_literal() is called for bytes object on Python 3 (via bytes_literal)
            def string_literal(obj, dummy=None):
                return db.string_literal(obj)
            return string_literal

converter (both of encoder and decoder) should know encoding of connection.
But converters doesn't have encoding argument. So string encoder is overwritten by connection constructor. And weakref is required to avoid circular reference (it cause leak at Python 2. I now hate Python 2 too...).
Decoder has another dirty hack too.

If no one use converters, I can add new argument and pass necessary information via argument. Then I'll be able to remove lots of unmaintainable dirty hacks!!!

I want to do it someday. But it breaks backward compatibility.
That's why I don't want to people using it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem - I'll revise things next week; no more time until then I'm afraid.

See `this StackOverflow answer <http://stackoverflow.com/a/6882884/250962>`_ for
more detail on MySQL's handling of dates/times like these.

If you know your database contains such dates you may want to prevent PyMSQL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/PyMSQL/PyMySQL/

===============
Data Conversion
===============

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add these notes at top:

  • Information in this page is implementation detail. Normal PyMySQL users don't need this information.
  • This is backdoor for low level problem. Don't use this for (de)serializing application layer types. Use ORM instead.

@rholloway
Copy link

I'm not sure this will be as uncommon as you may think. While at first the change in #520 to convert to strings sounds ok, it actually caused issues for us. We also had many rows with 0000-00-00 00:00:00 in a datetime field in DB which is fine in MySQL but not in python.

Up through 0.7.11, this was not an issue as they were handled as None through the stack. This is similar to how other mysql drivers function, fwiw (returning None in such cases).

I tested an upgrade to 0.8.0 and the change modifies data coming back not only by not returning None in cases where it used to, but returning a different type altogether than what you would expect (a string rather than datetime).

We can certainly account for it application side and add other checks, but defaulting to this change in 0.8.0 without appropriate documentation does indeed cause issues, especially given ORMs such as SQLAlchemy don't account for it and handle either (as of the latest version of SQLAlchemy, you can define your attribute as datetime but it'll still come back a string if using pymysql 0.8.0)

@methane
Copy link
Member

methane commented Apr 4, 2018

I'm not sure this will be as uncommon as you may think.

What the "this" means?
If you meant conv argument, you many underestimate how it's evil.
It waste my time a lot, may be more than 100 hours.
It made me tired about maintaining PyMySQL and mysqlclient.
It is broken by design before I start maintaining these projects.

@rholloway
Copy link

I meant the change in #520 that would require overriding conv or custom conversion hack as of 0.8.0.

In 0.7.11, pymysql did automatic conversion to python datetime objects or None, as documented above in this PR.

`DATETIME``    ``datetime.datetime`` *

However, this actually isn't accurate anymore as of 0.8.0. You mentioned above you were thinking of returning a String (not doing a conversion to None, I assume) for invalid dates. This change (#520) looks like it's already merged.

At the very least, the documentation above should be adjusted to address that. Personally, I prefer it to return datetime or None, as this aligns with other drivers and SQLAlchemy. If it is going to return datetime or string, hopefully we can get SQLAlchemy to do the conversion from string to None for invalid dates to maintain consistency with other drivers in the ORM layer...

@methane
Copy link
Member

methane commented Apr 4, 2018

I meant the change in #520 that would require overriding conv or custom conversion hack as of 0.8.0.

You can update your DB to drop all invalid dates too. Or you can convert string to None in application layer too.
So hack is not strong requirement for driver layer.

to maintain consistency with other drivers in the ORM layer...

What other drivers mean? Do you have list of behavior of drivers?

I want to make mysqlclient and PyMySQL as close as possible.
So I will change mysqlclient too. Or I will revert #520 in PyMySQL.
You are the only person who report suffering from #520. So I'm not sure about I should revert or not.

@methane
Copy link
Member

methane commented Apr 4, 2018

@rholloway
Copy link

@methane this is my point. As of 0.8.0 pymysql does not return None. It returns a string.

On dropping invalid dates from DB, I think it's nice in theory, but unfortunately not always possible in some environments (ie here, where multiple teams are using the DB that is the way it's been for years and changing/dropping the way it works is easier said than done due to concern around causing issues in production). Personally, I agree that if a date is not provided, it should be NULL rather than a NOT NULL field with a default value of 0000-00-00, but sometimes I don't get my way :(

Converting to None in application layer is a little bit of work but doable. It would be nice if SQLAlchemy handled this automatically to maintain consistency between different connectors it supports (ie always return None, or the string, since it may vary between connectors and SQLAlchemy is supposed to abstract it all away), but currently it doesn't seem to enforce that (ie you can list an attribute as a Date() attribute or DATETIME() but still get a string back in the case of pymysql). I suppose it could be converted using properties or other techniques rather than the standard attributes, but that feels pretty messy as well to do for every date related attribute rather than rely on the standards SQLAlchemy provides.

In any case, just pointing out that at least for us, the recent change to returning a 'string' did lead to issues and wasn't entirely backwards compatible, and I had to roll back to previous version of pymysql.

@jon-goodrum
Copy link

Hi Everyone,

What is the status of this pull request? These conversions are very important piece of documentation and I'm wondering how accurate these changes are. Can I assume these type conversions are correct?

Thanks.

@methane methane closed this Nov 21, 2019
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 31, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants