-
Notifications
You must be signed in to change notification settings - Fork 442
Regression in bulk insert behavior #163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
A |
@methane should we revert that commit? This is a pretty bad regression preventing us from upgrading |
Why can't you just remove comment? |
We use comments to uniquely identify queries such that it can be known from the server where the (problematic) query came from. It's incredibly useful for detecting "where did this come from and who and why and how can we stop it". Comments are valid sql, refusing to support them seems not the best idea. |
executemany() don't refuse it. |
Correct it doesn't refuse it, but it turns a bulk insert (1.3.8) which was a single fast query into thousands (millions even!). This kills the db server. |
execute() sends the query to MySQL. It should support all valid SQL as possible. |
understood. This used to work but was broken in a patch release. This works correctly with |
Because it may cause false positive and break query. |
You already break query (this issue)! If there's a demonstrated test with the fix will you accept a small pull request? |
This is an almost identical (in idea) PR which was accepted: #154 Why the difference of opinion here? |
Because I don't think |
#161 may have false positive. While I didn't test it, I feel it can match entire #162 may have false positive too. Most important problem is maintenance cost. Both of PyMySQL and MySQL-python had been unmaintained for years. And I'm not active enough too. Regular expression is one of part which makes project maintenance difficult. You can override cursor and implement your own executemany(). |
If I craft a regular expression which provably cannot match nested comments would you accept a pull request? |
It's also bad practice on github to outright close pull requests without suggesting improvements. I understand you don't want to maintain this project but that doesn't mean we have to be rude to each other :) |
Here's a regex which does not match too much:
I'll make a PR with this instead |
Err sorry, need a little bit more complicated (but still pretty readable):
|
For what it's worth, we can use |
If it's:
|
I don't have any idea improving such pull request. |
Easy to understand is very important. Current regex is cleary don't support last comment. |
So I think user should adjust their SQL to match the regular expression. My plan is make the regex more strict, zero false positive. |
Additionally, I'm making difference between PyMySQL and mysqlclient smaller. Since this project is GPL, please send pull request to PyMySQL. |
I've updated #165 and opened PyMySQL/PyMySQL#566 |
I saw it. But I dislike such complex regex only for trailing comment. I don't want merge it. I have more safer idea about
But I don't have time to verify the idea. Especially, I want to know how If you really want trailing comma support, could you try it? |
The regex is dead simple (much much simpler than the parentheses one right above it!), it looks more complicated because of the non capturing groups. Without them it is much more readable:
|
@methane trailing comment is becoming quite a common pattern in SQL querying, specifically for the purposes of annotating where in source code a query came from. For example, here is a ruby gem whose sole feature is to add these sorts of annotations to queries. It's developed and maintained by basecamp (owned by Rails creator DHH), and they've been using it in production since 2012. Notably,
Supporting query annotations (via comments) seems perfectly valid to me, and there's plenty of prior art for supporting this in other ORMs. |
In general, yes. But executemany() is very special purpose method. |
Can you explain to me how executemany() is "very special purpose"? I genuinely don't feel like it needs to be. |
execute() is general, executemany() is special. Is there any question about that??? |
If I accept this, why not heading comment? comment between
My design is queries matching simple pattern are convert to "bulk insert" query. But keeping the pattern minimum is my design decision. Instead, you can make your library for building query. I have no time and energy to talk about this issue anymore. |
Your argument is a tautology. "executemany is special because it is special"
Yes!!! These are all valid SQL queries that can be executed in bulk!! Supporting these would be fantastic for users of your library. I am happy to help add support for these.
I would agree with you, were it not for two points:
|
On the other hand, building "bulk insert" is not driver's job. It needs knowledge of SQL.
I hate it!! That's what I don't want to maintain!!!
Current complicity is needed to avoid false positive. Avoiding false positive is important to use
NO! It's not broken. It just doesn't support all insert SQL. But I'm tired to talking with you. I'm considering completely remove the feature. |
The following query should be able to be executed in bulk:
With mysqlclient 1.3.6, this query is emitted as a bulk insert.
I upgraded mysqclient to 1.3.9, and suddenly, this query is emitted as two individual inserts.
The text was updated successfully, but these errors were encountered: