-
Notifications
You must be signed in to change notification settings - Fork 301
connection: fix O(n^2) reply parsing. #207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The data method in the Connection object handles incoming data chunks by concatenating them with a Buffer object. If the data chunks are each of bounded size, then this algorithm is O(n^2) in the total size of the reply. This leads to excessive CPU consumption and hangs when the replies are large -- reading a 40 MB BYTEA causes a very long pause when tested here. This commit introduces an efficient double-ended queue, with O(n) push and shift operations, and uses that instead of allocating a new concatenated Buffer for each chunk.
Hi @dlbeer I tried your fork with https://github.com/porsager/postgres-benchmarks and found it to be 50-100% slower than current master.. There are no big "bytea" responses in that test, but maybe you could add a benchmark for that too so we can see if it's faster in practice. Even so, I wouldn't want to sacrifice performance for the 99% of use cases to improve 40MB bytea responses. |
On Tue, Aug 03, 2021 at 04:27:45AM -0700, Rasmus Porsager wrote:
I tried your fork with https://github.com/porsager/postgres-benchmarks and found it to be 50-100% slower than current master..
There are no big "bytea" responses in that test, but maybe you could add a benchmark for that too so we can see if it's faster in practice. Even so, I wouldn't want to sacrifice performance for the 99% of use cases to improve 40MB bytea responses.
Not sure if it was clear from my original description, but this isn't
specific to BYTEA, and it's not merely a special-case performance
improvement (that's just how I came across it).
If a malicious user is able to control the size of a response from the
database server, they can lock up the process for hours at a time.
I probably don't have time at the moment to optimize the ByteQueue
implementation myself, but I'd be happy to answer questions or check
over improvements that anyone else has to contribute.
…--
Daniel Beer ***@***.***> http://dlbeer.co.nz/
PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B
|
Alright. Would be very nice if you could write a reproductive case for when this lock happens. |
On Wed, Aug 04, 2021 at 12:13:46AM -0700, Rasmus Porsager wrote:
Alright.
Would be very nice if you could write your case for when this lock happens.
Do you mean add something to the benchmark suite, or just describe it?
The simplest way to trigger it is to create a table with a BYTEA or TEXT
column, store a large value (tens of MB, perhaps?), then SELECT it.
…--
Daniel Beer ***@***.***> http://dlbeer.co.nz/
PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B
|
No, just describe it like that 😉 Now I can take a better look at it - thanks |
Not sure if it can help but maybe we could pre-allocate the array passed to let next_data = null
function data(x) {
if (next_data) {
const new_length = Math.min(next_data.byteLength + x.byteLength, next_data.buffer.byteLength)
const to_read_from_x = new_length - next_data.byteLength
next_data = new Uint8Array(
next_data.buffer,
next_data.byteOffset,
new_length
);
next_data.set(x.subarray(0, to_read_from_x), new_length - to_read_from_x)
if (new_length < next_data.buffer.byteLength)
return
// next_data already have the `length + 1` length
backend[next_data[0]](Buffer.from(next_data.buffer, next_data.byteOffset, next_data.buffer.byteLength))
next_data = null
x = x.slice(x.byteLength - to_read_from_x)
}
buffer = buffer.length === 0
? x
: Buffer.concat([buffer, x], buffer.length + x.length)
while (buffer.length > 4) {
length = buffer.readInt32BE(1)
if (length >= buffer.length) {
if (length > 65535) { // heuristic (should be based on benchmark results)
next_data = Buffer.allocUnsafe(length + 1)
next_data = new Uint8Array(
next_data.buffer,
next_data.byteOffset,
buffer.byteLength
);
next_data.set(buffer, 0)
buffer = Buffer.alloc(0)
}
break
}
backend[buffer[0]](buffer.slice(0, length + 1))
buffer = buffer.slice(length + 1)
}
} (not tested) This is kind of a mix of previous solutions (avoid many |
On Wed, Aug 04, 2021 at 03:06:36PM -0700, Minigugus wrote:
buffer = buffer.length === 0
? x
: Buffer.concat([buffer, x], buffer.length + x.length)
This is the part that causes the problem. What happens is that incoming
data arrives in chunks of no more than some fixed size (k).
If you need to assemble a sequence of N chunks before you have a
complete reply, then you have a sequence of N calls to Buffer.concat,
and in each case you're concatenating all the data you've received so
far with k more bytes. You therefore end up copying overall:
k*N*(N+1)/2 = k/2*N^2 + k/2*N + k/2 bytes
The first term (k/2*N^2) is what causes the problem. For every doubling
of the message size, you end up doing (roughly) four times as much work.
…--
Daniel Beer ***@***.***> http://dlbeer.co.nz/
PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B
|
@dlbeer You're right, but I think most the time, Moreover, instead of I haven't tested your code so there is a lot of assumptions in this comment, I might be wrong on some points. My hypothesis is that the current implementation is better for small packets, but yours is better for big packets. It's why I suggested the code in my previous comment, so that we can let the benchmarks be our judge. |
On Wed, Aug 04, 2021 at 04:16:50PM -0700, Minigugus wrote:
Moreover, instead of `buffer = buffer.slice(length + 1)`, your code calls `buffer.shift(length + 1)`, which [allocate a new buffer every time](https://github.com/porsager/postgres/blob/04c93b72f68b4193d29e1bdbeef940feee1fe0ed/lib/byte_queue.js#L94), whereas `.slice()` [reuse the same buffer, only offsets are updated](https://nodejs.org/api/buffer.html#buffer_buf_slice_start_end).
Yes, that's correct.
A small and simple optimization (which I haven't benchmarked) might be
to keep pre-allocated a 5-byte header buffer and use shiftTo instead.
This would avoid buffer allocation on each loop iteration.
Another possibility is to only parse the header once when the buffer
first grows above 5 bytes, and keep cached the length and type fields
for when the full message arrives.
…--
Daniel Beer ***@***.***> http://dlbeer.co.nz/
PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B
|
I used 187Mb json file from here, and tested load time and select time: psql:
Example
js
Exampleimport postgres from "postgres";
import fs from "fs";
const sql = postgres(process.env.DB_URL);
async function select() {
const s = performance.now();
await sql`SELECT data from "data" LIMIT 1`;
print(performance.now() - s);
}
async function insert() {
const content = fs.readFileSync(`${process.cwd()}/../sf-city-lots-json/citylots.json`, 'utf-8');
const s = performance.now();
await sql`INSERT INTO "data" (id, "data") VALUES (5, ${content})`;
return print(performance.now() - s);
}
function print(ms) {
console.log(`Time - ${ms}ms; ${ms / 1000}s; ${ms / 60000}`);
}
async function main() {
await insert();
await select();
}
main().finally(() => sql.end()); ConclusionInsert time in postgres.js is the same as in psql (I'm impressed!) but select is about 100 times slower. |
@stalniy psql doesn't parse the json.. Could you try to see how long time Unless the column is of type |
I tried to replace JSON.parse(x) with just I actually did a bit of investigation and the root cause I think is |
@porsager If you replace the logic of buffer concat with BufferList, specifically using the dependency-free implementation at https://github.com/rvagg/bl/blob/master/BufferList.js , then performance for large payloads gets close to the theoretical max: 50x faster than right now, and ~2x slower than |
Ok, just took some time to look properly at this, and @dlbeer issue is right on the money. Sorry for not digging into it sooner, and thanks a lot for pointing it out!! I've got a working fix that doesn't introduce any dependencies or big abstractions, and no performance loss for normal usage. It seems to be faster or same speed as psql, so I guess we just got about 80x gain with @stalniy 's example 😂 You can try it out here #229 @stalniy it's because we're receiving a single really big message but in many small chunks, and for every chunk we created a new buffer by concatenating what we had with the next chunk. In the fix we just push all chunks to an array until we're done and then create a buffer from that array. |
The data method in the Connection object handles incoming data chunks by
concatenating them with a Buffer object. If the data chunks are each of
bounded size, then this algorithm is O(n^2) in the total size of the
reply.
This leads to excessive CPU consumption and hangs when the replies are
large -- reading a 40 MB BYTEA causes a very long pause when tested
here.
This commit introduces an efficient double-ended queue, with O(n) push
and shift operations, and uses that instead of allocating a new
concatenated Buffer for each chunk.