Fix chunked query to return chunk resultsets #753

hrbonz · 2019-09-08T10:10:07Z

When querying large data sets, it's vital to get a chunked responses to manage memory usage. Wrapping the query response in a generator and streaming the request provides the desired result.
It also fixes InfluxDBClient.query() behavior for chunked queries that is currently not working according to specs

close #585
close #531
close #538

hrbonz · 2019-09-08T10:10:35Z

I'm currently running tox tests and will fix whatever fails in travisCI

Option to dump data in a folder in fragment files of a given chunk size. This works only if the chunksize option is properly implemented in influxdb-python (see influxdata/influxdb-python#753)

psy0rz · 2019-10-09T09:44:54Z

Does this also close #538 ?

hrbonz · 2019-10-11T14:35:36Z

Yes, it covers it as well. I've had issues with my laptop so got caught in a bit of a snag, will get back to fixing the tests.
The tests were designed to pass rather than reflect the method documented behavior.

hrbonz · 2020-03-23T08:19:13Z

I had to change the test substantially as it was testing a broken behavior not the documented (and better one imho).
I'm testing that I'm getting the proper number of chunked ResultSets and testing that each Resultset returns the proper elements.

hrbonz · 2020-03-23T08:26:52Z

On a side note, I've been using this code to move data from an old version of influxdb to 1.7 (and doing some type casting in the middle) with no issue. About 50G of data across 1500 measurements.
Without the proper chunking and generator, my memory was going through the roof.

When querying large data sets, it's vital to get a chunked responses to manage memory usage. Wrapping the query response in a generator and streaming the request provides the desired result. It also fixes `InfluxDBClient.query()` behavior for chunked queries that is currently not working according to [specs](https://github.com/influxdata/influxdb-python/blob/master/influxdb/client.py#L429) close influxdata#585 close influxdata#531

russorat · 2020-03-31T16:20:16Z

@hrbonz Thank you! i've reached out to @sebito91 to take a look.

hrbonz · 2020-04-07T01:56:50Z

Thanks @russorat, hopefully this PR is getting some attention while the new library is getting all the love ;)

sebito91

This is great stuff, thank you @hrbonz!

When querying large data sets, it's vital to get a chunked responses to manage memory usage. Wrapping the query response in a generator and streaming the request provides the desired result. It also fixes `InfluxDBClient.query()` behavior for chunked queries that is currently not working according to [specs](https://github.com/influxdata/influxdb-python/blob/master/influxdb/client.py#L429) Closes influxdata#585. Closes influxdata#531. Closes influxdata#538.

hrbonz · 2020-04-13T09:18:51Z

@sebito91 quite happy to see it finally done and merged! Now to

ping @psy0rz for the inspiration

psy0rz · 2020-04-14T15:52:52Z

awesome! thank YOU for actually implementing it all the way :)

hrbonz mentioned this pull request Sep 8, 2019

Add fragmented dump/load gams/influxdump#3

Closed

hrbonz force-pushed the wip/chunked_query branch 2 times, most recently from 2a2d459 to a6bc05d Compare September 8, 2019 13:28

hrbonz mentioned this pull request Sep 14, 2019

add dump in record chunk sizes gams/influxdump#6

Merged

hrbonz force-pushed the wip/chunked_query branch from a6bc05d to 3832bd3 Compare March 23, 2020 08:07

hrbonz marked this pull request as ready for review March 23, 2020 08:07

hrbonz requested review from aviau, sebito91 and xginn8 as code owners March 23, 2020 08:07

hrbonz force-pushed the wip/chunked_query branch from 3832bd3 to 67301be Compare March 23, 2020 08:30

hrbonz mentioned this pull request Mar 26, 2020

Update api.md influxdata/docs.influxdata.com-ARCHIVE#2735

Merged

rbdm-qnt mentioned this pull request Mar 31, 2020

Still can’t query big datasets #800

Closed

sebito91 self-assigned this Apr 8, 2020

sebito91 approved these changes Apr 10, 2020

View reviewed changes

sebito91 merged commit c903d73 into influxdata:master Apr 10, 2020

This was referenced Apr 10, 2020

Properly use chunk_size param #523

Closed

Add support for streamed chunks when making large queries #478

Closed

sebito91 mentioned this pull request Apr 13, 2020

Properly use chunk_size param #524

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix chunked query to return chunk resultsets #753

Fix chunked query to return chunk resultsets #753

Uh oh!

hrbonz commented Sep 8, 2019 •

edited

Loading

Uh oh!

hrbonz commented Sep 8, 2019

Uh oh!

psy0rz commented Oct 9, 2019

Uh oh!

hrbonz commented Oct 11, 2019

Uh oh!

hrbonz commented Mar 23, 2020

Uh oh!

hrbonz commented Mar 23, 2020 •

edited

Loading

Uh oh!

russorat commented Mar 31, 2020

Uh oh!

hrbonz commented Apr 7, 2020

Uh oh!

sebito91 left a comment

Uh oh!

hrbonz commented Apr 13, 2020

Uh oh!

psy0rz commented Apr 14, 2020

Uh oh!

Uh oh!

Fix chunked query to return chunk resultsets #753

Fix chunked query to return chunk resultsets #753

Uh oh!

Conversation

hrbonz commented Sep 8, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hrbonz commented Sep 8, 2019

Uh oh!

psy0rz commented Oct 9, 2019

Uh oh!

hrbonz commented Oct 11, 2019

Uh oh!

hrbonz commented Mar 23, 2020

Uh oh!

hrbonz commented Mar 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

russorat commented Mar 31, 2020

Uh oh!

hrbonz commented Apr 7, 2020

Uh oh!

sebito91 left a comment

Choose a reason for hiding this comment

Uh oh!

hrbonz commented Apr 13, 2020

Uh oh!

psy0rz commented Apr 14, 2020

Uh oh!

Uh oh!

hrbonz commented Sep 8, 2019 •

edited

Loading

hrbonz commented Mar 23, 2020 •

edited

Loading