Skip to content
This repository was archived by the owner on Oct 29, 2024. It is now read-only.

Out of memory error when querying large data set with chunked=True #531

Closed
rmah opened this issue Oct 29, 2017 · 7 comments · Fixed by #753
Closed

Out of memory error when querying large data set with chunked=True #531

rmah opened this issue Oct 29, 2017 · 7 comments · Fixed by #753

Comments

@rmah
Copy link

rmah commented Oct 29, 2017

Making large queries for very large result sets results in out of memory errors. This can be traced back to the function _read_chunked_response in client.py which reads the entire response into RAM before returning a ResultSet.

@psy0rz
Copy link

psy0rz commented Nov 25, 2017

I can confirm this. Although chunked mode is supported, its completely useless this way.

They even went to the trouble of returning a generator, but its no use since the whole response is read into ram anyway.

I need this to process millions of measurements in chunked mode, so I expected it to return some kind of generator which i could then iterate while processing the data.

@psy0rz
Copy link

psy0rz commented Nov 25, 2017

First step in solving this is to add stream=True when calling self._session.request().

Now it makes it all the way to _read_chunked_response(), in which case response.iter_lines() will now be a generator.

@psy0rz
Copy link

psy0rz commented Nov 25, 2017

Also found a extra problem when using DataFrameClient.query(): This also uses InfluxDBClient.query(), HOWEVER, for some reason it then calls self._to_dataframe() with the result set, which in turn processes all results in memory instead of returning some kind of generator.

@psy0rz
Copy link

psy0rz commented Nov 26, 2017

ok i think i fixed it, will make a pullrequest.

psy0rz added a commit to psy0rz/influxdb-python that referenced this issue Nov 26, 2017
@goriccardo
Copy link

Any update on this?

@pedrofaria09
Copy link

Hello, any update on this? It's impossible for me to query 100.000.000 rows.

@psy0rz
Copy link

psy0rz commented Jun 24, 2019

Just add the change from my pull request manually: #538

Someone needs to create a test for it and update the pull request so the devs can accept the change. (i dont have time for it)

hrbonz added a commit to hrbonz/influxdb-python that referenced this issue Sep 8, 2019
When querying large data sets, it's vital to get a chunked responses to
manage memory usage. Wrapping the query response in a generator and
streaming the request provides the desired result.
It also fixes `InfluxDBClient.query()` behavior for chunked queries that
is currently not working according to
[specs](https://github.com/influxdata/influxdb-python/blob/master/influxdb/client.py#L410)

close influxdata#585
close influxdata#531
hrbonz added a commit to hrbonz/influxdb-python that referenced this issue Sep 8, 2019
When querying large data sets, it's vital to get a chunked responses to
manage memory usage. Wrapping the query response in a generator and
streaming the request provides the desired result.
It also fixes `InfluxDBClient.query()` behavior for chunked queries that
is currently not working according to
[specs](https://github.com/influxdata/influxdb-python/blob/master/influxdb/client.py#L410)

close influxdata#585
close influxdata#531
hrbonz added a commit to hrbonz/influxdb-python that referenced this issue Sep 8, 2019
When querying large data sets, it's vital to get a chunked responses to
manage memory usage. Wrapping the query response in a generator and
streaming the request provides the desired result.
It also fixes `InfluxDBClient.query()` behavior for chunked queries that
is currently not working according to
[specs](https://github.com/influxdata/influxdb-python/blob/master/influxdb/client.py#L410)

close influxdata#585
close influxdata#531
hrbonz added a commit to hrbonz/influxdb-python that referenced this issue Mar 23, 2020
When querying large data sets, it's vital to get a chunked responses to
manage memory usage. Wrapping the query response in a generator and
streaming the request provides the desired result.
It also fixes `InfluxDBClient.query()` behavior for chunked queries that
is currently not working according to
[specs](https://github.com/influxdata/influxdb-python/blob/master/influxdb/client.py#L429)

close influxdata#585
close influxdata#531
hrbonz added a commit to hrbonz/influxdb-python that referenced this issue Mar 23, 2020
When querying large data sets, it's vital to get a chunked responses to
manage memory usage. Wrapping the query response in a generator and
streaming the request provides the desired result.
It also fixes `InfluxDBClient.query()` behavior for chunked queries that
is currently not working according to
[specs](https://github.com/influxdata/influxdb-python/blob/master/influxdb/client.py#L429)

close influxdata#585
close influxdata#531
sebito91 pushed a commit that referenced this issue Apr 10, 2020
When querying large data sets, it's vital to get a chunked responses to
manage memory usage. Wrapping the query response in a generator and
streaming the request provides the desired result.
It also fixes `InfluxDBClient.query()` behavior for chunked queries that
is currently not working according to
[specs](https://github.com/influxdata/influxdb-python/blob/master/influxdb/client.py#L429)

Closes #585.
Closes #531.
Closes #538.
ocworld pushed a commit to AhnLab-OSS/influxdb-python that referenced this issue Apr 13, 2020
When querying large data sets, it's vital to get a chunked responses to
manage memory usage. Wrapping the query response in a generator and
streaming the request provides the desired result.
It also fixes `InfluxDBClient.query()` behavior for chunked queries that
is currently not working according to
[specs](https://github.com/influxdata/influxdb-python/blob/master/influxdb/client.py#L429)

Closes influxdata#585.
Closes influxdata#531.
Closes influxdata#538.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants