-
Notifications
You must be signed in to change notification settings - Fork 524
Out of memory error when querying large data set with chunked=True #531
Comments
I can confirm this. Although chunked mode is supported, its completely useless this way. They even went to the trouble of returning a generator, but its no use since the whole response is read into ram anyway. I need this to process millions of measurements in chunked mode, so I expected it to return some kind of generator which i could then iterate while processing the data. |
First step in solving this is to add stream=True when calling self._session.request(). Now it makes it all the way to _read_chunked_response(), in which case response.iter_lines() will now be a generator. |
Also found a extra problem when using DataFrameClient.query(): This also uses InfluxDBClient.query(), HOWEVER, for some reason it then calls self._to_dataframe() with the result set, which in turn processes all results in memory instead of returning some kind of generator. |
ok i think i fixed it, will make a pullrequest. |
…ing query() in chunked mode
Any update on this? |
Hello, any update on this? It's impossible for me to query 100.000.000 rows. |
Just add the change from my pull request manually: #538 Someone needs to create a test for it and update the pull request so the devs can accept the change. (i dont have time for it) |
When querying large data sets, it's vital to get a chunked responses to manage memory usage. Wrapping the query response in a generator and streaming the request provides the desired result. It also fixes `InfluxDBClient.query()` behavior for chunked queries that is currently not working according to [specs](https://github.com/influxdata/influxdb-python/blob/master/influxdb/client.py#L410) close influxdata#585 close influxdata#531
When querying large data sets, it's vital to get a chunked responses to manage memory usage. Wrapping the query response in a generator and streaming the request provides the desired result. It also fixes `InfluxDBClient.query()` behavior for chunked queries that is currently not working according to [specs](https://github.com/influxdata/influxdb-python/blob/master/influxdb/client.py#L410) close influxdata#585 close influxdata#531
When querying large data sets, it's vital to get a chunked responses to manage memory usage. Wrapping the query response in a generator and streaming the request provides the desired result. It also fixes `InfluxDBClient.query()` behavior for chunked queries that is currently not working according to [specs](https://github.com/influxdata/influxdb-python/blob/master/influxdb/client.py#L410) close influxdata#585 close influxdata#531
When querying large data sets, it's vital to get a chunked responses to manage memory usage. Wrapping the query response in a generator and streaming the request provides the desired result. It also fixes `InfluxDBClient.query()` behavior for chunked queries that is currently not working according to [specs](https://github.com/influxdata/influxdb-python/blob/master/influxdb/client.py#L429) close influxdata#585 close influxdata#531
When querying large data sets, it's vital to get a chunked responses to manage memory usage. Wrapping the query response in a generator and streaming the request provides the desired result. It also fixes `InfluxDBClient.query()` behavior for chunked queries that is currently not working according to [specs](https://github.com/influxdata/influxdb-python/blob/master/influxdb/client.py#L429) close influxdata#585 close influxdata#531
When querying large data sets, it's vital to get a chunked responses to manage memory usage. Wrapping the query response in a generator and streaming the request provides the desired result. It also fixes `InfluxDBClient.query()` behavior for chunked queries that is currently not working according to [specs](https://github.com/influxdata/influxdb-python/blob/master/influxdb/client.py#L429) Closes #585. Closes #531. Closes #538.
When querying large data sets, it's vital to get a chunked responses to manage memory usage. Wrapping the query response in a generator and streaming the request provides the desired result. It also fixes `InfluxDBClient.query()` behavior for chunked queries that is currently not working according to [specs](https://github.com/influxdata/influxdb-python/blob/master/influxdb/client.py#L429) Closes influxdata#585. Closes influxdata#531. Closes influxdata#538.
Making large queries for very large result sets results in out of memory errors. This can be traced back to the function _read_chunked_response in client.py which reads the entire response into RAM before returning a ResultSet.
The text was updated successfully, but these errors were encountered: