Skip to content
This repository was archived by the owner on Oct 29, 2024. It is now read-only.

DataFrame client takes >50s to get a response for 9k rows with 12 columns #872

Open
raokrutarth opened this issue Dec 20, 2020 · 0 comments

Comments

@raokrutarth
Copy link

  • InfluxDB version: influxdb:1.7-alpine
  • InfluxDB-python version: 5.3.1
  • Python version: 3.8.5
  • Operating system version:
    $ uname -a
    Linux 5.4.0-56-generic Readme improvements #62-Ubuntu SMP Mon Nov 23 19:20:19 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
    $ cat /proc/cpuinfo | head -n 10
    processor : 0
    vendor_id : GenuineIntel
    cpu family : 6
    model : 62
    model name : Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz
    stepping : 4
    microcode : 0x42e
    cpu MHz : 2190.312
    cache size : 15360 KB

SnakeViz profile snippet:
image

Query made:

 SELECT
        FIRST(volume) AS volume_start,
        FIRST(trade_price) AS trade_price_start,
        FIRST(ask_price) AS ask_price_start,

        LAST(volume) AS volume_end,
        LAST(trade_price) AS trade_price_end,
        LAST(ask_price) AS ask_price_end,
        LAST(day_change_percentage) AS day_change_percentage
    FROM full_market_snapshot
    WHERE
        time > now() - 48h30m
    GROUP BY ticker

Database entries/scale:

> select count(*) from full_market_snapshot
name: full_market_snapshot
time                 count_ask_price count_bid_price count_day_change_percentage count_last_trade_size count_trade_price count_volume count_weighted_volume count_yesterday_max
----                 --------------- --------------- --------------------------- --------------------- ----------------- ------------ --------------------- -------------------
1970-01-01T00:00:00Z 15054207        15054207        15054207                    15054207              15054207          15054207     15054207              15054207
> select count(*) from full_market_snapshot where time > now() - 48h30m
name: full_market_snapshot
time                           count_ask_price count_bid_price count_day_change_percentage count_last_trade_size count_trade_price count_volume count_weighted_volume count_yesterday_max
----                           --------------- --------------- --------------------------- --------------------- ----------------- ------------ --------------------- -------------------
2020-12-18T09:58:42.300308784Z 3949856         3949856         3949856                     3949856               3949856           3949856      3949856               3949856
> 

Returned dataframe dimensions: [9203 rows x 8 columns]

What can I do to improve the response time? The InfluxDB container uses <10% of the allotted CPU and main memory and storage is SSD with a docker volume mount.

@raokrutarth raokrutarth changed the title DataFrame client takes >5s to get a response for 9k rows with 12 columns DataFrame client takes >50s to get a response for 9k rows with 12 columns Dec 20, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant