Skip to content
This repository was archived by the owner on Oct 29, 2024. It is now read-only.

Improve DataFrameClient tags line-protocol conversion performance #503

Merged
merged 2 commits into from
Oct 12, 2017

Conversation

tzonghao
Copy link
Contributor

In _convert_dataframe_to_lines, if only global_tags is specified but
not tag_columns, take a faster route to process the tags. Previously,
in such a case, global tags are duplicated as tag columns and processed
as if they were tag columns. Such processing is wasteful and results in
a slowdown that becomes noticeable when batch loading many thousands of
data points with a handful of global tags.

In `_convert_dataframe_to_lines`, if only `global_tags` is specified but
not `tag_columns`, take a faster route to process the tags. Previously,
in such a case, global tags are duplicated as tag columns and processed
as if they were tag columns. Such processing is wasteful and results in
a slowdown that becomes noticeable when batch loading many thousands of
data points with a handful of global tags.
@tzonghao tzonghao changed the title Improve DataFrameClient tag line-protocol conversion performance Improve DataFrameClient tags line-protocol conversion performance Sep 25, 2017
@tzonghao
Copy link
Contributor Author

I did a simple performance test on a ~70K data point with 20 global tags. It seems to improve performance by at least 2 times. Here's the test (ipynb).

@aviau
Copy link
Collaborator

aviau commented Oct 12, 2017

Nice!

@aviau aviau merged commit f8bba58 into influxdata:master Oct 12, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants