Skip to content
This repository was archived by the owner on Oct 29, 2024. It is now read-only.

Fix performance degradation with line protocol #592

Merged
merged 1 commit into from
Jun 30, 2018

Conversation

shushen
Copy link
Contributor

@shushen shushen commented Jun 1, 2018

Assembling line by line in the commit bf232a7 to remove NaN has
significant performance impact.

This change fixes the issue by keeping the NaN fields before stringify
the dataframe, replacing the fields with empty string, and reverting
back to use pd.DataFrame.sum() function to yield the lines.

Fixes: #591

Assemble line by line in the commit bf232a7 to remove NaN has
significant performance impact.

This change fixes the issue by keeping the NaN fields before stringify
the dataframe, replacing the fields with empty string, and reverting
back to use pd.DataFrame.sum() function to yield the lines.

Fixes: influxdata#591
@shushen
Copy link
Contributor Author

shushen commented Jun 29, 2018

Unfortunately v5.1.0 has just been released with the performance regression vs v5.0 as addressed here.

@xginn8 Could you please kindly take a look at this PR?

@aviau
Copy link
Collaborator

aviau commented Jun 29, 2018

I'll wait for his input since he has been leading the data frame client but I will take a look if he does not respond.

Thanks for contributing and sorry for the wait :)

Copy link
Collaborator

@xginn8 xginn8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing!

@xginn8 xginn8 merged commit c300105 into influxdata:master Jun 30, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance degradation with line protocol on master vs. v5.0.0
3 participants