Skip to content
This repository was archived by the owner on Oct 29, 2024. It is now read-only.
This repository was archived by the owner on Oct 29, 2024. It is now read-only.

Insert pandas dataframe into InfluxDB issues #576

Closed
@dz123

Description

@dz123

@tzonghao @aviau @xginn8 @sebito91

I am trying to store some trading data into InfluxDB using the DataFrameClient with the write_points method. I have read the documentation online as well as the following two issues, #286 and #510.

Here is what my data looks like:

Date Ticker Close Volume
2018-04-15 00:00:00+00:00 MSFT 1.3 2.50
2018-04-14 00:00:00+00:00 MSFT 3.5 4.24
2018-04-15 00:00:00+00:00 AAPL 7.0 11.00
2018-04-14 00:00:00+00:00 AAPL 6.0 1.00

Below is my code:

client = DataFrameClient(host, port, user, password, dbname)
headers = ["Date","Ticker","Close", "Volume"]
data = [["2018-04-15","MSFT",1.3,2.5], ["2018-04-14","MSFT",3.5,4.24], ["2018-04-15","AAPL",7,11], ["2018-04-14","AAPL",6,1]]
df = pd.DataFrame(data, columns = headers)
df.Date = pd.to_datetime(df["Date"])
df = df.set_index("Date")
tags = { "Ticker": df[["Ticker"]]}
client.write_points(df, 'test', tags = tags, protocol = "json")

However this gives this below error message when I call write_points

InfluxDBClientError: 400: {"error":"partial write: unable to parse 'test,Ticker=\\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ Ticker': missing fields\nunable to parse 'Date\\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ ':

A similar message shows up when I try to separate out the "Ticker" tags column out of the data frame I write, so:

timeValues = df[["Close","Volume"]]
client.write_points(timeValues, 'test', tags = tags, protocol = "json")

Leads to the same error message above. I have three questions I would really love to get help on!:

  1. How do I fix what I am doing above? Is it the protocol thats wrong? In the documentation, the comment suggests to use "json" as a workaround for some reported bugs
  2. I also have the same time stamp for two different tag values (ie the same dates for both MSFT and APPL). Is this an issue when I write into the database?
  3. For the time series I am trying to write, there will be certain nan values for some of the tickers. For example, what if volume field value for 4-14 is nan for APPL? Will this still work? There were a few bug reports that seemed to suggest I cant write nan into database. EDIT: I found this DataFrame write "nan", "inf" error in influxdb #422 posted and it seems like the work around is to have separate measurements by field and then drop the na rows before writing to database.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions