Insert pandas dataframe into InfluxDB issues #576
Description
@tzonghao @aviau @xginn8 @sebito91
I am trying to store some trading data into InfluxDB using the DataFrameClient
with the write_points
method. I have read the documentation online as well as the following two issues, #286 and #510.
Here is what my data looks like:
Date | Ticker | Close | Volume |
---|---|---|---|
2018-04-15 00:00:00+00:00 | MSFT | 1.3 | 2.50 |
2018-04-14 00:00:00+00:00 | MSFT | 3.5 | 4.24 |
2018-04-15 00:00:00+00:00 | AAPL | 7.0 | 11.00 |
2018-04-14 00:00:00+00:00 | AAPL | 6.0 | 1.00 |
Below is my code:
client = DataFrameClient(host, port, user, password, dbname)
headers = ["Date","Ticker","Close", "Volume"]
data = [["2018-04-15","MSFT",1.3,2.5], ["2018-04-14","MSFT",3.5,4.24], ["2018-04-15","AAPL",7,11], ["2018-04-14","AAPL",6,1]]
df = pd.DataFrame(data, columns = headers)
df.Date = pd.to_datetime(df["Date"])
df = df.set_index("Date")
tags = { "Ticker": df[["Ticker"]]}
client.write_points(df, 'test', tags = tags, protocol = "json")
However this gives this below error message when I call write_points
InfluxDBClientError: 400: {"error":"partial write: unable to parse 'test,Ticker=\\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ Ticker': missing fields\nunable to parse 'Date\\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ ':
A similar message shows up when I try to separate out the "Ticker" tags column out of the data frame I write, so:
timeValues = df[["Close","Volume"]]
client.write_points(timeValues, 'test', tags = tags, protocol = "json")
Leads to the same error message above. I have three questions I would really love to get help on!:
- How do I fix what I am doing above? Is it the protocol thats wrong? In the documentation, the comment suggests to use "json" as a workaround for some reported bugs
- I also have the same time stamp for two different tag values (ie the same dates for both MSFT and APPL). Is this an issue when I write into the database?
- For the time series I am trying to write, there will be certain
nan
values for some of the tickers. For example, what if volume field value for 4-14 isnan
for APPL? Will this still work? There were a few bug reports that seemed to suggest I cant writenan
into database. EDIT: I found this DataFrame write "nan", "inf" error in influxdb #422 posted and it seems like the work around is to have separate measurements by field and then drop the na rows before writing to database.