Skip to content
This repository was archived by the owner on Oct 29, 2024. It is now read-only.

Writting NaN values makes the Client to fail #195

Closed
svernede opened this issue Jun 15, 2015 · 7 comments
Closed

Writting NaN values makes the Client to fail #195

svernede opened this issue Jun 15, 2015 · 7 comments

Comments

@svernede
Copy link

Writting a NaN value makes the Client to fail. Only tested with influxdb08

with the DataFrameClient

import pandas as pd
from influxdb.influxdb08  import DataFrameClient
data = pd.DataFrame({
            'value':[1.5,float('Nan'),3.2]},
             index=pd.date_range('01-01-2015', periods=3, freq='D'))

client = DataFrameClient(host, port, user, password, dbname)

client.write_points({'test':data})

raises the error

<class \'influxdb.influxdb08.client.InfluxDBClientError\'> : 400: b"invalid character \'N\' looking for beginning of value"

The client tries to write the value NaN which is not a valid JSON value. Python NaN should be mapped to JSON null. This is coherent with the Pandas usage to encode missing data by NaN

svernede added a commit to svernede/influxdb-python that referenced this issue Jun 15, 2015
…ll` in the JSON data

see influxdata#195

Adds a new dependency to module   `simplejson` but the same effect is difficult to achieve with base module `json`
see http://stackoverflow.com/questions/28639953/python-nan-json-encoder
@aviau
Copy link
Collaborator

aviau commented Jul 5, 2015

Fixed in #204

@aviau aviau closed this as completed Jul 5, 2015
@julienvienne
Copy link

Hello,
Is this code tested in the last version ?
Some numpy NaN don't seems to be understood in the last revision :

Best regards,

Traceback (most recent call last):
File "/home/julienv/weatherdata/weatherdata/acquisition/observation/weenat/WeenatInflux.py", line 45, in
influx_connexion.write_points(df_wmo, 'synop', {'wmo': wmo})
File "/usr/local/lib/python3.4/dist-packages/influxdb-2.11.0-py3.4.egg/influxdb/_dataframe_client.py", line 68, in write_points
points, time_precision, database, retention_policy)
File "/usr/local/lib/python3.4/dist-packages/influxdb-2.11.0-py3.4.egg/influxdb/client.py", line 391, in write_points
tags=tags)
File "/usr/local/lib/python3.4/dist-packages/influxdb-2.11.0-py3.4.egg/influxdb/client.py", line 436, in _write_points
expected_response_code=204
File "/usr/local/lib/python3.4/dist-packages/influxdb-2.11.0-py3.4.egg/influxdb/client.py", line 278, in write
headers=headers
File "/usr/local/lib/python3.4/dist-packages/influxdb-2.11.0-py3.4.egg/influxdb/client.py", line 248, in request
raise InfluxDBClientError(response.content, response.status_code)
influxdb.exceptions.InfluxDBClientError: 400: {"error":"partial write:\nunable to parse 'synop,wmo=7002 tp01h=0i,tp03h=nan 1420570800000000000': invalid number\nunable to parse 'synop,wmo=7002 tp01h=0i,tp03h=nan 1420574400000000000': invalid number\nunable to parse 'synop,wmo=7002 tp01h=0i,tp03h=nan 1420581600000000000': invalid number\nunable to parse 'synop,wmo=7002 tp01h=0i,tp03h=nan 1420585200000000000': invalid number"}

@julienvienne
Copy link

Hello,
Here is a simple code in order to reproduce the bug :
Test data : nan_data_test.zip

from influxdb import DataFrameClient
import pandas as pd

host='server1'
port = 8086
user = 'root'
password = 'root'
dbname = 'weather_test'

influx_connexion = DataFrameClient(host, port, user, password, dbname)
print("Delete database: " + dbname)
influx_connexion.drop_database(dbname)

print("Create database: " + dbname)
influx_connexion.create_database(dbname)

# This insertion is OK (uncomment to test)
#df_nonan = pd.read_csv("data_test_no_na.csv", sep=";", header=False,  index_col=[0], #parse_dates=True, na_values="")
#influx_connexion.write_points(df_nonan, 'weather', {'station_code': '07002'})

# This insertion with NAN fails
df_withnan = pd.read_csv("data_test_with_na.csv", sep=";", header=False,  index_col=[0], parse_dates=True, na_values="")
influx_connexion.write_points(df_withnan, 'weather', {'station_code': '07002'})


# Traceback :
Traceback (most recent call last):
  File "/home/julienv/weatherdata/weatherdata/acquisition/observation/weenat/test_nan.py", line 25, in <module>
    influx_connexion.write_points(df_withnan, 'weather', {'station_code': '07002'})
  File "/usr/local/lib/python3.4/dist-packages/influxdb-2.11.0-py3.4.egg/influxdb/_dataframe_client.py", line 68, in write_points
    points, time_precision, database, retention_policy)
  File "/usr/local/lib/python3.4/dist-packages/influxdb-2.11.0-py3.4.egg/influxdb/client.py", line 391, in write_points
    tags=tags)
  File "/usr/local/lib/python3.4/dist-packages/influxdb-2.11.0-py3.4.egg/influxdb/client.py", line 436, in _write_points
    expected_response_code=204
  File "/usr/local/lib/python3.4/dist-packages/influxdb-2.11.0-py3.4.egg/influxdb/client.py", line 278, in write
    headers=headers
  File "/usr/local/lib/python3.4/dist-packages/influxdb-2.11.0-py3.4.egg/influxdb/client.py", line 248, in request
    raise InfluxDBClientError(response.content, response.status_code)
influxdb.exceptions.InfluxDBClientError: 400: {"error":"partial write:\nunable to parse 'weather,station_code=07002 td=3.2,tp01h=nan 1420092000000000000': invalid number\nunable to parse 'weather,station_code=07002 td=nan,tp01h=0.0 1420113600000000000': invalid number"}

@julienvienne
Copy link

I found a workaround that may be used : replace numpy.nan by None before inserting data:

df_withnan = df_withnan.where((pd.notnull(df_withnan)), None)

Best regards

@t-pfaff
Copy link

t-pfaff commented Feb 7, 2017

@julienvienne: Downside is that the conversion to None changes the type to Object. If you have a field in InfluxDB which already has, e.g., float values, and you then insert new data with NaNs, the conversion to None changes the type and InfluxDB throws an error.

For example:

InfluxDBClientError: 400: {"error":"field type conflict: input field \"kas_AnlMenge\" on measurement \"artikel_tuning\" is type string, already exists as type float"}

@aviau: Maybe you want to open the issue again?

@shagru
Copy link

shagru commented Mar 7, 2017

I've encountered the same error in the latest version. The issue lies in "synop,wmo=7002 tp01h=0i,tp03h=nan 1420585200000000000" where "tp03h=nan" cannot be interpreted by influxdb line protocol. The best solution might be just to omit the nan value field in the line protocol input, like "synop,wmo=7002 tp01h=0i 1420585200000000000". This way, influxdb will put an empty value in that field. Also, we might need to check for "inf" as well since influxdb seems to not support that either. So, maybe we need to check for "np.isfinite()" and omit the value field that is not finite in the line protocol input.

@acezzz
Copy link

acezzz commented Feb 3, 2018

I have encountered similar error in version 5.0.0. The workaround julienvienne provided works fine except the case when the whole row are all NaN values.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants