This repository was archived by the owner on Oct 29, 2024. It is now read-only.
This repository was archived by the owner on Oct 29, 2024. It is now read-only.
Filter by tags appears to be broken? #251
Closed
Description
I've written some data points (listed at the bottom) to my instance in a measurement called 'abc':
>>> indb.write_points(data, tags = {"imei" : "test_imei"}, time_precision='ms')
True
I can get the data-points out:
>>> rs = indb.query("select * from abc")
And print them:
>>> print list(rs.get_points())
[
{u'imei': u'test_imei', u'value': 1111, u'time': u'2015-10-06T19:50:29.007Z'},
{u'imei': u'test_imei', u'value': 2222, u'time': u'2015-10-06T19:50:29.008Z'},
{u'imei': u'test_imei', u'value': 3333, u'time': u'2015-10-06T19:50:29.009Z'},
{u'imei': u'test_imei', u'value': 4444, u'time': u'2015-10-06T19:50:29.01Z'}
]
When I specify the tag value the result is empty:
>>> print list(rs.get_points(tags={"imei" : "test_imei"}))
[]
Data Points:
time = 1444161029007
data = [
{ "measurement" : "abc", "time": time, "fields" : {"value" : 1111} },
{ "measurement" : "abc", "time": time+1, "fields" : {"value" : 2222} },
{ "measurement" : "abc", "time": time+2, "fields" : {"value" : 3333} },
{ "measurement" : "abc", "time": time+3, "fields" : {"value" : 4444} }
]
>>> indb.write_points(data, tags = {"imei" : "test_imei"}, time_precision='ms')
True
Activity
drmclean commentedon Oct 7, 2015
Also related, the whole handling of tags doesn't appear to work correctly:
I have a measurement called 'current'.
I can find the tags inside current using:
This correctly shows one tag called "imei".
I can find the tag values inside current using:
Clearly showing a working tag called "imei" and 4 different tagValues.
I can get only the results for imei = imei_004 using a direct query:
But if I use the
rs.keys()
option is shows no tags:According to the docs
rs.keys()
ought to return a tuple with (serie_name, tags) but it seems to think that there are no tags despite happily querying tags from the db.aviau commentedon Oct 13, 2015
On 13/10/15 10:37 AM, David McLean wrote:
Hello David,
Absolutely, and this has nothing to do with influxbd-python. The InfluxDB server does not return tags if you don't ask it to.
However, this can be misguiding because there was a recent change to the InfluxDB API:
The implicit GROUP BY * that was added to every SELECT * has been
removed. Instead any tags in the data are now part of the columns in
the returned query.
The "tags" that you see in #251 are actually columns!
The fact that tags are now part of the columns makes if impossible for us to differentiate tags and columns :(!
It should work when using "group by", I cannot break that.
What I am getting out of this is that there are no bugs in influxdb-python. To get things working like you want, all you have to do is request the right tags by using the "GROUP BY" keyword.
So, there are two types of tags:
However, it looks like one would want to filter by the tags that are now included in the columns. What do you think?
Once again, I think that this discussion should be on GitHub so that everyone can see, I will post this here:
thank you for your work David
Best regards,
Alexandre Viau
alexandre@alexandreviau.net
drmclean commentedon Oct 13, 2015
Hi Alex,
The above makes sense although it should be noted that the "bug" is now just that the documentation is no longer accurate. In the current docs the following:
rs = cli.query("SELECT * from cpu")
cpu_influxdb_com_points = list(rs.get_points(tags={"host_name": "influxdb.com"}))
Is suggested as working code but due to the API changes it runs without error but no longer filters correctly, something which is a bit confusing for the first-time user!
I think its possible to implement filtering by tags included in the columns but I'm not sure how to do so in a way which maintains the previously functionality on group by queries but doesn't break other parts of the library. I'll remove my pull request as it neither solves the problems correctly nor passes the build!
aviau commentedon Oct 13, 2015
You are right! That is a bug :)
I'll think about this, we should use this issue to discuss how to do that.
3fr61n commentedon Mar 2, 2016
We had the same problem :(
With NO tags (works)
With EMPTY tags (works)
With ANY specific tags (does NOT works)
etc
anoopkhandelwal commentedon Mar 14, 2016
Hi,
I am also facing the same issue.I need to filter out data by using 3-4 tags.
Any other alternative solution/wrapper function which we can use to achieve our objective?
anoopkhandelwal commentedon Mar 15, 2016
Hi,
I wrote a wrapper function -
def filter_fun(data, key, allowed):
return filter(lambda x: key in x and x[key] in allowed, data)
def filter_data(data, fitered_tags):
response_list = data
for tag_key, tag_value in fitered_tags.iteritems():
response_list = filter_fun(response_list, tag_key, tag_value)
return response_list
now all we need to filter is to pass the data_list into filter_data function and pass tags(dict) so that it will filter all the data and return you list of dict elements.
e.g
data_list = filter_data(data_list, fitered_tags={'key_1': value1, 'key_2': 'value2'})
Since the function get_points also filtered after getting the data i.e. it is not executed on the query level,so performance wise it is same as get_points function.
Let me know,if we can make it more correct.
TwitchChen commentedon Jan 17, 2017
this bug has been resolved now?
xginn8 commentedon Nov 25, 2017
This bug should be fixed in the latest release -- if not, we can revisit.