Skip to content

genfromtxt with names=True doesn't work with python3 #5411

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hsgg opened this issue Jan 2, 2015 · 7 comments
Closed

genfromtxt with names=True doesn't work with python3 #5411

hsgg opened this issue Jan 2, 2015 · 7 comments

Comments

@hsgg
Copy link

hsgg commented Jan 2, 2015

Using python 3.4.2 on Arch Linux with numpy 1.9.1, the following code doesn't work:

#!/usr/bin/env python

import sys
import numpy as np

data = np.genfromtxt(sys.stdin, names=True)

When running with a simple input file, it gives the following error:

Traceback (most recent call last):
  File "./test.py", line 6, in <module>
    data = np.genfromtxt(sys.stdin, names=True)
  File "/usr/lib/python3.4/site-packages/numpy/lib/npyio.py", line 1395, in genfromtxt
    if comments in first_line:
TypeError: 'in <string>' requires string as left operand, not bytes

The input file can be constructed with:

f = io.StringIO(u"""\
x y z
2.4 2.3 0.1
3.5 5.6 0.2""")

There is no problem with python2.

@sylviamic
Copy link

I've tested this a bit on my own arch linux install, and it seems that this has to do with how python3 handles bytestrings and your use of sys.stdin to request filenames. From the comments for genfromtxt:

fname : file or str
File, filename, or generator to read. If the filename extension is
`.gz` or `.bz2`, the file is first decompressed. Note that
generators must return byte strings in Python 3k.

"Note that generators must return byte strings in Python 3k." I suspect that sys.stdin is not returning the byte string that python3 requires, but I'm not familiar enough with it to know why.

Solution
Replace sys.stdin with input, which is a builtin function for python3 (raw_input for python2). This should work almost identically to your current implementation.

data = np.genfromtxt(input(), names=True)

Or for a more interactive experience, you can have it print a string to the console first:

data = np.genfromtxt(input("Please input a filename here: "), names=True)

@hsgg
Copy link
Author

hsgg commented Jan 7, 2015

Thanks @nigeil for your comment, but that solves a different problem. I want to call my program as

$ ./test.py < test.dat

That way I don't need to bother with the filename in python.
I believe you are right that sys.stdin returns strings, whereas genfromtxt requires bytestrings. I will need to read up on how to get a file-like object that returns bytestrings.

@sylviamic
Copy link

Ah, I see the difference. Darned python3. You could replicate the functionality you're looking for with my solution by instead passing a file containing only the filename of the data file - but that is pretty lame. If you find a solution, please share it!

@jaimefrio
Copy link
Member

With Python 3.3.2 and numpy 1.8, I can reproduce your error, and get it to work using sys.stdin.buffer instead of sys.stdin, you may want to give it a try.

@hsgg
Copy link
Author

hsgg commented Jan 10, 2015

Thanks @jaimefrio. sys.stdin.buffer does indeed work. Personally, I think genfromtxt() should support sys.stdin directly, as it feels more natural to me. What do you think?

@jaimefrio
Copy link
Member

Yes, I think I agree: it seems like an important shortcoming in Python 3 support. That said, I don't know that code enough to understand what the implications would be.

My general feeling is that numpy's read-from-file functionality needs a serious overhaul to bring it up to the standard that pandas has set, and I am not sure that small incremental changes will take us there.

But PRs are always welcome! ;-)

@eric-wieser
Copy link
Member

Fixed in 1.14

agriyakhetarpal added a commit to agriyakhetarpal/numpy that referenced this issue Feb 27, 2024
See cython/cython#5411, which is now resolved for Cython>3 and we are at Cython>=3.0.6.
agriyakhetarpal added a commit to agriyakhetarpal/numpy that referenced this issue Feb 27, 2024
This commit performs the following:

1. Skip `RuntimeWarnings` on exotic `np.where()` tests on WASM because of the lack of floating point exception support
2. Skip NumPy config tests that use subprocess module on WASM
3. Ignore threaded tests for PRNGs on WASM
4. Remove numpygh-5411 Cython `AttributeError` check. See cython/cython#5411, which is now resolved for Cython>3, and we are at Cython>=3.0.6.
4. For f2py, check compilers only if not on WASM
5. Skip pickle serialisation tests for `stringdtype` on WASM runtimes
agriyakhetarpal added a commit to agriyakhetarpal/numpy that referenced this issue Feb 27, 2024
This commit performs the following:

1. Skip `RuntimeWarnings` on exotic `np.where()` tests on WASM because of the lack of floating point exception support
2. Skip NumPy config tests that use subprocess module on WASM
3. Ignore threaded tests for PRNGs on WASM
4. Remove numpygh-5411 Cython `AttributeError` check. See cython/cython#5411, which is now resolved for Cython>3, and we are at Cython>=3.0.6.
4. For f2py, check compilers only if not on WASM
5. Skip pickle serialisation tests for `stringdtype` on WASM runtimes
agriyakhetarpal added a commit to agriyakhetarpal/numpy that referenced this issue Feb 27, 2024
This commit performs the following:

1. Skip `RuntimeWarnings` on exotic `np.where()` tests on WASM because of the lack of floating point exception support
2. Skip NumPy config tests that use subprocess module on WASM
3. Ignore threaded tests for PRNGs on WASM
4. Remove numpygh-5411 Cython `AttributeError` check. See cython/cython#5411, which is now resolved for Cython>3, and we are at Cython>=3.0.6.
4. For f2py, check compilers only if not on WASM
5. Skip pickle serialisation tests for `stringdtype` on WASM runtimes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants