numpy.loadtxt with skiprows much slower than custom generator

I noticed when trying to load somewhat large text files (~12MB) that using the skiprows option of numpy.loadtxt is much slower than implementing the same thing with a custom generator

For example this take on average ~1.2s per invocation on my laptop

``` python
times = []
for _ in range(30) :
    s = time.time()
    tmp = np.loadtxt("test.txt",  skiprows=1001)
    times.append(time.time() - s)

print np.mean(times)
```

whereas this takes ~0.4s per invocation

``` python
def my_generator(fname, skiprows=0) :
    f = open(fname, 'r')
    for i,line in enumerate(f) :
        if i >= skiprows :
            yield line

times = []
for _ in range(30) :
    s = time.time()
    tmp = np.loadtxt(my_generator("test.txt", skiprows=1001))
    times.append(time.time() - s)

print np.mean(times)
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

numpy.loadtxt with skiprows much slower than custom generator #7480

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

numpy.loadtxt with skiprows much slower than custom generator #7480

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions