Skip to content

readline() causes output to be written at eof unless seek() is used #113439

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ghost opened this issue Dec 23, 2023 · 3 comments
Open

readline() causes output to be written at eof unless seek() is used #113439

ghost opened this issue Dec 23, 2023 · 3 comments
Labels
topic-IO type-bug An unexpected behavior, bug, or error

Comments

@ghost
Copy link

ghost commented Dec 23, 2023

Bug report

Bug description:

# Add a code block here, if required

In python3 it seems that there is a bug with the readline() method.

I have a file txt.txt that contains two lines:

1234567890
abcdefghij

I then run the following code:

g = open("txt.txt","r+")
g.write("xxx")
g.flush()
g.close()

It modifies the file as expected:

xxx4567890
abcdefghij

I then run the following code:

g = open("txt.txt","r+")
g.readline()    
Out[99]: 'xxx4567890\n'
g.tell()
Out[100] 12
g.write("XXX")
g.flush()
g.close()

I get the following:

xxx4567890
abcdefghij
XXX

Why is "XXX" being written to the end of the file instead of just after the first line?

If I run the following:

g = open("txt.txt","r+")
g.readline()    
Out[99]: 'xxx4567890\n'
g.tell()
Out[100] 12
g.seek(12)
g.tell()
g.write("XXX")
g.flush()
g.close()

I get:

xxx4567890
XXXdefghij
XXX

seems like this is a bug in readline() - it says the cursor is at 12 but writes at EOF unless I use seek()

CPython versions tested on:

3.11

Operating systems tested on:

Windows

@ghost ghost added the type-bug An unexpected behavior, bug, or error label Dec 23, 2023
@benjaminJohnson2204
Copy link

I was able to reproduce this issue, and I'd like to work on fixing it.

I looked into the bug, and it seems like it's being caused by the TextIOWrapper class (in both the C io and Python pyio modules) reading an entire chunk at a time, then not rewinding its stream pointer before performing the write.

Reproducing this example with opening a file in binary mode (as opposed to text) works as expected, rather than being buggy. The TextIOWrapper class, which is used for text file I/O, has its own buffer it reads an entire chunk into for every read or readline call, and doesn't seek to the correct position when writing after reading. So I think the TextIOWrapper class should be changed.

The simplest fix would be to essentially add self.seek(self.tell()) to both the C and Python implementations of TextIOWrapper whenever we are writing to a stream that is readable and has a non-empty read buffer. For non-seekable streams, we may just want to leave the implementation as is (i.e. calling self.seek(self.tell()) only if the stream is seekable). The only way I see to make this bug not occur for non-seekable streams would be to change TextIOWrapper to not buffer its read calls if it is both readable and writable, but not seekable.

I'd be interested to hear other opinions on this.

@vadmium
Copy link
Member

vadmium commented Jul 26, 2024

Probably the same as Issue #82891 (and #56424, closed as not worth fixing)

@StanFromIreland
Copy link
Contributor

Duplicate: #117095

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-IO type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants