Skip to content

support READ_COMMITTED isolation_level #1707

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
maslovalex opened this issue Jan 29, 2019 · 4 comments
Closed

support READ_COMMITTED isolation_level #1707

maslovalex opened this issue Jan 29, 2019 · 4 comments
Labels

Comments

@maslovalex
Copy link

No description provided.

@maslovalex
Copy link
Author

from javadoc

Reading Transactional Messages

Transactions were introduced in Kafka 0.11.0 wherein applications can write to multiple topics and partitions atomically. In order for this to work, consumers reading from these partitions should be configured to only read committed data. This can be achieved by setting the isolation.level=read_committed in the consumer's configuration.

In read_committed mode, the consumer will read only those transactional messages which have been successfully committed. It will continue to read non-transactional messages as before. There is no client-side buffering in read_committed mode. Instead, the end offset of a partition for a read_committed consumer would be the offset of the first message in the partition belonging to an open transaction. This offset is known as the 'Last Stable Offset'(LSO).

A read_committed consumer will only read up to the LSO and filter out any transactional messages which have been aborted. The LSO also affects the behavior of seekToEnd(Collection) and endOffsets(Collection) for read_committed consumers, details of which are in each method's documentation. Finally, the fetch lag metrics are also adjusted to be relative to the LSO for read_committed consumers.

Partitions with transactional messages will include commit or abort markers which indicate the result of a transaction. There markers are not returned to applications, yet have an offset in the log. As a result, applications reading from topics with transactional messages will see gaps in the consumed offsets. These missing messages would be the transaction markers, and they are filtered out for consumers in both isolation levels. Additionally, applications using read_committed consumers may also see gaps due to aborted transactions, since those messages would not be returned by the consumer and yet would have valid offsets.

ATM KafkaConsumer.poll() returns among expected records also those which look like
ConsumerRecord(topic=u'my-topic', partition=0, offset=12, timestamp=1548765417767, timestamp_type=0, key='\x00\x00\x00\x01', value='\x00\x00\x00\x00\x00\x00', headers=[], checksum=None, serialized_key_size=4, serialized_value_size=6, serialized_header_size=-1)
for the messages which are produced as Transactional.

@nandajavarma
Copy link

I would like to see this supported as well. Is there any work around for this atm?

@jkasprzak-dragon
Copy link

+1 that would be really useful...

@jeffwidman
Copy link
Contributor

jeffwidman commented Sep 13, 2019

This is an all-volunteer project, so pull requests are welcome!

If you work for a company that needs this enough to pay for it, there is a chance @dpkp may be willing to do sponsored work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants