Skip to content

RFC Confusing seek implementation  #146

Closed
@GregBowyer

Description

@GregBowyer

Hi all

I feel that the current seek implementation in the SimpleConsumer is rather confusing, there feels like there are two issues here:

  1. In the case of a consumer with multiple partitions, the divide by partitions and find the offset does not seem like it would render the desired effect, which is described somewhat in SimpleConsumer seek & pending #67
  2. The offset whence behaviour works from a delta, that is with say whence 0 it is not seek to this absolute offset, but rather seek to the first available message offset + seek offset.

The API as described does make sense in the context of being analogous to fseek() but is confusing with regard to kafka as a queue, since at present 0.8.0 does not automatically maintain client offsets, most clients are forced to maintain these directly.

Since the seek behaviour always adds the lowest offset and the delta it becomes difficult to actually maintain these numbers, especially as there is no natural way to get the lowest offset in a partition (which would allow computing the delta ...)

I feel that the API in general is very confusing and extremely subtle, I have for my own purposes created a new api call called seek_absolute which can be found on this commit (https://github.com/GregBowyer/kafka-python/commit/056cba565f356787f1b027a881a7487be5a758ce) but rather than make a pull request, I think that more discussion is really needed over how the seek contract should work in general

Thoughts ?

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions