Skip to content

sibiryakov/kafka-python

 
 

Repository files navigation

Kafka Python client

https://coveralls.io/repos/dpkp/kafka-python/badge.svg?branch=master&service=github https://travis-ci.org/dpkp/kafka-python.svg?branch=master
>>> pip install kafka-python

kafka-python is a client for the Apache Kafka distributed stream processing system. It is designed to function much like the official java client, with a sprinkling of pythonic interfaces (e.g., iterators).

KafkaConsumer

>>> from kafka import KafkaConsumer
>>> consumer = KafkaConsumer('my_favorite_topic')
>>> for msg in consumer:
...     print (msg)

KafkaConsumer is a full-featured, high-level message consumer class that is similar in design and function to the new 0.9 java consumer. Most configuration parameters defined by the official java client are supported as optional kwargs, with generally similar behavior. Gzip and Snappy compressed messages are supported transparently.

In addition to the standard KafkaConsumer.poll() interface (which returns micro-batches of messages, grouped by topic-partition), kafka-python supports single-message iteration, yielding ConsumerRecord namedtuples, which include the topic, partition, offset, key, and value of each message.

By default, KafkaConsumer will attempt to auto-commit message offsets every 5 seconds. When used with 0.9 kafka brokers, KafkaConsumer will dynamically assign partitions using the kafka GroupCoordinator APIs and a RoundRobinPartitionAssignor partitioning strategy, enabling relatively straightforward parallel consumption patterns. See ReadTheDocs for examples.

KafkaProducer

KafkaProducer is a high-level, asynchronous message producer. The class is intended to operate as similarly as possible to the official java client. See ReadTheDocs for more details.

>>> from kafka import KafkaProducer
>>> producer = KafkaProducer(bootstrap_servers='localhost:1234')
>>> producer.send('foobar', b'some_message_bytes')
>>> # Blocking send
>>> producer.send('foobar', b'another_message').get(timeout=60)
>>> # Use a key for hashed-partitioning
>>> producer.send('foobar', key=b'foo', value=b'bar')
>>> # Serialize json messages
>>> import json
>>> producer = KafkaProducer(value_serializer=json.loads)
>>> producer.send('fizzbuzz', {'foo': 'bar'})
>>> # Serialize string keys
>>> producer = KafkaProducer(key_serializer=str.encode)
>>> producer.send('flipflap', key='ping', value=b'1234')
>>> # Compress messages
>>> producer = KafkaProducer(compression_type='gzip')
>>> for i in range(1000):
...     producer.send('foobar', b'msg %d' % i)

Protocol

A secondary goal of kafka-python is to provide an easy-to-use protocol layer for interacting with kafka brokers via the python repl. This is useful for testing, probing, and general experimentation. The protocol support is leveraged to enable a KafkaClient.check_version() method that probes a kafka broker and attempts to identify which version it is running (0.8.0 to 0.9).

Low-level

Legacy support is maintained for low-level consumer and producer classes, SimpleConsumer and SimpleProducer. See ReadTheDocs for API details.

About

Python client for Apache Kafka

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 99.5%
  • Other 0.5%