Once events are in Kafka, we need standardized way to import them into downstream systems. This task will describe and track that work. This task is not about a Stream Processing system, which would consume events from Kafka, transform them, and then produce them back to Kafka. This is about consuming events out of Kafka and saving them into a storage system.
Currently, downstream systems consume events from Kafka using custom consumers and glue code. EventLogging kafka + jrm, Camus, Refinery Spark 'Refine' code, statsv, kafkatee etc. are all examples of custom 'downstream connectors' we use at WMF.
We'd like to standardize the way this is done.
Using Kafka Connect could be nice, but in 2018, most of the useful Confluent connector implementations where switched to a non FLOSS license.
Flink has built in connectors. These might also be useful to standardize on outside of a streaming context.