flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mangtani, Kushal" <Kushal.Mangt...@viasat.com>
Subject Flume && Kafka Integration
Date Tue, 17 Mar 2015 21:24:08 GMT
Hello,

We are using Flume in our prod env to ingest data. A while back, we decided to extend the
functionality and added kafka for real time monitoring.
So, the Flume Source forks off and deposits the data into two separate channels ,one if HDFS(required
mapping) and other is Kafka(optional mapping). We have made the KafkaChannels as optional
selector mapping<http://flume.apache.org/releases/content/1.4.0/FlumeUserGuide.html#fan-out-flow>
so that any issue with Kafka should not block the HDFS pipeline.
However, I have noticed this never happens. Any issue with Kafka cluster eventually also brings
down the HDFS ingestion. So, my question is that either Optional Channel Mapping in flume
src code does not works correctly OR kafka-sink/kafka cluster  I am using is outdated ? Any
inputs will be appreciated.

Env:

  *   Ubuntu 12.04
  *   CDH 5 flume 1.4
  *   Kafka Src Download - 2.9.1-0.8.1.1
  *   Using Custom Flume-Kafka Sink https://github.com/baniuyao/flume-ng-kafka-sink

- Kushal

Mime
View raw message