flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas.B...@continental-corporation.com
Subject HDFS sink - rollover
Date Mon, 21 Sep 2015 10:05:01 GMT
Hi,

I'm using the Kafka-Flume source and the Flume-HDFS sink for writing 
SequenceFiles. I would like to rollover a SequenceFile after a specific 
count of events/messages was written, e.g. after 50 messages (see 
rollCount parameter below) a new file should be written.
My configuration seems to be incorrect as a rollover is done after 4 
messages (instead of 50).
I'm using the following "rollover" configuration:

"a_cobepa_probe.sources  = kafka-cobepa_probe
a_cobepa_probe.channels = hdfs-channel-notused
a_cobepa_probe.sinks    = hdfs_cobepa_probe
 
a_cobepa_probe.sources.kafka-cobepa_probe.type = 
org.apache.flume.source.kafka.KafkaSource
a_cobepa_probe.sources.kafka-cobepa_probe.zookeeperConnect = <secret>
a_cobepa_probe.sources.kafka-cobepa_probe.topic = cobepa_probe
a_cobepa_probe.sources.kafka-cobepa_probe.batchSize = 1
a_cobepa_probe.sources.kafka-cobepa_probe.channels = hdfs-channel-notused
 
a_cobepa_probe.channels.hdfs-channel-notused.type = memory
a_cobepa_probe.sinks.hdfs_cobepa_probe.channel = hdfs-channel-notused
a_cobepa_probe.sinks.hdfs_cobepa_probe.type = hdfs
a_cobepa_probe.sinks.hdfs_cobepa_probe.hdfs.writeFormat = 
de.conti.backend.asw.flume.serializer.MyBuilder
a_cobepa_probe.sinks.hdfs_cobepa_probe.hdfs.fileType = SequenceFile
# a_cobepa_probe.sinks.hdfs_cobepa_probe.hdfs.fileType = DataStream
a_cobepa_probe.sinks.hdfs_cobepa_probe.hdfs.filePrefix = %k%M
a_cobepa_probe.sinks.hdfs_cobepa_probe.hdfs.fileSuffix = .cobr
a_cobepa_probe.sinks.hdfs_cobepa_probe.hdfs.useLocalTimeStamp = true
a_cobepa_probe.sinks.hdfs_cobepa_probe.hdfs.path = /etl/%{topic}/%y%m%d
a_cobepa_probe.sinks.hdfs_cobepa_probe.hdfs.rollCount=50
a_cobepa_probe.sinks.hdfs_cobepa_probe.hdfs.rollSize=0
a_cobepa_probe.sinks.hdfs_cobepa_probe.hdfs.batchSize=50
 
a_cobepa_probe.channels.hdfs-channel-notused.capacity = 100
a_cobepa_probe.channels.hdfs-channel-notused.transactionCapacity = 100"


Are there any dependencies to other configuration parameters (in addition 
to the rollCount and rollSize parameter)?

Thank you very much and kind regards,
Thomas
Mime
View raw message