flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From higkoohk <higko...@gmail.com>
Subject Re: Why does flume create one file per milliseconds to HDFS ?
Date Wed, 15 May 2013 08:13:04 GMT
Sorry for all ,My config has a typo:
tengine.sinks.hdfs4log.hdfs.rollCouont = 3600

It should be:
tengine.sinks.hdfs4log.hdfs.rollCount = 3600

Due to this typo, your files are being rolled once 10 events are written.

Many thanks @ Hari Shreedharan


2013/5/15 higkoohk <higkoohk@gmail.com>

> Hello ,all !
>
>    I'm a new flumer , today I use flume to collector web server logs.
>
>    My flume config is:
>
> tengine.sources = tengine
>
> tengine.sources.tengine.type = exec
>
> tengine.sources.tengine.command = tail -n +0 -F
>> /data/log/tengine/access.log
>
> tengine.sources.tengine.channels = file4log
>
> tengine.sinks = hdfs4log
>
> tengine.sinks.hdfs4log.type = hdfs
>
> tengine.sinks.hdfs4log.channel = file4log
>
> tengine.sinks.hdfs4log.serializer = avro_event
>
> tengine.sinks.hdfs4log.hdfs.path = hdfs://
>> hdfs.kisops.org:8020/flume/tengine
>
> tengine.sinks.hdfs4log.hdfs.filePrefix = access
>
> tengine.sinks.hdfs4log.hdfs.fileSuffix = .log
>
> tengine.sinks.hdfs4log.hdfs.rollInterval = 3600
>
> tengine.sinks.hdfs4log.hdfs.rollCouont = 3600
>
> tengine.sinks.hdfs4log.hdfs.rollSize = 506870912
>
> tengine.sinks.hdfs4log.hdfs.batchSize = 1048576
>
> tengine.sinks.hdfs4log.hdfs.threadsPoolSize = 38
>
> tengine.sinks.hdfs4log.hdfs.fileType = DataStream
>
> tengine.sinks.hdfs4log.hdfs.writeFormat = Text
>
> tengine.channels = file4log
>
> tengine.channels.file4log.type = file
>
> tengine.channels.file4log.capacity = 1048576
>
> tengine.channels.file4log.transactionCapacity = 1048576
>
> tengine.channels.file4log.checkpointDir = /data/log/hdfs
>
> tengine.channels.file4log.dataDirs = /data/log/tengine
>
> And it's log:
>
> Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS
>> access
>
> Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.6.1.jar from classpath
>
> Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar from classpath
>
> Info: Including HBASE libraries found via (/usr/bin/hbase) for HBASE access
>
> Info: Excluding
>> /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hbase/bin/../lib/slf4j-api-1.6.1.jar
>> from classpath
>
> Info: Excluding
>> /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/bin/../lib/zookeeper/lib/slf4j-api-1.6.1.jar
>> from classpath
>
> Info: Excluding
>> /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/bin/../lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar
>> from classpath
>
> Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.6.1.jar from classpath
>
> Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar from classpath
>
> 13/05/15 11:27:52 INFO conf.FlumeConfiguration: Processing:hdfs4log
>
> 13/05/15 11:27:52 INFO conf.FlumeConfiguration: Added sinks: hdfs4log
>> Agent: tengine
>
> 13/05/15 11:27:52 INFO conf.FlumeConfiguration: Post-validation flume
>> configuration contains configuration for agents: [tengine]
>
> 13/05/15 11:27:52 INFO node.AbstractConfigurationProvider: Creating
>> channels
>
> 13/05/15 11:27:52 INFO channel.DefaultChannelFactory: Creating instance of
>> channel file4log type file
>
> 13/05/15 11:27:52 INFO node.AbstractConfigurationProvider: Created channel
>> file4log
>
> 13/05/15 11:27:52 INFO source.DefaultSourceFactory: Creating instance of
>> source tengine, type exec
>
> 13/05/15 11:27:52 INFO sink.DefaultSinkFactory: Creating instance of sink:
>> hdfs4log, type: hdfs
>
> 13/05/15 11:27:52 INFO hdfs.HDFSEventSink: Hadoop Security enabled: false
>
> 13/05/15 11:27:52 INFO node.AbstractConfigurationProvider: Channel
>> file4log connected to [tengine, hdfs4log]
>
> 13/05/15 11:27:52 INFO node.Application: Starting Channel file4log
>
> 13/05/15 11:27:52 INFO file.FileChannel: Starting FileChannel file4log {
>> dataDirs: [/data/log/tengine] }...
>
> 13/05/15 11:27:52 INFO file.Log: Encryption is not enabled
>
> 13/05/15 11:27:52 INFO file.Log: Replay started
>
> 13/05/15 11:27:52 INFO file.Log: Found NextFileID 0, from []
>
> 13/05/15 11:27:53 INFO file.EventQueueBackingStoreFile: Preallocated
>> /data/log/hdfs/checkpoint to 8396840 for capacity 1048576
>
> 13/05/15 11:27:53 INFO file.EventQueueBackingStoreFileV3: Starting up with
>> /data/log/hdfs/checkpoint and /data/log/hdfs/checkpoint.meta
>
> 13/05/15 11:27:53 INFO file.Log: Last Checkpoint Wed May 15 11:27:52 CST
>> 2013, queue depth = 0
>
> 13/05/15 11:27:53 INFO file.Log: Replaying logs with v2 replay logic
>
> 13/05/15 11:27:53 INFO file.ReplayHandler: Starting replay of []
>
> 13/05/15 11:27:53 INFO file.ReplayHandler: read: 0, put: 0, take: 0,
>> rollback: 0, commit: 0, skip: 0, eventCount:0
>
> 13/05/15 11:27:53 INFO file.Log: Rolling /data/log/tengine
>
> 13/05/15 11:27:53 INFO file.Log: Roll start /data/log/tengine
>
> 13/05/15 11:27:53 INFO tools.DirectMemoryUtils: Unable to get
>> maxDirectMemory from VM: NoSuchMethodException:
>> sun.misc.VM.maxDirectMemory(null)
>
> 13/05/15 11:27:53 INFO tools.DirectMemoryUtils: Direct Memory Allocation:
>>  Allocation = 1048576, Allocated = 0, MaxDirectMemorySize = 18677760,
>> Remaining = 18677760
>
> 13/05/15 11:27:53 INFO file.LogFile: Opened /data/log/tengine/log-1
>
> 13/05/15 11:27:53 INFO file.Log: Roll end
>
> 13/05/15 11:27:53 INFO file.EventQueueBackingStoreFile: Start checkpoint
>> for /data/log/hdfs/checkpoint, elements to sync = 0
>
> 13/05/15 11:27:53 INFO file.EventQueueBackingStoreFile: Updating
>> checkpoint metadata: logWriteOrderID: 1368588473111, queueSize: 0,
>> queueHead: 0
>
> 13/05/15 11:27:53 INFO file.LogFileV3: Updating log-1.meta currentPosition
>> = 0, logWriteOrderID = 1368588473111
>
> 13/05/15 11:27:53 INFO file.Log: Updated checkpoint for file:
>> /data/log/tengine/log-1 position: 0 logWriteOrderID: 1368588473111
>
> 13/05/15 11:27:53 INFO file.FileChannel: Queue Size after replay: 0
>> [channel=file4log]
>
> 13/05/15 11:27:53 INFO instrumentation.MonitoredCounterGroup: Monitoried
>> counter group for type: CHANNEL, name: file4log, registered successfully.
>
> 13/05/15 11:27:53 INFO instrumentation.MonitoredCounterGroup: Component
>> type: CHANNEL, name: file4log started
>
> 13/05/15 11:27:53 INFO node.Application: Starting Sink hdfs4log
>
> 13/05/15 11:27:53 INFO node.Application: Starting Source tengine
>
> 13/05/15 11:27:53 INFO source.ExecSource: Exec source starting with
>> command:tail -n +0 -F /data/log/tengine/access.log
>
> 13/05/15 11:27:53 INFO instrumentation.MonitoredCounterGroup: Monitoried
>> counter group for type: SINK, name: hdfs4log, registered successfully.
>
> 13/05/15 11:27:53 INFO instrumentation.MonitoredCounterGroup: Component
>> type: SINK, name: hdfs4log started
>
> 13/05/15 11:27:59 INFO hdfs.HDFSDataStream: Serializer = avro_event,
>> UseRawLocalFileSystem = false
>
> 13/05/15 11:27:59 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479399.log.tmp
>
> 13/05/15 11:28:00 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479399.log.tmp
>
> 13/05/15 11:28:01 INFO hdfs.BucketWriter: Renaming hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479399.log.tmp to
>> hdfs://hdfs.kisops.org:8020/flume/tengine/access.1368588479399.log
>
> 13/05/15 11:28:01 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479400.log.tmp
>
> 13/05/15 11:28:01 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479400.log.tmp
>
> 13/05/15 11:28:01 INFO hdfs.BucketWriter: Renaming hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479400.log.tmp to
>> hdfs://hdfs.kisops.org:8020/flume/tengine/access.1368588479400.log
>
> 13/05/15 11:28:01 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479401.log.tmp
>
> 13/05/15 11:28:01 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479401.log.tmp
>
> 13/05/15 11:28:01 INFO hdfs.BucketWriter: Renaming hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479401.log.tmp to
>> hdfs://hdfs.kisops.org:8020/flume/tengine/access.1368588479401.log
>
> 13/05/15 11:28:01 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479402.log.tmp
>
> 13/05/15 11:28:01 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479402.log.tmp
>
> 13/05/15 11:28:01 INFO hdfs.BucketWriter: Renaming hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479402.log.tmp to
>> hdfs://hdfs.kisops.org:8020/flume/tengine/access.1368588479402.log
>
> 13/05/15 11:28:01 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479403.log.tmp
>
> 13/05/15 11:28:01 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479403.log.tmp
>
> 13/05/15 11:28:01 INFO hdfs.BucketWriter: Renaming hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479403.log.tmp to
>> hdfs://hdfs.kisops.org:8020/flume/tengine/access.1368588479403.log
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479404.log.tmp
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479404.log.tmp
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Renaming hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479404.log.tmp to
>> hdfs://hdfs.kisops.org:8020/flume/tengine/access.1368588479404.log
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479405.log.tmp
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479405.log.tmp
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Renaming hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479405.log.tmp to
>> hdfs://hdfs.kisops.org:8020/flume/tengine/access.1368588479405.log
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479406.log.tmp
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479406.log.tmp
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Renaming hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479406.log.tmp to
>> hdfs://hdfs.kisops.org:8020/flume/tengine/access.1368588479406.log
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479407.log.tmp
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479407.log.tmp
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Renaming hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479407.log.tmp to
>> hdfs://hdfs.kisops.org:8020/flume/tengine/access.1368588479407.log
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479408.log.tmp
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479408.log.tmp
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Renaming hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479408.log.tmp to
>> hdfs://hdfs.kisops.org:8020/flume/tengine/access.1368588479408.log
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479409.log.tmp
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479409.log.tmp
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Renaming hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479409.log.tmp to
>> hdfs://hdfs.kisops.org:8020/flume/tengine/access.1368588479409.log
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479410.log.tmp
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479410.log.tmp
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Renaming hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479410.log.tmp to
>> hdfs://hdfs.kisops.org:8020/flume/tengine/access.1368588479410.log
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479411.log.tmp
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479411.log.tmp
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Renaming hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479411.log.tmp to
>> hdfs://hdfs.kisops.org:8020/flume/tengine/access.1368588479411.log
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479412.log.tmp
>
> 13/05/15 11:28:02 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479412.log.tmp
>
> 13/05/15 11:28:03 INFO hdfs.BucketWriter: Renaming hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479412.log.tmp to
>> hdfs://hdfs.kisops.org:8020/flume/tengine/access.1368588479412.log
>
> 13/05/15 11:28:03 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479413.log.tmp
>
> 13/05/15 11:28:03 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479413.log.tmp
>
> 13/05/15 11:28:03 INFO hdfs.BucketWriter: Renaming hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479413.log.tmp to
>> hdfs://hdfs.kisops.org:8020/flume/tengine/access.1368588479413.log
>
> 13/05/15 11:28:03 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479414.log.tmp
>
> 13/05/15 11:28:03 INFO hdfs.BucketWriter: Creating hdfs://
>> hdfs.kisops.org:8020/flume/tengine/access.1368588479414.log.tmp
>
> 13/05/15 11:28:22 INFO file.EventQueueBackingStoreFile: Start checkpoint
>> for /data/log/hdfs/checkpoint, elements to sync = 160
>
> 13/05/15 11:28:22 INFO file.EventQueueBackingStoreFile: Updating
>> checkpoint metadata: logWriteOrderID: 1368588473441, queueSize: 0,
>> queueHead: 158
>
> 13/05/15 11:28:22 INFO file.LogFileV3: Updating log-1.meta currentPosition
>> = 27844, logWriteOrderID = 1368588473441
>
> 13/05/15 11:28:23 INFO file.Log: Updated checkpoint for file:
>> /data/log/tengine/log-1 position: 27844 logWriteOrderID: 1368588473441
>
>    I found that flume would create one file per milliseconds ?
>

Mime
View raw message