Brock, 

This looks like FLUME-1417. This logs on the jira show when the problem is hit during startup. I actually managed to get the Log Id is null error during runtime when I was testing that issue, if you change to small file size and checkpoint very often. 

Thanks,
Hari

-- 
Hari Shreedharan

On Friday, October 5, 2012 at 11:19 AM, Brock Noland wrote:

Hi,

Just curious if you got around this or figured out what was going on?
Makes me a little nervous about a file channel bug.

Brock

On Tue, Oct 2, 2012 at 6:28 AM, Brock Noland <brock@cloudera.com> wrote:
Also, if you could send us your full log that would be great. The
email list doesn't take attachements so either:

1) post it on pastbin
or
2) zip it and mail it to me directly

Brock

On Tue, Oct 2, 2012 at 6:06 AM, Brock Noland <brock@cloudera.com> wrote:
Hi,

What version of flume? If trunk (1.3.0-SNAPSHOT) what is the last
patch you have?

Can you how us a ls -la of your data and checkpoint directories?

Brock

On Tue, Oct 2, 2012 at 3:43 AM, Raymond Ng <raymondair@gmail.com> wrote:
Just to add more info to this, I've checked the File channel where a
"ChannelException: Cannot acquire capacity" is reported against, and can see
file log-1 has the size of 0 and log-2 has over 300 MB of data, comparing
with another File channel which has files log-2 and log-3 both with data in
it but no file log-1 is found.

sounds like log-1 is the one causing the "NullPointerException: LogFile is
null for id 1" below, and when I restarted flume, I get the following
warning. I can confirm there was no manual tampering in the file channel
directory

2012-10-02 09:38:10,231 INFO [conf-file-poller-0]
DefaultLogicalNodeManager.java - Starting Channel probeFileChannel1
2012-10-02 09:38:10,239 INFO [conf-file-poller-0]
DefaultLogicalNodeManager.java - Starting Channel probeFileChannel3
2012-10-02 09:38:10,313 WARN [lifecycleSupervisor-1-2] ReplayHandler.java -
Hit EOF on /home/user/flume-ng/filechannel3/data/log-1
2012-10-02 09:38:10,314 INFO [lifecycleSupervisor-1-1]
DirectMemoryUtils.java - Unable to get maxDirectMemory from VM:
NoSuchMethodException: sun.misc.VM.maxDirectMemory(null)
2012-10-02 09:38:10,317 INFO [lifecycleSupervisor-1-1]
DirectMemoryUtils.java - Direct Memory Allocation: Allocation = 1048576,
Allocated = 0, MaxDirectMemorySize = 954466304, Remaining = 954466304
2012-10-02 09:38:10,381 WARN [lifecycleSupervisor-1-1] LogFile.java -
Checkpoint for file(/home/user/flume-ng/filechannel1/data/log-2) is:
1349166469095, which is beyond the requested checkpoint time: 0.
2012-10-02 09:38:10,381 WARN [lifecycleSupervisor-1-2] LogFile.java -
Checkpoint for file(/home/user/flume-ng/filechannel3/data/log-2) is:
1349166991594, which is beyond the requested checkpoint time: 0.
2012-10-02 09:41:52,144 ERROR [lifecycleSupervisor-1-2] ReplayHandler.java -
Pending takes 32103 exist after the end of replay. Duplicate messages will
exist in destination.
2012-10-02 09:41:52,709 INFO [lifecycleSupervisor-1-2]
MonitoredCounterGroup.java - Component type: CHANNEL, name:
probeFileChannel3 started
2012-10-02 09:42:31,413 WARN [lifecycleSupervisor-1-1] LogFile.java -
Checkpoint for file(/home/cluster_admin/flume-ng/filechannel1/data/log-3)
is: 1349166981020, which is beyond the requested checkpoint time: 0.
2012-10-02 09:45:14,836 ERROR [lifecycleSupervisor-1-1] ReplayHandler.java -
Pending takes 8409 exist after the end of replay. Duplicate messages will
exist in destination.
2012-10-02 09:45:15,453 INFO [lifecycleSupervisor-1-1]
MonitoredCounterGroup.java - Component type: CHANNEL, name:
probeFileChannel1 started


On Tue, Oct 2, 2012 at 9:19 AM, Raymond Ng <raymondair@gmail.com> wrote:

Hi

Could I have some advice for the following exception please, is this
related to the "ChannelException: Cannot acquire capacity" which I
experience from time to time


2012-10-02 09:16:53,563 ERROR [Log-BackgroundWorker] Log.java - General
error in checkpoint worker
java.lang.NullPointerException
at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:738)
at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:692)
at org.apache.flume.channel.file.Log.access$300(Log.java:57)
at
org.apache.flume.channel.file.Log$BackgroundWorker.run(Log.java:892)
2012-10-02 09:16:56,317 ERROR
[SinkRunner-PollingRunner-DefaultSinkProcessor] HDFSEventSink.java - process
failed
java.lang.NullPointerException: LogFile is null for id 1
at
com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
at org.apache.flume.channel.file.Log.get(Log.java:316)
at
org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:373)
at
org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
at
org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:91)
at
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:383)
at
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:662)
2012-10-02 09:16:56,318 ERROR
[SinkRunner-PollingRunner-DefaultSinkProcessor] SinkRunner.java - Unable to
deliver event. Exception follows.
org.apache.flume.EventDeliveryException: java.lang.NullPointerException:
LogFile is null for id 1
at
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:450)
at
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NullPointerException: LogFile is null for id 1
at
com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
at org.apache.flume.channel.file.Log.get(Log.java:316)
at
org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:373)
at
org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
at
org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:91)
at
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:383)
... 3 more
2012-10-02 09:16:56,625 ERROR [Log-BackgroundWorker] Log.java - General
error in checkpoint worker
java.lang.NullPointerException
at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:738)
at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:692)
at org.apache.flume.channel.file.Log.access$300(Log.java:57)
at
org.apache.flume.channel.file.Log$BackgroundWorker.run(Log.java:892)
2012-10-02 09:16:59,678 ERROR [Log-BackgroundWorker] Log.java - General
error in checkpoint worker
java.lang.NullPointerException
at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:738)
at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:692)
at org.apache.flume.channel.file.Log.access$300(Log.java:57)
at
org.apache.flume.channel.file.Log$BackgroundWorker.run(Log.java:892)
2012-10-02 09:17:01,318 ERROR
[SinkRunner-PollingRunner-DefaultSinkProcessor] HDFSEventSink.java - process
failed
java.lang.NullPointerException: LogFile is null for id 1
at
com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
at org.apache.flume.channel.file.Log.get(Log.java:316)
at
org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:373)
at
org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
at
org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:91)
at
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:383)
at
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:662)
2012-10-02 09:17:01,318 ERROR
[SinkRunner-PollingRunner-DefaultSinkProcessor] SinkRunner.java - Unable to
deliver event. Exception follows.
org.apache.flume.EventDeliveryException: java.lang.NullPointerException:
LogFile is null for id 1
at
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:450)
at
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NullPointerException: LogFile is null for id 1
at
com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
at org.apache.flume.channel.file.Log.get(Log.java:316)
at
org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:373)
at
org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
at
org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:91)
at
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:383)
... 3 more



--
Rgds
Ray




--
Rgds
Ray



--
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/



--
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/



--
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/