flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brock Noland <br...@cloudera.com>
Subject Re: flume-ng data recovery
Date Tue, 15 Jan 2013 01:48:47 GMT
Hi,

OK..... I would increase the capacity of the channel to say 2000000
with the original unmodified files.

I would also upgrade to the latest 1.3.1 since there are many file
channel fixes in 1.3.0 and 1.3.1.

On Mon, Jan 14, 2013 at 5:33 PM, Camp, Roy <rcamp@ebay.com> wrote:
> When I deleted both, the error changes to the one below.  However, I removed data file
log-241 and was able to replay log-240 with no problem.  After that completed I removed the
log-240 and put the log-241 in the data directory.  It didn't appear to be working but I split
log-241 into chunks and the first chunks seem to be working so far.
>
> Thanks,
>
> Roy
>
>
>
> 2013-01-14 17:29:00,346 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.channel.file.Log.replay(Log.java:304)]
Found NextFileID 241, from [/var/log/flume-ng/collectorfix/data/log-240, /var/log/flume-ng/collectorfix/data/log-241]
> 2013-01-14 17:29:00,381 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.channel.file.EventQueueBackingStoreFile.<init>(EventQueueBackingStoreFile.java:71)]
Preallocated /var/log/flume-ng/collectorfix/checkpoint/checkpoint to 8008232 for capacity
1000000
> 2013-01-14 17:29:00,384 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.channel.file.EventQueueBackingStoreFileV3.<init>(EventQueueBackingStoreFileV3.java:46)]
Starting up with /var/log/flume-ng/collectorfix/checkpoint/checkpoint and /var/log/flume-ng/collectorfix/checkpoint/checkpoint.meta
> 2013-01-14 17:29:00,442 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.channel.file.Log.replay(Log.java:336)]
Last Checkpoint Mon Jan 14 17:29:00 GMT-07:00 2013, queue depth = 0
> 2013-01-14 17:29:00,454 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.channel.file.Log.replay(Log.java:355)]
Replaying logs with v2 replay logic
> 2013-01-14 17:29:00,458 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:223)]
Starting replay of [/var/log/flume-ng/collectorfix/data/log-240, /var/log/flume-ng/collectorfix/data/log-241]
> 2013-01-14 17:29:00,459 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:236)]
Replaying /var/log/flume-ng/collectorfix/data/log-240
> 2013-01-14 17:29:00,474 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.tools.DirectMemoryUtils.getDefaultDirectMemorySize(DirectMemoryUtils.java:113)]
Unable to get maxDirectMemory from VM: NoSuchMethodException: sun.misc.VM.maxDirectMemory(null)
> 2013-01-14 17:29:00,477 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.tools.DirectMemoryUtils.allocate(DirectMemoryUtils.java:47)]
Direct Memory Allocation:  Allocation = 1048576, Allocated = 0, MaxDirectMemorySize = 2033909760,
Remaining = 2033909760
> 2013-01-14 17:29:00,527 (lifecycleSupervisor-1-1) [WARN - org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition(LogFile.java:431)]
Checkpoint for file(/var/log/flume-ng/collectorfix/data/log-240) is: 1355687437770, which
is beyond the requested checkpoint time: 0 and position 1621631818
> 2013-01-14 17:29:00,548 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:236)]
Replaying /var/log/flume-ng/collectorfix/data/log-241
> 2013-01-14 17:29:00,548 (lifecycleSupervisor-1-1) [WARN - org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition(LogFile.java:431)]
Checkpoint for file(/var/log/flume-ng/collectorfix/data/log-241) is: 1355687437770, which
is beyond the requested checkpoint time: 0 and position 490073589
> 2013-01-14 17:29:40,621 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.channel.file.LogFile$SequentialReader.next(LogFile.java:452)]
Encountered EOF at 1623187930 in /var/log/flume-ng/collectorfix/data/log-240
> 2013-01-14 17:29:48,183 (lifecycleSupervisor-1-1) [ERROR - org.apache.flume.channel.file.Log.replay(Log.java:373)]
Failed to initialize Log on [channel=collectorfilefix]
> java.lang.IllegalStateException: Unable to add FlumeEventPointer [fileID=241, offset=510579142].
Queue depth = 1000000, Capacity = 1000000
>     at org.apache.flume.channel.file.ReplayHandler.processCommit(ReplayHandler.java:374)
>     at org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:309)
>     at org.apache.flume.channel.file.Log.replay(Log.java:356)
>     at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:258)
>     at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:236)
>     at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
>     at java.util.concurrent.FutureTask$Sync.innerRunAndReset(Unknown Source)
>     at java.util.concurrent.FutureTask.runAndReset(Unknown Source)
>     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(Unknown
Source)
>     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(Unknown
Source)
>     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
Source)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>     at java.lang.Thread.run(Unknown Source)
> 2013-01-14 17:29:48,184 (lifecycleSupervisor-1-1) [ERROR - org.apache.flume.channel.file.FileChannel.start(FileChannel.java:269)]
Failed to start the file channel [channel=collectorfilefix]
> java.lang.IllegalStateException: Unable to add FlumeEventPointer [fileID=241, offset=510579142].
Queue depth = 1000000, Capacity = 1000000
>     at org.apache.flume.channel.file.ReplayHandler.processCommit(ReplayHandler.java:374)
>     at org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:309)
>     at org.apache.flume.channel.file.Log.replay(Log.java:356)
>     at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:258)
>     at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:236)
>     at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
>     at java.util.concurrent.FutureTask$Sync.innerRunAndReset(Unknown Source)
>     at java.util.concurrent.FutureTask.runAndReset(Unknown Source)
>     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(Unknown
Source)
>     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(Unknown
Source)
>     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
Source)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>     at java.lang.Thread.run(Unknown Source)
>
>
>
>



-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/

Mime
View raw message