Hi Chen,
I don't see the SinkRunner thread so it didn't get very far. I'm not sure what went wrong but it looks like some error was swallowed somewhere by some Thread maybe. Seems strange. But it isn't immediately obvious to me what went wrong since if you are using a recent version of Flume it should print an error to the log if it gets a ClassNotFoundException, say if it can't find the Hadoop classes.

Regards,
Mike


On Mon, Oct 8, 2012 a

t 8:48 PM, Lichen <kevinli.li@huawei.com> wrote:

Mike, I’ve dumped flume and get this:

 

Full thread dump Java HotSpot(TM) 64-Bit Server VM (14.3-b01 mixed mode):

 

"Attach Listener" daemon prio=10 tid=0x00002aaab5088000 nid=0x3229 waiting on condition [0x0000000000000000]

   java.lang.Thread.State: RUNNABLE

 

   Locked ownable synchronizers:

        - None

 

"conf-file-poller-0" prio=10 tid=0x00002aaab50d6000 nid=0x3210 waiting on condition [0x0000000041139000]

   java.lang.Thread.State: WAITING (parking)

        at sun.misc.Unsafe.park(Native Method)

        - parking to wait for  <0x00002aaab49f8aa8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)

        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)

        at java.util.concurrent.DelayQueue.take(DelayQueue.java:160)

        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:583)

        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:576)

        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)

        at java.lang.Thread.run(Thread.java:619)

 

   Locked ownable synchronizers:

        - None

 

"lifecycleSupervisor-1-0" prio=10 tid=0x00002aaab50fe000 nid=0x320f waiting on condition [0x0000000041038000]

   java.lang.Thread.State: TIMED_WAITING (parking)

        at sun.misc.Unsafe.park(Native Method)

        - parking to wait for  <0x00002aaab49ff488> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)

        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963)

        at java.util.concurrent.DelayQueue.take(DelayQueue.java:164)

        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:583)

        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:576)

        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)

        at java.lang.Thread.run(Thread.java:619)

 

   Locked ownable synchronizers:

        - None

 

"lifecycleSupervisor-1-2" prio=10 tid=0x00002aaab5cf7800 nid=0x320e waiting on condition [0x0000000040f37000]

   java.lang.Thread.State: TIMED_WAITING (parking)

        at sun.misc.Unsafe.park(Native Method)

        - parking to wait for  <0x00002aaab4a23be8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)

        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963)

        at java.util.concurrent.DelayQueue.take(DelayQueue.java:164)

        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:583)

        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:576)

        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)

        at java.lang.Thread.run(Thread.java:619)

 

   Locked ownable synchronizers:

        - None

 

"lifecycleSupervisor-1-1" prio=10 tid=0x00002aaab5cf7000 nid=0x320d waiting on condition [0x0000000040e36000]

   java.lang.Thread.State: TIMED_WAITING (parking)

        at sun.misc.Unsafe.park(Native Method)

        - parking to wait for  <0x00002aaab4a23be8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)

        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963)

        at java.util.concurrent.DelayQueue.take(DelayQueue.java:164)

        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:583)

        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:576)

        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)

        at java.lang.Thread.run(Thread.java:619)

 

   Locked ownable synchronizers:

        - None

 

"lifecycleSupervisor-1-0" prio=10 tid=0x00002aaab5cf6000 nid=0x320c waiting on condition [0x0000000040d35000]

   java.lang.Thread.State: TIMED_WAITING (parking)

        at sun.misc.Unsafe.park(Native Method)

        - parking to wait for  <0x00002aaab4a23be8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)

        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963)

        at java.util.concurrent.DelayQueue.take(DelayQueue.java:164)

        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:583)

        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:576)

        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)

        at java.lang.Thread.run(Thread.java:619)

 

   Locked ownable synchronizers:

        - None

 

"Low Memory Detector" daemon prio=10 tid=0x00002aaab4dcc800 nid=0x320a runnable [0x0000000000000000]

   java.lang.Thread.State: RUNNABLE

 

   Locked ownable synchronizers:

        - None

 

"CompilerThread1" daemon prio=10 tid=0x00002aaab4dc9800 nid=0x3209 waiting on condition [0x0000000000000000]

   java.lang.Thread.State: RUNNABLE

 

   Locked ownable synchronizers:

        - None

 

"CompilerThread0" daemon prio=10 tid=0x00002aaab4dc5800 nid=0x3208 waiting on condition [0x0000000000000000]

   java.lang.Thread.State: RUNNABLE

 

   Locked ownable synchronizers:

        - None

 

"Signal Dispatcher" daemon prio=10 tid=0x00002aaab4bed800 nid=0x3207 runnable [0x0000000000000000]

   java.lang.Thread.State: RUNNABLE

 

   Locked ownable synchronizers:

        - None

 

"Finalizer" daemon prio=10 tid=0x00002aaab4bd0800 nid=0x3206 in Object.wait() [0x000000004072f000]

   java.lang.Thread.State: WAITING (on object monitor)

        at java.lang.Object.wait(Native Method)

        - waiting on <0x00002aaab49ffb68> (a java.lang.ref.ReferenceQueue$Lock)

        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)

        - locked <0x00002aaab49ffb68> (a java.lang.ref.ReferenceQueue$Lock)

        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)

        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

 

   Locked ownable synchronizers:

        - None

 

"Reference Handler" daemon prio=10 tid=0x00002aaab4bce800 nid=0x3205 in Object.wait() [0x000000004062e000]

   java.lang.Thread.State: WAITING (on object monitor)

        at java.lang.Object.wait(Native Method)

        - waiting on <0x00002aaab49f8358> (a java.lang.ref.Reference$Lock)

        at java.lang.Object.wait(Object.java:485)

        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)

        - locked <0x00002aaab49f8358> (a java.lang.ref.Reference$Lock)

 

   Locked ownable synchronizers:

        - None

 

"main" prio=10 tid=0x000000004011a000 nid=0x3201 waiting on condition [0x000000004022a000]

   java.lang.Thread.State: TIMED_WAITING (sleeping)

        at java.lang.Thread.sleep(Native Method)

        at org.apache.flume.lifecycle.LifecycleController.waitForOneOf(LifecycleController.java:73)

        at org.apache.flume.lifecycle.LifecycleController.waitForOneOf(LifecycleController.java:49)

        at org.apache.flume.node.Application.run(Application.java:167)

        at org.apache.flume.node.Application.main(Application.java:70)

 

   Locked ownable synchronizers:

        - None

 

"VM Thread" prio=10 tid=0x00002aaab4bc7800 nid=0x3204 runnable

 

"GC task thread#0 (ParallelGC)" prio=10 tid=0x0000000040124000 nid=0x3202 runnable

 

"GC task thread#1 (ParallelGC)" prio=10 tid=0x0000000040126000 nid=0x3203 runnable

 

"VM Periodic Task Thread" prio=10 tid=0x00002aaab4dcf000 nid=0x320b waiting on condition

 

JNI global references: 615

 

Is this enough to know why flume stuck? Thanks.

Chen

 

On Mon, Oct 9, 2012 at 10:58 AM, Mike Percy <mpercy@cloudera.org> wrote:

 

Chen if you can send jstack output that could help. Or do: sudo kill -3 <pid> on Flume and send the output from the Flume stdout, which is the same thing.

 

Regards

Mike