flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Everton" <...@iocaine.org>
Subject Fwd: Possible Bug in AvroEventSource
Date Thu, 21 Jul 2011 03:00:00 GMT
Sorry, sent this to the wrong mailing list.


----- Original message -----
From: "Dan Everton" <dan@iocaine.org>
To: flume-user@cloudera.org
Date: Thu, 21 Jul 2011 12:59:19 +1000
Subject: Possible Bug in AvroEventSource

We're using a custom logging library to write to locally installed Flume
Node instance with an avroSource configured. Every so often the Flume
Node stops responding and it's thread count starts going up
dramatically. Checking the thread dumps from the node we see this:

Thread 611 (783209204@qtp-689559925-452):
  State: WAITING
  Blocked count: 0
  Waited count: 1
  Waiting on
  java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@23085bf6
  Stack:
    sun.misc.Unsafe.park(Native Method)
    java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
    java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
    java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:254)
    com.cloudera.flume.handlers.avro.AvroEventSource.enqueue(AvroEventSource.java:116)
    com.cloudera.flume.handlers.avro.AvroEventSource$1.append(AvroEventSource.java:137)
    sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    java.lang.reflect.Method.invoke(Method.java:597)
    org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:88)
    org.apache.avro.ipc.Responder.respond(Responder.java:150)
    org.apache.avro.ipc.Responder.respond(Responder.java:100)
    org.apache.avro.ipc.ResponderServlet.doPost(ResponderServlet.java:48)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
    org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
    org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401)
    org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
    org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    org.mortbay.jetty.Server.handle(Server.java:326)

over and over again.

I think what's happening is that something causes the Avro server to
fail and the events just queue up eventually causing the node to fail.

Poking around the AvroEventSource.java code around those lines I see
this

    this.svr = new FlumeEventAvroServerImpl(port) {
      @Override
      public void append(AvroFlumeEvent evt) {
        // convert AvroEvent evt -> e
        AvroEventAdaptor adapt = new AvroEventAdaptor(evt);
        try {
          enqueue(adapt.toFlumeEvent());
        } catch (IOException e1) {
          e1.printStackTrace();
        }
        super.append(evt);
      }
    };

The fact that any exceptions form the enqueue process get swallowed
seems problematic to me, but I'm not sure if it's why things eventually
fail.

Has anyone else seen something like this? For the moment we're going to
switch back to Thrift as that seems to be better tested.

Cheers,
Dan


Mime
View raw message