ode-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthieu Riou" <matth...@offthelip.org>
Subject Re: Client calling retired process?
Date Tue, 25 Nov 2008 16:20:07 GMT
On Mon, Nov 24, 2008 at 7:14 AM, Chris Taylor <saursoor@yahoo.com> wrote:

> Some more information regarding this error:
>
> we are still seeing this even with the ODE Trunk 1.2.1 deployment. It
> occurs quite rarely, but it seems the catalyst is an OutOfMemoryError raised
> by ODE when a new request comes in:
>

Reviewing the code again I couldn't spot anything that would produce this
behavior. The process or the process data aren't stored in structures that
would be sensitive to OOM. One thing that could help would be a debug log of
BpelEngineImpl when the problem occurs as routing to a given process from
the message happens in BpelEngineImpl.route(). So you could just set that
logger to debug and see the next time it happens.

Thanks,
Matthieu


>
>
> java.lang.OutOfMemoryError
>
> at
> org.apache.ode.bpel.engine.MyRoleMessageExchangeImpl$ResponseFuture.get(MyRoleMessageExchangeImpl.java:201)
>
> at
> org.apache.ode.axis2.ODEService.onAxisMessageExchange(ODEService.java:149)
>
> at
> org.apache.ode.axis2.hooks.ODEMessageReceiver.invokeBusinessLogic(ODEMessageReceiver.java:67)
>
> at
> org.apache.ode.axis2.hooks.ODEMessageReceiver.invokeBusinessLogic(ODEMessageReceiver.java:50)
>
> at
> org.apache.axis2.receivers.AbstractMessageReceiver.receive(AbstractMessageReceiver.java:96)
>
> at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:145)
>
> at
> org.apache.axis2.transport.http.HTTPTransportUtils.processHTTPPostRequest(HTTPTransportUtils.java:275)
>
> at org.apache.axis2.transport.http.AxisServlet.doPost(AxisServlet.java:120)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:763)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:856)
>
> at com.ibm.ws <http://com.ibm.ws.webcontainer.servlet.servletwrapper.se/>
> .webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1075)
>
> at com.ibm.ws
> .webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:550)
>
> at
> com.ibm.ws.wswebcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:478)
>
> at
> com.ibm.ws.webcontainer.servlet.CacheServletWrapper.handleRequest(CacheServletWrapper.java:90)
>
> at
> com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:744)
>
> at
> com.ibm.ws.wswebcontainer.WebContainer.handleRequest(WebContainer.java:1455)
>
> at com.ibm.ws <http://com.ibm.ws.webcontainer.channel.wcchannellink.re/>
> .webcontainer.channel.WCChannelLink.ready(WCChannelLink.java:115)
>
> at com.ibm.ws <http://com.ibm.ws.http.channel.inbound.impl.ht/>
> .http.channel.inbound.impl.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:458)
>
> at
> com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewInformation(HttpInboundLink.java:387)
>
> at com.ibm.ws<http://com.ibm.ws.http.channel.inbound.impl.httpiclreadcallback.com/>
> .http.channel.inbound.impl.HttpICLReadCallback.complete(HttpICLReadCallback.java:102)
>
> at
> com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:165)
>
> at com.ibm.io <http://com.ibm.io.async.abstractasyncfuture.in/>
> .async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217)
>
> at com.ibm.io <http://com.ibm.io.async.asyncchannelfuture.fi/>
> .async.AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161)
>
> at com.ibm.io <http://com.ibm.io.async.asyncfuture.com/>
> .async.AsyncFuture.completed(AsyncFuture.java:136)
>
> at com.ibm.io <http://com.ibm.io.async.resulthandler.com/>
> .async.ResultHandler.complete(ResultHandler.java:195)
>
> at com.ibm.io <http://com.ibm.io.async.resulthandler.ru/>
> .async.ResultHandler.runEventProcessingLoop(ResultHandler.java:743)
>
> at com.ibm.io <http://com.ibm.io.async.re/>
> .async.ResultHandler$2.run(ResultHandler.java:873)
>
> at com.ibm.ws <http://com.ibm.ws.util.th/>
> .util.ThreadPool$Worker.run(ThreadPool.java:1473)
>
>
>
> After Websphere recovers, from this point on until we redeploy the process
> in question to a new version, ODE attempts to route subsequent requests to a
> retired version.
>
>
>
> [11/20/08 14:29:26:968 CST] 00000046 SystemOut O 14:29:26,967 ERROR
> [BpelEngineImpl] Scheduled job failed; jobDetail={type=INVOKE_INTERNAL,
> pid={http://eclipse.org/bpel/sample}AdminYNProcess-195,<http://eclipse.org/bpel/sample%7DAdminYNProcess-195,>inmem=true,
mexid=4611686018427387977}
>
> org.apache.ode.bpel.runtime.InvalidProcessException: Process is retired.
>
> at
> org.apache.ode.bpel.engine.PartnerLinkMyRoleImpl.invokeNewInstance(PartnerLinkMyRoleImpl.java:173)
>
> at
> org.apache.ode.bpel.engine.BpelProcess.invokeProcess(BpelProcess.java:204)
>
> at
> org.apache.ode.bpel.engine.BpelProcess.handleWorkEvent(BpelProcess.java:372)
>
> at
> org.apache.ode.bpel.engine.BpelEngineImpl.onScheduledJob(BpelEngineImpl.java:326)
>
> at
> org.apache.ode.bpel.engine.BpelServerImpl.onScheduledJob(BpelServerImpl.java:373)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4$1.call(SimpleScheduler.java:337)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4$1.call(SimpleScheduler.java:336)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler.execTransaction(SimpleScheduler.java:174)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4.call(SimpleScheduler.java:335)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4.call(SimpleScheduler.java:332)
>
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:284)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:665)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:690)
>
> at java.lang.Thread.run(Thread.java:810)
>
> Attached is the Java core dump file from the time of the original
> OutOfMemoryError, showing that it was caused by excessive garbage
> collection.  the VM this runs under allocates 1 Gig of memory on the heap.
>
> - Chris Taylor
>
>  ------------------------------
> *From:* Matthieu Riou <matthieu@offthelip.org>
> *To:* user@ode.apache.org
> *Cc:* Dave Cecchi <dave.cecchi@perficient.com>
> *Sent:* Thursday, October 16, 2008 10:40:57 AM
> *Subject:* Re: Client calling retired process?
>
> On Wed, Oct 15, 2008 at 9:27 AM, Chris Taylor <saursoor@yahoo.com> wrote:
>
> > Matthieu, Yes would appreciate if you could put that latest built war
> > somewhere.  We have attempted to build with buildr without success.
> >
>
> Here it is:
>
> http://people.apache.org/~mriou/ode-axis2-war-1.2.1-SNAPSHOT.war<http://people.apache.org/%7Emriou/ode-axis2-war-1.2.1-SNAPSHOT.war>
>
> Let me know how it goes.
>
> Cheers,
> Matthieu
>
>
> >
> >
> >
> > ----- Original Message ----
> > From: Matthieu Riou <matthieu@offthelip.org>
> > To: user@ode.apache.org
> > Sent: Monday, October 13, 2008 1:30:56 PM
> > Subject: Re: Client calling retired process?
> >
> > On Mon, Oct 13, 2008 at 10:55 AM, Chris Taylor <saursoor@yahoo.com>
> wrote:
> >
> > > Thanks, Matthieu.  Some background:
> > >
> > > we're running ODE 1.2 on Websphere 6.1, with Oracle 10g as the process
> > > store.
> > >
> > > This scenario consistently fails in the manner I described, but it
> seems
> > > only for certain processes.
> > >
> > > So, for example, if i have the following:
> > >
> > > ProcessA-20
> > > ProcessB-21
> > > ProcessC-22
> > >
> > > deployed in my environment, the scenario would be that something causes
> > > ProcessA-20 to hang - at which point it goes into recovery mode and
> > spawns
> > > an ode job to retry.  From this point on, new requests to (not just)
> > > ProcessA get routed to the now-retired ProcessA-19, but also new
> requests
> > to
> > > ProcessB get routed to (now-retired) ProcessB-20!  The weird thing is,
> > > ProcessC-22 is apparently unaffected.  It still gets calls legitimately
> > > routed to its latest versioned deployment, ProcessC-22.
> > >
> > > I do not know if this happens under other scenarios unrelated to
> > recovery.
> > > I think I just do not have enough data points yet to say.
> > >
> > >
> >
> > If you have a reproducible test scenario, it would be great if you could
> > try
> > it with the current stable branch. I've fixed something related to what
> > you're describing a couple of months ago. If doing a build is an issue
> for
> > you, I can upload the WAR to a public place.
> >
> > Thanks,
> > Matthieu
> >
> >
> > >
> > >
> > >
> > > ----- Original Message ----
> > > From: Matthieu Riou <matthieu@offthelip.org>
> > > To: user@ode.apache.org
> > > Sent: Monday, October 13, 2008 12:33:18 PM
> > > Subject: Re: Client calling retired process?
> > >
> > > On Mon, Oct 13, 2008 at 8:17 AM, Chris Taylor <saursoor@yahoo.com>
> > wrote:
> > >
> > > > Thanks, Alexis, but i'm no closer to fully understanding why this
> > occurs.
> > > > It happens periodically now almost everyday with different deployed
> > > > processes.  Although I don't understand it, I have done some research
> > > into
> > > > the behaviour.  Here's a scenario:
> > > >
> > > > we'll deploy ProcessA-19, then retire it with ProcessA-20 deployment.
> > At
> > > > some point it, or another, process will fail and attempt to go into
> > > recovery
> > > > mode (excuse me if I state this incorrectly),  at this point ODE will
> > > create
> > > > a scheduled job in an attempt to retry the service later.
> > > >
> > > > Here's where it gets screwy.  From then on, all new calls to ProcessA
> > > will
> > > > not route to ProcessA-20, but ode will attempt to route them to
> > > ProcessA-19,
> > > > which is of course retired. Ode does not recover from this.  It seems
> > the
> > > > only way to compensate is to redeploy ProcessA as ProcessA-21.  New
> > > requests
> > > > will then route correctly.
> > > >
> > > > Any idea here?
> > > >
> > >
> > > I'll have to ask a few more questions to narrow it down and make sure I
> > > understand correctly:
> > >
> > >  * Does the exact same scenario sometimes works and sometimes doesn't?
> > >  * Is it always happening in relation with recovery and retry or did
> you
> > > see it happen in other situations as well?
> > >  * Which version of ODE are you using? Have you tried with a recent 1.X
> > > branch?
> > >
> > > Thanks,
> > > Matthieu
> > >
> > >
> > > >
> > > >
> > > >
> > > > ----- Original Message ----
> > > > From: Alexis Midon <midon@intalio.com>
> > > > To: user@ode.apache.org
> > > > Sent: Wednesday, October 8, 2008 7:26:54 PM
> > > > Subject: Re: Client calling retired process?
> > > >
> > > > Hi Chris,
> > > >
> > > > No new executions can be started on a retired process, but running
> > > > instances
> > > > can still finish their job. [1]
> > > >
> > > > I'm not really familiar with this part of the code, but after looking
> > at
> > > > it,
> > > > it seems to me that the deployment of a new version is not atomic.
> > > Meaning
> > > > that a process could be flagged as retired while the creation of a
> new
> > > > instance is in progress, hence you're exception.
> > > >
> > > > does it make sense regarding your scenario? is it possible that the
> > > process
> > > > gets retired while messages are coming in?
> > > >
> > > > [1] further details here:
> > > > http://ode.apache.org/user-guide.html#UserGuide-Versioning
> > > >
> > > >
> > > >
> > > > On Wed, Oct 8, 2008 at 11:37 AM, Chris Taylor <saursoor@yahoo.com>
> > > wrote:
> > > >
> > > > > Okay, I've a deployment (called GetCodes) bundle that includes 5
> > > > > processes.  4 of the processes make calls to the fifth (it's an
> > > > abstraction
> > > > > layer of process business logic).  When I deploy this "GetCodes"
> > bundle
> > > > > using the DeploymentService utility, I can see an incremented
> > > deployment
> > > > > (say, GetCodes-40) alongside previous iterations.
> > > > >
> > > > > Occasionally, I'll have a client making soap calls to one of the
> > > > processes
> > > > > under this logical bundle that will fail with the following error:
> > > > >
> > > > > InvalidProcessException: Process is retired.
> > > > >
> > > > > In the logs, it's clear that ODE is directing this client call to
> > > > > GetCodes-39 - though the client isn't explicitly attempting to call
> a
> > > > > specific version (is that even possible?).  Any clue why some
> clients
> > > > > periodically - erroneously - are directed by ODE to a retired
> process
> > > > > version?
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
> >
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message