ode-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Taylor <saurs...@yahoo.com>
Subject Re: Client calling retired process?
Date Tue, 25 Nov 2008 20:35:20 GMT
One other thing - I have no idea if this is related or not:  We also use the DeploymentService
for bpel deployments into this test environment (and all of our environments, actually). 
Every time we start up now, we see errors like the following:
[11/25/08 14:12:04:463 CST] 00000060 SystemOut O 14:12:04,462 ERROR [ProcessStoreImpl] Error
loading DU from store: GetProviderDetails-107
org.apache.ode.bpel.iapi.ContextException: Deployed directory null no longer there!
at org.apache.ode.store.ProcessStoreImpl.load(ProcessStoreImpl.java:606)
at org.apache.ode.store.ProcessStoreImpl$6.call(ProcessStoreImpl.java:461)
at org.apache.ode.store.ProcessStoreImpl$Callable.call(ProcessStoreImpl.java:701)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:284)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:665)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:690)
at java.lang.Thread.run(Thread.java:810)
which makes perfect sense, of course, since we are no longer on version GetProviderDetails-107.
but now on, let's say, GetProviderDetails-200.  But why would ODE continue to look for retired
versions?  And now, given that we are many versions deep on many of these bpels, these errors
take several pages upon startup.




________________________________
From: Matthieu Riou <matthieu@offthelip.org>
To: user@ode.apache.org
Sent: Tuesday, November 25, 2008 10:20:07 AM
Subject: Re: Client calling retired process?

On Mon, Nov 24, 2008 at 7:14 AM, Chris Taylor <saursoor@yahoo.com> wrote:

> Some more information regarding this error:
>
> we are still seeing this even with the ODE Trunk 1.2.1 deployment. It
> occurs quite rarely, but it seems the catalyst is an OutOfMemoryError raised
> by ODE when a new request comes in:
>

Reviewing the code again I couldn't spot anything that would produce this
behavior. The process or the process data aren't stored in structures that
would be sensitive to OOM. One thing that could help would be a debug log of
BpelEngineImpl when the problem occurs as routing to a given process from
the message happens in BpelEngineImpl.route(). So you could just set that
logger to debug and see the next time it happens.

Thanks,
Matthieu


>
>
> java.lang.OutOfMemoryError
>
> at
> org.apache.ode.bpel.engine.MyRoleMessageExchangeImpl$ResponseFuture.get(MyRoleMessageExchangeImpl.java:201)
>
> at
> org.apache.ode.axis2.ODEService.onAxisMessageExchange(ODEService.java:149)
>
> at
> org.apache.ode.axis2.hooks.ODEMessageReceiver.invokeBusinessLogic(ODEMessageReceiver.java:67)
>
> at
> org.apache.ode.axis2.hooks.ODEMessageReceiver.invokeBusinessLogic(ODEMessageReceiver.java:50)
>
> at
> org.apache.axis2.receivers.AbstractMessageReceiver.receive(AbstractMessageReceiver.java:96)
>
> at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:145)
>
> at
> org.apache.axis2.transport.http.HTTPTransportUtils.processHTTPPostRequest(HTTPTransportUtils.java:275)
>
> at org.apache.axis2.transport.http.AxisServlet.doPost(AxisServlet.java:120)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:763)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:856)
>
> at com.ibm.ws <http://com.ibm.ws.webcontainer.servlet.servletwrapper.se/>
> .webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1075)
>
> at com.ibm.ws
> .webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:550)
>
> at
> com.ibm.ws.wswebcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:478)
>
> at
> com.ibm.ws.webcontainer.servlet.CacheServletWrapper.handleRequest(CacheServletWrapper.java:90)
>
> at
> com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:744)
>
> at
> com.ibm.ws.wswebcontainer.WebContainer.handleRequest(WebContainer.java:1455)
>
> at com.ibm.ws <http://com.ibm.ws.webcontainer.channel.wcchannellink.re/>
> .webcontainer.channel.WCChannelLink.ready(WCChannelLink.java:115)
>
> at com.ibm.ws <http://com.ibm.ws.http.channel.inbound.impl.ht/>
> .http.channel.inbound.impl.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:458)
>
> at
> com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewInformation(HttpInboundLink.java:387)
>
> at com.ibm.ws<http://com.ibm.ws.http.channel.inbound.impl.httpiclreadcallback.com/>
> .http.channel.inbound.impl.HttpICLReadCallback.complete(HttpICLReadCallback.java:102)
>
> at
> com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:165)
>
> at com.ibm.io <http://com.ibm.io.async.abstractasyncfuture.in/>
> .async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217)
>
> at com.ibm.io <http://com.ibm.io.async.asyncchannelfuture.fi/>
> .async.AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161)
>
> at com.ibm.io <http://com.ibm.io.async.asyncfuture.com/>
> .async.AsyncFuture.completed(AsyncFuture.java:136)
>
> at com.ibm.io <http://com.ibm.io.async.resulthandler.com/>
> .async.ResultHandler.complete(ResultHandler.java:195)
>
> at com.ibm.io <http://com.ibm.io.async.resulthandler.ru/>
> .async.ResultHandler.runEventProcessingLoop(ResultHandler.java:743)
>
> at com.ibm.io <http://com.ibm.io.async.re/>
> .async.ResultHandler$2.run(ResultHandler.java:873)
>
> at com.ibm.ws <http://com.ibm.ws.util.th/>
> .util.ThreadPool$Worker.run(ThreadPool.java:1473)
>
>
>
> After Websphere recovers, from this point on until we redeploy the process
> in question to a new version, ODE attempts to route subsequent requests to a
> retired version.
>
>
>
> [11/20/08 14:29:26:968 CST] 00000046 SystemOut O 14:29:26,967 ERROR
> [BpelEngineImpl] Scheduled job failed; jobDetail={type=INVOKE_INTERNAL,
> pid={http://eclipse.org/bpel/sample}AdminYNProcess-195,<http://eclipse.org/bpel/sample%7DAdminYNProcess-195,>inmem=true,
mexid=4611686018427387977}
>
> org.apache.ode.bpel.runtime.InvalidProcessException: Process is retired.
>
> at
> org.apache.ode.bpel.engine.PartnerLinkMyRoleImpl.invokeNewInstance(PartnerLinkMyRoleImpl.java:173)
>
> at
> org.apache.ode.bpel.engine.BpelProcess.invokeProcess(BpelProcess.java:204)
>
> at
> org.apache.ode.bpel.engine.BpelProcess.handleWorkEvent(BpelProcess.java:372)
>
> at
> org.apache.ode.bpel.engine.BpelEngineImpl.onScheduledJob(BpelEngineImpl.java:326)
>
> at
> org.apache.ode.bpel.engine.BpelServerImpl.onScheduledJob(BpelServerImpl.java:373)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4$1.call(SimpleScheduler.java:337)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4$1.call(SimpleScheduler.java:336)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler.execTransaction(SimpleScheduler.java:174)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4.call(SimpleScheduler.java:335)
>
> at
> org.apache.ode.scheduler.simple.SimpleScheduler$4.call(SimpleScheduler.java:332)
>
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:284)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:665)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:690)
>
> at java.lang.Thread.run(Thread.java:810)
>
> Attached is the Java core dump file from the time of the original
> OutOfMemoryError, showing that it was caused by excessive garbage
> collection.  the VM this runs under allocates 1 Gig of memory on the heap.
>
> - Chris Taylor
>
>  ------------------------------
> *From:* Matthieu Riou <matthieu@offthelip.org>
> *To:* user@ode.apache.org
> *Cc:* Dave Cecchi <dave.cecchi@perficient.com>
> *Sent:* Thursday, October 16, 2008 10:40:57 AM
> *Subject:* Re: Client calling retired process?
>
> On Wed, Oct 15, 2008 at 9:27 AM, Chris Taylor <saursoor@yahoo.com> wrote:
>
> > Matthieu, Yes would appreciate if you could put that latest built war
> > somewhere.  We have attempted to build with buildr without success.
> >
>
> Here it is:
>
> http://people.apache.org/~mriou/ode-axis2-war-1.2.1-SNAPSHOT.war<http://people.apache.org/%7Emriou/ode-axis2-war-1.2.1-SNAPSHOT.war>
>
> Let me know how it goes.
>
> Cheers,
> Matthieu
>
>
> >
> >
> >
> > ----- Original Message ----
> > From: Matthieu Riou <matthieu@offthelip.org>
> > To: user@ode.apache.org
> > Sent: Monday, October 13, 2008 1:30:56 PM
> > Subject: Re: Client calling retired process?
> >
> > On Mon, Oct 13, 2008 at 10:55 AM, Chris Taylor <saursoor@yahoo.com>
> wrote:
> >
> > > Thanks, Matthieu.  Some background:
> > >
> > > we're running ODE 1.2 on Websphere 6.1, with Oracle 10g as the process
> > > store.
> > >
> > > This scenario consistently fails in the manner I described, but it
> seems
> > > only for certain processes.
> > >
> > > So, for example, if i have the following:
> > >
> > > ProcessA-20
> > > ProcessB-21
> > > ProcessC-22
> > >
> > > deployed in my environment, the scenario would be that something causes
> > > ProcessA-20 to hang - at which point it goes into recovery mode and
> > spawns
> > > an ode job to retry.  >From this point on, new requests to (not just)
> > > ProcessA get routed to the now-retired ProcessA-19, but also new
> requests
> > to
> > > ProcessB get routed to (now-retired) ProcessB-20!  The weird thing is,
> > > ProcessC-22 is apparently unaffected.  It still gets calls legitimately
> > > routed to its latest versioned deployment, ProcessC-22.
> > >
> > > I do not know if this happens under other scenarios unrelated to
> > recovery.
> > > I think I just do not have enough data points yet to say.
> > >
> > >
> >
> > If you have a reproducible test scenario, it would be great if you could
> > try
> > it with the current stable branch. I've fixed something related to what
> > you're describing a couple of months ago. If doing a build is an issue
> for
> > you, I can upload the WAR to a public place.
> >
> > Thanks,
> > Matthieu
> >
> >
> > >
> > >
> > >
> > > ----- Original Message ----
> > > From: Matthieu Riou <matthieu@offthelip.org>
> > > To: user@ode.apache.org
> > > Sent: Monday, October 13, 2008 12:33:18 PM
> > > Subject: Re: Client calling retired process?
> > >
> > > On Mon, Oct 13, 2008 at 8:17 AM, Chris Taylor <saursoor@yahoo.com>
> > wrote:
> > >
> > > > Thanks, Alexis, but i'm no closer to fully understanding why this
> > occurs.
> > > > It happens periodically now almost everyday with different deployed
> > > > processes.  Although I don't understand it, I have done some research
> > > into
> > > > the behaviour.  Here's a scenario:
> > > >
> > > > we'll deploy ProcessA-19, then retire it with ProcessA-20 deployment.
> > At
> > > > some point it, or another, process will fail and attempt to go into
> > > recovery
> > > > mode (excuse me if I state this incorrectly),  at this point ODE will
> > > create
> > > > a scheduled job in an attempt to retry the service later.
> > > >
> > > > Here's where it gets screwy.  From then on, all new calls to ProcessA
> > > will
> > > > not route to ProcessA-20, but ode will attempt to route them to
> > > ProcessA-19,
> > > > which is of course retired. Ode does not recover from this.  It seems
> > the
> > > > only way to compensate is to redeploy ProcessA as ProcessA-21.  New
> > > requests
> > > > will then route correctly.
> > > >
> > > > Any idea here?
> > > >
> > >
> > > I'll have to ask a few more questions to narrow it down and make sure I
> > > understand correctly:
> > >
> > >  * Does the exact same scenario sometimes works and sometimes doesn't?
> > >  * Is it always happening in relation with recovery and retry or did
> you
> > > see it happen in other situations as well?
> > >  * Which version of ODE are you using? Have you tried with a recent 1.X
> > > branch?
> > >
> > > Thanks,
> > > Matthieu
> > >
> > >
> > > >
> > > >
> > > >
> > > > ----- Original Message ----
> > > > From: Alexis Midon <midon@intalio.com>
> > > > To: user@ode.apache.org
> > > > Sent: Wednesday, October 8, 2008 7:26:54 PM
> > > > Subject: Re: Client calling retired process?
> > > >
> > > > Hi Chris,
> > > >
> > > > No new executions can be started on a retired process, but running
> > > > instances
> > > > can still finish their job. [1]
> > > >
> > > > I'm not really familiar with this part of the code, but after looking
> > at
> > > > it,
> > > > it seems to me that the deployment of a new version is not atomic.
> > > Meaning
> > > > that a process could be flagged as retired while the creation of a
> new
> > > > instance is in progress, hence you're exception.
> > > >
> > > > does it make sense regarding your scenario? is it possible that the
> > > process
> > > > gets retired while messages are coming in?
> > > >
> > > > [1] further details here:
> > > > http://ode.apache.org/user-guide.html#UserGuide-Versioning
> > > >
> > > >
> > > >
> > > > On Wed, Oct 8, 2008 at 11:37 AM, Chris Taylor <saursoor@yahoo.com>
> > > wrote:
> > > >
> > > > > Okay, I've a deployment (called GetCodes) bundle that includes 5
> > > > > processes.  4 of the processes make calls to the fifth (it's an
> > > > abstraction
> > > > > layer of process business logic).  When I deploy this "GetCodes"
> > bundle
> > > > > using the DeploymentService utility, I can see an incremented
> > > deployment
> > > > > (say, GetCodes-40) alongside previous iterations.
> > > > >
> > > > > Occasionally, I'll have a client making soap calls to one of the
> > > > processes
> > > > > under this logical bundle that will fail with the following error:
> > > > >
> > > > > InvalidProcessException: Process is retired.
> > > > >
> > > > > In the logs, it's clear that ODE is directing this client call to
> > > > > GetCodes-39 - though the client isn't explicitly attempting to call
> a
> > > > > specific version (is that even possible?).  Any clue why some
> clients
> > > > > periodically - erroneously - are directed by ODE to a retired
> process
> > > > > version?
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
> >
>
>
>


      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message