ode-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From René Bos <r....@pagelink.nl>
Subject RE: INTERNAL ERROR: No ENTRY for RESPONSE CHANNEL 69
Date Mon, 17 Dec 2007 16:23:38 GMT
Hello!!

I did some research with one of my colleagues and found a strange thing. I turned on PostgreSQL
logging and saw this:
2007-12-17 15:20:00 LOG:  execute <unnamed>: SELECT t0.CORRELATOR_ID, t1.MESSAGE_ROUTE_ID,
t1.CORRELATION_KEY, t1.CORR_ID, t1.GROUP_ID, t1.ROUTE_INDEX, t1.PROCESS_INSTANCE_ID FROM ODE_CORRELATOR
t0 INNER JOIN ODE_MESSAGE_ROUTE t1 ON t0.CORRELATOR_ID = t1.CORR_ID WHERE (t0.CORRELATOR_KEY
= $1 AND t0.PROC_ID = $2) ORDER BY t0.CORRELATOR_ID ASC 
2007-12-17 15:20:00 DETAIL:  parameters: $1 = '104.saveOrAanbieden', $2 = '51' 

When I executed this by myself I found out that it returned two rows (I displayed all rows
from the both tables): 
53;"104.saveOrAanbieden";51;174;"103~nl.pagelink.torque.opm.ObjectenMut_20581##1188214622828283";"69";0;53;63
53;"104.saveOrAanbieden";51;302;"103~nl.pagelink.torque.opm.ObjectenMut_20581##1188214622828283";"149";0;53;63

It looks like old routes are not cleaned up, so when it reached findRoute in PartnerLinkMyRoleImpl
it can return an old route, with a wrong channel.
An other possibility  would be that when the process gets by the saveOrAanbieden receive the
second time, it creates a new route, but a route already existed because it was not removed
(and was not meant to be removed, I don't know exactly how this works).

Please note that the problem appears only when it reached saveOrAanbieden or approveOrDisapprove
for the second time (Because of the used while).

In the following code fragment from PartnerLinkMyRoleImpl I see that it returns the first
route found. Note that this is the Ode 1.1 source, not the current trunk (Because we use Ode
1.1)

// Try to find a route for one of our keys.
for (CorrelationKey key : keys) {
	messageRoute = correlator.findRoute(key);
	if (messageRoute != null) {
		if (__log.isDebugEnabled()) {
			__log.debug("INPUTMSG: " + correlatorId + ": ckey " + key + " route is to " + messageRoute);
		}
		matchedKey = key;
		break;
	}
}

I hope you can see what the problem exactly is and give us some fix. Because the crashing
processes (2 of them) are already running by a customer we did like to get a solution within
a short time.
Can you please tell us if we can do a temporary fix in the source so that we can make our
customer happy again? We are thinging of something to find only the newest route and discard
the previous ones. Maybe a order by in a query? We don't know where..
Also I was thinking of removing the break in the code fragment above, could this fix the problem?

Thanks!!

René

-----Original Message-----
From: René Bos [mailto:r.bos@pagelink.nl] 
Sent: zaterdag 15 december 2007 13:43
To: user@ode.apache.org
Subject: RE: INTERNAL ERROR: No ENTRY for RESPONSE CHANNEL 69

Yeah I also searched for a difference between the two configurations! But could not find anything.
One difference was the Java versions, 5 and 6. But it don't work with both of them on the
working machine. Another differnce is Win 2000 vs Win XP on the testmachine but that don't
have to be a problem I think. Another thing is that the testmachine is a lot faster, more
RAM and 2 cores.

The strange thing is that I copied the entire Tomcat folder from my machine to the testmachien
(to the same location) and also copied the used databases. But then the problems still exists.

I remember now something that could be usefull to. When the error comes up, in the message
exchange table a UKNOWN_ENDPOINT status is set to the message. But after some time (more than
half a hour) when I restarted tomcat, some of the UKNOWN_ENDPOINT's were processed. Not all.
That happend to me some times..

I'm not at work at the moment but I think we used (on both machines because they are copies):
ode-axis2.db.mode=EXTERNAL
ode-axis2.db.ext.dataSource=java:comp/env/jdbc/OdeDS

And OdeDS is configured in Tomcat 5.5.23.

At the moment I'm thinking of a timing problem or something. But I find it very strange!
I have a database dump (SQL) with 3 processes deployed, but only with one process instance.
And that process instances failed with the error. Maybe you can do something with that?

Thanks!
   Rene

-----Oorspronkelijk bericht-----
Van: matthieu.riou@gmail.com namens Matthieu Riou
Verzonden: vr 14-12-2007 18:14
Aan: user@ode.apache.org
Onderwerp: Re: INTERNAL ERROR: No ENTRY for RESPONSE CHANNEL 69
 
Sounds to me like a transaction manager problem, when channels can't be
found it's usually a missing commit somewhere. Since it works on your
machine and not on the others, and also that problems with unfound channels
usually don't happen on normal configuration, I'd lean toward a
configuration problem. Which leads me to the questions: what is the
difference between you configuration and the configuration on your test
machine? Postgres? Are you running in internal, embedded or external mode?

Thanks,
Matthieu

On Dec 14, 2007 8:12 AM, René Bos <r.bos@pagelink.nl> wrote:

>  Hello!
>
>
>
> I have a problem with two of my processes. I'm running Ode 1.1 with a
> PostgreSQL database. I attached one of the processes so you can see the BPEL
> code. I attached also the error.
>
>
>
> The error appears sometimes when I do the following calls:
>
> Initiate
>
> saveOrAanbieden with completionValue save
>
> saveOrAanbieden with completionValue aanbieden
>
>
>
> Or when I do:
>
> Initiate
>
> saveOrAanbieden with completionValue aanbieden
>
> approveOrDisapprove with completionValue disapprove
>
> saveOrAanbieden with completionValue aanbieden or save
>
>
>
> The strange thing is the problem does not exists on my local workstation,
> but it does on another testing machine!
>
> On the testing machine it sometimes does show up, other times not.
>
>
>
> Can you tell me if something is fixed in this area? Or can you help me by
> checking my process/reproduce it..
>
>
>
> Rene
>


Mime
View raw message