From jetspeed-dev-return-7966-qmlist-jakarta-archive-jetspeed-dev=nagoya.apache.org@jakarta.apache.org Thu May 22 17:02:08 2003 Return-Path: Delivered-To: apmail-jakarta-jetspeed-dev-archive@apache.org Received: (qmail 64834 invoked from network); 22 May 2003 17:02:07 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 22 May 2003 17:02:07 -0000 Received: (qmail 14841 invoked by uid 97); 22 May 2003 17:04:19 -0000 Delivered-To: qmlist-jakarta-archive-jetspeed-dev@nagoya.betaversion.org Received: (qmail 14834 invoked from network); 22 May 2003 17:04:19 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by nagoya.betaversion.org with SMTP; 22 May 2003 17:04:19 -0000 Received: (qmail 64491 invoked by uid 500); 22 May 2003 17:02:02 -0000 Mailing-List: contact jetspeed-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Jetspeed Developers List" Reply-To: "Jetspeed Developers List" Delivered-To: mailing list jetspeed-dev@jakarta.apache.org Received: (qmail 64478 invoked from network); 22 May 2003 17:02:01 -0000 Received: from smtp10.mail.yahoo.co.jp (211.14.15.31) by daedalus.apache.org with SMTP; 22 May 2003 17:02:01 -0000 Received: from unknown (HELO yahoo.co.jp) (219.117.195.103) by smtp10.mail.yahoo.co.jp with SMTP; 22 May 2003 17:02:03 -0000 X-Apparently-From: Message-ID: <3ECD0329.8090803@yahoo.co.jp> Date: Fri, 23 May 2003 02:04:41 +0900 From: Shinsuke SUGAYA User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4b) Gecko/20030508 X-Accept-Language: ja, en, en-us MIME-Version: 1.0 To: Jetspeed Developers List Subject: Re: possible character encoding bug References: <9E035BE80785AA4EAA456959ECB4A30D035CABBF@nt036.an.sopra> In-Reply-To: <9E035BE80785AA4EAA456959ECB4A30D035CABBF@nt036.an.sopra> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Hi, DefaultJetspeedParameterParser applies the following priorities when determining a encoding information of the form data. 1) the character encoding used in the body of the request. 2) character-set parameter in media.xreg 3) content.defaultencoding in JetspeedResources.properties 4) US-ASCII I have considered that it is better for Jetspeed to use the encoding used in the requested page. But, for tomcat3, it seems that ISO-8859-1 is returned. So Jetspeed cannot handle the form data. Therefore I'm thinking that 1) should be deleted for fixing this issue. Please delete the following code and check this issue. DefaultJetspeedParameterParser.java: @@ -127,10 +127,6 @@ } } - if ( req.getCharacterEncoding() != null ) - { - enc = req.getCharacterEncoding(); - } setCharacterEncoding( enc ); } Please let me know if you have any problems. Regards, shinsuke Aurelien Pernoud wrote: > In fact I've encountered troubles in my app too with encoding in UTF-8. As > far as I went, here's the trouble : > > When a browser sends a request to a web server, it should send the > Content-Type header with its charset (in your case UTF-8). But > unfortunately, most of them don't (IE 5,6, and even Mozilla doesn't ! I > think it's more te act "like IE" but anyway...). When no encoding is > specified, then the servlet API says it's ISO-8859-1. > > In servlet 2.2 (tomcat 3), there is no way to specify what is the encoding > of the request (in 2.3 you have the request.setcharacterencoding that saves > everything), so people came out with this "trick" now also included in > Jetspeed, that is you get the request parameter as if it was 8859-1 (because > browsers didn't say explicitely that it was UTF-8), and then you use the > good character encoding (utf-8) to decode the string. > > Unfortunately, tomcat 3 and tomcat 4 don't work the same way. Tomcat 4 > handles perfectly this trick, because request.getCharacterEncoding returns > null, but I don't know why tomcat 3 returns ISO-8859-1 there. Are you by any > chance using tomcat 3 ? > If so, you can change in turbineresources.properties the parameterparser to > be used : > > services.RunDataService.default.parameter.parser=org.apache.jetspeed.util.pa > rser.DefaultJetspeedParameterParser > # > services.RunDataService.default.parameter.parser=org.apache.turbine.util.par > ser.DefaultParameterParser > > The other one may work fine in your case. That is a real mess, and I haven't > found any way to make it work fine and be API 2.2 compatible with all webapp > servers. The only way to get rid of this is to definitely move to 2.3 :( > > The revelant code is not what you said but here, in setRequest method : > > if ( req.getCharacterEncoding() != null ) > { > enc = req.getCharacterEncoding(); > } > > That's when the parameter parser tries to find what is the request encoding. > Under tomcat 4, ok, under tomcat 3, the getCharacterEncoding isn't null and > so we try to decode 8859-1 to 8859-1... > This final test is here "in case" the browser really sent its encoding, but > as none does (I've tested most used once) maybe this should be thrown > away... I don't know. Here in my app that's what I finally did. > > For mor info on encoding troubles with 2.2 API, see this : > http://www.jguru.com/faq/printablefaq.jsp?topic=I18N > > I hope I was clear enough, but as you see this is an awful bug. > Aurelien > > Joachim Müller a écrit : > > >>hi, I just want to check back before I submit this >>to bugzilla: >> >> >>there is a possible character encoding bug in >> >>org.apache.jetspeed.util.parser.DefaultJetspeedParameterParser >> >>line 151 >> >>return new String(str.getBytes("8859_1"), getCharacterEncoding()); >> >> >>this leads to errors using german umlaute when rundata parameter >>are encoded with UTF-8. (eg. in the user name: try user name >>übel, create an account and try to edit the account) >> >>if the rundata encoding is UTF-8 this leads to errors creating the >>string with umlauten. does somebody put the fixed encoding here on >>purpose? If not I would propose this modification: >> >>return new String(str.getBytes(getCharacterEncoding()), >>getCharacterEncoding()); > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: jetspeed-dev-unsubscribe@jakarta.apache.org > For additional commands, e-mail: jetspeed-dev-help@jakarta.apache.org > __________________________________________________ Do You Yahoo!? Yahoo! BB is Broadband by Yahoo! http://bb.yahoo.co.jp/ --------------------------------------------------------------------- To unsubscribe, e-mail: jetspeed-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: jetspeed-dev-help@jakarta.apache.org