lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Digy" <>
Subject RE: Umlauts as Char
Date Tue, 08 Feb 2011 09:49:59 GMT
Altough Java doesn't write BOM, VS is clever enough to open it correctly.


The problem probably is that Apache server sends the java code using "Content-Type: text/plain;
charset=ISO-8859-1" and the receiver (possibly a browser) incorrectly tries to convert UTF-8
 to ISO-8859-1.

Using a svn client to download the code is a  solution.






-----Original Message-----
From: Stefan Bodewig [] 
Sent: Tuesday, February 08, 2011 7:10 AM
Subject: Re: Umlauts as Char


On 2011-02-08, Prescott Nasser wrote:


> in the void subsitute function you'll see them:


>         else if ( buffer.charAt( c ) == 'ü' ) {

>           buffer.setCharAt( c, 'u' );

>         }


> This does not constitue a character in .net (that I can figure out)

> and thus it doesn't compile. The .java file says encoded in UTF-8. I

> was thinking maybe I could do the same thing in VS2010, but I'm not

> finding a way, and searching on this has been difficult.


IIRC VS will recognize UTF-8 encoded files if they start with a byte

order mark (BOM) but Java usually doesn't write one.  I think I once

found the setting for reading/writing UTF-8 in VS, will need to search

for it when at work.


If you have a JDK installed you can use its native2ascii tool that can

be used to replace non-ASCII characters with Unicoce escape sequences

that you can then use in C# as well (see Nicolas' post).


If you have Ant installed (sorry, can't resist ;-) you can convert the

whole tree in one (untested) go with something like


<copy todir="will-hold-translated-files"


  <fileset dir="holds-original-files"/>





  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message