lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prescott Nasser <geobmx...@hotmail.com>
Subject RE: Umlauts as Char
Date Tue, 08 Feb 2011 06:13:00 GMT

I'm not sure why I didn't think about it - but there are tons of online converters...copy
and paste ftw.
 
I think the dealing with other character sets is new to me and complicated this more than
it needed to be.
 
Thanks guys,
~P





----------------------------------------
> From: bodewig@apache.org
> To: lucene-net-dev@lucene.apache.org
> Subject: Re: Umlauts as Char
> Date: Tue, 8 Feb 2011 06:09:58 +0100
>
> On 2011-02-08, Prescott Nasser wrote:
>
> > in the void subsitute function you'll see them:
>
> > else if ( buffer.charAt( c ) == 'ü' ) {
> > buffer.setCharAt( c, 'u' );
> > }
>
> > This does not constitue a character in .net (that I can figure out)
> > and thus it doesn't compile. The .java file says encoded in UTF-8. I
> > was thinking maybe I could do the same thing in VS2010, but I'm not
> > finding a way, and searching on this has been difficult.
>
> IIRC VS will recognize UTF-8 encoded files if they start with a byte
> order mark (BOM) but Java usually doesn't write one. I think I once
> found the setting for reading/writing UTF-8 in VS, will need to search
> for it when at work.
>
> If you have a JDK installed you can use its native2ascii tool that can
> be used to replace non-ASCII characters with Unicoce escape sequences
> that you can then use in C# as well (see Nicolas' post).
>
> If you have Ant installed (sorry, can't resist ;-) you can convert the
> whole tree in one (untested) go with something like
>
> > encoding="utf8">
> 
> 
> 
>
> Stefan 		 	   		  
Mime
View raw message