lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From michael herndon <mhern...@michaelherndon.com>
Subject Re: Paul: why are we using sbyte?
Date Tue, 08 Apr 2014 16:43:47 GMT
We should take a vote on CLS compliance, that way there is an actual stance
on CLS compliance by the PMC for version 4 and onward.

My suggestions for making those changes:
  * keep branch_4x in a state that builds (well for the core project
anyways).
  * write a small guide on converting Java's byte which is signed, to
.Net's unsigned byte and bit shifting.

I'll start a different thread for thoughts about updating the wiki,
readme's and other things so that we can keep track of a list of decisions,
goals, and lower the barrier of entry for those looking to make
contributions.








On Tue, Apr 8, 2014 at 9:27 AM, Itamar Syn-Hershko <itamar@code972.com>wrote:

> +1 on CLS compliance. Lucene.NET previously did sbyte computations on C#
> byte[] arrays so it is possible. We need to either wrap current sbyte usage
> with byte[] public API doing conversions on the go, or translate back to
> byte[] to the bone and make sure everything works ok
>
> --
>
> Itamar Syn-Hershko
> http://code972.com | @synhershko <https://twitter.com/synhershko>
> Freelance Developer & Consultant
> Author of RavenDB in Action <http://manning.com/synhershko/>
>
>
> On Tue, Apr 8, 2014 at 4:24 PM, michael herndon <
> mherndon@michaelherndon.com
> > wrote:
>
> > The library is not currently CLSCompliant.  There are quite a few changes
> > needed to make that happen.  It has been attempted before, so its worth
> > digging through the previous threads to find some of the issues that
> occur
> > when the attribute is applied.
> >
> >
> >
> >
> > On Tue, Apr 8, 2014 at 9:09 AM, Simon Svensson <sisve@devhost.se> wrote:
> >
> > > Hi,
> > >
> > > The SByte type is not CLSCompliant. What is our stance on cls
> compliance?
> > >
> > > // Simon
> > >
> > >
> > > On 08/04/14 14:35, michael herndon wrote:
> > >
> > >> Try using a projection with .Select(o => (sbyte)o).ToArray();
> > >>
> > >> public static sbyte[] GetBytes(this string str, string encoding)
> > >> {
> > >>        return
> > >> Encoding.GetEncoding(encoding)
> > >> .GetBytes(str)
> > >> .Select(o => (sbyte)o).ToArray();
> > >> }
> > >>
> > >> http://dotnetfiddle.net/rfAOeB
> > >>
> > >>
> > >> On Mon, Apr 7, 2014 at 6:20 PM, Paul Irwin <pirwin@feature23.com>
> > wrote:
> > >>
> > >>  Yes, that works until you call Array.Copy, which Lucene does left and
> > >>> right. The call to Array.Copy will exception out if the array has
> been
> > >>> (what it considers) improperly cast like that.
> > >>>
> > >>> See all the references to System.arraycopy (Java equiv of Array.Copy)
> > >>> here:
> > >>>
> > >>> https://github.com/apache/lucene-solr/search?q=%
> > >>> 22System.ArrayCopy%22&type=Code
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Mon, Apr 7, 2014 at 6:11 PM, Itamar Syn-Hershko <
> itamar@code972.com
> > >>>
> > >>>> wrote:
> > >>>> You don't need to array-copy - a simple cast should work. Can you
> test
> > >>>>
> > >>> this
> > >>>
> > >>>> as well?
> > >>>>
> > >>>> I have this and it seems to work:
> > >>>>
> > >>>> public static sbyte[] getBytes(this string str, string encoding)
> > >>>>          {
> > >>>>              return
> > >>>> (sbyte[])(Array)Encoding.GetEncoding(encoding).GetBytes(str);
> > >>>>          }
> > >>>>
> > >>>> --
> > >>>>
> > >>>> Itamar Syn-Hershko
> > >>>> http://code972.com | @synhershko <https://twitter.com/synhershko>
> > >>>> Freelance Developer & Consultant
> > >>>> Author of RavenDB in Action <http://manning.com/synhershko/>
> > >>>>
> > >>>>
> > >>>> On Tue, Apr 8, 2014 at 1:03 AM, Paul Irwin <pirwin@feature23.com>
> > >>>> wrote:
> > >>>>
> > >>>>  There were a few spots where I added a byte[] version as well
for
> > >>>>> convenience, but not everywhere. And you have to use BlockCopy...
> you
> > >>>>>
> > >>>> get
> > >>>
> > >>>> an exception if you try to Array.Copy a sbyte[] to byte[] or vice
> > >>>>>
> > >>>> versa,
> > >>>
> > >>>> even though the storage in memory is virtually identical.
> > >>>>>
> > >>>>> And feel free to use my code here for your project for porting
Java
> > to
> > >>>>>
> > >>>> C#,
> > >>>>
> > >>>>> it does pascal casing and .NET naming conventions (I for
> interfaces,
> > >>>>>
> > >>>> etc).
> > >>>>
> > >>>>> Uses Roslyn for C# generation.
> > >>>>>
> > >>>> https://github.com/paulirwin/javatocsharp
> > >>>
> > >>>>
> > >>>>> On Mon, Apr 7, 2014 at 5:04 PM, Itamar Syn-Hershko <
> > itamar@code972.com
> > >>>>>
> > >>>>>> wrote:
> > >>>>>> I'm pretty sure there's no need to BlockCopy as the underlying
> > binary
> > >>>>>> representation is the same. I'm just wondering whether
we should
> > >>>>>>
> > >>>>> change
> > >>>
> > >>>> this internally or find the places where it aches and provide a
> > >>>>>>
> > >>>>> byte[]
> > >>>
> > >>>> API
> > >>>>>
> > >>>>>> as well
> > >>>>>>
> > >>>>>> I'm working on porting the tests now - I think we better
have all
> > >>>>>>
> > >>>>> tests
> > >>>
> > >>>> ported and running (and passing) and then make this kind of
> decisions
> > >>>>>>
> > >>>>>> BTW it is now much easier to port tests, you basically
copy-paste
> > and
> > >>>>>> almost everything works. I'm also working with a friend
to do some
> > >>>>>>
> > >>>>> Java
> > >>>
> > >>>> to
> > >>>>>
> > >>>>>> C# auto conversion, including camelCase to PascalCase by
using
> > >>>>>>
> > >>>>> Reflection.
> > >>>>>
> > >>>>>> --
> > >>>>>>
> > >>>>>> Itamar Syn-Hershko
> > >>>>>> http://code972.com | @synhershko <https://twitter.com/synhershko>
> > >>>>>> Freelance Developer & Consultant
> > >>>>>> Author of RavenDB in Action <http://manning.com/synhershko/>
> > >>>>>>
> > >>>>>>
> > >>>>>> On Mon, Apr 7, 2014 at 11:49 PM, Paul Irwin <pirwin@feature23.com
> >
> > >>>>>>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Hey Itamar,
> > >>>>>>>
> > >>>>>>> There was existing Lucene.net code that used sbyte,
but one of
> the
> > >>>>>>>
> > >>>>>> things I
> > >>>>>>
> > >>>>>>> ran into while porting is that Java was heavily using
negative
> > >>>>>>>
> > >>>>>> constants
> > >>>>>
> > >>>>>> for bytes since their bytes were signed. Also IIRC there
were some
> > >>>>>>> greater-than/less-than comparisons that would break
if wrapped
> > >>>>>>>
> > >>>>>> around
> > >>>
> > >>>> to
> > >>>>>
> > >>>>>> be
> > >>>>>>
> > >>>>>>> between 128 and 255. I tried going down the route of
making
> > >>>>>>>
> > >>>>>> everything
> > >>>>
> > >>>>> byte
> > >>>>>>
> > >>>>>>> instead of sbyte but kept running into incompatibilities.
It was
> > >>>>>>>
> > >>>>>> easier
> > >>>>
> > >>>>> --
> > >>>>>>
> > >>>>>>> and arguably more true to the Java code -- to keep
it sbyte.
> Using
> > >>>>>>> Buffer.BlockCopy instead of the Java-equivalent Array.Copy
works
> to
> > >>>>>>> transform the sbyte arrays to byte arrays.
> > >>>>>>>
> > >>>>>>> I'm open to any suggestions, and please by all means
have at
> trying
> > >>>>>>>
> > >>>>>> to
> > >>>>
> > >>>>> change it, but it became a royal pain and I got it to work
with
> > >>>>>>>
> > >>>>>> sbyte
> > >>>
> > >>>> so
> > >>>>>
> > >>>>>> I
> > >>>>>>
> > >>>>>>> didn't pursue the matter further.
> > >>>>>>>
> > >>>>>>> Paul
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Mon, Apr 7, 2014 at 4:41 PM, Itamar Syn-Hershko
<
> > >>>>>>>
> > >>>>>> itamar@code972.com
> > >>>>
> > >>>>> wrote:
> > >>>>>>>> Hi Paul,
> > >>>>>>>>
> > >>>>>>>> Please refer to this commit:
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>  https://github.com/apache/lucene.net/commit/
> > >>> 8c23317c905d79823fd168ede778820439c8b163
> > >>>
> > >>>> Why have you moved to using sbyte?
> > >>>>>>>>
> > >>>>>>>> I know this is one of the differences between Java
and .NET, but
> > >>>>>>>>
> > >>>>>>> we
> > >>>
> > >>>> are
> > >>>>>
> > >>>>>> on
> > >>>>>>>
> > >>>>>>>> .NET and should allow using byte.
> > >>>>>>>>
> > >>>>>>>> Having Field implementation to expect sbyte[] is
almost useless
> > >>>>>>>>
> > >>>>>>> as
> > >>>
> > >>>> Encoding.GetEncoding(encoding).GetBytes(str); for example returns
> > >>>>>>>>
> > >>>>>>> byte[].
> > >>>>>>
> > >>>>>>> Can we change it back please so it uses byte everywhere,
> > >>>>>>>>
> > >>>>>>> especially
> > >>>
> > >>>> on
> > >>>>>
> > >>>>>> the
> > >>>>>>>
> > >>>>>>>> public facing API?
> > >>>>>>>>
> > >>>>>>>> --
> > >>>>>>>>
> > >>>>>>>> Itamar Syn-Hershko
> > >>>>>>>> http://code972.com | @synhershko <
> https://twitter.com/synhershko
> > >>>>>>>> Freelance Developer & Consultant
> > >>>>>>>> Author of RavenDB in Action <http://manning.com/synhershko/>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>>
> > >>>>>>> Paul Irwin
> > >>>>>>> Lead Software Engineer
> > >>>>>>> feature[23]
> > >>>>>>>
> > >>>>>>> Email: pirwin@feature23.com
> > >>>>>>> Cell: 863-698-9294
> > >>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>> --
> > >>>>>
> > >>>>> Paul Irwin
> > >>>>> Lead Software Engineer
> > >>>>> feature[23]
> > >>>>>
> > >>>>> Email: pirwin@feature23.com
> > >>>>> Cell: 863-698-9294
> > >>>>>
> > >>>>>
> > >>>
> > >>> --
> > >>>
> > >>> Paul Irwin
> > >>> Lead Software Engineer
> > >>> feature[23]
> > >>>
> > >>> Email: pirwin@feature23.com
> > >>> Cell: 863-698-9294
> > >>>
> > >>>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message