incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Carroll <>
Subject Re: Clerezza, Stanbol, Jena, Semantic Commons, WDYT?
Date Tue, 09 Nov 2010 16:59:32 GMT
I could be accused of having gone overboard ... each of the slightly 
different specs as an explicit representation in the code ...
Having changed job and looking at this from a different perspective I am 
less convinced by the pickiness.
There is a largely unrealized goal in the IRI code to link errors right 
back to the specs so that the error messages quote chapter and verse.


On 11/9/2010 8:52 AM, Andy Seaborne wrote:
> On 09/11/10 16:22, Florent Guillaume wrote:
>> On Tue, Nov 9, 2010 at 1:19 PM, Andy Seaborne
>> <>  wrote:
>>> Jeremy identified the IRI library as a potential contribution to a 
>>> commons
>>> area.  It is free-standing, and does not use or call any Jena RDF 
>>> code - it
>>> depends only on ICU4J (and JUnit + Jflex in the build).
>> Please note that Abdera already has an IRI library.

> Florent,
> Thanks for pointing that out.  I see it has a test suite as well and 
> it would be good to make sure we've got things right.
> Illegal IRIs in data have been a bit of a plague in RDF data and the 
> IRI library (written by Jeremy) is a response to that.  It checks both 
> rules for specific IRI schemes and also recommended forms as IRIs are 
> often com pared for equality.  The library is quite picky.  It 
> includes profiles for RDF URI references, IRI and the compromise we 
> use in Jena as a balance of legacy and spec exactness.
> There is an online test service for RDF data in non-RDF/XML formats at:
> The IRIs are checked with the IRI library.
>     Andy
> A few examples:
> http://example/a b
> Code: 17/WHITESPACE in PATH: A single whitespace character. These 
> match no grammar rules of URIs/IRIs. These characters are permitted in 
> RDF URI References, XML system identifiers, and XML Schema anyURIs.
> http://example/a[]b
> Code: 0/ILLEGAL_CHARACTER in PATH: The character violates the grammar 
> rules for URIs/IRIs.
> http://example:80/
> Code: 13/DEFAULT_PORT_SHOULD_BE_OMITTED in PORT: If the port is the 
> default one for the scheme it should be omitted.
> <http://example:80/> Code: 14/PORT_SHOULD_NOT_BE_WELL_KNOWN in PORT: 
> Ports under 1024 should be accessed using the appropriate scheme name
> urn:xyz
> Code: 61/SCHEME_PATTERN_MATCH_FAILED in PATH: The scheme specific 
> syntax rules are violated.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message