incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Seaborne <>
Subject Re: Clerezza, Stanbol, Jena, Semantic Commons, WDYT?
Date Mon, 08 Nov 2010 23:14:47 GMT

On 08/11/10 18:32, Jeremy Carroll wrote:
> To make the commons discussion more concrete I would suggest the
> following items for the commons:
> - an IRI library
> - some code to do with vocabularies.
> - connecting to a URL and doing semweb aware content negotiation (this
> is typically done badly)
> (Actually the IRI code should probably be wider, Jena initially used the
> xerces URI code but then the needs exceeded what they supported)
> Jeremy

Good idea.  The IRI code is independent of the rest of Jena and is 
valuable in it's own right.

ARP (Jena RDF/XML parser) is also independent of the Jena code structure 
and once was (is it still possible to get just ARP?).  It's just the 
final step of generation that turns the output of parsing into 
Jena-specific objects.  Might be worth splitting out if it would be useful.

The lowest level of RIOT parsing, which defines the tokens for creating 
any of the Turtle family of langauges, is not Jena dependent.  The 
actual RIOT parsers themselves are as they directly generate 
Jena-specific objects to avoid the copy overhead.  It's a performance 

[RIOT is a set of faster parsers for non-XML serializations of RDF, 
currently part of ARQ, but should migrate to Jena core when fully 
stable. - original need was parsers for formats capable of delivering to 
the TDB database at full loading speed without heavy CPU load.]

But the command line tools based on RIOT which parse or validate one 
format are reusable - they use Jena internally, but the input and output 
are completely standard.

The RDF validator Eyeball is also a useful tool in its own right.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message