incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: [PROPOSAL] Apache Linda
Date Sun, 18 Nov 2012 20:55:42 GMT
On Sun, Nov 18, 2012 at 1:45 AM, Paolo Castagna <>wrote:

> ....

On 17/11/12 22:49, Ted Dunning wrote:
>> Frankly, the phrase "linked data" is also so generic as to be essentially
>> meaningless outside your community.  There are many, many uses of this
>> phrase in computer science that mean something completely different from
>> what you guys seem to mean.
> Where else is the phrase "linked data" used with a different meaning?

The problem is that the phrase is generic and can arise in general speech.

Links and pointers are ubiquitous in computer parlance.  Nothing in the
phrase "linked data" constrains the meaning to *that* kind of link for
*that* kind of data other than the usage in a relatively small community.

What 'those guys' seem to mean is well described in the Linked Data
> Wikipedia page:**Linked_data<>
> Please, notice there isn't a disambiguation page. :-)

That is because the phrase is only used as a proper noun for one thing.
 But it is used commonly as a descriptive phrase.

The comparable phrase "red flowers" doesn't need a disambiguation link in
wikipedia either because the meaning is apparent as a compositional

> The above wiki page seems pretty short and clear to me.

But the phrase itself is so vanilla that searching on the web to find the
meaning (to a native speaker, anyway) seems kind of pointless.

My question was not "what does linked data mean?" because it seemed like I
could come up with ten meanings for the term.  The question was "which of
the many possible meanings are these people talking about?".  Note that a
web search wouldn't answer that question because the existence of a common
usage does not imply that any given community is following that common
usage pattern.

Is there anything  in your opinion which isn't clear and should be better
> explained?

I think that you are missing the point.

The problem is that the phrase itself doesn't have any signal that there is
any nominative usage going on.  If I were speaking German and used the
English phrase, there would be a very strong signal, but we aren't doing

As such, I think that most mentions of "linked data" should include some
such signal.  In a proposal aimed at people outside your community, in
particular, you need something along the lines "the phrase linked data is
used here idiosyncratically to refer to ...".  If you assume that the
reader knows what kind of link you mean between what kind of data, then the
documents you produce will tend to be impenetrable.  Assumptions like this
are common within insular communities and commonly lead to
misunderstandings like this.

The phrase "linked data" is composed by two words and the common definition
> of 'linked' (in particular if referred to the Web) and 'data' applies here
> unchanged. If you think at the Web as is big 'library' of linked
> 'documents', can we do the same also for data, instead of documents? How?
> This is what 'linked data' is all about and what it is trying to achieve: a
> Web of data.

I get it now.  My point was that your proposal didn't convey this.

And I would contend that the common definitions of linked and data when
combined do not unambiguously come up with Linked Data(tm) as you tend to
use the phrase.  With the proper predisposition, it might, but your
predisposition is not shared universally.  I cite myself as the existence
proof of at least one experienced and active computer scientist who had no
clue what you were going on about.

Having a project name that memorializes a phrase that nobody is likely to
>> understand without (lots of) supporting material and which is used by
>> other
>> projects in roughly the same domain is problematic.
> I disagree.

Well, you can't disagree that I was confused by your proposal.  I don't
think that you can disagree that a big part of the cause of the confusion
was the use of the generic phrase "linked data" in a highly specific way.

Take other terms that have succeeded easily:


    web log => blog

    web page

    atomic clock

Each of these is essentially a phrase, but one that did not have a prior
common usage.

> The 4 principles are very clear and simple:
>  1. Use URIs as names for things
>  2. Use HTTP URIs so that people can look up those names.
>  3. When someone looks up a URI, provide useful information, using the
> standards (RDF*, SPARQL)
>  4. Include links to other URIs. so that they can discover more things.
> We could debate indefinitely on the "using the standards ..." part, but
> should we do it here?

No.  You can define things any way you like.  That isn't the point.

> What's isn't clear to you from the four principles above?

The clarity of the four principles isn't the point.  The clarity of the
phrase "linked data" without somewhat unusual foreknowledge and without the
definition is the point.  A phrase that has to have its definition
schlepped around with the phrase is hardly very useful.

> You already know what a URI, HTTP URIs, links are. Isn't it? :-)

Yes.  But you are way out in the weeds here arguing a point that doesn't
need to be made.

> Now, I could have sympathy with you if you point your finger at RDF and
> SPARQL, but,

Ahh... but each of these has names that are clearly not something else.
 Thus, if I don't know about triples and such, I still can see that I
*don't* know what these phrases are.

With "linked data", I don't have a clue that I don't know what you are
talking about.

> ... and clear... and if you want you can always refer to the W3C
> Recommendations (those are your primary sources of information in this
> case).

but how would a person know that is where these terms are defined?
 Especially when you are giving them a huge pointer toward blackboard
systems with the name Linda?

> ...You should be aware, however, that with these defects, it seems very
>  unlikely to me that Apache would be able to help with trademark and name
>> conflict issues.  That may not seem like a big deal now, but if your
>> project really does get going and then somebody tries to take over your
>> community with a nearly identically named product, it will definitely feel
>> like a big deal.  Take a look at what happens with Open Office all the
>> time.
> Regarding the name, I have no better suggestion than dropping the 'n'?
> Linda --> Lida (but I have not done much research to see if that has
> problems or not).

How about following the tradition established by the contraction of web log
into blog?

That would give "web linked data" => Blinda

It is still a female name if you need the gender stereotyping of Linda.  It
seems to have non-English meanings, but certainly has no connotations in
English.  It also seems to have no prior technical usage.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message