You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@marmotta.apache.org by Fabian Cretton <Fa...@hevs.ch> on 2014/10/27 10:50:34 UTC

Rép. : Re: Java ImportClient and HTTP Authentication

Do you mean that passing the user/pwd to the ClientConfiguration should be the correct way to do it ?
 
And, more precisely, where is the use of a deprecated API of HttpClient ?
 
thanks
Fabian

>>> Jakob Frank <ja...@apache.org> 27.10.2014 09:38 >>>
Hi Fabian,

while looking into the code for the data-import issue, I saw that we
are using a deprecated API of HttpClient - maybe the authentication
issue is related to that.

Would be great if you could have a look into that and maybe provide a patch!

Best,
Jakob


On 24 October 2014 13:56, Fabian Cretton <Fa...@hevs.ch> wrote:
> Hi,
>
> In a Marmotta module I am developping, I did use ImportClient to upload
> data, and it did work fine.
>
> However, when changing Marmotta's security from "simple" to "restricted",
> the ImportClient was failing with a 401.
> I did try to pass a user/pwd to the ClientConfiguration(), but the error
> persisted (I was passing the user and password as 'clear' strings, for
> instance 'admin' and 'pass123').
>
> To make it work, I had make my own copy of the method
> ImportClient.uploadDataset(), and pass the headerAuth received by my own
> webservice to the post object: post.setHeader("Authorization", headerAuth);
>
> I thus have two question:
> - is that normal that the ImportClient was failing or did I do something
> wrong
> - if that was normal, would you want a new version of ImportClient that
> could handle this ?
>
> thank you
> Fabian
>

Re: Re: LDClient

Posted by Fabian Cretton <Fa...@hevs.ch>.

Thank you Sergio for the explanations.

About what you told and which I did understand as "In Linked Data,
don't reuse others URI as subject of your triples", could you point me
to the documentation (best practice maybe ?) where this is stated ?

This discussions is too generic for me, as there are so many ways to
design ontologies (also depending on the purpose of your design,
including speeding up SPARQL queries using materialization). Another
point is that we didn't talk so far about inferences, and I don't see
how that "linked data principle" deals with  owl:inverseOf
( http://www.w3.org/TR/2004/REC-owl-semantics-20040210/#owl_inverseOf)
. Working since more than 10 years with semantic web technologies, I am
maybe mixing up "semantic web" general possibilities and "Linked Data"
more specific guidelines.

Thank you again
Fabian

>>> Le 04.11.2014 à  14:48, Sergio Fernández<wi...@apache.org> a écrit
dans le message <54...@apache.org> :

Hi guys,

On 04/11/14 11:44, Fabian Cretton wrote:
> Because in my understanding of the web of data, anyone can say
anything
> about anything, isn't that correct ?

Not really if you consider trust...

> For instance, there could be a specific product sold by different
> vendors, and each vendor, publishing a catalog in RDF, will provide
a
> price for that product. So, referring to that produc'ts URI, each
vendor
> will publish data with the product's URI as subject. That seems a
very
> simple and realistic case, isn't it ?

No, I would not say so.

Let's use a concrete product as example, a car, VW Golf.

If the manufactured publishes information about that car:

<http://www.volkswagen.com/cars/golf/gte> vso:weight "1520" .

Then if a retailer sells a model of that car, publishing the following
data:

<http://www.asturwagen.es/offer1> a gr:Offering ;
   gr:includes <http://www.volkswagen.com/cars/golf/gte> .

<http://www.volkswagen.com/cars/golf/gte> vso:weight "1525" .

You should never trust what the retailer says that the weight of the 
Golf GTE is 1,525 kg. The dealer has no authority to say that.

URIs are global identifiers using the hierarchical DNS system. You 
cannot mint URIs you do not control for saying information about things

you do not own.

Back to you your product example, the vendor

> Then can you help me to better understand what "is/should" a
Marmotta's
> LDClient ?
> On that page [1], it is said that "LDClient is a flexible and
modular
> Linked Data Client (RDFizer
> ( http://www.w3.org/wiki/ConverterToRdf) )"
> There is already something not clear for me in that sentence:
RDFizing
> is, to me, the process of transforming non-rdf data to rdf.
> But if I understand it well, LDClient is already able to import
natif
> RDF, for instance RDFa, Linked Data and also querying a SPARQL
> end-point.
>
> Is LDClient designed to deal only with data published from its own
URL,
> where all triples have that URL as subject ?

It does RDFizing. But it also discards the triples that do not talk 
about the URI that was referenced.

> if so, what happens when LDClient is used as a RDFizer on non-RDF
data
> ?
>
> Maybe I should have a look at the RDFa client and also see how data
is
> processed there.

In the RDFa Data Provider is just a transformation process. The cases 
that might be causing issues for understanding it could be other that 
make use of APIs to get data out of other formats (e.g., the 
transformation from Facebook Graph).

> But here is what interest us in LDClient:
> - import RDF and non-RDF data in the triple store (even if it is an
RDF
> file where subject don't correspond to the file's URL)
> - import first in a temporary location, in order to import only part
of
> the data, and validate the data. It seems that LDClient does handle
this
> natively and this feature is very interesting for us.

OK, so let's put it this way: the general purpose of LDClient is to 
respect the URI as identity for the data; but if you have a custom 
scenario were that needs to be extended, you are completely free to use

the infrastructure provided by LDClient.

> About dealing with data update, I understand that in your use of
> LDCache/LDClient, ensuring that triples with a specific subject come
> from one data source is a way to know which triples to update when
> refreshing the data. In our case, we deal with 'contexts' (named
graph)
> to deal with that.

That's the business of LDCache, which uses a fixed context for caching

the data.

> Talking about this, I do have another question: is it a problem for
> Marmotta/Kiwi do deal with a certain quantity of contexts ? I know it
is
> not a problem with other triple stores as OWLIM for instance.

Yes, there is not limit on the context KiWi is able to manage.

BTW, if you are interested of keeping OWLIM it should be easy to
provide 
a backend for Marmotta...

Cheers,

-- 
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 660 2747 925
e: sergio.fernandez@redlink.co
w: http://redlink.co

Re: LDClient

Posted by Sergio Fernández <wi...@apache.org>.

Hi guys,

On 04/11/14 11:44, Fabian Cretton wrote:
> Because in my understanding of the web of data, anyone can say anything
> about anything, isn't that correct ?

Not really if you consider trust...

> For instance, there could be a specific product sold by different
> vendors, and each vendor, publishing a catalog in RDF, will provide a
> price for that product. So, referring to that produc'ts URI, each vendor
> will publish data with the product's URI as subject. That seems a very
> simple and realistic case, isn't it ?

No, I would not say so.

Let's use a concrete product as example, a car, VW Golf.

If the manufactured publishes information about that car:

<http://www.volkswagen.com/cars/golf/gte> vso:weight "1520" .

Then if a retailer sells a model of that car, publishing the following data:

<http://www.asturwagen.es/offer1> a gr:Offering ;
   gr:includes <http://www.volkswagen.com/cars/golf/gte> .

<http://www.volkswagen.com/cars/golf/gte> vso:weight "1525" .

You should never trust what the retailer says that the weight of the 
Golf GTE is 1,525 kg. The dealer has no authority to say that.

URIs are global identifiers using the hierarchical DNS system. You 
cannot mint URIs you do not control for saying information about things 
you do not own.

Back to you your product example, the vendor

> Then can you help me to better understand what "is/should" a Marmotta's
> LDClient ?
> On that page [1], it is said that "LDClient is a flexible and modular
> Linked Data Client (RDFizer
> ( http://www.w3.org/wiki/ConverterToRdf) )"
> There is already something not clear for me in that sentence: RDFizing
> is, to me, the process of transforming non-rdf data to rdf.
> But if I understand it well, LDClient is already able to import natif
> RDF, for instance RDFa, Linked Data and also querying a SPARQL
> end-point.
>
> Is LDClient designed to deal only with data published from its own URL,
> where all triples have that URL as subject ?

It does RDFizing. But it also discards the triples that do not talk 
about the URI that was referenced.

> if so, what happens when LDClient is used as a RDFizer on non-RDF data
> ?
>
> Maybe I should have a look at the RDFa client and also see how data is
> processed there.

In the RDFa Data Provider is just a transformation process. The cases 
that might be causing issues for understanding it could be other that 
make use of APIs to get data out of other formats (e.g., the 
transformation from Facebook Graph).

> But here is what interest us in LDClient:
> - import RDF and non-RDF data in the triple store (even if it is an RDF
> file where subject don't correspond to the file's URL)
> - import first in a temporary location, in order to import only part of
> the data, and validate the data. It seems that LDClient does handle this
> natively and this feature is very interesting for us.

OK, so let's put it this way: the general purpose of LDClient is to 
respect the URI as identity for the data; but if you have a custom 
scenario were that needs to be extended, you are completely free to use 
the infrastructure provided by LDClient.

> About dealing with data update, I understand that in your use of
> LDCache/LDClient, ensuring that triples with a specific subject come
> from one data source is a way to know which triples to update when
> refreshing the data. In our case, we deal with 'contexts' (named graph)
> to deal with that.

That's the business of LDCache, which uses a fixed context for caching 
the data.

> Talking about this, I do have another question: is it a problem for
> Marmotta/Kiwi do deal with a certain quantity of contexts ? I know it is
> not a problem with other triple stores as OWLIM for instance.

Yes, there is not limit on the context KiWi is able to manage.

BTW, if you are interested of keeping OWLIM it should be easy to provide 
a backend for Marmotta...

Cheers,

-- 
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 660 2747 925
e: sergio.fernandez@redlink.co
w: http://redlink.co

Re: Re: Re : Re: LDClient

Posted by Fabian Cretton <Fa...@hevs.ch>.

Hi Jakob,

Yes, sorry, I was talking about "subject" (not object).

I guess we are not having a talk about the web of data, but about
Marmotta's LDClient ?

Because in my understanding of the web of data, anyone can say anything
about anything, isn't that correct ?
For instance, there could be a specific product sold by different
vendors, and each vendor, publishing a catalog in RDF, will provide a
price for that product. So, referring to that produc'ts URI, each vendor
will publish data with the product's URI as subject. That seems a very
simple and realistic case, isn't it ?

In the example you give "What I interpret from your message below is
that you would like to include triples like "dbpedia:Europe a
dbpedia:Continent" retrieved from a URL like http://example.com/foo - is
this correct?" 
Yes about "data", no about "data publishing". What I mean is that we
could imagine that some people dealing with tourism could have a class
that is a "TouristProvenance", and it seems totally normal to me if they
publish data saying: "dbpedia:Europe a touristOnto:TouristProvenance",
isn't it possible in their data ?
But then, the difference will be about how to publish that data: of
course they would not publish that triple as "linked data" when
derefencing "http://example.com/foo", but they could provide an ontology
in a RDF file, or a SPARQL end-point containing such triples.

If I am wrong, thank you for the pointers, maybe I missed something and
should correct my way of thinking.

Then can you help me to better understand what "is/should" a Marmotta's
LDClient ?
On that page [1], it is said that "LDClient is a flexible and modular
Linked Data Client (RDFizer
( http://www.w3.org/wiki/ConverterToRdf) )"
There is already something not clear for me in that sentence: RDFizing
is, to me, the process of transforming non-rdf data to rdf.
But if I understand it well, LDClient is already able to import natif
RDF, for instance RDFa, Linked Data and also querying a SPARQL
end-point.

Is LDClient designed to deal only with data published from its own URL,
where all triples have that URL as subject ?
if so, what happens when LDClient is used as a RDFizer on non-RDF data
?

Maybe I should have a look at the RDFa client and also see how data is
processed there.

But here is what interest us in LDClient:
- import RDF and non-RDF data in the triple store (even if it is an RDF
file where subject don't correspond to the file's URL)
- import first in a temporary location, in order to import only part of
the data, and validate the data. It seems that LDClient does handle this
natively and this feature is very interesting for us.

About dealing with data update, I understand that in your use of
LDCache/LDClient, ensuring that triples with a specific subject come
from one data source is a way to know which triples to update when
refreshing the data. In our case, we deal with 'contexts' (named graph)
to deal with that. 
Talking about this, I do have another question: is it a problem for
Marmotta/Kiwi do deal with a certain quantity of contexts ? I know it is
not a problem with other triple stores as OWLIM for instance.

Thank you
Fabian

[1] http://marmotta.apache.org/ldclient/

>>> Le 04.11.2014 à  09:45, Jakob Frank
<ja...@salzburgresearch.at> a écrit dans le message
<54...@salzburgresearch.at> :

Hi Fabian,

are you sure you're not mixing up subject and object in your message?

Because LDClient will de-reference, e.g.
http://dbpedia.org/resource/Europe and add all triples with
dbpedia:Europe as *subject* to the repository.

Any other URI, e.g http://example.com/foo will be dereferenced and a
triple like "<http://example.com/foo> dct:about dbpedia:Europe" will
be
added to the repository.

What I interpret from your message below is that you would like to
include triples like "dbpedia:Europe a dbpedia:Continent" retrieved
from
a URL like http://example.com/foo - is this correct?

This introduces a big problem: provenance. How do you guarantee that
the
data from http://example.com/foo about dbpedia:Europe is correct?
That's
why triples with a different subject are ignored in LDClient.

Best,
Jakob

On 2014-11-04 09:05, Fabian Cretton wrote:
> Hello Sergio,
>  
> In this current discussion, shouldn't we do a difference between
> the linked data principles [1] (and thus the RDF graph), and how
data
> are published (rdf file, linked data with content negociation,
sparql
> end-point, RDFa, etc.) ?
>  
> About linked data principles, tell me if I am wrong, but here is what
I
> understand: the goal of the first point "Use URIs as names for
things"
> is to have international keys to identify things, and thus avoid
data
> silos as in relational databases. The second point "Use HTTP URIs so
> that people can look up those names. " says that the URIS should be
> accessible through HTTP (e.g. URL), and so they can be dereferenced
in
> order to get SOME data about that thing (point 3 - "When someone
looks
> up a URI, provide useful information, using the standards (RDF*,
SPARQL)
> "). Than, this data can link to other data as stated in point 4
"Include
> links to other URIs. so that they can discover more things. "
>  
> But does the linked data principles say that triples with a specif
> object should only be served (data publishing) on that specific URI ?
It
> is not my understanding so far, and thats why I did write "SOME"
> information here above.
> For instance, anyone could write triples about
> <http://dbpedia.org/resource/Europe>, in any given domain (art,
politic,
> etc.), using any available ontology, no ?
> So triples with <http://dbpedia.org/resource/Europe> as object could
> come from any source other than derefencing the
> "http://dbpedia.org/resource/Europe" URL. 
> And as an example, this file
> "http://www.w3.org/People/Berners-Lee/card.rdf" does contain triples
> with different resources as objects.
>  
> Replacing this in the overLOD context: its goal is to provide tools
to
> build an application based on distributed data, here using the Web
of
> Data technologies. Different data providers do provide data in
different
> forms (data publishing). It could be rdf files, sparql end-points,
or
> even data that needs to be RDFized (microdata for instance).
> Then overLOD allows to reference those data, import them (entirely
or
> partly, for instance we usually don't need all languages of the
labels
> provided by a geoname feature), control them (as data could be
wrong,
> and inferencing is not easily a way to control data). Then data is
at
> disposal for apps build on that instance of overLOD (i.e. with the
> decisions we took, it is an instance of Marmotta).
>  
> And thus, overLOD does bring something different from LDCache, a way
to
> better "control" which data is in the store, how it is updated,
which
> seems to me mandatory when building a real app.
>  
> We won't have time in the overLOD project to build a fully
functional
> tool, but the basics will be there.
>  
> I am not sure this discussion is of any interest for you, but thanks
for
> your thoughts
> Fabian
>  
>  
>  
>  
>  
> Hi,
> 
> On 01/11/14 13:14, Fabian Cretton wrote:
>>>> Then, I did implement LDClients that can import RDF files (instead
of
>>>> using the import service). They are just like the "linked data"
code,
>>>> except I don't check if the subject of the triple correspond to
the
>>>> URI.
>>
>> Of course we don't expect that the code we write for OverLOD will
be
> appreciated by the Marmotta Team,
>> but we will just let people know it is there if needed :-)
>>
>> But actually I don't understand your point here about RDF files
moving
> away from Linked Data paradigm.
>> Do you mean that Youtube, Vimeo, RDFa and SPARQL endpoints, which
all
> have LDClients, follow linked data paradigm more than
>> http://sws.geonames.org/2921044/about.rdf
> 
> No no, I'm not saying that. Let me try to explain it:
> 
> If we take the Linked Data principles [1], ee could say that
LDClient
> extends the 3rd point ("when someone looks up a URI, provide useful
> information") beyond just "using the standards (RDF*, SPARQL)" by
> providing new methods to get RDF data out of other formats.
> 
> But LDClient does not modify the 1st principle ("use URIs as names
for
> things"). And that's what I referred to because the sentence "They
are
> just like the "linked data" code, except I don't check if the subject
of
> the triple correspond to the URI".
> 
> Maybe I got it wrong, and what you actually do is extend the 4th
> principle ("Include links to other URIs. so that they can discover
more
> things"), which is of course interesting. Just needed to be
explained.
> 
> BTW, hope you have in mind that if OverLOD produces new LDClient
data
> providers that can be useful for a broader community, please propose
> them to be included in the main project.
> 
> Cheers,
> 
> [1] http://www.w3.org/DesignIssues/LinkedData.html
> 
> P.S.: please, configure you client to use the "Re:" prefix when
replying
> to public English mailing lists
> 
> -- 
> Sergio Fernández
> Partner Technology Manager
> Redlink GmbH
> m: +43 660 2747 925
> e: sergio.fernandez@redlink.co
> w: http://redlink.co
> [1] http://www.w3.org/DesignIssues/LinkedData.html
> 

-- 
DI Jakob Frank
Knowledge and Media Technologies

Salzburg Research Forschungsgesellschaft mbH
Jakob Haringer-Strasse 5/3 | 5020 Salzburg, Austria
T: +43.662.2288-419 | F: +43.662.2288-222
jakob.frank@salzburgresearch.at
http://www.salzburgresearch.at
http://at.linkedin.com/in/jakobfrank

Re: Re : Re: LDClient

Posted by Jakob Frank <ja...@salzburgresearch.at>.

Hi Fabian,

are you sure you're not mixing up subject and object in your message?

Because LDClient will de-reference, e.g.
http://dbpedia.org/resource/Europe and add all triples with
dbpedia:Europe as *subject* to the repository.

Any other URI, e.g http://example.com/foo will be dereferenced and a
triple like "<http://example.com/foo> dct:about dbpedia:Europe" will be
added to the repository.


What I interpret from your message below is that you would like to
include triples like "dbpedia:Europe a dbpedia:Continent" retrieved from
a URL like http://example.com/foo - is this correct?

This introduces a big problem: provenance. How do you guarantee that the
data from http://example.com/foo about dbpedia:Europe is correct? That's
why triples with a different subject are ignored in LDClient.


Best,
Jakob

On 2014-11-04 09:05, Fabian Cretton wrote:
> Hello Sergio,
>  
> In this current discussion, shouldn't we do a difference between
> the linked data principles [1] (and thus the RDF graph), and how data
> are published (rdf file, linked data with content negociation, sparql
> end-point, RDFa, etc.) ?
>  
> About linked data principles, tell me if I am wrong, but here is what I
> understand: the goal of the first point "Use URIs as names for things"
> is to have international keys to identify things, and thus avoid data
> silos as in relational databases. The second point "Use HTTP URIs so
> that people can look up those names. " says that the URIS should be
> accessible through HTTP (e.g. URL), and so they can be dereferenced in
> order to get SOME data about that thing (point 3 - "When someone looks
> up a URI, provide useful information, using the standards (RDF*, SPARQL)
> "). Than, this data can link to other data as stated in point 4 "Include
> links to other URIs. so that they can discover more things. "
>  
> But does the linked data principles say that triples with a specif
> object should only be served (data publishing) on that specific URI ? It
> is not my understanding so far, and thats why I did write "SOME"
> information here above.
> For instance, anyone could write triples about
> <http://dbpedia.org/resource/Europe>, in any given domain (art, politic,
> etc.), using any available ontology, no ?
> So triples with <http://dbpedia.org/resource/Europe> as object could
> come from any source other than derefencing the
> "http://dbpedia.org/resource/Europe" URL. 
> And as an example, this file
> "http://www.w3.org/People/Berners-Lee/card.rdf" does contain triples
> with different resources as objects.
>  
> Replacing this in the overLOD context: its goal is to provide tools to
> build an application based on distributed data, here using the Web of
> Data technologies. Different data providers do provide data in different
> forms (data publishing). It could be rdf files, sparql end-points, or
> even data that needs to be RDFized (microdata for instance).
> Then overLOD allows to reference those data, import them (entirely or
> partly, for instance we usually don't need all languages of the labels
> provided by a geoname feature), control them (as data could be wrong,
> and inferencing is not easily a way to control data). Then data is at
> disposal for apps build on that instance of overLOD (i.e. with the
> decisions we took, it is an instance of Marmotta).
>  
> And thus, overLOD does bring something different from LDCache, a way to
> better "control" which data is in the store, how it is updated, which
> seems to me mandatory when building a real app.
>  
> We won't have time in the overLOD project to build a fully functional
> tool, but the basics will be there.
>  
> I am not sure this discussion is of any interest for you, but thanks for
> your thoughts
> Fabian
>  
>  
>  
>  
>  
> Hi,
> 
> On 01/11/14 13:14, Fabian Cretton wrote:
>>>> Then, I did implement LDClients that can import RDF files (instead of
>>>> using the import service). They are just like the "linked data" code,
>>>> except I don't check if the subject of the triple correspond to the
>>>> URI.
>>
>> Of course we don't expect that the code we write for OverLOD will be
> appreciated by the Marmotta Team,
>> but we will just let people know it is there if needed :-)
>>
>> But actually I don't understand your point here about RDF files moving
> away from Linked Data paradigm.
>> Do you mean that Youtube, Vimeo, RDFa and SPARQL endpoints, which all
> have LDClients, follow linked data paradigm more than
>> http://sws.geonames.org/2921044/about.rdf
> 
> No no, I'm not saying that. Let me try to explain it:
> 
> If we take the Linked Data principles [1], ee could say that LDClient
> extends the 3rd point ("when someone looks up a URI, provide useful
> information") beyond just "using the standards (RDF*, SPARQL)" by
> providing new methods to get RDF data out of other formats.
> 
> But LDClient does not modify the 1st principle ("use URIs as names for
> things"). And that's what I referred to because the sentence "They are
> just like the "linked data" code, except I don't check if the subject of
> the triple correspond to the URI".
> 
> Maybe I got it wrong, and what you actually do is extend the 4th
> principle ("Include links to other URIs. so that they can discover more
> things"), which is of course interesting. Just needed to be explained.
> 
> BTW, hope you have in mind that if OverLOD produces new LDClient data
> providers that can be useful for a broader community, please propose
> them to be included in the main project.
> 
> Cheers,
> 
> [1] http://www.w3.org/DesignIssues/LinkedData.html
> 
> P.S.: please, configure you client to use the "Re:" prefix when replying
> to public English mailing lists
> 
> -- 
> Sergio Fernández
> Partner Technology Manager
> Redlink GmbH
> m: +43 660 2747 925
> e: sergio.fernandez@redlink.co
> w: http://redlink.co
> [1] http://www.w3.org/DesignIssues/LinkedData.html
> 

-- 
DI Jakob Frank
Knowledge and Media Technologies

Salzburg Research Forschungsgesellschaft mbH
Jakob Haringer-Strasse 5/3 | 5020 Salzburg, Austria
T: +43.662.2288-419 | F: +43.662.2288-222
jakob.frank@salzburgresearch.at
http://www.salzburgresearch.at
http://at.linkedin.com/in/jakobfrank

Re : Re: LDClient (was: Java ImportClient and HTTP Authentication)

Posted by Fabian Cretton <Fa...@hevs.ch>.

Hello Sergio,

In this current discussion, shouldn't we do a difference between the
linked data principles [1] (and thus the RDF graph), and how data are
published (rdf file, linked data with content negociation, sparql
end-point, RDFa, etc.) ?

About linked data principles, tell me if I am wrong, but here is what I
understand: the goal of the first point "Use URIs as names for things"
is to have international keys to identify things, and thus avoid data
silos as in relational databases. The second point "Use HTTP URIs so
that people can look up those names. " says that the URIS should be
accessible through HTTP (e.g. URL), and so they can be dereferenced in
order to get SOME data about that thing (point 3 - "When someone looks
up a URI, provide useful information, using the standards (RDF*, SPARQL)
"). Than, this data can link to other data as stated in point 4 "Include
links to other URIs. so that they can discover more things. "

But does the linked data principles say that triples with a specif
object should only be served (data publishing) on that specific URI ? It
is not my understanding so far, and thats why I did write "SOME"
information here above.
For instance, anyone could write triples about
<http://dbpedia.org/resource/Europe>, in any given domain (art, politic,
etc.), using any available ontology, no ?
So triples with <http://dbpedia.org/resource/Europe> as object could
come from any source other than derefencing the
"http://dbpedia.org/resource/Europe" URL. 
And as an example, this file
"http://www.w3.org/People/Berners-Lee/card.rdf" does contain triples
with different resources as objects.

Replacing this in the overLOD context: its goal is to provide tools to
build an application based on distributed data, here using the Web of
Data technologies. Different data providers do provide data in different
forms (data publishing). It could be rdf files, sparql end-points, or
even data that needs to be RDFized (microdata for instance).
Then overLOD allows to reference those data, import them (entirely or
partly, for instance we usually don't need all languages of the labels
provided by a geoname feature), control them (as data could be wrong,
and inferencing is not easily a way to control data). Then data is at
disposal for apps build on that instance of overLOD (i.e. with the
decisions we took, it is an instance of Marmotta).

And thus, overLOD does bring something different from LDCache, a way to
better "control" which data is in the store, how it is updated, which
seems to me mandatory when building a real app.

We won't have time in the overLOD project to build a fully functional
tool, but the basics will be there.

I am not sure this discussion is of any interest for you, but thanks
for your thoughts
Fabian

Hi,

On 01/11/14 13:14, Fabian Cretton wrote:
>>> Then, I did implement LDClients that can import RDF files (instead
of
>>> using the import service). They are just like the "linked data"
code,
>>> except I don't check if the subject of the triple correspond to
the
>>> URI.
>
> Of course we don't expect that the code we write for OverLOD will be
appreciated by the Marmotta Team,
> but we will just let people know it is there if needed :-)
>
> But actually I don't understand your point here about RDF files
moving away from Linked Data paradigm.
> Do you mean that Youtube, Vimeo, RDFa and SPARQL endpoints, which all
have LDClients, follow linked data paradigm more than
> http://sws.geonames.org/2921044/about.rdf

No no, I'm not saying that. Let me try to explain it:

If we take the Linked Data principles [1], ee could say that LDClient 
extends the 3rd point ("when someone looks up a URI, provide useful 
information") beyond just "using the standards (RDF*, SPARQL)" by 
providing new methods to get RDF data out of other formats.

But LDClient does not modify the 1st principle ("use URIs as names for

things"). And that's what I referred to because the sentence "They are

just like the "linked data" code, except I don't check if the subject
of 
the triple correspond to the URI".

Maybe I got it wrong, and what you actually do is extend the 4th 
principle ("Include links to other URIs. so that they can discover more

things"), which is of course interesting. Just needed to be explained.

BTW, hope you have in mind that if OverLOD produces new LDClient data 
providers that can be useful for a broader community, please propose 
them to be included in the main project.

Cheers,

[1] http://www.w3.org/DesignIssues/LinkedData.html

P.S.: please, configure you client to use the "Re:" prefix when
replying 
to public English mailing lists

-- 
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 660 2747 925
e: sergio.fernandez@redlink.co
w: http://redlink.co
[1] http://www.w3.org/DesignIssues/LinkedData.html

Re: LDClient (was: Java ImportClient and HTTP Authentication)

Posted by Sergio Fernández <wi...@apache.org>.

Hi,

On 01/11/14 13:14, Fabian Cretton wrote:
>>> Then, I did implement LDClients that can import RDF files (instead of
>>> using the import service). They are just like the "linked data" code,
>>> except I don't check if the subject of the triple correspond to the
>>> URI.
>
> Of course we don't expect that the code we write for OverLOD will be appreciated by the Marmotta Team,
> but we will just let people know it is there if needed :-)
>
> But actually I don't understand your point here about RDF files moving away from Linked Data paradigm.
> Do you mean that Youtube, Vimeo, RDFa and SPARQL endpoints, which all have LDClients, follow linked data paradigm more than
> http://sws.geonames.org/2921044/about.rdf

No no, I'm not saying that. Let me try to explain it:

If we take the Linked Data principles [1], ee could say that LDClient 
extends the 3rd point ("when someone looks up a URI, provide useful 
information") beyond just "using the standards (RDF*, SPARQL)" by 
providing new methods to get RDF data out of other formats.

But LDClient does not modify the 1st principle ("use URIs as names for 
things"). And that's what I referred to because the sentence "They are 
just like the "linked data" code, except I don't check if the subject of 
the triple correspond to the URI".

Maybe I got it wrong, and what you actually do is extend the 4th 
principle ("Include links to other URIs. so that they can discover more 
things"), which is of course interesting. Just needed to be explained.

BTW, hope you have in mind that if OverLOD produces new LDClient data 
providers that can be useful for a broader community, please propose 
them to be included in the main project.

Cheers,

[1] http://www.w3.org/DesignIssues/LinkedData.html

P.S.: please, configure you client to use the "Re:" prefix when replying 
to public English mailing lists

-- 
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 660 2747 925
e: sergio.fernandez@redlink.co
w: http://redlink.co

Rép. : Re: Rép. : Re: Java ImportClient and HTTP Authentication

Posted by Fabian Cretton <Fa...@hevs.ch>.

Hi Sergio,

>>
>> Then, I did implement LDClients that can import RDF files (instead of
>> using the import service). They are just like the "linked data" code,
>> except I don't check if the subject of the triple correspond to the
>> URI.
>> My goal is to handle all imports with LDClient, which allows me to do
>> other things on the temporary imported data.
>> If you want this code, just tell me how to do.

>Well, that moves LDClient way from the Linked Data paradigm. I 
>understand your use case, but I'd not push it upstream.
>
>Maybe writing specialized Sesame RDFHandlers could be another approach 
>for that problem.

Of course we don't expect that the code we write for OverLOD will be appreciated by the Marmotta Team,
but we will just let people know it is there if needed :-)

But actually I don't understand your point here about RDF files moving away from Linked Data paradigm.
Do you mean that Youtube, Vimeo, RDFa and SPARQL endpoints, which all have LDClients, follow linked data paradigm more than 
http://sws.geonames.org/2921044/about.rdf

I can see your point if LDClient only handles data published following the linked data principles, as DBPedia.
But as LDClient also handles RDFa, do you mean that RDFa is more linked data than rdf files ?
Not to say that LDClient seems event to play the role of RDFizer (Youtube and Vimeo)
Thank you for your explanation on that.

Cheers and thank you for your time
Fabian

Re: Rép. : Re: Java ImportClient and HTTP Authentication

Posted by Sergio Fernández <wi...@apache.org>.

Hi,

On 01/11/14 09:28, Fabian Cretton wrote:
> I think there is an error in your HTTPUtil.createPost(), as the
> authentication syntax is "user:pwd" and not "user;pwd"
> String.format("%s;%s"
> should be
> String.format("%s:%s"

Ups, sorry for the typo. We need tests covering that.

> Another question/proposal:
>
> only error logs are generated on the server.
>
> if that does make sense, that could be a change in your implementation
> as well. Otherwise I will keep my version in OverLOD.

Please, file an issue in jira with the api changes you want, and we'll 
discuss it.

> Then, I did implement LDClients that can import RDF files (instead of
> using the import service). They are just like the "linked data" code,
> except I don't check if the subject of the triple correspond to the
> URI.
> My goal is to handle all imports with LDClient, which allows me to do
> other things on the temporary imported data.
> If you want this code, just tell me how to do.

Well, that moves LDClient way from the Linked Data paradigm. I 
understand your use case, but I'd not push it upstream.

Maybe writing specialized Sesame RDFHandlers could be another approach 
for that problem.

> I understand I can fill a JIRA, but then I am not a git/github
> specialist.

Jira is the official way to get contributions, and then being safe on IP 
issues. GitHub pull requests are still fine for patches, but all should 
always reference a jira issue.

Thanks.

Cheers,

-- 
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 660 2747 925
e: sergio.fernandez@redlink.co
w: http://redlink.co