You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jena.apache.org by lookman sanni <lo...@gmail.com> on 2016/11/15 00:09:33 UTC

Duplicate entries produced by SPARQL query

Hi,

I have the following RDF:

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:j.0="http://www.somelink.org/ofdf#">
<rdf:Description rdf:about="
http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678">
    <j.0:creditedToCust>ABCDEFG1_5#XYZ1</j.0:creditedToCust>
    <j.0:createdOnDate>ABCDEFG1_5#20151208</j.0:createdOnDate>
<j.0:consumedOnDate>ABCDEFG1_5#20151111</j.0:consumedOnDate>
<j.0:creditedFor>ABCDEFG1_5#124.50</j.0:creditedFor>
<j.0:createdOnDate>ABCDEFG2#20151104</j.0:createdOnDate>
    <j.0:creditedToCust>ABCDEFG2#XYZ2</j.0:creditedToCust>
    <j.0:creditedFor>ABCDEFG2#124.50</j.0:creditedFor>
    <j.0:consumedOnDate>ABCDEFG2#20151111</j.0:consumedOnDate>
  </rdf:Description>
</rdf:RDF>

and the following SPARQL query:

String qs1 = "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> " +
 "PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> " +
 "PREFIX ticket: <http://www.lookouster.org/ofdf#> " +
 "SELECT ?lticket ?type ?bf1 ?cust1 ?tAmount1 ?dtCreation1 ?dtTravel1 ?bf2
?cust2 ?tAmount2 ?dtCreation2 ?dtTravel2 " +
 "WHERE { " +
 "?lticket ticket:creditedToCust ?_cust1       .\n" +
 "?lticket ticket:creditedToCust ?_cust2       .\n" +
 "?lticket ticket:creditedFor    ?_tAmount1    .\n" +
 "?lticket ticket:creditedFor    ?_tAmount2    .\n" +
 "?lticket ticket:createdOnDate  ?_dtCreation1 .\n" +
 "?lticket ticket:createdOnDate  ?_dtCreation2 .\n" +
 "?lticket ticket:consumedOnDate  ?_dtTravel1 .\n" +
 "?lticket ticket:consumedOnDate  ?_dtTravel2 .\n" +
 "FILTER ( strafter( ?_cust1, '#') != strafter( ?_cust2, '#') )   .\n" +
 "FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_tAmount1, '#') )   .\n" +
 "FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_dtCreation1, '#') )
.\n" +
 "FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_dtTravel1, '#') )   .\n"
+
 "FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_tAmount2, '#') )   .\n" +
 "FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_dtCreation2, '#') )
.\n" +
 "FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_dtTravel2, '#') )   .\n"
+
 "FILTER ( strafter( ?_tAmount1, '#' ) = strafter( ?_tAmount2, '#' ) )
.\n" +
 "bind(strafter( ?_cust1, '#' ) as ?cust1)  .\n" +
 "bind(strafter( ?_cust2, '#' ) as ?cust2)  .\n" +
 "bind(strbefore( ?_cust1, '#' ) as ?bf1)  .\n" +
 "bind(strbefore( ?_cust2, '#' ) as ?bf2)  .\n" +
 "bind(strafter( ?_tAmount1, '#' ) as ?tAmount1)  .\n" +
 "bind(strafter( ?_tAmount2, '#' ) as ?tAmount2)  .\n" +
 "bind(strafter( ?_dtCreation1, '#' ) as ?dtCreation1)  .\n" +
 "bind(strafter( ?_dtCreation2, '#' ) as ?dtCreation2)  .\n" +
 "bind(strafter( ?_dtTravel1, '#' ) as ?dtTravel1)  .\n" +
 "bind(strafter( ?_dtTravel2, '#' ) as ?dtTravel2)  .\n" +
 "values ?type { '2x Credit Notes' }          .\n" +
"} ORDER BY ?lticket ?dtCreation1 " ;

The query ran well, but return the following 2 lines:

lticket,type,bf1,cust1,tAmount1,dtCreation1,dtTravel1,bf2,cust2,tAmount2,dtCreation2,dtTravel2
http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678,2x Credit
Notes,ABCDEFG2,XYZ2,124.50,20151104,20151111,ABCDEFG1,XYZ1,124.50,20151208,20151111
http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678,2x Credit
Notes,ABCDEFG1,XYZ1,124.50,20151208,20151111,ABCDEFG2,XYZ2,124.50,20151104,20151111


How do I manage to only output one line, given that the two lines are just
inverted ?

Thank you for your help.

-- 
Best Regards

Lookman SANNI

Re: Duplicate entries produced by SPARQL query

Posted by "Lorenz B." <bu...@informatik.uni-leipzig.de>.

No, that's the usual way how to avoid duplicates.

One comment for the future: Please

a) show the data in Turtle format instead of RDF/XML - RDF/XML is
verbose and difficult to read and Turtle is what SPARQL BGPs are made of.
b) show the "real" SPARQL query, i.e. not as Java String concatenation 
- by the way, this can also be used to debug syntax errors more easily.

Both will help us to give faster and maybe also better support, although
in the current case it wasn't the case.


Cheers,
Lorenz

> That's what I actually implemented. I was hoping there would be something
> sexier.
>
> Thanks though Brian.
>
> On Tue, Nov 15, 2016 at 9:27 AM, Brian McBride <br...@epimorphics.com>
> wrote:
>
>> Would adding a filter
>>
>> FILTER (?bf1 <= ?bf2)
>>
>> or similar do the trick?
>>
>> Brian
>>
>>
>>
>>
>> On 15/11/16 00:09, lookman sanni wrote:
>>
>>> Hi,
>>>
>>> I have the following RDF:
>>>
>>> <rdf:RDF
>>>      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>>      xmlns:j.0="http://www.somelink.org/ofdf#">
>>> <rdf:Description rdf:about="
>>> http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678">
>>>      <j.0:creditedToCust>ABCDEFG1_5#XYZ1</j.0:creditedToCust>
>>>      <j.0:createdOnDate>ABCDEFG1_5#20151208</j.0:createdOnDate>
>>> <j.0:consumedOnDate>ABCDEFG1_5#20151111</j.0:consumedOnDate>
>>> <j.0:creditedFor>ABCDEFG1_5#124.50</j.0:creditedFor>
>>> <j.0:createdOnDate>ABCDEFG2#20151104</j.0:createdOnDate>
>>>      <j.0:creditedToCust>ABCDEFG2#XYZ2</j.0:creditedToCust>
>>>      <j.0:creditedFor>ABCDEFG2#124.50</j.0:creditedFor>
>>>      <j.0:consumedOnDate>ABCDEFG2#20151111</j.0:consumedOnDate>
>>>    </rdf:Description>
>>> </rdf:RDF>
>>>
>>> and the following SPARQL query:
>>>
>>> String qs1 = "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>> " +
>>>   "PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> " +
>>>   "PREFIX ticket: <http://www.lookouster.org/ofdf#> " +
>>>   "SELECT ?lticket ?type ?bf1 ?cust1 ?tAmount1 ?dtCreation1 ?dtTravel1
>>> ?bf2
>>> ?cust2 ?tAmount2 ?dtCreation2 ?dtTravel2 " +
>>>   "WHERE { " +
>>>   "?lticket ticket:creditedToCust ?_cust1       .\n" +
>>>   "?lticket ticket:creditedToCust ?_cust2       .\n" +
>>>   "?lticket ticket:creditedFor    ?_tAmount1    .\n" +
>>>   "?lticket ticket:creditedFor    ?_tAmount2    .\n" +
>>>   "?lticket ticket:createdOnDate  ?_dtCreation1 .\n" +
>>>   "?lticket ticket:createdOnDate  ?_dtCreation2 .\n" +
>>>   "?lticket ticket:consumedOnDate  ?_dtTravel1 .\n" +
>>>   "?lticket ticket:consumedOnDate  ?_dtTravel2 .\n" +
>>>   "FILTER ( strafter( ?_cust1, '#') != strafter( ?_cust2, '#') )   .\n" +
>>>   "FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_tAmount1, '#') )
>>>  .\n" +
>>>   "FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_dtCreation1, '#') )
>>> .\n" +
>>>   "FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_dtTravel1, '#') )
>>>  .\n"
>>> +
>>>   "FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_tAmount2, '#') )
>>>  .\n" +
>>>   "FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_dtCreation2, '#') )
>>> .\n" +
>>>   "FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_dtTravel2, '#') )
>>>  .\n"
>>> +
>>>   "FILTER ( strafter( ?_tAmount1, '#' ) = strafter( ?_tAmount2, '#' ) )
>>> .\n" +
>>>   "bind(strafter( ?_cust1, '#' ) as ?cust1)  .\n" +
>>>   "bind(strafter( ?_cust2, '#' ) as ?cust2)  .\n" +
>>>   "bind(strbefore( ?_cust1, '#' ) as ?bf1)  .\n" +
>>>   "bind(strbefore( ?_cust2, '#' ) as ?bf2)  .\n" +
>>>   "bind(strafter( ?_tAmount1, '#' ) as ?tAmount1)  .\n" +
>>>   "bind(strafter( ?_tAmount2, '#' ) as ?tAmount2)  .\n" +
>>>   "bind(strafter( ?_dtCreation1, '#' ) as ?dtCreation1)  .\n" +
>>>   "bind(strafter( ?_dtCreation2, '#' ) as ?dtCreation2)  .\n" +
>>>   "bind(strafter( ?_dtTravel1, '#' ) as ?dtTravel1)  .\n" +
>>>   "bind(strafter( ?_dtTravel2, '#' ) as ?dtTravel2)  .\n" +
>>>   "values ?type { '2x Credit Notes' }          .\n" +
>>> "} ORDER BY ?lticket ?dtCreation1 " ;
>>>
>>> The query ran well, but return the following 2 lines:
>>>
>>> lticket,type,bf1,cust1,tAmount1,dtCreation1,dtTravel1,bf2,
>>> cust2,tAmount2,dtCreation2,dtTravel2
>>> http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678,2x
>>> Credit
>>> Notes,ABCDEFG2,XYZ2,124.50,20151104,20151111,ABCDEFG1,XYZ1,
>>> 124.50,20151208,20151111
>>> http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678,2x
>>> Credit
>>> Notes,ABCDEFG1,XYZ1,124.50,20151208,20151111,ABCDEFG2,XYZ2,
>>> 124.50,20151104,20151111
>>>
>>>
>>> How do I manage to only output one line, given that the two lines are just
>>> inverted ?
>>>
>>> Thank you for your help.
>>>
>>>
>
-- 
Lorenz B�hmann
AKSW group, University of Leipzig
Group: http://aksw.org - semantic web research center

Re: Duplicate entries produced by SPARQL query

Posted by lookman sanni <lo...@gmail.com>.

That's what I actually implemented. I was hoping there would be something
sexier.

Thanks though Brian.

On Tue, Nov 15, 2016 at 9:27 AM, Brian McBride <br...@epimorphics.com>
wrote:

> Would adding a filter
>
> FILTER (?bf1 <= ?bf2)
>
> or similar do the trick?
>
> Brian
>
>
>
>
> On 15/11/16 00:09, lookman sanni wrote:
>
>> Hi,
>>
>> I have the following RDF:
>>
>> <rdf:RDF
>>      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>      xmlns:j.0="http://www.somelink.org/ofdf#">
>> <rdf:Description rdf:about="
>> http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678">
>>      <j.0:creditedToCust>ABCDEFG1_5#XYZ1</j.0:creditedToCust>
>>      <j.0:createdOnDate>ABCDEFG1_5#20151208</j.0:createdOnDate>
>> <j.0:consumedOnDate>ABCDEFG1_5#20151111</j.0:consumedOnDate>
>> <j.0:creditedFor>ABCDEFG1_5#124.50</j.0:creditedFor>
>> <j.0:createdOnDate>ABCDEFG2#20151104</j.0:createdOnDate>
>>      <j.0:creditedToCust>ABCDEFG2#XYZ2</j.0:creditedToCust>
>>      <j.0:creditedFor>ABCDEFG2#124.50</j.0:creditedFor>
>>      <j.0:consumedOnDate>ABCDEFG2#20151111</j.0:consumedOnDate>
>>    </rdf:Description>
>> </rdf:RDF>
>>
>> and the following SPARQL query:
>>
>> String qs1 = "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>> " +
>>   "PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> " +
>>   "PREFIX ticket: <http://www.lookouster.org/ofdf#> " +
>>   "SELECT ?lticket ?type ?bf1 ?cust1 ?tAmount1 ?dtCreation1 ?dtTravel1
>> ?bf2
>> ?cust2 ?tAmount2 ?dtCreation2 ?dtTravel2 " +
>>   "WHERE { " +
>>   "?lticket ticket:creditedToCust ?_cust1       .\n" +
>>   "?lticket ticket:creditedToCust ?_cust2       .\n" +
>>   "?lticket ticket:creditedFor    ?_tAmount1    .\n" +
>>   "?lticket ticket:creditedFor    ?_tAmount2    .\n" +
>>   "?lticket ticket:createdOnDate  ?_dtCreation1 .\n" +
>>   "?lticket ticket:createdOnDate  ?_dtCreation2 .\n" +
>>   "?lticket ticket:consumedOnDate  ?_dtTravel1 .\n" +
>>   "?lticket ticket:consumedOnDate  ?_dtTravel2 .\n" +
>>   "FILTER ( strafter( ?_cust1, '#') != strafter( ?_cust2, '#') )   .\n" +
>>   "FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_tAmount1, '#') )
>>  .\n" +
>>   "FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_dtCreation1, '#') )
>> .\n" +
>>   "FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_dtTravel1, '#') )
>>  .\n"
>> +
>>   "FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_tAmount2, '#') )
>>  .\n" +
>>   "FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_dtCreation2, '#') )
>> .\n" +
>>   "FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_dtTravel2, '#') )
>>  .\n"
>> +
>>   "FILTER ( strafter( ?_tAmount1, '#' ) = strafter( ?_tAmount2, '#' ) )
>> .\n" +
>>   "bind(strafter( ?_cust1, '#' ) as ?cust1)  .\n" +
>>   "bind(strafter( ?_cust2, '#' ) as ?cust2)  .\n" +
>>   "bind(strbefore( ?_cust1, '#' ) as ?bf1)  .\n" +
>>   "bind(strbefore( ?_cust2, '#' ) as ?bf2)  .\n" +
>>   "bind(strafter( ?_tAmount1, '#' ) as ?tAmount1)  .\n" +
>>   "bind(strafter( ?_tAmount2, '#' ) as ?tAmount2)  .\n" +
>>   "bind(strafter( ?_dtCreation1, '#' ) as ?dtCreation1)  .\n" +
>>   "bind(strafter( ?_dtCreation2, '#' ) as ?dtCreation2)  .\n" +
>>   "bind(strafter( ?_dtTravel1, '#' ) as ?dtTravel1)  .\n" +
>>   "bind(strafter( ?_dtTravel2, '#' ) as ?dtTravel2)  .\n" +
>>   "values ?type { '2x Credit Notes' }          .\n" +
>> "} ORDER BY ?lticket ?dtCreation1 " ;
>>
>> The query ran well, but return the following 2 lines:
>>
>> lticket,type,bf1,cust1,tAmount1,dtCreation1,dtTravel1,bf2,
>> cust2,tAmount2,dtCreation2,dtTravel2
>> http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678,2x
>> Credit
>> Notes,ABCDEFG2,XYZ2,124.50,20151104,20151111,ABCDEFG1,XYZ1,
>> 124.50,20151208,20151111
>> http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678,2x
>> Credit
>> Notes,ABCDEFG1,XYZ1,124.50,20151208,20151111,ABCDEFG2,XYZ2,
>> 124.50,20151104,20151111
>>
>>
>> How do I manage to only output one line, given that the two lines are just
>> inverted ?
>>
>> Thank you for your help.
>>
>>
>


-- 
Best Regards

Lookman SANNI

Re: Duplicate entries produced by SPARQL query

Posted by Brian McBride <br...@epimorphics.com>.

Would adding a filter

FILTER (?bf1 <= ?bf2)

or similar do the trick?

Brian



On 15/11/16 00:09, lookman sanni wrote:
> Hi,
>
> I have the following RDF:
>
> <rdf:RDF
>      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>      xmlns:j.0="http://www.somelink.org/ofdf#">
> <rdf:Description rdf:about="
> http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678">
>      <j.0:creditedToCust>ABCDEFG1_5#XYZ1</j.0:creditedToCust>
>      <j.0:createdOnDate>ABCDEFG1_5#20151208</j.0:createdOnDate>
> <j.0:consumedOnDate>ABCDEFG1_5#20151111</j.0:consumedOnDate>
> <j.0:creditedFor>ABCDEFG1_5#124.50</j.0:creditedFor>
> <j.0:createdOnDate>ABCDEFG2#20151104</j.0:createdOnDate>
>      <j.0:creditedToCust>ABCDEFG2#XYZ2</j.0:creditedToCust>
>      <j.0:creditedFor>ABCDEFG2#124.50</j.0:creditedFor>
>      <j.0:consumedOnDate>ABCDEFG2#20151111</j.0:consumedOnDate>
>    </rdf:Description>
> </rdf:RDF>
>
> and the following SPARQL query:
>
> String qs1 = "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> " +
>   "PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> " +
>   "PREFIX ticket: <http://www.lookouster.org/ofdf#> " +
>   "SELECT ?lticket ?type ?bf1 ?cust1 ?tAmount1 ?dtCreation1 ?dtTravel1 ?bf2
> ?cust2 ?tAmount2 ?dtCreation2 ?dtTravel2 " +
>   "WHERE { " +
>   "?lticket ticket:creditedToCust ?_cust1       .\n" +
>   "?lticket ticket:creditedToCust ?_cust2       .\n" +
>   "?lticket ticket:creditedFor    ?_tAmount1    .\n" +
>   "?lticket ticket:creditedFor    ?_tAmount2    .\n" +
>   "?lticket ticket:createdOnDate  ?_dtCreation1 .\n" +
>   "?lticket ticket:createdOnDate  ?_dtCreation2 .\n" +
>   "?lticket ticket:consumedOnDate  ?_dtTravel1 .\n" +
>   "?lticket ticket:consumedOnDate  ?_dtTravel2 .\n" +
>   "FILTER ( strafter( ?_cust1, '#') != strafter( ?_cust2, '#') )   .\n" +
>   "FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_tAmount1, '#') )   .\n" +
>   "FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_dtCreation1, '#') )
> .\n" +
>   "FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_dtTravel1, '#') )   .\n"
> +
>   "FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_tAmount2, '#') )   .\n" +
>   "FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_dtCreation2, '#') )
> .\n" +
>   "FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_dtTravel2, '#') )   .\n"
> +
>   "FILTER ( strafter( ?_tAmount1, '#' ) = strafter( ?_tAmount2, '#' ) )
> .\n" +
>   "bind(strafter( ?_cust1, '#' ) as ?cust1)  .\n" +
>   "bind(strafter( ?_cust2, '#' ) as ?cust2)  .\n" +
>   "bind(strbefore( ?_cust1, '#' ) as ?bf1)  .\n" +
>   "bind(strbefore( ?_cust2, '#' ) as ?bf2)  .\n" +
>   "bind(strafter( ?_tAmount1, '#' ) as ?tAmount1)  .\n" +
>   "bind(strafter( ?_tAmount2, '#' ) as ?tAmount2)  .\n" +
>   "bind(strafter( ?_dtCreation1, '#' ) as ?dtCreation1)  .\n" +
>   "bind(strafter( ?_dtCreation2, '#' ) as ?dtCreation2)  .\n" +
>   "bind(strafter( ?_dtTravel1, '#' ) as ?dtTravel1)  .\n" +
>   "bind(strafter( ?_dtTravel2, '#' ) as ?dtTravel2)  .\n" +
>   "values ?type { '2x Credit Notes' }          .\n" +
> "} ORDER BY ?lticket ?dtCreation1 " ;
>
> The query ran well, but return the following 2 lines:
>
> lticket,type,bf1,cust1,tAmount1,dtCreation1,dtTravel1,bf2,cust2,tAmount2,dtCreation2,dtTravel2
> http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678,2x Credit
> Notes,ABCDEFG2,XYZ2,124.50,20151104,20151111,ABCDEFG1,XYZ1,124.50,20151208,20151111
> http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678,2x Credit
> Notes,ABCDEFG1,XYZ1,124.50,20151208,20151111,ABCDEFG2,XYZ2,124.50,20151104,20151111
>
>
> How do I manage to only output one line, given that the two lines are just
> inverted ?
>
> Thank you for your help.
>