You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Sarven Capadisli <in...@csarven.ca> on 2012/05/12 19:41:35 UTC
TDB Optimizer
Hi TDBsters,
I was just getting back to playing around with tdbstats to optimize some
query responses [1]. At first I couldn't get any counts, but then
discovered that "tdbstats only looks in the default graph (currently)"
[2]. So, I went on to pass one of the graph IRIs, lo and behold, I got
some stats for that graph.
What I'm wondering is, how can I create stats for all of the graphs in
the store and possibly have them represented in stats.opt? Creating
stats for each graph and then merging them (with proper counts) into a
single stats.opt seems cumbersome and possibly incorrect. After all, I
want to be able to query the store in a way that the generated
statistics reflects the store in the most accurate way possible.
Also.. what is the location of rule file where we write the Statistics
Rule Language? I can't see how that supposed to go into stats.opt
because then how would we obtain the counts for the triple patterns?
Thanks!
[1] http://jena.apache.org/documentation/tdb/optimizer.html
[2] http://tech.groups.yahoo.com/group/jena-dev/message/44946
-Sarven
Re: TDB Optimizer
Posted by Sarven Capadisli <in...@csarven.ca>.
On 12-05-15 10:54 AM, Svatopluk Šperka wrote:
> Hi,
>
> what about using something like "tdbstats --loc=a/b/c --graph=urn:x-arq:UnionGraph". tdbstats should accept urn:x-arg:UnionGraph according to http://jena.apache.org/documentation/tdb/datasets.html .
>
> Svatopluk
>
> On May 14, 2012, at 11:13 PM, Andy Seaborne wrote:
>
>> On 12/05/12 18:41, Sarven Capadisli wrote:
>>> Hi TDBsters,
>>>
>>> I was just getting back to playing around with tdbstats to optimize some
>>> query responses [1]. At first I couldn't get any counts, but then
>>> discovered that "tdbstats only looks in the default graph (currently)"
>>> [2]. So, I went on to pass one of the graph IRIs, lo and behold, I got
>>> some stats for that graph.
>>>
>>> What I'm wondering is, how can I create stats for all of the graphs in
>>> the store and possibly have them represented in stats.opt? Creating
>>> stats for each graph and then merging them (with proper counts) into a
>>> single stats.opt seems cumbersome and possibly incorrect. After all, I
>>> want to be able to query the store in a way that the generated
>>> statistics reflects the store in the most accurate way possible.
>>>
>>> Also.. what is the location of rule file where we write the Statistics
>>> Rule Language? I can't see how that supposed to go into stats.opt
>>> because then how would we obtain the counts for the triple patterns?
>>>
>>> Thanks!
>>>
>>> [1] http://jena.apache.org/documentation/tdb/optimizer.html
>>> [2] http://tech.groups.yahoo.com/group/jena-dev/message/44946
>>>
>>> -Sarven
>>
>> Sarven,
>>
>> There isn't a way to capture proper stats across named graphs currently. The stats are applied to every graph, so it's not sensitive to which graph (named or default).
>>
>> The stats.opt file is the rules file. Be careful not to overwrite :-(
>>
>> Andy
>
>
Thanks Svatopluk. It looks like that did the trick as far as getting the
stats for the union of all graphs. However, at the moment, I'm uncertain
what that entails.
It makes me wonder whether the application of stats can be improved.
That is, can the stats be more granular on a per graph basis? If GRAPH
is used in a SPARQL query, can the stats be better used? Would that mean
extending the Statistics Rule Language to use a graph variable as well?
Otherwise, I'm unsure how using anything but
--graph=urn:x-arq:UnionGraph or urn:x-arq:DefaultGraph is actually
helpful for obtaining a stats file. Because, if a single stats file is
used (which comes from a single named graph), it excludes statistics
about the other graphs.
Do I understand this problem correctly?
Thanks,
-Sarven
Re: TDB Optimizer
Posted by Svatopluk Šperka <sp...@gmail.com>.
Hi,
what about using something like "tdbstats --loc=a/b/c --graph=urn:x-arq:UnionGraph". tdbstats should accept urn:x-arg:UnionGraph according to http://jena.apache.org/documentation/tdb/datasets.html .
Svatopluk
On May 14, 2012, at 11:13 PM, Andy Seaborne wrote:
> On 12/05/12 18:41, Sarven Capadisli wrote:
>> Hi TDBsters,
>>
>> I was just getting back to playing around with tdbstats to optimize some
>> query responses [1]. At first I couldn't get any counts, but then
>> discovered that "tdbstats only looks in the default graph (currently)"
>> [2]. So, I went on to pass one of the graph IRIs, lo and behold, I got
>> some stats for that graph.
>>
>> What I'm wondering is, how can I create stats for all of the graphs in
>> the store and possibly have them represented in stats.opt? Creating
>> stats for each graph and then merging them (with proper counts) into a
>> single stats.opt seems cumbersome and possibly incorrect. After all, I
>> want to be able to query the store in a way that the generated
>> statistics reflects the store in the most accurate way possible.
>>
>> Also.. what is the location of rule file where we write the Statistics
>> Rule Language? I can't see how that supposed to go into stats.opt
>> because then how would we obtain the counts for the triple patterns?
>>
>> Thanks!
>>
>> [1] http://jena.apache.org/documentation/tdb/optimizer.html
>> [2] http://tech.groups.yahoo.com/group/jena-dev/message/44946
>>
>> -Sarven
>
> Sarven,
>
> There isn't a way to capture proper stats across named graphs currently. The stats are applied to every graph, so it's not sensitive to which graph (named or default).
>
> The stats.opt file is the rules file. Be careful not to overwrite :-(
>
> Andy
Re: TDB Optimizer
Posted by Andy Seaborne <an...@apache.org>.
On 12/05/12 18:41, Sarven Capadisli wrote:
> Hi TDBsters,
>
> I was just getting back to playing around with tdbstats to optimize some
> query responses [1]. At first I couldn't get any counts, but then
> discovered that "tdbstats only looks in the default graph (currently)"
> [2]. So, I went on to pass one of the graph IRIs, lo and behold, I got
> some stats for that graph.
>
> What I'm wondering is, how can I create stats for all of the graphs in
> the store and possibly have them represented in stats.opt? Creating
> stats for each graph and then merging them (with proper counts) into a
> single stats.opt seems cumbersome and possibly incorrect. After all, I
> want to be able to query the store in a way that the generated
> statistics reflects the store in the most accurate way possible.
>
> Also.. what is the location of rule file where we write the Statistics
> Rule Language? I can't see how that supposed to go into stats.opt
> because then how would we obtain the counts for the triple patterns?
>
> Thanks!
>
> [1] http://jena.apache.org/documentation/tdb/optimizer.html
> [2] http://tech.groups.yahoo.com/group/jena-dev/message/44946
>
> -Sarven
Sarven,
There isn't a way to capture proper stats across named graphs currently.
The stats are applied to every graph, so it's not sensitive to which
graph (named or default).
The stats.opt file is the rules file. Be careful not to overwrite :-(
Andy