You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@clerezza.apache.org by Reto Bachmann-Gmuer <re...@trialox.org> on 2011/05/26 20:31:45 UTC

[VOTE] Accept the proposed patch of CLEREZZA-540

With CLEREZZA-540 I suggest a GraphNodeProvider-Service that returns a
GraphNode given a named resource. Mainly this code that used to be in
DiscoBitsTypeHandler which has been generalized.

The issue is described as:
"Implement a platform service that returns GraphNodes for URIs. The
GraphNode is the resource identified by that uri with as BaseGraph sources
considered authoritative for that resource. "

Of course "considered authoritative" it not a very sharp description. The
issue is labeled with "platform" which implies it is not a generic utility
of clerezza.rdf but that it relies on platform default graphs.

The solution proposed in commit
#1125477<http://svn.apache.org/viewvc?view=rev&rev=1125477>and
#1125652 <http://svn.apache.org/viewvc?view=rev&rev=1125652> sets the
basegarph as follows:
- always trust the content graph
- for remote resource trust the graph you get by dereferencing the uri
- for resources in the user-uri space trust that user

This might not match an intuitive understanding of "authoritative" and I'm
happy to redefine the issue so that no confusion arises.

What I do strongly believe is that the proposed patch offers a major and
very useful new functionality. Especially as it allows the following
features to be implemented:
- Thanks to CLEREZZA-544 one can call the render-method to delegate the
rendering of resources with a UriRef instead of a resource, in this case the
resource is rendered using its own baseGraph rather than the one of the
calling template. An example usecase for this is rendering the author of a
comment, the whole profile of the (possibly remote) commenter isn't and
shall not be part of the baseGraph of the GraphNode returned by the jax-rs
resource method, yet for rendering the comment-author infobox it might be
beneficial to render a GarphNode with a baseGraph containing all of the
information in the users profile-document
- With CLEREZZA-541 the GraphNodeService is accessed from TypeHandler, I
posted a resolution to this issue because it was already quite there on my
local machine when Herny reopened CLEREZZA-540, to respect the reopening of
the issue I didn't mark the dependent issue as resolved. I will of course
revert the changes if requested to do so by a qualifying -1.

I'm not arguing that my patches solve all issues one might have around
getting resource descriptions but I do think it is very valuable and to
allow to base other stuff on this service I would like the issue to be
closed. As Henry reopened the issue twice and I don't want to close the
issue again without a broader discussion. Yet as many thing depend on the
issue leaving it open doesn't seem an option to me.

Future enhancement might include:
- manually force refresh of caches for graphs related to a requested
resource
- force an alternative set of baseGraphs to be used (e.g. Only local or only
remote sources)

So I'm asking you to kindly review the proposed code and vote about closing
CLEREZZA-540

[ ] +1, I agree with accepting the proposed code into trunk
[ ] 0, I don't care
[ ] -1, I don't want this code in trunk (must specify a technical
explanation, please also specify what would have to be changed for the patch
to be acceptable to you)

Cheers,
Reto

Re: Issues of trust -- Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Reto Bachmann-Gmuer <re...@trialox.org>.

On Mon, Jun 6, 2011 at 11:55 AM, Henry Story <he...@bblfish.net>wrote:

> Sorry for not being able to reply earlier. The Federated Social Web
> Conference in Berlin
> co-occurred with the WebId get together and so I was extremely busy . There
> were a lot of
> discussions on the Social Web, and I even presented Clerezza.
>
>   http://d-cent.org/fsw2011/
>
> I read through the first part of Reto's reply and answered that here. The
> issues were grouped together all around the theme of trust, the complexity
> of determining it, and how that clashes with the idea of being able to
> determine trust by default for other people. It turns out, I argue, that
> adding the content graph to a remote graph can reduce trust in the result
> rather than enhance it. It furthermore can lead to information leakage. It
> also makes many use cases impossible to implement as well as making choices
> that will make efficient reasoning in the future difficult.
>
> On 3 Jun 2011, at 13:29, Reto Bachmann-Gmuer wrote:
> >>
> >> Also there will be contradictions in the information on the web. Some
> >> people may trust some graphs, other trust others.
> >
> > Right, that's why the GraphNodeProvider trusts only the content-graph,
> which
> > is trusted qua being a platform service) and the graph resulting from
> > dereferencing the resource (trusted by conventional web-trust)
>
> As we saw the content graph returned by the Provider is a complex union of
> other graphs, of which a documentation, config, web-resources, and
> enrichment graph, plus some logic regarding WebIdGraphService, and other
> services.
>
> Combine that with the fact that judgements of trust are context dependent,
> user dependent and task dependent among other things, and one can predict
> that making simplified trust decisions for others will lead to security
> holes, and many other issues.
>
> So as I understood Clerezza is built so as to make it possible for users to
> upload new packages. These may contain documentation, which could contain
> relations that are perhaps out of date, or are just hypothetical, and so
> start interfering in odd ways with other applications... It would be a pity
> to loose the ability to give rights to one's friends to add limited new
> features to Clerezza for fear that they may interfere with one's work in
> other unrelated areas.
>
The ability for users to upload bundles running with their permission is a
feature that should be used with greatest care. In fact there's no
protection against DOS attacks  also the user bundles might start services
other than jax-rs resources scoped to their bundle prefix. You correctly
note, that via the documentation graph they are able to add contents to the
virtual content graph without having an explicit permission to do so. This
is certainly an issue.

However this is not related to CLEREZZA-540, the content-graph is a feature
that exists independently of CLEREZZA-540 and while it is a fact that - as
you point out - users that have (the anyway very high) privilege to upload
bundles can add content to the content graph, this means that they can for
example add arbitrary content to the start page of the local clerezza
instance. Annotating remote resources doesn't seem to be something that
should be more restricted than resources served by the local instance.



>
> Consider that in the previous post you wrote:
>
> > Modules can write to the content graph or add temporary additions to it.
> > Actually writing to the content graph should happen when public and
> trusted
> > information is added.
>
> Who decides when information is public and trusted? This is not something
> that comes all at once. It is not furthermore something that is determined
> for ever. Marriages would not break up so frequently were it otherwise. And
> most companies have more than 2 people working for them, with constantly
> changing trust relations, roles, etc...
>
You pointed out an issue above, but basically its who has readwrite
permission on the content-graph to decide which triples are public and
trusted


>
> > An information is considered trusted when added by a
> > user with respective permission or verified by privileged code (e.g. that
> > allows the public to add see-also references).
>
> Yes, this is one potential method for determining trust. It will work for
> some apps, but for many others this will capture a woefully inadequate
> notion of trust. Consider that the above will mean that the aggregate trust
> one can put in the content graph, will depend on the verification ability of
> the weakest code installed in Clerezza.
>
Yes, as per now the weakest code can do bad things. Clerezza does not (yet)
have a powerfull sandboxing mechanism. We should add warnings about this but
this doesn't alters the soundness of having a place of instance-wide trusted
and public data. This place (the content graph) is not introduced by the
issue you veto against.



>
> So now the trust one can have in the result returned by Clerezza when
> asking for
>
>   <http://www.w3.org/People/Berners-Lee/card>
>
> Is not the trust one has in the above resource,

You mean in the infrastructure used for getting a representation for that
resource, i.e. the server and the netwsork.

but the trust one has in the weakest link of a specific installation of
> Clerezza.
>

Which makes perfectly sense as you wouldn't otherwise ask a specific
clerezza instance but directly dereference the resource.

[...]

> >
> > With the current service you have what TimBL says plus the platform-wide
> > truths of the content-graph, this may contain things like a link back to
> you
> > (the owner of the platform instance) or a statement like : TimBL rdf:type
> > ex: Spammer which might not be published in TimBL's profile
>
> Yes, that is great. I would love to be able to have that information when I
> need
> it. But is the content graph the right place to put information about
> spammers?
>
It seems like a non-classified and trusted (for the medium operation trust
level of the content graph) information.

> Is a simple ssp application that publishes a graph of relations on TimBL
> also going
> to now publish that he is a spammer?
>
ssp is a rendering mechanism, here we are talking about a service that gives
description about resources using the content graph and for remote resources
the web. This service is not by default accessible from remote.


>
> >
> >> When I get Dan Brickley's graph I may want to know all the people he
> >> mentions in his foaf profile - even if he does not mention them as
> >> foaf:knows related to him.
> >
> > does this provide a new point?
>
> yes. It shows that your merger of a graph with a whole bunch of other
> information  makes certain types of queries impossible.
>
Ok, I think you illustrated this with the TimBL example too.


>        Use Case: When people type or drag and drop URLs in a form  they
> will usually not be precise with their URLs. So it will be important to find
> the minimum published context, find the people in that context, to be able
> to help them select a person. Presumably they meant some person in that
> page. So one needs to search the people in that page to help them select the
> right one.
>
Yes, apart from the fact that (as Tommaso pointed out) nobody is forced to
use the new service it seems that the service could be used here. That it
might even be useful to give the platform owner the ability to add triples.


>
> We have similar issues with reasoning. Reasoning over small graphs is going
> to be a lot more efficient than over the whole database. So one may be also
> interested just in minimal reasoning, adding all foaf:Person to all
> foaf:knows subjects and objects in a remote foaf - in order to locate that
> person the user is interested in.
>
Yes, and reasing can also be done when handling the query, the described use
case is only a problem with big graph when you do forward chaining.


> When receiving a new grph one may also be first interested to see if it by
> itself does not contain any contradictions,
> before merging it - even if only virtually - with the rest of the DB.
>
and when installing a bundle one might be interested to see that all methods
it contains actually terminate...

we don't know the meaning of all used terms so a checking for contradiction
would necessarily be limited. it would be expensive to do and I don't see
exactly the benefit, if there is a contradiction between local knowledge and
what the web says you might have a contradiction in the triples accessible
from the graphnode. this happens, if for some application this is a
catastrophe they shouldn't use the service. (And probably neither the web).



>
> >
> >> If the GraphNodeProvider returns a union graph of the documentation
> graph,
> >
> > Again no, we're not returning a union graph we're returning a GraphNode,
> the
> > underlying graph is an implementation detail (was think if the
> > getGraph-method could be made less visible (protected or private) to
> avoid
> > this confusion)
> >
> >
> >> content graph,... and his foaf profile then when searching for all the
> >> foaf:Person
> >
> > You don't search a GraphNode for all foaf:Person but the GraphNode
> > represents the foaf:Person you asked for.
>
> Well I could also ask for the GraphNode of the foaf:Person class and then
>
>    people/-RDF.`type`
>
> would return all instances.
>
The resource you get "contains" the description it gets from the foaf
ontology as well as any person the platform knows about. This seems to be
usefull in many situations, but you were talking about getting persons not
classes, your usage of "the" in your sentence indicate you refer to an
instance of foaf:Person.



>
>
> >
> >> I will get the documentation writers too, the writers of content in the
> >> content graph, and who knows what else...
> >
> > you will have properties pointing from that persons to all the comments
> he
> > left on the local instance, which can be quite handy (and which are from
> the
> > underlying content graph as they are probably not also contained in the
> > remote foaf:profile)
>
> That can be quite handy in SOME circumstances, and not handy at all in
> others.
>
A truism: all the service offered by the platform is useful for some scopes
and useless for others.


>
> So in summary, the "trust" decisions made by the GraphNodeProvider do not
> increase
> trust but may well reduce it,

The trust decision is not made by the GNP but by its client, I don't ask you
to tell me about xy if I could ask xy directly unless to give you the
opportunity to tell me something that xy might not tell me.


> they could end up creating information leakage,

not seeing this.


> they reduce
> the number of things that can be done,

how can an added feature rediuce the number of things that can be done?


> make some very arbitrary trust decisions,

The "trust decisions" (referent of "they") "make some very arbitrary trust
decisions" ?


> and make reasoning more difficult.
>
With the current blocking of features and length of discussions it seems
unlikely clerezza will ever be close to do reasoning :(



>
> The decision to return these union of graphs by default is thus unintuitive
> and unhelpful.

Which seems slightly contrasting to the fact that we have 5 explicit +1 to
the feature.


> It packs a huge amount of decisions that are not evident when asking for a
> GraphNode.  These decisions are not of course bad in all circumstances.
> There  are many cases where it may be very good. But it is better to have
> those decisions be clear and not tie them into the core of Clerezza

As having a common trust basis is what is the particularity of the
clerezza.platform bundles this seems like an argument to the whole platform.



> where the returning of a simple UriRef requires one to be aware of all
> these decisions.
>
No, please see my previous mail on UriRef/GraphNode disctinction.


Reto

Current mood: :(

Issues of trust -- Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Henry Story <he...@bblfish.net>.

Sorry for not being able to reply earlier. The Federated Social Web Conference in Berlin 
co-occurred with the WebId get together and so I was extremely busy . There were a lot of
discussions on the Social Web, and I even presented Clerezza.

   http://d-cent.org/fsw2011/

I read through the first part of Reto's reply and answered that here. The issues were grouped together all around the theme of trust, the complexity of determining it, and how that clashes with the idea of being able to determine trust by default for other people. It turns out, I argue, that adding the content graph to a remote graph can reduce trust in the result rather than enhance it. It furthermore can lead to information leakage. It also makes many use cases impossible to implement as well as making choices that will make efficient reasoning in the future difficult. 

On 3 Jun 2011, at 13:29, Reto Bachmann-Gmuer wrote:
>> 
>> Also there will be contradictions in the information on the web. Some
>> people may trust some graphs, other trust others.
> 
> Right, that's why the GraphNodeProvider trusts only the content-graph, which
> is trusted qua being a platform service) and the graph resulting from
> dereferencing the resource (trusted by conventional web-trust)

As we saw the content graph returned by the Provider is a complex union of other graphs, of which a documentation, config, web-resources, and enrichment graph, plus some logic regarding WebIdGraphService, and other services. 

Combine that with the fact that judgements of trust are context dependent, user dependent and task dependent among other things, and one can predict that making simplified trust decisions for others will lead to security holes, and many other issues.

So as I understood Clerezza is built so as to make it possible for users to upload new packages. These may contain documentation, which could contain relations that are perhaps out of date, or are just hypothetical, and so start interfering in odd ways with other applications... It would be a pity to loose the ability to give rights to one's friends to add limited new features to Clerezza for fear that they may interfere with one's work in other unrelated areas.

Consider that in the previous post you wrote:

> Modules can write to the content graph or add temporary additions to it.
> Actually writing to the content graph should happen when public and trusted
> information is added. 

Who decides when information is public and trusted? This is not something that comes all at once. It is not furthermore something that is determined for ever. Marriages would not break up so frequently were it otherwise. And most companies have more than 2 people working for them, with constantly changing trust relations, roles, etc...

> An information is considered trusted when added by a
> user with respective permission or verified by privileged code (e.g. that
> allows the public to add see-also references).

Yes, this is one potential method for determining trust. It will work for some apps, but for many others this will capture a woefully inadequate notion of trust. Consider that the above will mean that the aggregate trust one can put in the content graph, will depend on the verification ability of the weakest code installed in Clerezza.

So now the trust one can have in the result returned by Clerezza when asking for

   <http://www.w3.org/People/Berners-Lee/card>

Is not the trust one has in the above resource, but the trust one has in the weakest link of a specific installation of Clerezza.

>> Graphs can be merged easily in RDF - IF they  are believed both to be true.
>> But what is believed to be true will depend on what possible world you
>> believe yourself to be in. I argued this in "Beatnik: change your mind"
>> in more detail, if that helps for people following this discussion
>> 
>> http://blogs.oracle.com/bblfish/entry/beatnik_change_your_mind
>> 
>> From the point of view of WebID and security I want to be able to tell WHO
>> said what. In many applications being able to be very clear about where
>> something was said is going to  be essential to giving good feedback. Some
>> example coming from the field I am working on below.
>> 
> 
>> So for a foaf-browser, I want to know when TimBl declares someone to be a
>> friend, and differentiate that from when someone declares himself to be a
>> friend of TimBL, which is a very different thing.
> 
> With the current service you have what TimBL says plus the platform-wide
> truths of the content-graph, this may contain things like a link back to you
> (the owner of the platform instance) or a statement like : TimBL rdf:type
> ex: Spammer which might not be published in TimBL's profile

Yes, that is great. I would love to be able to have that information when I need
it. But is the content graph the right place to put information about spammers?
Is a simple ssp application that publishes a graph of relations on TimBL also going
to now publish that he is a spammer? 

> 
>> When I get Dan Brickley's graph I may want to know all the people he
>> mentions in his foaf profile - even if he does not mention them as
>> foaf:knows related to him.
> 
> does this provide a new point?

yes. It shows that your merger of a graph with a whole bunch of other information  makes certain types of queries impossible. 
	Use Case: When people type or drag and drop URLs in a form  they will usually not be precise with their URLs. So it will be important to find the minimum published context, find the people in that context, to be able to help them select a person. Presumably they meant some person in that page. So one needs to search the people in that page to help them select the right one.

We have similar issues with reasoning. Reasoning over small graphs is going to be a lot more efficient than over the whole database. So one may be also interested just in minimal reasoning, adding all foaf:Person to all foaf:knows subjects and objects in a remote foaf - in order to locate that person the user is interested in.

When receiving a new grph one may also be first interested to see if it by itself does not contain any contradictions,
before merging it - even if only virtually - with the rest of the DB.

> 
>> If the GraphNodeProvider returns a union graph of the documentation graph,
> 
> Again no, we're not returning a union graph we're returning a GraphNode, the
> underlying graph is an implementation detail (was think if the
> getGraph-method could be made less visible (protected or private) to avoid
> this confusion)
> 
> 
>> content graph,... and his foaf profile then when searching for all the
>> foaf:Person
> 
> You don't search a GraphNode for all foaf:Person but the GraphNode
> represents the foaf:Person you asked for.

Well I could also ask for the GraphNode of the foaf:Person class and then

    people/-RDF.`type` 

would return all instances.

> 
>> I will get the documentation writers too, the writers of content in the
>> content graph, and who knows what else...
> 
> you will have properties pointing from that persons to all the comments he
> left on the local instance, which can be quite handy (and which are from the
> underlying content graph as they are probably not also contained in the
> remote foaf:profile)

That can be quite handy in SOME circumstances, and not handy at all in others.

So in summary, the "trust" decisions made by the GraphNodeProvider do not increase
trust but may well reduce it, they could end up creating information leakage, they reduce 
the number of things that can be done, make some very arbitrary trust decisions, and 
make reasoning more difficult. 

The decision to return these union of graphs by default is thus unintuitive and unhelpful. It packs a huge amount of decisions that are not evident when asking for a GraphNode.  These decisions are not of course bad in all circumstances. There  are many cases where it may be very good. But it is better to have those decisions be clear and not tie them into the core of Clerezza where the returning of a simple UriRef requires one to be aware of all these decisions.

Henry

Re: sketch of a compromise solution -- Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Tommaso Teofili <to...@gmail.com>.

Hello guys,
I am ok to give the GraphNodeProviderService a wider scope than a
platform-only service but I don't think every JSR311 class has to use it, so
for example if one wants to implement a service which is more restrictive
well, then one can simply not use it, no? It seems to me this's also the
spirit of Henry's compromise solution. Maybe I am just making it too simple,
so please correct me if I am wrong here.
My opinion is that also CLEREZZA-544 corrects a previous issue with the
context used, the plan could be to add an optional Filter to restrict what
remains exposed in the context.
My 2 cents,
Tommaso



2011/6/3 Reto Bachmann-Gmuer <re...@trialox.org>

> On Wed, Jun 1, 2011 at 7:05 PM, Henry Story <he...@bblfish.net>
> wrote:
> [...]
>
> > > Yes, TcManager is the main entry point to the rdf data. I don't see any
> > code
> > > smell here. I a classical RDBMS java application a component will
> > typically
> > > use jdbc and rely on other components that use jdbc, here it's
> TcManager
> > > instead of jdbc,
> >
> > The thing to do would be to look at the number of  connections that are
> > generated to the
> > DB. My feeling is that one could do the same and have only 1 or 2
> > connections to the DB.
> > Here we could quickly end up with 10 times more....
> >
> I don't know what you mean by DB conncetions and where you see this factor
> 10.
>
> [...]
>
> > >
> > >>  On my fresh install of ZZ that is 20 times more information than the
> > >> initial graph.
> > >>
> > > - What is 20 times more?
> > > - What do you mean by "get"?
> > >
> > > A graphnode point to a resource and is designed for browsing from
> > resource
> > > to resource. It is not a graph but a node in a graph. The object is
> > > associated to a base-graph which used to identify the propertied and
> > > instanctiate another graphnode when hoping to a property value. The
> > > underlying graph could for instance be Timbl's GGG (giant gloabl graph
> > aka
> > > the web).
> >
> > yes  ((but clearly you don't want to dereference the whole web when
> > working...))
> >
> no, and nobody is doing this. If I include the uri pattern
> <http(s)://.*/(.*/)*> I'm not blasting this mail to the size of the web.
>
>
> >
> > Also there will be contradictions in the information on the web. Some
> > people may trust some graphs, other trust others.
>
> Right, that's why the GraphNodeProvider trusts only the content-graph,
> which
> is trusted qua being a platform service) and the graph resulting from
> dereferencing the resource (trusted by conventional web-trust)
>
>
> > Graphs can be merged easily in RDF - IF they  are believed both to be
> true.
> > But what is believed to be true will depend on what possible world you
> > believe yourself to be in. I argued this in "Beatnik: change your mind"
> > in more detail, if that helps for people following this discussion
> >
> >  http://blogs.oracle.com/bblfish/entry/beatnik_change_your_mind
> >
> > From the point of view of WebID and security I want to be able to tell
> WHO
> > said what. In many applications being able to be very clear about where
> > something was said is going to  be essential to giving good feedback.
> Some
> > example coming from the field I am working on below.
> >
>
> > So for a foaf-browser, I want to know when TimBl declares someone to be a
> > friend, and differentiate that from when someone declares himself to be a
> > friend of TimBL, which is a very different thing.
>
> With the current service you have what TimBL says plus the platform-wide
> truths of the content-graph, this may contain things like a link back to
> you
> (the owner of the platform instance) or a statement like : TimBL rdf:type
> ex: Spammer which might not be published in TimBL's profile
>
>
> > When I get Dan Brickley's graph I may want to know all the people he
> > mentions in his foaf profile - even if he does not mention them as
> > foaf:knows related to him.
>
> does this provide a new point?
>
>
> > If the GraphNodeProvider returns a union graph of the documentation
> graph,
>
> Again no, we're not returning a union graph we're returning a GraphNode,
> the
> underlying graph is an implementation detail (was think if the
> getGraph-method could be made less visible (protected or private) to avoid
> this confusion)
>
>
> > content graph,... and his foaf profile then when searching for all the
> > foaf:Person
>
> You don't search a GraphNode for all foaf:Person but the GraphNode
> represents the foaf:Person you asked for.
>
>
> > I will get the documentation writers too, the writers of content in the
> > content graph, and who knows what else...
>
> you will have properties pointing from that persons to all the comments he
> left on the local instance, which can be quite handy (and which are from
> the
> underlying content graph as they are probably not also contained in the
> remote foaf:profile)
>
>
> > many people will have no direct relation to Dan at all. People can say
> true
> > things about Dan but those not be things Dan himself would say.
> >
> Yes, we only consider as true what we say ourseflf (i.e. the content graph)
> and in particular circumstances also what Dan says.
>
>
> >
> > I believe these use cases are not limited to the foaf browser but to a
> very
> > large category of semantic web applications. Give me some linked data
> > application, and I will easily come up with use cases of the same kind.
> >
> That's why graphnodeprovider is a generic service and its not true that it
> was designed for a particular and very specific application of mines in
> mind.
> [...]
>
> > >
> > > I thought I had heard people mention issues with speed on this list.
> > >>
> > > You may check archives of this list at:
> > > http://mail-archives.apache.org/mod_mbox/incubator-clerezza-dev/
> >
> > Do you have a precise thread?
> >
> No, its you who thought he heard people mention issue with speed. Check the
> archive and construct an argument if those speed issues relate to the issue
> at hand, otherwise your remark " I thought I had heard people mention
> issues
> with speed on this list" is purely demagogic and a hindrance to an
> effective
> an fruitful discussion. (like saying "I've heard people finding the
> clerezza
> code hard to read" when justifying a -1 against some code)
>
> > [...]
> > >>
> > >> What I am wondering is in what cases is this needed? It seems like
> this
> > may
> > >> indeed what a particular application may require, but does it have to
> be
> > >> a general service? The name certainly suggests a very general service,
> > not
> > >> one required for a particular application.
> > >>
> > > This is about ContentGraphProvider then, not about issue 540. It's the
> > > ContentGraphProvider which provides the graph of instance-wide and
> public
> > > information for the platform
> >
> > 540 the GraphNodeProvider delegates decisions to the ContentGraphProvider
> > 544 uses the GraphNodeProvider that delegates to the ContentGraphProvider
> > and ties it into the the core,
> >    so that when a JSR311 class requests a named graph it then gets
> whatever
> > the ContentGraphProvider decides
> >    is trusted content.
> >
> I don't see any link to JSR311 but yes, the ContentGraphProvider provides
> content that is  public and platform-wide trusted by default (the system
> graph has higher trust).
>
>
> >
> > SKETCH OF A COMPROMISE SOLUTION
> > ===============================
> >
> > Perhaps there is a way to allow make things more transparent, by for
> > example having JSR311 classes that really
> > want the full union returned by the  current ContentGraphProvider to have
> > that, and other applications to get something more limited.
> >
> Again see no link to Jsr311. But I think this might be the mentioned
> possible future enhancement to allow clients to specify the trust
> boundaries.
>
>
> > I suggest that we think of naming the content graph or at least build
> > something so that those who need the content graph can ask for it
> clearly,
> > and those who don't can make sure they don't get it.
> >
> The content graph is named, the virtual content graph isn't, but we could
> give the virtaul content graph a name thus making it accessible in sparql
> queries (using a TcProvider) but this seems completely unrelated to the
> issue at hand.
>
>
> >
> >  - It may be useful then to name the content graph - it would be a union
> > graph that could be specified by a SPARQL UNION
> >    query for example or the equivalent.
> >
> Yes, if we give the Virtual content graph it can be used by the sparql
> endpoint
>
>
> >
> >  - have a JSR311 class return a NamedGraphNode (or something like that)
> > which can then call the CallbackRenderer.
> >
> What should a NamedGraphNode be? GarphNode can wrap a named or an anonymous
> resource I see no need for a subclass for UriRef-GraphNodes, but we have
> been discussing this in another thread, don't see a relation to the issue
> at
> hand.
>
> I don't know what you mean by the NamedGraphNode calling CallbackRenderer,
> the CallbackRenderer is called by the renderlet.
>
>   The NamedGraphNode could name the union graph, or could be a union graph
> > - by reference. So all the code would do is
> >
> >    return new UnionNamedGraphNode(new
> > UriRef("urn:x-localhost/contentGraph"),new UriRef("
> > http://remote.example/resource/"))
> >
>
> I'm not getting it which one is the resource and which one the name of the
> underlying graph, whats the difference to current GraphNodes.
>
>
> >
> >    or some nice syntactic sugar for that. For example for apps requiring
> > the union of Content + other  that you need,  something like the
> following
> > would be neat
> >
> >    return new ContentGraphNodePlus(UriRef("
> http://remote.example/resource/
> > "))
> >
>
> If you're talking about GraphNodes (but I'm not sure where you in fact mean
> graphs) the I don't see what you're introducing that is new
>
> return new GraphNode(new UriRef("http://remote.example/resource/"), new
> UnionMGraph(contentGraph, fooGraph, barGraph))
>
>
> >
> >    These objects would just be holders of the graph name(s), which the
> > TcManagers can then hook up into the underlying triple store. Something
> > along those lines would be very nice. One could easily write applications
> > that get union of contents as you wish, and I could easily get very
> > precisely defined graphs for security based application, or more flexible
> > linked data graphs too.
>
> You can do this (see example return statement above)
>
>
> > It could also avoid the iterative way the GraphNodeProvider currently
> > works.
> >
> Is this a reference to the if-then statements you criticed but never told
> me
> what you mean despite me repatedly asking? Or what "iterative way" are you
> reffering to?
>
>
> >
> > Having something like that would mean  that perhaps the  addition of the
> > new method in CallbackRenderer
> >
> >   public void render(UriRef resource, GraphNode context, String mode,
> >                        OutputStream os) throws IOException;
> >
> GarphNode is resource+context where the context is a graph. Now the
> renderlet gets a graphnode to render, it shouldn't get any context from
> anywhere else. the new render method is exactly to render a method with a
> differnt context not avaialble directly to the renderlet. If the outer
> renderlet already has (or can generate) the context for the nested
> rendering
> the this can be done with existing (pre 540) infrastructure.
>
>
> >
> >
> > would no longer be needed, or would be adapted somewhat.
>
> what?
>
>
> > It could also mean that the GraphNodeProvider could be a lot more
> general,
> > as its name indicates it should be. The information about graphs hard
> coded
> > into the provider could then be moved to a Graph (or GraphNode or
> NamedGraph
> > or NamedGraphNode object). It would then be a lot clearer when looking at
> > JSR311 code what was being returned.
> >
> Again this is not specific to jsr311 code. I think my proposed
> GraphNodeProvider is quite generic but that additional features coould be
> added.
>
>
> >
> > [[ ps: a thought
> > One could perhpas write implementations of such a NamedGraph that would
> > perhaps allow links to be followed outward (from accepted named graphs to
> > others graphs it links to, up to a certain number of hops).
> > ]]
> >
> Which seems to be exactly the context-switch allowed by ZZ-544
>
>
> >
> > >> Perhaps changing the name from GraphNodeProvider to
> > >> ContentGraphPlusOtherProvider would make more sense.
> > > It's a platform service that provides GraphNodes. Being a platform
> > service
> > > implies it usesthe platform means of getting trusted content. If it
> would
> > > just dereference URIs the it would probably be placed in a subpackage
> of
> > > clerezza.rdf.
> >
> > perhaps. But why not make things nice and general as explained above?
> >
> Where do you make something nice and more general? You're describing how
> clients can do stuff without GraphNodeProvider what they of course can do.
> And you're proposing new classes for what seems they can do as easily (but
> more consistently and thus more elegantly) with the existing classes.
>
>
>
> > Currently with changes to 544 and in particular the render method
> >
> >  public void render(UriRef resource, GraphNode context, String mode,
> >                        OutputStream os) throws IOException;
> >
> >
> > when a JSR311 class returns a URI,
>
> A jsr311 returns a GraphNode, if it returns a URI then type-rendering is
> not
> used (but another MessageBodyWriter, if available)
>
>
> > the renderer does not get the graph named by that
> > URI
>
> No renderlets get invoked, but if it gets a graphNode the renderlets gets
> that GraphNode which allows ecploring the resource with whatever graph the
> jax-rs resource method chose to use. Choosing this graph is the business of
> the application logic and certainly does not belong into the renderlet.
>
>
> > but that graph and something else, defined in some unrelated package. For
> > me this
> > does not make it easy to understand the code.
> >
> Obviously you don't. Would be good we find way to improve understanding of
> the clerezza architecture without requiring blocking the evolution by
> casting -1
>
>
> >
> >
> > >>> This might not match an intuitive understanding of "authoritative"
> and
> > >> I'm
> > >>> happy to redefine the issue so that no confusion arises.
> > >>
> > >> One thing I am not quite clear about yet, is who writes to the content
> > >> graph? I see a lot of modules use it.
> > >>
> > > Modules can write to the content graph or add temporary additions to
> it.
> > > Actually writing to the content graph should happen when public and
> > trusted
> > > information is added. An information is considered trusted when added
> by
> > a
> > > user with respective permission or verified by privileged code (e.g.
> that
> > > allows the public to add see-also references).
> >
> > Good so say a trusted user of mine :joe truthfully says
> >
> Waht do you mean by "trusted user"? trust with no limits? (admin rights?)
>
>
>
> >
> >  b:danbri foaf:knows :joe .
> >
> > then currently when I ask for http://danbri.org/foaf.rdf#danbri
> > I will get a graph that contains the above triple even if danbri does not
> > make that
> > claim. Sometimes that is good, and sometimes not.
>
> Sometimes it's good to use clerezza, sometimes a hammer is more appropriate
> ;)
>
>
> > In many cases as I have argued it will be
> > important for me to know what danbri claims. Perhaps so I can ping him to
> > tell him about
> > my desire for him to claim friendship with me.
> >
> There's nothing to prevent you or that would make it hard to write such an
> application, it's just not what the garphnodeprovider is for and it
> definitively doesn't belong into the renderlet
>
>
> >
> > In the current API changes it won't be clear at all why when I ask for
> >
> >    <http://danbri.org/foaf.rdf#danbri>
> >
> > I get <http://danbri.org/foaf.rdf#danbri>  + 5 other graphs.
>
> graph/resource distinction, what does the addition of a person and a graph
> result in?
>
>
> > Or it will require the developer
> > to know the internals of clerezza to work this out, as I have just had to
> > do myself.
> >
> It can well be, that clerezza will support sophisticated provenance
> mechanism in future. Not sure however if the blocking of patches for the
> existing base architecture fosters this developement.
>
> [...]
> >
>
> > >
> > >> 4. But instead of just having a GraphNodeProvider that just returns
> the
> > >> graph, you have added some twists to
> > >>  it and return more than jut the named graph. There is nothing to say
> > that
> > >> a named graph cannot be the union
> > >>  of many other graphs, but it seems really arbitrary for me to get the
> > >> documentation of clerezza along with the
> > >>  triples of Tim Berners Lee's graph.
> > >>
> > >
> > >>  Somehow things have gone a bit haywire at the end here.
> > >
> > > If you call getGraph on a GraphNode you're leaving the scope of the
> > > GraphNode. Probably all this discussion would not be necessary if had
> > been
> > > using getNodeContext instead of getGraph. The NodeContext is what
> related
> > to
> > > the node. Using getGraph is a bit like doing the following:
> > >
> > > File file = SomeService.getFileDescribing("Tim Berners Lee")
> > > file.getParent().getParent().getParent().listChildrenRecursively()
> >
> > I don't think that is a good way of looking at what graphs are useful
> for.
> > Graphs are more
> > like bubbles in a comic strip.
> >
> Yes, but here it's not about graph but resources (interpreted in a huge
> universe of believes)
>
>
> >
> > I argue this very carefully in "Are OO languages Autistic?"
> >
> >  http://blogs.oracle.com/bblfish/entry/are_oo_languages_autistic
> >
> > This is a fundamental new programming element provided in the semantic
> web.
> >
> > So the context as you are defining it is not what I am looking for. I am
> > really looking for the named graph - the entire claim made by a resource.
>
> We don't have the notion of claims made by a resource. But it would be easy
> to add a methos to GraphNodeProvider returning only what the web offers as
> context of a resource
>
>
> > This can be seen by considering the example I gave above where someone
> adds
> > to the content graph information about Dan Brickley
> >
> >    b:danbri foaf:knows :joe .
> >
> > If I only get Dan Brickely's graph back that triple will not be there. If
> I
> > get Dan Brickley's  + the content graph, then that information will
> appear
> > even if I just ask for dan's node context. Also there may be information
> > about
> > people appearing in Dan Brickley's profile that are not directly linked
> by
> > him, that I will
> > also be interested in retrieving.
> >
> Use render(uriRef) method to have thos people rendered in their context.
>
>
> > So the context is not the tool I need - and I don't think my use cases
> are
> > special.
> >
> In the usecase of telling Dan that a true statement is missing indeed what
> id provided by ZZ-540 is probably not what you need. But I think this
> usecase is more special than seein all the comments a person posted and
> other facts which are assumed to be true (by platform trust boundaries)
> about a person. But this discussion is pointless as one feature doesn't
> prevent the other from being implemented.
>
>
> >
> > >
> > > The listed files can contain thigs that are completely unrelated to Tim
> > > Berners Lee
> > >
> > >
> > >
> > >> And I think this is due to a bit of confusion of the needs
> > >>  of your application with trying to keep the general architecture
> clean.
> > >>
> > > As I said, I did not made this particularly for an application, my wall
> > > application is merely a demo. When we want to do something like a
> > > foaf-browser we want to be able to display the resource in their
> context,
> > > just a usecase.
> >
> > Ok, so that is where our disagreement lies. The node context is in many
> > case not
> > at all what we want. It both adds too much information and not enough.
> >
> Who is "we"? You have one usecase where one should have less information
> accessible via the graphnode, there are other usecase (and imho more) where
> we want all information we trust).
>
> Wehn do we have not enough information?
>
>
> >
> > It may be that in the wall demo that is not visible. But in security
> > matters and
> > trust matters it will make a big difference.
> >
> Sorry, this seems like a demagogic null-sentence. Yes, we do care about
> speed, we do care about trust and we do care about security. And the
> proposed resolution of ZZ-540 and 544 brings an improvement, as it prevents
> data from other trust boundaries having to be part of the base graph for
> the
> graphnode returned by a root-resource method.
>
> >
> > >>
> > >>  Now on the whole I have learnt a lot about Clerezza by following
> this,
> > >> but I just can't say that this looks like
> > >> a good long term solution.  We are constantly moving around and around
> > >> something.
> > >>
> > > This is your impression. I hope my explanations to the concrete points
> > you
> > > mention could help changing this impression.
> >
> > I think it should now be clear how we can come to a solution that
> satisfies
> > both
> > our needs.
> >
>
> Yes: you revoke your -1 and you raising an issue for getting a resource
> description only from the web for your particular usecase.
> [...]
>
>
> >
> > > Would the rename be okay for you to accept the proposed path? (I really
> > > would like to go back to productive work, so I rather have a horrible
> > name
> > > than seeing the project stalled by your veto).
> >
> > Well then the issue would be why this class should appear in the
> > CallbackRenderer.
> > No I think there should be a way from JSR311 code to ask to ask precisely
> > for the
> > type of GraphNode it wants with very little coding. So that for the use
> > cases
> > where walking the content graph is the right thing to do it is one line
> of
> > code,
> > and for cases where something more precise is needed it is also just one
> > line of
> > code. In any case it should be easy when reading the code to understand
> > what is going
> > to be displayed.
> >
> I don't think this is particularly hard to do, and with the issue I
> proposed
> you raise above even easier.
>
>
> >
> > I hope this helps,
>
> Maybe this thread helps understanding the clerezza architecture better. Yet
> blocking development with a -1 seems quite a high price for this.
>
> Reto
>
>
> >
> >        Henry
> >
> > >
> > > Reto
> > >
> > > PS: You seem to be extensively using you're right to veto while
> ignoring
> > > other's veto on your code, looking at
> > > https://issues.apache.org/jira/browse/CLEREZZA-515 I see that the
> > commits
> > > have not been reverted even more than one week after my veto and
> request
> > to
> > > revert.
> >
> > Hmm, I did revert that using git. But I am not sure why that does not
> > appear in the
> > commits for that issue.... I see you brought that up in another thread.
> >
> >
> > Social Web Architect
> > http://bblfish.net/
> >
> >
>

Re: sketch of a compromise solution -- Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Tommaso Teofili <to...@gmail.com>.

2011/6/7 Reto Bachmann-Gmuer <re...@trialox.org>

> On Mon, Jun 6, 2011 at 8:58 PM, Tsuyoshi Ito <ts...@trialox.org> wrote:
>
> > Hi Henry, hi Reto
> >
> > Can I remind you that we are working towards a release and IMO we should
> > not
> > change APIs anymore. Your are now discussing over 2 weeks about the
> > GraphNodeProvider and IMO your discussion isn't very constructive.
> >
> > I think there ist some potential thread that bundles can add
> documentation
> > to the contentgraph as addition as Henry mentioned.Therefore we should
> > create an issue.
> >
> > But as Tommaso mentioned if you don't trust GraphNodeProvider or
> > ContentGraphProvider don't used it. I have developed a lot of
> applications
> > (e.g. Quiz, Poll, Feed Manager) where I don't use the
> ContentGraphProvider
> > because I don't want to share the information or I don't trust it.
> >
> > I think we could rename the package of the GraphNodeProvider to make
> clear
> > that it depends on the contentgraph and its additions. So I suggest to
> > rename the package of the GraphNodeProvider to
> >
> > platform.content.graphnodeprovider
> >
> > Would be cool if we could find a solution.
> >
> Indeed, I'm happy with the renaming if Henry can withdraw his -1 and accept
> the proposed resolutions to the issues being discussed.
>

At this point I think we do need to find a solution to move forward and this
is what it seems good to me.

I don't want to find ourselves in the situation Lucene/Solr project faced
lately with people vetoing and reverting each other's commit requiring a
formal report on how to deal with those issues from the ASF Board [1].
Being an Apache project consists also of building a community able to behave
in a smart and positive way even when there are controversial opinions and
it seems to me we have to learn a lot here, see one example:

Most of any attempts I have made at closing an issue have been blocked by
> you. True my
> code is not perfect, but neither is yours. And that is a little bit why I
> am giving you
> a bit of heat here.


so where do we end up with this? This sounds like "I am annoying you since
you've been annoying me", is that positive?

I do think we should forget for a moment the recent controversies and keep
up the good work we've been doing for more than a year and a half with
Clerezza. Can we?

Tommaso

[1] :
https://svn.apache.org/repos/asf/lucene/board-reports/2011/special-board-report-may.txt



>
> Reto
>
>
>
> >
> > Cheers
> > Tsuy
> >
> >
> >
> >
> >
> >
> >
> > On Mon, Jun 6, 2011 at 7:28 PM, Henry Story <he...@bblfish.net>
> > wrote:
> >
> > >
> > > On 6 Jun 2011, at 18:07, Reto Bachmann-Gmuer wrote:
> > >
> > > > -1 for the moment on closing the issue. (not on removing the code)
> > > >>  Please answer the above points carefully.
> > > >
> > > > I can of course remove the code, In understood the staement above as
> > you
> > > not
> > > > explicitely not asking me to do so. The point is that it makes little
> > > > difference (apart from the couple of minutes needed for the revert):
> > your
> > > -1
> > > > is blocking further development.
> > > >
> > > > To your claim that I did not provide an explanation for my recent -1
> to
> > > your
> > > > resolution of CLEREZZA-515: A -1 without technical reasons is not
> > valid,
> > > I
> > > > provided 5 technical reasons with my -1. I refused to give further
> > > > explanations and enter discussion before you removed the
> compatibility
> > > and
> > > > api-description breaking patch. It took you more than a week to
> revert
> > > this
> > > > change, this was a serious impediment on using the code in trunk.
> > > >
> > > > May I ask you to be explicit:
> > > >
> > > > [ ] I stick to my -1, but I don't mind the code staying there as long
> > as
> > > no
> > > > new code is added depending on it
> > > > [ ] I want the patch for CLEREZZA-540 reverted
> > > > [ ] I withdraw my -1
> > >
> > > I have also provided ample technical reasons. But I am willing to look
> at
> > > your arguments (unlike your -1 on my code). The discussion seems to be
> > > evolving quite a lot. I want to look at this relation between JSR311
> code
> > > and the
> > >
> > > If I may say: adding code quickly to ZZ and then closing  issues
> quickly
> > > seems like a way to bypass scrutiny.
> > >
> > > Reviewing code as you mentioned recently in CLEREZZA-516 is a lot of
> work
> > > (indeed you asked me there to do more work refactoring things, to avoid
> > you
> > > having to do such reviewing). I am sure you can make a branch, like my
> > > bblfish branch, and work on that in the mean time.
> > >
> > > I'll be looking at your criticism of my JSR311 points and your
> > explanation
> > > for why you need this next. You should be happy that you get this free
> > > reviewing. Criticism is expensive to purchase.
> > >
> > > Henry
> > >
> > > Social Web Architect
> > > http://bblfish.net/
> > >
> > >
> >
>

Re: sketch of a compromise solution -- Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Reto Bachmann-Gmuer <re...@trialox.org>.

On Mon, Jun 6, 2011 at 8:58 PM, Tsuyoshi Ito <ts...@trialox.org> wrote:

> Hi Henry, hi Reto
>
> Can I remind you that we are working towards a release and IMO we should
> not
> change APIs anymore. Your are now discussing over 2 weeks about the
> GraphNodeProvider and IMO your discussion isn't very constructive.
>
> I think there ist some potential thread that bundles can add documentation
> to the contentgraph as addition as Henry mentioned.Therefore we should
> create an issue.
>
> But as Tommaso mentioned if you don't trust GraphNodeProvider or
> ContentGraphProvider don't used it. I have developed a lot of applications
> (e.g. Quiz, Poll, Feed Manager) where I don't use the ContentGraphProvider
> because I don't want to share the information or I don't trust it.
>
> I think we could rename the package of the GraphNodeProvider to make clear
> that it depends on the contentgraph and its additions. So I suggest to
> rename the package of the GraphNodeProvider to
>
> platform.content.graphnodeprovider
>
> Would be cool if we could find a solution.
>
Indeed, I'm happy with the renaming if Henry can withdraw his -1 and accept
the proposed resolutions to the issues being discussed.

Reto



>
> Cheers
> Tsuy
>
>
>
>
>
>
>
> On Mon, Jun 6, 2011 at 7:28 PM, Henry Story <he...@bblfish.net>
> wrote:
>
> >
> > On 6 Jun 2011, at 18:07, Reto Bachmann-Gmuer wrote:
> >
> > > -1 for the moment on closing the issue. (not on removing the code)
> > >>  Please answer the above points carefully.
> > >
> > > I can of course remove the code, In understood the staement above as
> you
> > not
> > > explicitely not asking me to do so. The point is that it makes little
> > > difference (apart from the couple of minutes needed for the revert):
> your
> > -1
> > > is blocking further development.
> > >
> > > To your claim that I did not provide an explanation for my recent -1 to
> > your
> > > resolution of CLEREZZA-515: A -1 without technical reasons is not
> valid,
> > I
> > > provided 5 technical reasons with my -1. I refused to give further
> > > explanations and enter discussion before you removed the compatibility
> > and
> > > api-description breaking patch. It took you more than a week to revert
> > this
> > > change, this was a serious impediment on using the code in trunk.
> > >
> > > May I ask you to be explicit:
> > >
> > > [ ] I stick to my -1, but I don't mind the code staying there as long
> as
> > no
> > > new code is added depending on it
> > > [ ] I want the patch for CLEREZZA-540 reverted
> > > [ ] I withdraw my -1
> >
> > I have also provided ample technical reasons. But I am willing to look at
> > your arguments (unlike your -1 on my code). The discussion seems to be
> > evolving quite a lot. I want to look at this relation between JSR311 code
> > and the
> >
> > If I may say: adding code quickly to ZZ and then closing  issues quickly
> > seems like a way to bypass scrutiny.
> >
> > Reviewing code as you mentioned recently in CLEREZZA-516 is a lot of work
> > (indeed you asked me there to do more work refactoring things, to avoid
> you
> > having to do such reviewing). I am sure you can make a branch, like my
> > bblfish branch, and work on that in the mean time.
> >
> > I'll be looking at your criticism of my JSR311 points and your
> explanation
> > for why you need this next. You should be happy that you get this free
> > reviewing. Criticism is expensive to purchase.
> >
> > Henry
> >
> > Social Web Architect
> > http://bblfish.net/
> >
> >
>

Re: sketch of a compromise solution -- Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Tsuyoshi Ito <ts...@trialox.org>.

Hi Henry, hi Reto

Can I remind you that we are working towards a release and IMO we should not
change APIs anymore. Your are now discussing over 2 weeks about the
GraphNodeProvider and IMO your discussion isn't very constructive.

I think there ist some potential thread that bundles can add documentation
to the contentgraph as addition as Henry mentioned.Therefore we should
create an issue.

But as Tommaso mentioned if you don't trust GraphNodeProvider or
ContentGraphProvider don't used it. I have developed a lot of applications
(e.g. Quiz, Poll, Feed Manager) where I don't use the ContentGraphProvider
because I don't want to share the information or I don't trust it.

I think we could rename the package of the GraphNodeProvider to make clear
that it depends on the contentgraph and its additions. So I suggest to
rename the package of the GraphNodeProvider to

platform.content.graphnodeprovider

Would be cool if we could find a solution.

Cheers
Tsuy

On Mon, Jun 6, 2011 at 7:28 PM, Henry Story <he...@bblfish.net> wrote:

>
> On 6 Jun 2011, at 18:07, Reto Bachmann-Gmuer wrote:
>
> > -1 for the moment on closing the issue. (not on removing the code)
> >>  Please answer the above points carefully.
> >
> > I can of course remove the code, In understood the staement above as you
> not
> > explicitely not asking me to do so. The point is that it makes little
> > difference (apart from the couple of minutes needed for the revert): your
> -1
> > is blocking further development.
> >
> > To your claim that I did not provide an explanation for my recent -1 to
> your
> > resolution of CLEREZZA-515: A -1 without technical reasons is not valid,
> I
> > provided 5 technical reasons with my -1. I refused to give further
> > explanations and enter discussion before you removed the compatibility
> and
> > api-description breaking patch. It took you more than a week to revert
> this
> > change, this was a serious impediment on using the code in trunk.
> >
> > May I ask you to be explicit:
> >
> > [ ] I stick to my -1, but I don't mind the code staying there as long as
> no
> > new code is added depending on it
> > [ ] I want the patch for CLEREZZA-540 reverted
> > [ ] I withdraw my -1
>
> I have also provided ample technical reasons. But I am willing to look at
> your arguments (unlike your -1 on my code). The discussion seems to be
> evolving quite a lot. I want to look at this relation between JSR311 code
> and the
>
> If I may say: adding code quickly to ZZ and then closing  issues quickly
> seems like a way to bypass scrutiny.
>
> Reviewing code as you mentioned recently in CLEREZZA-516 is a lot of work
> (indeed you asked me there to do more work refactoring things, to avoid you
> having to do such reviewing). I am sure you can make a branch, like my
> bblfish branch, and work on that in the mean time.
>
> I'll be looking at your criticism of my JSR311 points and your explanation
> for why you need this next. You should be happy that you get this free
> reviewing. Criticism is expensive to purchase.
>
> Henry
>
> Social Web Architect
> http://bblfish.net/
>
>

Re: sketch of a compromise solution -- Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Henry Story <he...@bblfish.net>.

On 6 Jun 2011, at 18:07, Reto Bachmann-Gmuer wrote:

> -1 for the moment on closing the issue. (not on removing the code)
>>  Please answer the above points carefully.
> 
> I can of course remove the code, In understood the staement above as you not
> explicitely not asking me to do so. The point is that it makes little
> difference (apart from the couple of minutes needed for the revert): your -1
> is blocking further development.
> 
> To your claim that I did not provide an explanation for my recent -1 to your
> resolution of CLEREZZA-515: A -1 without technical reasons is not valid, I
> provided 5 technical reasons with my -1. I refused to give further
> explanations and enter discussion before you removed the compatibility and
> api-description breaking patch. It took you more than a week to revert this
> change, this was a serious impediment on using the code in trunk.
> 
> May I ask you to be explicit:
> 
> [ ] I stick to my -1, but I don't mind the code staying there as long as no
> new code is added depending on it
> [ ] I want the patch for CLEREZZA-540 reverted
> [ ] I withdraw my -1

I have also provided ample technical reasons. But I am willing to look at your arguments (unlike your -1 on my code). The discussion seems to be evolving quite a lot. I want to look at this relation between JSR311 code and the 

If I may say: adding code quickly to ZZ and then closing  issues quickly seems like a way to bypass scrutiny. 

Reviewing code as you mentioned recently in CLEREZZA-516 is a lot of work (indeed you asked me there to do more work refactoring things, to avoid you having to do such reviewing). I am sure you can make a branch, like my bblfish branch, and work on that in the mean time. 

I'll be looking at your criticism of my JSR311 points and your explanation for why you need this next. You should be happy that you get this free reviewing. Criticism is expensive to purchase.

Henry

Social Web Architect
http://bblfish.net/

Re: sketch of a compromise solution -- Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Reto Bachmann-Gmuer <re...@trialox.org>.

On Mon, Jun 6, 2011 at 5:34 PM, Henry Story <he...@bblfish.net> wrote:

>
> On 3 Jun 2011, at 13:29, Reto Bachmann-Gmuer wrote:
>
> >>
> >> but that graph and something else, defined in some unrelated package.
> For
> >> me this does not make it easy to understand the code.
> >>
> > Obviously you don't. Would be good we find way to improve understanding
> of
> > the clerezza architecture without requiring blocking the evolution by
> > casting -1
>
> If I had cast a -1 then you would have had to remove the code without my
> needing to give you an explanation, as you did recently to my code - only
> providing an explanation a week after the fact. So please don't exaggerate.
> If not closing an issue is the equivalent to a -1, then you are the biggest
> disher-out of -1 on this list.
>

you wrote ( a few hours beforethe end of the 72h voting period):

-1 for the moment on closing the issue. (not on removing the code)
>   Please answer the above points carefully.

I can of course remove the code, In understood the staement above as you not
explicitely not asking me to do so. The point is that it makes little
difference (apart from the couple of minutes needed for the revert): your -1
is blocking further development.

To your claim that I did not provide an explanation for my recent -1 to your
resolution of CLEREZZA-515: A -1 without technical reasons is not valid, I
provided 5 technical reasons with my -1. I refused to give further
explanations and enter discussion before you removed the compatibility and
api-description breaking patch. It took you more than a week to revert this
change, this was a serious impediment on using the code in trunk.

May I ask you to be explicit:

[ ] I stick to my -1, but I don't mind the code staying there as long as no
new code is added depending on it
[ ] I want the patch for CLEREZZA-540 reverted
[ ] I withdraw my -1

> You added a GraphNodeProvider. I tried to use it to get GraphNodes for
> remote URIs, and discovered that I got the whole content graph,

documentation graph, and a number of other graphs with it too.

It returns a graphnode this is something else than a bunch of graphs as you
repeatedly claim. I wrote in my very first reply (and reiterated a few times
since) that you're invoking the wron method to discover about the resource.

I wrote

> getGraph returns "the graph the node represented by this instance is in",
> as I mentioned before this could be GGG. On a GraphNode you should usually
> not invoke getGraph, the object exists to hop from resource to resource.

Henry Story <he...@bblfish.net> wrote:

> The reasons for introducing this were never explained - you only just did
> it now, with reference to a discobits application that is not available in
> ZZ trunk.
>

The DiscobitsTypeHandler is quite a crucial component in Clerezza and it is
indeed part of trunk. CLEREZZA-541 was created together with issue 540 and
by being declaraed as dependency of CLEREZZA it should have been visible to
you. The DiscobitsTypeHandler is the default type-handler returning a
GraphNode as response to Http-GET requests. I would expect you to examine
the situation more carefully before vetoing changes.

Reto

> I am trying to read your explanation there and am not finding it easy to
> understand....
>
> Henry
>
>
> Social Web Architect
> http://bblfish.net/
>
>

Re: sketch of a compromise solution -- Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Henry Story <he...@bblfish.net>.

On 3 Jun 2011, at 13:29, Reto Bachmann-Gmuer wrote:

>> 
>> but that graph and something else, defined in some unrelated package. For
>> me this does not make it easy to understand the code.
>> 
> Obviously you don't. Would be good we find way to improve understanding of
> the clerezza architecture without requiring blocking the evolution by
> casting -1

If I had cast a -1 then you would have had to remove the code without my needing to give you an explanation, as you did recently to my code - only providing an explanation a week after the fact. So please don't exaggerate. If not closing an issue is the equivalent to a -1, then you are the biggest disher-out of -1 on this list. 

You added a GraphNodeProvider. I tried to use it to get GraphNodes for remote URIs, and discovered that I got the whole content graph, documentation graph, and a number of other graphs with it too. The reasons for introducing this were never explained - you only just did it now, with reference to a discobits application that is not available in ZZ trunk. I am trying to read your explanation there and am not finding it easy to understand....

Henry

Social Web Architect
http://bblfish.net/

Re: sketch of a compromise solution -- Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Reto Bachmann-Gmuer <re...@trialox.org>.

On Wed, Jun 1, 2011 at 7:05 PM, Henry Story <he...@bblfish.net> wrote:
[...]

> > Yes, TcManager is the main entry point to the rdf data. I don't see any
> code
> > smell here. I a classical RDBMS java application a component will
> typically
> > use jdbc and rely on other components that use jdbc, here it's TcManager
> > instead of jdbc,
>
> The thing to do would be to look at the number of  connections that are
> generated to the
> DB. My feeling is that one could do the same and have only 1 or 2
> connections to the DB.
> Here we could quickly end up with 10 times more....
>
I don't know what you mean by DB conncetions and where you see this factor
10.

[...]

> >
> >>  On my fresh install of ZZ that is 20 times more information than the
> >> initial graph.
> >>
> > - What is 20 times more?
> > - What do you mean by "get"?
> >
> > A graphnode point to a resource and is designed for browsing from
> resource
> > to resource. It is not a graph but a node in a graph. The object is
> > associated to a base-graph which used to identify the propertied and
> > instanctiate another graphnode when hoping to a property value. The
> > underlying graph could for instance be Timbl's GGG (giant gloabl graph
> aka
> > the web).
>
> yes  ((but clearly you don't want to dereference the whole web when
> working...))
>
no, and nobody is doing this. If I include the uri pattern
<http(s)://.*/(.*/)*> I'm not blasting this mail to the size of the web.


>
> Also there will be contradictions in the information on the web. Some
> people may trust some graphs, other trust others.

Right, that's why the GraphNodeProvider trusts only the content-graph, which
is trusted qua being a platform service) and the graph resulting from
dereferencing the resource (trusted by conventional web-trust)


> Graphs can be merged easily in RDF - IF they  are believed both to be true.
> But what is believed to be true will depend on what possible world you
> believe yourself to be in. I argued this in "Beatnik: change your mind"
> in more detail, if that helps for people following this discussion
>
>  http://blogs.oracle.com/bblfish/entry/beatnik_change_your_mind
>
> From the point of view of WebID and security I want to be able to tell WHO
> said what. In many applications being able to be very clear about where
> something was said is going to  be essential to giving good feedback. Some
> example coming from the field I am working on below.
>

> So for a foaf-browser, I want to know when TimBl declares someone to be a
> friend, and differentiate that from when someone declares himself to be a
> friend of TimBL, which is a very different thing.

With the current service you have what TimBL says plus the platform-wide
truths of the content-graph, this may contain things like a link back to you
(the owner of the platform instance) or a statement like : TimBL rdf:type
ex: Spammer which might not be published in TimBL's profile


> When I get Dan Brickley's graph I may want to know all the people he
> mentions in his foaf profile - even if he does not mention them as
> foaf:knows related to him.

does this provide a new point?


> If the GraphNodeProvider returns a union graph of the documentation graph,

Again no, we're not returning a union graph we're returning a GraphNode, the
underlying graph is an implementation detail (was think if the
getGraph-method could be made less visible (protected or private) to avoid
this confusion)


> content graph,... and his foaf profile then when searching for all the
> foaf:Person

You don't search a GraphNode for all foaf:Person but the GraphNode
represents the foaf:Person you asked for.


> I will get the documentation writers too, the writers of content in the
> content graph, and who knows what else...

you will have properties pointing from that persons to all the comments he
left on the local instance, which can be quite handy (and which are from the
underlying content graph as they are probably not also contained in the
remote foaf:profile)


> many people will have no direct relation to Dan at all. People can say true
> things about Dan but those not be things Dan himself would say.
>
Yes, we only consider as true what we say ourseflf (i.e. the content graph)
and in particular circumstances also what Dan says.


>
> I believe these use cases are not limited to the foaf browser but to a very
> large category of semantic web applications. Give me some linked data
> application, and I will easily come up with use cases of the same kind.
>
That's why graphnodeprovider is a generic service and its not true that it
was designed for a particular and very specific application of mines in
mind.
[...]

> >
> > I thought I had heard people mention issues with speed on this list.
> >>
> > You may check archives of this list at:
> > http://mail-archives.apache.org/mod_mbox/incubator-clerezza-dev/
>
> Do you have a precise thread?
>
No, its you who thought he heard people mention issue with speed. Check the
archive and construct an argument if those speed issues relate to the issue
at hand, otherwise your remark " I thought I had heard people mention issues
with speed on this list" is purely demagogic and a hindrance to an effective
an fruitful discussion. (like saying "I've heard people finding the clerezza
code hard to read" when justifying a -1 against some code)

> [...]
> >>
> >> What I am wondering is in what cases is this needed? It seems like this
> may
> >> indeed what a particular application may require, but does it have to be
> >> a general service? The name certainly suggests a very general service,
> not
> >> one required for a particular application.
> >>
> > This is about ContentGraphProvider then, not about issue 540. It's the
> > ContentGraphProvider which provides the graph of instance-wide and public
> > information for the platform
>
> 540 the GraphNodeProvider delegates decisions to the ContentGraphProvider
> 544 uses the GraphNodeProvider that delegates to the ContentGraphProvider
> and ties it into the the core,
>    so that when a JSR311 class requests a named graph it then gets whatever
> the ContentGraphProvider decides
>    is trusted content.
>
I don't see any link to JSR311 but yes, the ContentGraphProvider provides
content that is  public and platform-wide trusted by default (the system
graph has higher trust).


>
> SKETCH OF A COMPROMISE SOLUTION
> ===============================
>
> Perhaps there is a way to allow make things more transparent, by for
> example having JSR311 classes that really
> want the full union returned by the  current ContentGraphProvider to have
> that, and other applications to get something more limited.
>
Again see no link to Jsr311. But I think this might be the mentioned
possible future enhancement to allow clients to specify the trust
boundaries.


> I suggest that we think of naming the content graph or at least build
> something so that those who need the content graph can ask for it clearly,
> and those who don't can make sure they don't get it.
>
The content graph is named, the virtual content graph isn't, but we could
give the virtaul content graph a name thus making it accessible in sparql
queries (using a TcProvider) but this seems completely unrelated to the
issue at hand.


>
>  - It may be useful then to name the content graph - it would be a union
> graph that could be specified by a SPARQL UNION
>    query for example or the equivalent.
>
Yes, if we give the Virtual content graph it can be used by the sparql
endpoint


>
>  - have a JSR311 class return a NamedGraphNode (or something like that)
> which can then call the CallbackRenderer.
>
What should a NamedGraphNode be? GarphNode can wrap a named or an anonymous
resource I see no need for a subclass for UriRef-GraphNodes, but we have
been discussing this in another thread, don't see a relation to the issue at
hand.

I don't know what you mean by the NamedGraphNode calling CallbackRenderer,
the CallbackRenderer is called by the renderlet.

   The NamedGraphNode could name the union graph, or could be a union graph
> - by reference. So all the code would do is
>
>    return new UnionNamedGraphNode(new
> UriRef("urn:x-localhost/contentGraph"),new UriRef("
> http://remote.example/resource/"))
>

I'm not getting it which one is the resource and which one the name of the
underlying graph, whats the difference to current GraphNodes.


>
>    or some nice syntactic sugar for that. For example for apps requiring
> the union of Content + other  that you need,  something like the following
> would be neat
>
>    return new ContentGraphNodePlus(UriRef("http://remote.example/resource/
> "))
>

If you're talking about GraphNodes (but I'm not sure where you in fact mean
graphs) the I don't see what you're introducing that is new

return new GraphNode(new UriRef("http://remote.example/resource/"), new
UnionMGraph(contentGraph, fooGraph, barGraph))


>
>    These objects would just be holders of the graph name(s), which the
> TcManagers can then hook up into the underlying triple store. Something
> along those lines would be very nice. One could easily write applications
> that get union of contents as you wish, and I could easily get very
> precisely defined graphs for security based application, or more flexible
> linked data graphs too.

You can do this (see example return statement above)


> It could also avoid the iterative way the GraphNodeProvider currently
> works.
>
Is this a reference to the if-then statements you criticed but never told me
what you mean despite me repatedly asking? Or what "iterative way" are you
reffering to?


>
> Having something like that would mean  that perhaps the  addition of the
> new method in CallbackRenderer
>
>   public void render(UriRef resource, GraphNode context, String mode,
>                        OutputStream os) throws IOException;
>
GarphNode is resource+context where the context is a graph. Now the
renderlet gets a graphnode to render, it shouldn't get any context from
anywhere else. the new render method is exactly to render a method with a
differnt context not avaialble directly to the renderlet. If the outer
renderlet already has (or can generate) the context for the nested rendering
the this can be done with existing (pre 540) infrastructure.


>
>
> would no longer be needed, or would be adapted somewhat.

what?


> It could also mean that the GraphNodeProvider could be a lot more general,
> as its name indicates it should be. The information about graphs hard coded
> into the provider could then be moved to a Graph (or GraphNode or NamedGraph
> or NamedGraphNode object). It would then be a lot clearer when looking at
> JSR311 code what was being returned.
>
Again this is not specific to jsr311 code. I think my proposed
GraphNodeProvider is quite generic but that additional features coould be
added.


>
> [[ ps: a thought
> One could perhpas write implementations of such a NamedGraph that would
> perhaps allow links to be followed outward (from accepted named graphs to
> others graphs it links to, up to a certain number of hops).
> ]]
>
Which seems to be exactly the context-switch allowed by ZZ-544


>
> >> Perhaps changing the name from GraphNodeProvider to
> >> ContentGraphPlusOtherProvider would make more sense.
> > It's a platform service that provides GraphNodes. Being a platform
> service
> > implies it usesthe platform means of getting trusted content. If it would
> > just dereference URIs the it would probably be placed in a subpackage of
> > clerezza.rdf.
>
> perhaps. But why not make things nice and general as explained above?
>
Where do you make something nice and more general? You're describing how
clients can do stuff without GraphNodeProvider what they of course can do.
And you're proposing new classes for what seems they can do as easily (but
more consistently and thus more elegantly) with the existing classes.



> Currently with changes to 544 and in particular the render method
>
>  public void render(UriRef resource, GraphNode context, String mode,
>                        OutputStream os) throws IOException;
>
>
> when a JSR311 class returns a URI,

A jsr311 returns a GraphNode, if it returns a URI then type-rendering is not
used (but another MessageBodyWriter, if available)


> the renderer does not get the graph named by that
> URI

No renderlets get invoked, but if it gets a graphNode the renderlets gets
that GraphNode which allows ecploring the resource with whatever graph the
jax-rs resource method chose to use. Choosing this graph is the business of
the application logic and certainly does not belong into the renderlet.


> but that graph and something else, defined in some unrelated package. For
> me this
> does not make it easy to understand the code.
>
Obviously you don't. Would be good we find way to improve understanding of
the clerezza architecture without requiring blocking the evolution by
casting -1


>
>
> >>> This might not match an intuitive understanding of "authoritative" and
> >> I'm
> >>> happy to redefine the issue so that no confusion arises.
> >>
> >> One thing I am not quite clear about yet, is who writes to the content
> >> graph? I see a lot of modules use it.
> >>
> > Modules can write to the content graph or add temporary additions to it.
> > Actually writing to the content graph should happen when public and
> trusted
> > information is added. An information is considered trusted when added by
> a
> > user with respective permission or verified by privileged code (e.g. that
> > allows the public to add see-also references).
>
> Good so say a trusted user of mine :joe truthfully says
>
Waht do you mean by "trusted user"? trust with no limits? (admin rights?)



>
>  b:danbri foaf:knows :joe .
>
> then currently when I ask for http://danbri.org/foaf.rdf#danbri
> I will get a graph that contains the above triple even if danbri does not
> make that
> claim. Sometimes that is good, and sometimes not.

Sometimes it's good to use clerezza, sometimes a hammer is more appropriate
;)


> In many cases as I have argued it will be
> important for me to know what danbri claims. Perhaps so I can ping him to
> tell him about
> my desire for him to claim friendship with me.
>
There's nothing to prevent you or that would make it hard to write such an
application, it's just not what the garphnodeprovider is for and it
definitively doesn't belong into the renderlet


>
> In the current API changes it won't be clear at all why when I ask for
>
>    <http://danbri.org/foaf.rdf#danbri>
>
> I get <http://danbri.org/foaf.rdf#danbri>  + 5 other graphs.

graph/resource distinction, what does the addition of a person and a graph
result in?


> Or it will require the developer
> to know the internals of clerezza to work this out, as I have just had to
> do myself.
>
It can well be, that clerezza will support sophisticated provenance
mechanism in future. Not sure however if the blocking of patches for the
existing base architecture fosters this developement.

[...]
>

> >
> >> 4. But instead of just having a GraphNodeProvider that just returns the
> >> graph, you have added some twists to
> >>  it and return more than jut the named graph. There is nothing to say
> that
> >> a named graph cannot be the union
> >>  of many other graphs, but it seems really arbitrary for me to get the
> >> documentation of clerezza along with the
> >>  triples of Tim Berners Lee's graph.
> >>
> >
> >>  Somehow things have gone a bit haywire at the end here.
> >
> > If you call getGraph on a GraphNode you're leaving the scope of the
> > GraphNode. Probably all this discussion would not be necessary if had
> been
> > using getNodeContext instead of getGraph. The NodeContext is what related
> to
> > the node. Using getGraph is a bit like doing the following:
> >
> > File file = SomeService.getFileDescribing("Tim Berners Lee")
> > file.getParent().getParent().getParent().listChildrenRecursively()
>
> I don't think that is a good way of looking at what graphs are useful for.
> Graphs are more
> like bubbles in a comic strip.
>
Yes, but here it's not about graph but resources (interpreted in a huge
universe of believes)


>
> I argue this very carefully in "Are OO languages Autistic?"
>
>  http://blogs.oracle.com/bblfish/entry/are_oo_languages_autistic
>
> This is a fundamental new programming element provided in the semantic web.
>
> So the context as you are defining it is not what I am looking for. I am
> really looking for the named graph - the entire claim made by a resource.

We don't have the notion of claims made by a resource. But it would be easy
to add a methos to GraphNodeProvider returning only what the web offers as
context of a resource


> This can be seen by considering the example I gave above where someone adds
> to the content graph information about Dan Brickley
>
>    b:danbri foaf:knows :joe .
>
> If I only get Dan Brickely's graph back that triple will not be there. If I
> get Dan Brickley's  + the content graph, then that information will appear
> even if I just ask for dan's node context. Also there may be information
> about
> people appearing in Dan Brickley's profile that are not directly linked by
> him, that I will
> also be interested in retrieving.
>
Use render(uriRef) method to have thos people rendered in their context.


> So the context is not the tool I need - and I don't think my use cases are
> special.
>
In the usecase of telling Dan that a true statement is missing indeed what
id provided by ZZ-540 is probably not what you need. But I think this
usecase is more special than seein all the comments a person posted and
other facts which are assumed to be true (by platform trust boundaries)
about a person. But this discussion is pointless as one feature doesn't
prevent the other from being implemented.


>
> >
> > The listed files can contain thigs that are completely unrelated to Tim
> > Berners Lee
> >
> >
> >
> >> And I think this is due to a bit of confusion of the needs
> >>  of your application with trying to keep the general architecture clean.
> >>
> > As I said, I did not made this particularly for an application, my wall
> > application is merely a demo. When we want to do something like a
> > foaf-browser we want to be able to display the resource in their context,
> > just a usecase.
>
> Ok, so that is where our disagreement lies. The node context is in many
> case not
> at all what we want. It both adds too much information and not enough.
>
Who is "we"? You have one usecase where one should have less information
accessible via the graphnode, there are other usecase (and imho more) where
we want all information we trust).

Wehn do we have not enough information?


>
> It may be that in the wall demo that is not visible. But in security
> matters and
> trust matters it will make a big difference.
>
Sorry, this seems like a demagogic null-sentence. Yes, we do care about
speed, we do care about trust and we do care about security. And the
proposed resolution of ZZ-540 and 544 brings an improvement, as it prevents
data from other trust boundaries having to be part of the base graph for the
graphnode returned by a root-resource method.

>
> >>
> >>  Now on the whole I have learnt a lot about Clerezza by following this,
> >> but I just can't say that this looks like
> >> a good long term solution.  We are constantly moving around and around
> >> something.
> >>
> > This is your impression. I hope my explanations to the concrete points
> you
> > mention could help changing this impression.
>
> I think it should now be clear how we can come to a solution that satisfies
> both
> our needs.
>

Yes: you revoke your -1 and you raising an issue for getting a resource
description only from the web for your particular usecase.
[...]


>
> > Would the rename be okay for you to accept the proposed path? (I really
> > would like to go back to productive work, so I rather have a horrible
> name
> > than seeing the project stalled by your veto).
>
> Well then the issue would be why this class should appear in the
> CallbackRenderer.
> No I think there should be a way from JSR311 code to ask to ask precisely
> for the
> type of GraphNode it wants with very little coding. So that for the use
> cases
> where walking the content graph is the right thing to do it is one line of
> code,
> and for cases where something more precise is needed it is also just one
> line of
> code. In any case it should be easy when reading the code to understand
> what is going
> to be displayed.
>
I don't think this is particularly hard to do, and with the issue I proposed
you raise above even easier.


>
> I hope this helps,

Maybe this thread helps understanding the clerezza architecture better. Yet
blocking development with a -1 seems quite a high price for this.

Reto


>
>        Henry
>
> >
> > Reto
> >
> > PS: You seem to be extensively using you're right to veto while ignoring
> > other's veto on your code, looking at
> > https://issues.apache.org/jira/browse/CLEREZZA-515 I see that the
> commits
> > have not been reverted even more than one week after my veto and request
> to
> > revert.
>
> Hmm, I did revert that using git. But I am not sure why that does not
> appear in the
> commits for that issue.... I see you brought that up in another thread.
>
>
> Social Web Architect
> http://bblfish.net/
>
>

sketch of a compromise solution -- Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Henry Story <he...@bblfish.net>.

Ok, so this discussion is starting to be fruitful. I am learning things about how clerezza works, and the reasons for the decisions made, and I understand your point of view. This has lead me to a proposal below that I think could satisfy both your requirements and mine in a way that is flexible and extensible for all future uses.

On 1 Jun 2011, at 01:52, Reto Bachmann-Gmuer wrote:

> On Sun, May 29, 2011 at 7:06 PM, Henry Story <he...@bblfish.net>wrote:
> 
>> 
>> On 26 May 2011, at 20:31, Reto Bachmann-Gmuer wrote:
>> 
>>> With CLEREZZA-540 I suggest a GraphNodeProvider-Service that returns a
>>> GraphNode given a named resource. Mainly this code that used to be in
>>> DiscoBitsTypeHandler which has been generalized.
>>> 
>>> The issue is described as:
>>> "Implement a platform service that returns GraphNodes for URIs. The
>>> GraphNode is the resource identified by that uri with as BaseGraph sources
>>> considered authoritative for that resource. "
>>> 
>>> Of course "considered authoritative" it not a very sharp description. The
>>> issue is labeled with "platform" which implies it is not a generic utility
>>> of clerezza.rdf but that it relies on platform default graphs.
>>> 
>>> The solution proposed in commit
>>> #1125477<http://svn.apache.org/viewvc?view=rev&rev=1125477>and
>>> #1125652 <http://svn.apache.org/viewvc?view=rev&rev=1125652> sets the
>>> basegarph as follows:
>>> - always trust the content graph
>>> - for remote resource trust the graph you get by dereferencing the uri
>>> - for resources in the user-uri space trust that user
>> 
>> Of course what one thinks of this patch depends completely on how this
>> provider gets used. It has quite a lot of limitations it seems to me as implemented
>> currently - which is of course not a final failing in a developing project,
>> but it does seem like a good discussion would help us narrow perhaps on
>> better solutions or on fixes, which is part of the reason I thought the ISSUE
>> should not yet be closed.
>> 
> Keeping the issue open blocks development, an issue is associated to a patch
> as long as an issue is open other code should not depend on code committed
> under this issue.

yes. I understand this now. Sorry for putting code into clerezza earlier and not
trying to fix it immediately. I have open issues in other open source projects that
have been worked on for over a year. For example the issue in Chromium
  http://code.google.com/p/chromium/issues/detail?id=29784


> This is way an issue should be closed shortly after the
> patch has been committed, and conversely why code shouldn't be kept in trunk
> it the issue cannot be closed. An open issue is more than a discussion
> reminder.

Yes. But one has to also leave enough time for debate on an issue before it gets closed, if there
is an issue that is raised. Here I was arguing that the combination of 540 and 544 is not right. 544 confirms my doubts about 540 because it shows how it is  intended to be used. The aim of having a discussion is to help find the best answers to a problem. 

>> 
>> The failings I mentioned in the ISSUE-540, and develop below are:
>> 
>> 1. it to relies on more and more Services that each require TcProviders.
>> This
>> feels very ad-hoc. The ones I mentioned are
>> 
>>   - UserManager: requires a TcManager
>>   - WebIdGraphsService: also requires a TcManager
>>   - PlatformConfig: requires TcManager
>>   - ContentGraphProvider: requires TcManager
>>   - TcManager
>> 
>>  Each of these is used, and makes a call to the database. And the
>> TcManager itself each time
>> iterates through a number of TcProviders.
>> 
> Yes, TcManager is the main entry point to the rdf data. I don't see any code
> smell here. I a classical RDBMS java application a component will typically
> use jdbc and rely on other components that use jdbc, here it's TcManager
> instead of jdbc,

The thing to do would be to look at the number of  connections that are generated to the
DB. My feeling is that one could do the same and have only 1 or 2 connections to the DB. 
Here we could quickly end up with 10 times more.... 

But even if I am wrong about this, I think there is a better solution for how we can 
do what you want in a way that would satisfy both of us. I detail that below.


>> 
>> 2. when asking for an external URI, you get the whole content graph too
>> 
> You get a GraphNode
> 
> 
>>  On my fresh install of ZZ that is 20 times more information than the
>> initial graph.
>> 
> - What is 20 times more?
> - What do you mean by "get"?
> 
> A graphnode point to a resource and is designed for browsing from resource
> to resource. It is not a graph but a node in a graph. The object is
> associated to a base-graph which used to identify the propertied and
> instanctiate another graphnode when hoping to a property value. The
> underlying graph could for instance be Timbl's GGG (giant gloabl graph aka
> the web).

yes  ((but clearly you don't want to dereference the whole web when working...))

Also there will be contradictions in the information on the web. Some people may trust some graphs, other trust others. Graphs can be merged easily in RDF - IF they  are believed both to be true. But what is believed to be true will depend on what possible world you believe yourself to be in. I argued this in "Beatnik: change your mind"
in more detail, if that helps for people following this discussion

  http://blogs.oracle.com/bblfish/entry/beatnik_change_your_mind

From the point of view of WebID and security I want to be able to tell WHO said what. In many applications being able to be very clear about where something was said is going to  be essential to giving good feedback. Some example coming from the field I am working on below.

So for a foaf-browser, I want to know when TimBl declares someone to be a friend, and differentiate that from when someone declares himself to be a friend of TimBL, which is a very different thing. When I get Dan Brickley's graph I may want to know all the people he mentions in his foaf profile - even if he does not mention them as foaf:knows related to him. If the GraphNodeProvider returns a union graph of the documentation graph, content graph,... and his foaf profile then when searching for all the foaf:Person I will get the documentation writers too, the writers of content in the content graph, and who knows what else... many people will have no direct relation to Dan at all. People can say true things about Dan but those not be things Dan himself would say.  

I believe these use cases are not limited to the foaf browser but to a very large category of semantic web applications. Give me some linked data application, and I will easily come up with use cases of the same kind. 


>> How big is that going to become as one's content graph grows over time?
> 
> The underlying graph would be the size of the virtual content graph + the
> size of the remote graph - duplicate triples
> 
>> Is this not going to create a huge bottleneck very quickly?
> 
> No, the time complexity for accessing a property of graphnode grows linearly
> to the number of graphs in the graph-union (which in this case is 2) but is
> only O(log n) for the number of triples  in the indexed triple store.
> 
> I thought I had heard people mention issues with speed on this list.
>> 
> You may check archives of this list at:
> http://mail-archives.apache.org/mod_mbox/incubator-clerezza-dev/

Do you have a precise thread?

> 
> I mentioned performance concern directly to you when I saw that your
> EasyGraph class copies to memory all triples of a TripleCollection rather
> than accessing them in the original graph. So yes, the performance
> implications of our design choices should always be in our minds.

That is perfectly good criticism. I think you may have patched it even...
If not I should definitively fix that. 

> 
>> 
>> So to verify this do the following:
>> 
>> zz> import org.apache.clerezza.platform.graphnodeprovider._
>> zz> val gnp = $[GraphNodeProvider]
>> zz> val tbl = gnp.get(new UriRef("
>> http://www.w3.org/People/Berners-Lee/card#i"))
>> zz> tbl.getGraph.size
>> res0: Int = 1878
>> 
>> getGraph returns "the graph the node represented by this instance is in",
> as I mentioned before this could be GGG. On a GraphNode you should usually
> not invoke getGraph, the object exists to hop from resource to resource.

(Well I thought it reveals how much information is being shipped around, but apparently 
not as you argue below)

> 
>> If I get Tim Berners Lee's Graph on the command line I find
>> 
>> $ rapper http://www.w3.org/People/Berners-Lee/card | wc
>> rapper: Parsing URI http://www.w3.org/People/Berners-Lee/card with parser
>> rdfxml
>> rapper: Serializing with serializer ntriples
>> rapper: Parsing returned 78 triples
>>     78     380    9978
>> 
>> So here we have 78 triples, but the resulting answer schlepped around is 20
>> times bigger - on a new installation!
>> 
> If by schlepped around you mean that they are copied in memory or something
> like that this is not the case. It's a reference, that's all.

Ah ok. Some graphs copy all the information into a set, others work by reference...
I had not cottoned onto that. The ones I have been using were based on Hashsets.

So that is less of a problem then. 

> 
> 
>> 
>> I wondered what was going on, because to my surprise on a new installation
>> the Content Graph contains only 1 triple.
>> So I looked into a running instance of ContentGraphProvider and found that
>> the additions array contained the following graphs in addition to the
>> content graph:
>> 
>> - <urn:x-localinstance:/documentation.graph>   1002 triples
>> - <urn:x-localinstance:/config.graph>           176 triples
>> - <urn:x-localinstance:/web-resources.graph>    621 triples
>> - <urn:x-localinstance:/enrichment.graph>         0 triples
>> 
>> So that does then indeed add up to the number.
>> 
> Great, you found the answer :)
> 
> 
>> 
>> What I am wondering is in what cases is this needed? It seems like this may
>> indeed what a particular application may require, but does it have to be
>> a general service? The name certainly suggests a very general service, not
>> one required for a particular application.
>> 
> This is about ContentGraphProvider then, not about issue 540. It's the
> ContentGraphProvider which provides the graph of instance-wide and public
> information for the platform

540 the GraphNodeProvider delegates decisions to the ContentGraphProvider
544 uses the GraphNodeProvider that delegates to the ContentGraphProvider and ties it into the the core, 
    so that when a JSR311 class requests a named graph it then gets whatever the ContentGraphProvider decides
    is trusted content.

SKETCH OF A COMPROMISE SOLUTION 
===============================

Perhaps there is a way to allow make things more transparent, by for example having JSR311 classes that really
want the full union returned by the  current ContentGraphProvider to have that, and other applications to get something more limited.

I suggest that we think of naming the content graph or at least build something so that those who need the content graph can ask for it clearly, and those who don't can make sure they don't get it.

  - It may be useful then to name the content graph - it would be a union graph that could be specified by a SPARQL UNION
    query for example or the equivalent.

  - have a JSR311 class return a NamedGraphNode (or something like that) which can then call the CallbackRenderer.
    The NamedGraphNode could name the union graph, or could be a union graph - by reference. So all the code would do is
    
    return new UnionNamedGraphNode(new UriRef("urn:x-localhost/contentGraph"),new UriRef("http://remote.example/resource/"))
    
    or some nice syntactic sugar for that. For example for apps requiring the union of Content + other  that you need,  something like the following would be neat

    return new ContentGraphNodePlus(UriRef("http://remote.example/resource/"))

    These objects would just be holders of the graph name(s), which the TcManagers can then hook up into the underlying triple store. Something along those lines would be very nice. One could easily write applications that get union of contents as you wish, and I could easily get very precisely defined graphs for security based application, or more flexible linked data graphs too. It could also avoid the iterative way the GraphNodeProvider currently works. 

Having something like that would mean  that perhaps the  addition of the new method in CallbackRenderer

   public void render(UriRef resource, GraphNode context, String mode,
			OutputStream os) throws IOException;
 

would no longer be needed, or would be adapted somewhat. It could also mean that the GraphNodeProvider could be a lot more general, as its name indicates it should be. The information about graphs hard coded into the provider could then be moved to a Graph (or GraphNode or NamedGraph or NamedGraphNode object). It would then be a lot clearer when looking at JSR311 code what was being returned.

[[ ps: a thought
One could perhpas write implementations of such a NamedGraph that would perhaps allow links to be followed outward (from accepted named graphs to others graphs it links to, up to a certain number of hops). 
]]

>> Perhaps changing the name from GraphNodeProvider to
>> ContentGraphPlusOtherProvider would make more sense.
> It's a platform service that provides GraphNodes. Being a platform service
> implies it usesthe platform means of getting trusted content. If it would
> just dereference URIs the it would probably be placed in a subpackage of
> clerezza.rdf.

perhaps. But why not make things nice and general as explained above?
Currently with changes to 544 and in particular the render method

  public void render(UriRef resource, GraphNode context, String mode,
			OutputStream os) throws IOException;


when a JSR311 class returns a URI, the renderer does not get the graph named by that
URI but that graph and something else, defined in some unrelated package. For me this
does not make it easy to understand the code.


>>> This might not match an intuitive understanding of "authoritative" and
>> I'm
>>> happy to redefine the issue so that no confusion arises.
>> 
>> One thing I am not quite clear about yet, is who writes to the content
>> graph? I see a lot of modules use it.
>> 
> Modules can write to the content graph or add temporary additions to it.
> Actually writing to the content graph should happen when public and trusted
> information is added. An information is considered trusted when added by a
> user with respective permission or verified by privileged code (e.g. that
> allows the public to add see-also references).

Good so say a trusted user of mine :joe truthfully says

  b:danbri foaf:knows :joe .

then currently when I ask for http://danbri.org/foaf.rdf#danbri
I will get a graph that contains the above triple even if danbri does not make that
claim. Sometimes that is good, and sometimes not. In many cases as I have argued it will be
important for me to know what danbri claims. Perhaps so I can ping him to tell him about
my desire for him to claim friendship with me.

In the current API changes it won't be clear at all why when I ask for 

    <http://danbri.org/foaf.rdf#danbri>

I get <http://danbri.org/foaf.rdf#danbri>  + 5 other graphs. Or it will require the developer
to know the internals of clerezza to work this out, as I have just had to do myself.

> 
> 
>> 
>>> 
>>> What I do strongly believe is that the proposed patch offers a major and
>>> very useful new functionality. Especially as it allows the following
>>> features to be implemented:
>>> - Thanks to CLEREZZA-544 one can call the render-method to delegate the
>>> rendering of resources with a UriRef instead of a resource,
>> 
>> I think you mean a "UriRef instead of a Graph".
>> 
> "UriRef instead of a GraphNode" (GraphNode is rougly what is called
> "Resource" in the jena api)
> 
>> 
>> Yes, that makes sense. But why does the GraphNodeProvider have to cast
>> such a wide net to catch so many triples? It seems to me that if one
>> is to use a URI then it would be better that the URI refer precisely to
>> that named graph (or to a node it it). One could use other tools to create
>> virtual graphs, like Simon Schenk's Networked Graphs I mentioned
>> 
>> http://blogs.oracle.com/bblfish/entry/opening_sesame_with_networked_graphs
>> 
>> These allow one to have virtual graphs depending on a SPARQL query pattern.
>> There it would be easy for different services to specify different ones.
>> And I think something like that would be really good to have.
>> 
> 
> The possible GGG base graph could be implemented exactly as you describe,
> but using sparql would probably be quite an inefficient approach.

Good so I have no problem in naming such a base graph, as long as the JSR311 code 
and can clearly specify to the CallbackRenderer what it really wants.

> 
> 
> 
>> 
>>> in this case the
>>> resource is rendered using its own baseGraph rather than the one of the
>>> calling template. An example usecase for this is rendering the author of
>> a
>>> comment, the whole profile of the (possibly remote) commenter isn't and
>>> shall not be part of the baseGraph of the GraphNode returned by the
>> jax-rs
>>> resource method, yet for rendering the comment-author infobox it might be
>>> beneficial to render a GarphNode with a baseGraph containing all of the
>>> information in the users profile-document
>> 
>> But why also all the information from the documentation and the config
>> graphs?
>> It may be useful in some very limited cases, but it may mostly not be. It
>> seems that
>> some use cases would be useful to help describe this in more detail.
>> 
> 
> The config graph will seldomly be in the reachability graph of a resource
> but if it probably make sense of having it there.



> 
> 
> 
>> 
>>> - With CLEREZZA-541 the GraphNodeService is accessed from TypeHandler, I
>>> posted a resolution to this issue because it was already quite there on
>> my
>>> local machine when Herny reopened CLEREZZA-540, to respect the reopening
>> of
>>> the issue I didn't mark the dependent issue as resolved. I will of course
>>> revert the changes if requested to do so by a qualifying -1.
>>> 
>>> I'm not arguing that my patches solve all issues one might have around
>>> getting resource descriptions but I do think it is very valuable and to
>>> allow to base other stuff on this service I would like the issue to be
>>> closed. As Henry reopened the issue twice and I don't want to close the
>>> issue again without a broader discussion. Yet as many thing depend on the
>>> issue leaving it open doesn't seem an option to me.
>> 
>> What depends on it is something you are wanting to do in your projects it
>> seems to me, and that is not that clearly laid out.
> 
> Also because of the negative expirience with the monster issues recently I'm
> trying to be very explicit on dependent issues. A motivator for this issues
> was also your code around browsing remote foaf-profiles.

yes, the foaf browser is a very simple example linked data application. I
thought it would be good to have one in zz to help bring out these issues.


> 
> 
>> Because it does not seem obvious to
>> me why a service should make the decisions this one does about what is
>> authoritative.
>> 
> I mentioned earlier that other trust settings might be added with other
> issues. You seem to agree that it is useful to assume the graph resulting
> from dereferencing the resource as authoritative. The content graph is
> authoritative by platform conventions. There is no specific decision of the
> service here

Ok. Again I think it will be very confusing to bake this into clerezza the way
it is done now. It confused me a lot, and this thread is a testament to that.
> 
>>> 
>>> Future enhancement might include:
>> 
>>> - manually force refresh of caches for graphs related to a requested
>>> resource
>> 
>> Yes, indeed. But why here, when it is not in the WebProxy? You would think
>> cache
>> update functionality should go in the WebProxy right?
>> 
> Because when you access resources you don't (usually) care about the
> underlying graphs. You can already forces cache-refreshes when using the
> WebProxy.

Ah ok. So this is not the core issue here.

> 
> 
> 
>> 
>> 
>>> - force an alternative set of baseGraphs to be used (e.g. Only local or
>> only
>>> remote sources)
>> 
>> What I am wondering is why all this is done like this? If I go over the
>> changes of the
>> past few weeks this is what I see:
>> 
>> So if we go over the history of refactorings that led us here.
>> 
>> 1. You did not like the initial WebProxy you I wrote by refactoring your
>> WebIdGraphsService.
>>  Neither did I in fact - but it did work  at least and added minimum
>> change - being new to ZZ
>> I did not want to play around too much in the internals.
>> 
> well your code massively expanded the rdf api. Compare this with the length
> of discussion which is just about one method and one interface in the higher
> level platform api.

Ok. In any case it lead to the WebProxy being a core component which is good. 
It was a mistake for me to sidetrack this discussion there too.

> 
> 
> 
>> 2. You moved the old WebProxy to what seemed like a nicer interface: the
>> TcProvider interface. And
>>  indeed that does look a lot better. BUT but this interface is really
>> meant for direct, no interpretation
>>  access to the database and so lack
>>  - key notions of caching (well I suppose they could make sense even for
>> other sesame or jena graphs?)
>> 
> No I don't think this would make sense for jena or sesame graphs. I think
> that most clients don't need to force custom caching/update policies, but if
> they do they can access the WebProxy services which offers methods not
> available via TcManager

ok

> 
> 
>>  - does not provide a method for returning the final name of the graph
>> (for redireted resources, or foaf:knows),
>>    when the WebProxy gets called
>>     (since this the TcProvider assumes you give it exactly the correct
>> name of the graph)
>>  => So really it is quite uncomfortable there somehow.
>> 
> This is something you could open an issue about, something like "Graph
> aliasing in TcManager/TcProvider"

ok

> 
> 
> 
> 
>> 3. This led you then to move to this GraphNodeProvider in order get a graph
>> from a URI - which is very similar to the TcProvider in many ways, right?
> 
> No, its something fundamentally different
> 
> It even uses the code of the original WebProxy to do a HEAD
>>  on a remote resource to find the graph name  (and which one would assume
>> would be part of the WebProxy
>>  code since it  will be making the real HTTP Connection, and so can follow
>> the changes of the graph names.)
>> 
> So what? (do you want to hear that your name is in the class comment or that
> this particular code comes from the WebId-service which you extracted when
> introducing your WebProxy)

No. But I'll leave this discussion here as this is turning out to be a side issue.
> 
> 
> 
>>  But because the TcManager interface is really a database layer interface,
>> that cannot be placed there, and
>>  so is now placed into something outside - this class you have now
>> written.
>> 
> I don't understand. I think the possibilities the new storage.web service
> offers are quite cool. For example that you can do sparql-queries on remote
> graphs using the clerezza endpoint (if you have appropriate privileges). You
> can browse resource in remote graphs as described here:
> http://mail-archives.apache.org/mod_mbox/incubator-clerezza-dev/201105.mbox/%3CBANLkTinFnYRFNkje0HmdH--VsBidLgpr2g@mail.gmail.com%3E

yes, that is cool. The issue here is related to CLEREZZA-538
"cached graphs can have names with #", so that if someone writes a 

val planetRdf = $[TcManager].getGraph("http://planetrdf.com/index.rdf".uri)

which leads him to follow a hash uri there could be an exception being thrown.
Anyway, lets move that discussion to CLEREZZA-538.

> 
> 
>> 4. But instead of just having a GraphNodeProvider that just returns the
>> graph, you have added some twists to
>>  it and return more than jut the named graph. There is nothing to say that
>> a named graph cannot be the union
>>  of many other graphs, but it seems really arbitrary for me to get the
>> documentation of clerezza along with the
>>  triples of Tim Berners Lee's graph.
>> 
> 
>>  Somehow things have gone a bit haywire at the end here.
> 
> If you call getGraph on a GraphNode you're leaving the scope of the
> GraphNode. Probably all this discussion would not be necessary if had been
> using getNodeContext instead of getGraph. The NodeContext is what related to
> the node. Using getGraph is a bit like doing the following:
> 
> File file = SomeService.getFileDescribing("Tim Berners Lee")
> file.getParent().getParent().getParent().listChildrenRecursively()

I don't think that is a good way of looking at what graphs are useful for. Graphs are more
like bubbles in a comic strip.

I argue this very carefully in "Are OO languages Autistic?"

 http://blogs.oracle.com/bblfish/entry/are_oo_languages_autistic

This is a fundamental new programming element provided in the semantic web.

So the context as you are defining it is not what I am looking for. I am really looking for the named graph - the entire claim made by a resource. This can be seen by considering the example I gave above where someone adds to the content graph information about Dan Brickley 

    b:danbri foaf:knows :joe .

If I only get Dan Brickely's graph back that triple will not be there. If I get Dan Brickley's  + the content graph, then that information will appear even if I just ask for dan's node context. Also there may be information about
people appearing in Dan Brickley's profile that are not directly linked by him, that I will
also be interested in retrieving. 

So the context is not the tool I need - and I don't think my use cases are special.

> 
> The listed files can contain thigs that are completely unrelated to Tim
> Berners Lee
> 
> 
> 
>> And I think this is due to a bit of confusion of the needs
>>  of your application with trying to keep the general architecture clean.
>> 
> As I said, I did not made this particularly for an application, my wall
> application is merely a demo. When we want to do something like a
> foaf-browser we want to be able to display the resource in their context,
> just a usecase.

Ok, so that is where our disagreement lies. The node context is in many case not
at all what we want. It both adds too much information and not enough.

It may be that in the wall demo that is not visible. But in security matters and
trust matters it will make a big difference.

> 
> 
>> 
>>  Now on the whole I have learnt a lot about Clerezza by following this,
>> but I just can't say that this looks like
>> a good long term solution.  We are constantly moving around and around
>> something.
>> 
> This is your impression. I hope my explanations to the concrete points you
> mention could help changing this impression.

I think it should now be clear how we can come to a solution that satisfies both
our needs.

> 
> 
> 
>> 
>>  Would any of the following work?
>> 
>>  - TcProvider extended to specify caching options?
>> 
> Unrelated to the issue. But no, I think only storage.web needs those options

ok.

> 
> 
>>  - Graph to be extended so that it can contain its name (so that one can
>> ask for a resource in a TcProvider,
>>   and find out what its name really was by inspecting the resulting graph)
>> 
> Again, his is unrelated to the issue at hand.
> 
> Having triple collection-aliases with primary names accessible via TcManager
> would be fine, but I would not agree at having them as part of the
> TripleCollection

ok, so perhaps we should focus on this and on the proposed solution above.

> 
> 
>>   -> if not, should WebProxy really be a TcProvider?
>> 
> Since there is no way of knowing ahead of time what the name
>>      of a graph for a resource is, given that redirects can occur at any
>> time.
>> 
>>   The WebProxy as TcProvider mostly makes sense otherwise, so it does feel
>> like the above two things would help.
>> 
>>> 
>>> So I'm asking you to kindly review the proposed code and vote about
>> closing
>>> CLEREZZA-540
>>> 
>>> [ ] +1, I agree with accepting the proposed code into trunk
>>> [ ] 0, I don't care
>>> [ ] -1, I don't want this code in trunk (must specify a technical
>>> explanation, please also specify what would have to be changed for the
>> patch
>>> to be acceptable to you.
>> 
>> -1 for the moment on closing the issue. (not on removing the code)
>>  Please answer the above points carefully.
>> 
> -1 are against code, keeping the code in trunk if you can't accept makes
> little sense to me

> 
> 
> Okay, I see two reasons which could qualify as technical reasons:
> - The service returns huge amount of triples: this is just wrong as it
> returns graphnodes

ok.

> - The class should be named ContentGraphPlusOtherProvider instead of
> GraphNodeProvider: As It doesn't provide a Graph but a GraphNode your name
> seems wrong rather than just imprecise. A precise name might be
> GraphNodeBasedOnContentGraphPlusOtherProvider.
> 
> Would the rename be okay for you to accept the proposed path? (I really
> would like to go back to productive work, so I rather have a horrible name
> than seeing the project stalled by your veto).

Well then the issue would be why this class should appear in the CallbackRenderer.
No I think there should be a way from JSR311 code to ask to ask precisely for the
type of GraphNode it wants with very little coding. So that for the use cases
where walking the content graph is the right thing to do it is one line of code,
and for cases where something more precise is needed it is also just one line of
code. In any case it should be easy when reading the code to understand what is going
to be displayed.

I hope this helps,

	Henry

> 
> Reto
> 
> PS: You seem to be extensively using you're right to veto while ignoring
> other's veto on your code, looking at
> https://issues.apache.org/jira/browse/CLEREZZA-515 I see that the commits
> have not been reverted even more than one week after my veto and request to
> revert.

Hmm, I did revert that using git. But I am not sure why that does not appear in the
commits for that issue.... I see you brought that up in another thread.


Social Web Architect
http://bblfish.net/

Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Reto Bachmann-Gmuer <re...@trialox.org>.

On Sun, May 29, 2011 at 7:06 PM, Henry Story <he...@bblfish.net>wrote:

>
> On 26 May 2011, at 20:31, Reto Bachmann-Gmuer wrote:
>
> > With CLEREZZA-540 I suggest a GraphNodeProvider-Service that returns a
> > GraphNode given a named resource. Mainly this code that used to be in
> > DiscoBitsTypeHandler which has been generalized.
> >
> > The issue is described as:
> > "Implement a platform service that returns GraphNodes for URIs. The
> > GraphNode is the resource identified by that uri with as BaseGraph
> sources
> > considered authoritative for that resource. "
> >
> > Of course "considered authoritative" it not a very sharp description. The
> > issue is labeled with "platform" which implies it is not a generic
> utility
> > of clerezza.rdf but that it relies on platform default graphs.
> >
> > The solution proposed in commit
> > #1125477<http://svn.apache.org/viewvc?view=rev&rev=1125477>and
> > #1125652 <http://svn.apache.org/viewvc?view=rev&rev=1125652> sets the
> > basegarph as follows:
> > - always trust the content graph
> > - for remote resource trust the graph you get by dereferencing the uri
> > - for resources in the user-uri space trust that user
>
> Of course what one thinks of this patch depends completely on how this
> provider
> gets used. It has quite a lot of limitations it seems to me as implemented
> currently - which is of course not a final failing in a developing project,
> but it does seem like a good discussion would help us narrow perhaps on
> better
> solutions or on fixes, which is part of the reason I thought the ISSUE
> should not
> yet be closed.
>
Keeping the issue open blocks development, an issue is associated to a patch
as long as an issue is open other code should not depend on code committed
under this issue. This is way an issue should be closed shortly after the
patch has been committed, and conversely why code shouldn't be kept in trunk
it the issue cannot be closed. An open issue is more than a discussion
reminder.



>
> The failings I mentioned in the ISSUE-540, and develop below are:
>
> 1. it to relies on more and more Services that each require TcProviders.
> This
>  feels very ad-hoc. The ones I mentioned are
>
>    - UserManager: requires a TcManager
>    - WebIdGraphsService: also requires a TcManager
>    - PlatformConfig: requires TcManager
>    - ContentGraphProvider: requires TcManager
>    - TcManager
>
>   Each of these is used, and makes a call to the database. And the
> TcManager itself each time
> iterates through a number of TcProviders.
>
Yes, TcManager is the main entry point to the rdf data. I don't see any code
smell here. I a classical RDBMS java application a component will typically
use jdbc and rely on other components that use jdbc, here it's TcManager
instead of jdbc,



>
> 2. when asking for an external URI, you get the whole content graph too
>
You get a GraphNode


>   On my fresh install of ZZ that is 20 times more information than the
> initial graph.
>
- What is 20 times more?
- What do you mean by "get"?

A graphnode point to a resource and is designed for browsing from resource
to resource. It is not a graph but a node in a graph. The object is
associated to a base-graph which used to identify the propertied and
instanctiate another graphnode when hoping to a property value. The
underlying graph could for instance be Timbl's GGG (giant gloabl graph aka
the web).


> How big is that going to become as one's content graph grows over time?

The underlying graph would be the size of the virtual content graph + the
size of the remote graph - duplicate triples


> Is this not going to create a huge bottleneck very quickly?

No, the time complexity for accessing a property of graphnode grows linearly
to the number of graphs in the graph-union (which in this case is 2) but is
only O(log n) for the number of triples  in the indexed triple store.

I thought I had heard people mention issues with speed on this list.
>
You may check archives of this list at:
http://mail-archives.apache.org/mod_mbox/incubator-clerezza-dev/

I mentioned performance concern directly to you when I saw that your
EasyGraph class copies to memory all triples of a TripleCollection rather
than accessing them in the original graph. So yes, the performance
implications of our design choices should always be in our minds.


>
>  So to verify this do the following:
>
> zz> import org.apache.clerezza.platform.graphnodeprovider._
> zz> val gnp = $[GraphNodeProvider]
> zz> val tbl = gnp.get(new UriRef("
> http://www.w3.org/People/Berners-Lee/card#i"))
> zz> tbl.getGraph.size
> res0: Int = 1878
>
> getGraph returns "the graph the node represented by this instance is in",
as I mentioned before this could be GGG. On a GraphNode you should usually
not invoke getGraph, the object exists to hop from resource to resource.


> If I get Tim Berners Lee's Graph on the command line I find
>
> $ rapper http://www.w3.org/People/Berners-Lee/card | wc
> rapper: Parsing URI http://www.w3.org/People/Berners-Lee/card with parser
> rdfxml
> rapper: Serializing with serializer ntriples
> rapper: Parsing returned 78 triples
>      78     380    9978
>
> So here we have 78 triples, but the resulting answer schlepped around is 20
> times bigger - on a new installation!
>
If by schlepped around you mean that they are copied in memory or something
like that this is not the case. It's a reference, that's all.



>
> I wondered what was going on, because to my surprise on a new installation
> the Content Graph contains only 1 triple.
> So I looked into a running instance of ContentGraphProvider and found that
> the additions array contained the following graphs in addition to the
> content graph:
>
>  - <urn:x-localinstance:/documentation.graph>   1002 triples
>  - <urn:x-localinstance:/config.graph>           176 triples
>  - <urn:x-localinstance:/web-resources.graph>    621 triples
>  - <urn:x-localinstance:/enrichment.graph>         0 triples
>
> So that does then indeed add up to the number.
>
Great, you found the answer :)


>
> What I am wondering is in what cases is this needed? It seems like this may
> indeed what a particular application may require, but does it have to be
> a general service? The name certainly suggests a very general service, not
> one required for a particular application.
>
This is about ContentGraphProvider then, not about issue 540. It's the
ContentGraphProvider which provides the graph of instance-wide and public
information for the platform



>
> Perhaps changing the name from GraphNodeProvider to
> ContentGraphPlusOtherProvider
> would make more sense.
>
It's a platform service that provides GraphNodes. Being a platform service
implies it usesthe platform means of getting trusted content. If it would
just dereference URIs the it would probably be placed in a subpackage of
clerezza.rdf.



>
> > This might not match an intuitive understanding of "authoritative" and
> I'm
> > happy to redefine the issue so that no confusion arises.
>
> One thing I am not quite clear about yet, is who writes to the content
> graph?
> I see a lot of modules use it.
>
Modules can write to the content graph or add temporary additions to it.
Actually writing to the content graph should happen when public and trusted
information is added. An information is considered trusted when added by a
user with respective permission or verified by privileged code (e.g. that
allows the public to add see-also references).


>
> >
> > What I do strongly believe is that the proposed patch offers a major and
> > very useful new functionality. Especially as it allows the following
> > features to be implemented:
> > - Thanks to CLEREZZA-544 one can call the render-method to delegate the
> > rendering of resources with a UriRef instead of a resource,
>
> I think you mean a "UriRef instead of a Graph".
>
"UriRef instead of a GraphNode" (GraphNode is rougly what is called
"Resource" in the jena api)

>
> Yes, that makes sense. But why does the GraphNodeProvider have to cast
> such a wide net to catch so many triples? It seems to me that if one
> is to use a URI then it would be better that the URI refer precisely to
> that named graph (or to a node it it). One could use other tools to create
> virtual graphs, like Simon Schenk's Networked Graphs I mentioned
>
> http://blogs.oracle.com/bblfish/entry/opening_sesame_with_networked_graphs
>
> These allow one to have virtual graphs depending on a SPARQL query pattern.
> There it would be easy for different services to specify different ones.
> And I think something like that would be really good to have.
>

The possible GGG base graph could be implemented exactly as you describe,
but using sparql would probably be quite an inefficient approach.



>
> > in this case the
> > resource is rendered using its own baseGraph rather than the one of the
> > calling template. An example usecase for this is rendering the author of
> a
> > comment, the whole profile of the (possibly remote) commenter isn't and
> > shall not be part of the baseGraph of the GraphNode returned by the
> jax-rs
> > resource method, yet for rendering the comment-author infobox it might be
> > beneficial to render a GarphNode with a baseGraph containing all of the
> > information in the users profile-document
>
> But why also all the information from the documentation and the config
> graphs?
> It may be useful in some very limited cases, but it may mostly not be. It
> seems that
> some use cases would be useful to help describe this in more detail.
>

The config graph will seldomly be in the reachability graph of a resource
but if it probably make sense of having it there.



>
> > - With CLEREZZA-541 the GraphNodeService is accessed from TypeHandler, I
> > posted a resolution to this issue because it was already quite there on
> my
> > local machine when Herny reopened CLEREZZA-540, to respect the reopening
> of
> > the issue I didn't mark the dependent issue as resolved. I will of course
> > revert the changes if requested to do so by a qualifying -1.
> >
> > I'm not arguing that my patches solve all issues one might have around
> > getting resource descriptions but I do think it is very valuable and to
> > allow to base other stuff on this service I would like the issue to be
> > closed. As Henry reopened the issue twice and I don't want to close the
> > issue again without a broader discussion. Yet as many thing depend on the
> > issue leaving it open doesn't seem an option to me.
>
> What depends on it is something you are wanting to do in your projects it
> seems
> to me, and that is not that clearly laid out.

Also because of the negative expirience with the monster issues recently I'm
trying to be very explicit on dependent issues. A motivator for this issues
was also your code around browsing remote foaf-profiles.


> Because it does not seem obvious to
> me why a service should make the decisions this one does about what is
> authoritative.
>
I mentioned earlier that other trust settings might be added with other
issues. You seem to agree that it is useful to assume the graph resulting
from dereferencing the resource as authoritative. The content graph is
authoritative by platform conventions. There is no specific decision of the
service here



> >
> > Future enhancement might include:
>
> > - manually force refresh of caches for graphs related to a requested
> > resource
>
> Yes, indeed. But why here, when it is not in the WebProxy? You would think
> cache
> update functionality should go in the WebProxy right?
>
Because when you access resources you don't (usually) care about the
underlying graphs. You can already forces cache-refreshes when using the
WebProxy.



>
>
> > - force an alternative set of baseGraphs to be used (e.g. Only local or
> only
> > remote sources)
>
> What I am wondering is why all this is done like this? If I go over the
> changes of the
> past few weeks this is what I see:
>
> So if we go over the history of refactorings that led us here.
>
> 1. You did not like the initial WebProxy you I wrote by refactoring your
> WebIdGraphsService.
>   Neither did I in fact - but it did work  at least and added minimum
> change - being new to ZZ
>  I did not want to play around too much in the internals.
>
well your code massively expanded the rdf api. Compare this with the length
of discussion which is just about one method and one interface in the higher
level platform api.



> 2. You moved the old WebProxy to what seemed like a nicer interface: the
> TcProvider interface. And
>   indeed that does look a lot better. BUT but this interface is really
> meant for direct, no interpretation
>   access to the database and so lack
>   - key notions of caching (well I suppose they could make sense even for
> other sesame or jena graphs?)
>
No I don't think this would make sense for jena or sesame graphs. I think
that most clients don't need to force custom caching/update policies, but if
they do they can access the WebProxy services which offers methods not
available via TcManager


>   - does not provide a method for returning the final name of the graph
> (for redireted resources, or foaf:knows),
>     when the WebProxy gets called
>      (since this the TcProvider assumes you give it exactly the correct
> name of the graph)
>   => So really it is quite uncomfortable there somehow.
>
This is something you could open an issue about, something like "Graph
aliasing in TcManager/TcProvider"




> 3. This led you then to move to this GraphNodeProvider in order get a graph
> from a URI -



> which is very similar
>   to the TcProvider in many ways, right?

No, its something fundamentally different

It even uses the code of the original WebProxy to do a HEAD
>   on a remote resource to find the graph name  (and which one would assume
> would be part of the WebProxy
>   code since it  will be making the real HTTP Connection, and so can follow
> the changes of the graph names.)
>
So what? (do you want to hear that your name is in the class comment or that
this particular code comes from the WebId-service which you extracted when
introducing your WebProxy)



>   But because the TcManager interface is really a database layer interface,
> that cannot be placed there, and
>   so is now placed into something outside - this class you have now
> written.
>
I don't understand. I think the possibilities the new storage.web service
offers are quite cool. For example that you can do sparql-queries on remote
graphs using the clerezza endpoint (if you have appropriate privileges). You
can browse resource in remote graphs as described here:
http://mail-archives.apache.org/mod_mbox/incubator-clerezza-dev/201105.mbox/%3CBANLkTinFnYRFNkje0HmdH--VsBidLgpr2g@mail.gmail.com%3E


> 4. But instead of just having a GraphNodeProvider that just returns the
> graph, you have added some twists to
>   it and return more than jut the named graph. There is nothing to say that
> a named graph cannot be the union
>   of many other graphs, but it seems really arbitrary for me to get the
> documentation of clerezza along with the
>   triples of Tim Berners Lee's graph.
>

>   Somehow things have gone a bit haywire at the end here.

If you call getGraph on a GraphNode you're leaving the scope of the
GraphNode. Probably all this discussion would not be necessary if had been
using getNodeContext instead of getGraph. The NodeContext is what related to
the node. Using getGraph is a bit like doing the following:

File file = SomeService.getFileDescribing("Tim Berners Lee")
file.getParent().getParent().getParent().listChildrenRecursively()

The listed files can contain thigs that are completely unrelated to Tim
Berners Lee



> And I think this is due to a bit of confusion of the needs
>   of your application with trying to keep the general architecture clean.
>
As I said, I did not made this particularly for an application, my wall
application is merely a demo. When we want to do something like a
foaf-browser we want to be able to display the resource in their context,
just a usecase.


>
>   Now on the whole I have learnt a lot about Clerezza by following this,
> but I just can't say that this looks like
>  a good long term solution.  We are constantly moving around and around
> something.
>
This is your impression. I hope my explanations to the concrete points you
mention could help changing this impression.



>
>   Would any of the following work?
>
>   - TcProvider extended to specify caching options?
>
Unrelated to the issue. But no, I think only storage.web needs those options


>   - Graph to be extended so that it can contain its name (so that one can
> ask for a resource in a TcProvider,
>    and find out what its name really was by inspecting the resulting graph)
>
Again, his is unrelated to the issue at hand.

Having triple collection-aliases with primary names accessible via TcManager
would be fine, but I would not agree at having them as part of the
TripleCollection


>    -> if not, should WebProxy really be a TcProvider?
>
Since there is no way of knowing ahead of time what the name
>       of a graph for a resource is, given that redirects can occur at any
> time.
>
>    The WebProxy as TcProvider mostly makes sense otherwise, so it does feel
> like the above two things would help.
>
> >
> > So I'm asking you to kindly review the proposed code and vote about
> closing
> > CLEREZZA-540
> >
> > [ ] +1, I agree with accepting the proposed code into trunk
> > [ ] 0, I don't care
> > [ ] -1, I don't want this code in trunk (must specify a technical
> > explanation, please also specify what would have to be changed for the
> patch
> > to be acceptable to you.
>
> -1 for the moment on closing the issue. (not on removing the code)
>   Please answer the above points carefully.
>
-1 are against code, keeping the code in trunk if you can't accept makes
little sense to me.


Okay, I see two reasons which could qualify as technical reasons:
- The service returns huge amount of triples: this is just wrong as it
returns graphnodes
- The class should be named ContentGraphPlusOtherProvider instead of
GraphNodeProvider: As It doesn't provide a Graph but a GraphNode your name
seems wrong rather than just imprecise. A precise name might be
GraphNodeBasedOnContentGraphPlusOtherProvider.

Would the rename be okay for you to accept the proposed path? (I really
would like to go back to productive work, so I rather have a horrible name
than seeing the project stalled by your veto).

Reto

PS: You seem to be extensively using you're right to veto while ignoring
other's veto on your code, looking at
https://issues.apache.org/jira/browse/CLEREZZA-515 I see that the commits
have not been reverted even more than one week after my veto and request to
revert.

Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Henry Story <he...@bblfish.net>.

On 26 May 2011, at 20:31, Reto Bachmann-Gmuer wrote:

> With CLEREZZA-540 I suggest a GraphNodeProvider-Service that returns a
> GraphNode given a named resource. Mainly this code that used to be in
> DiscoBitsTypeHandler which has been generalized.
> 
> The issue is described as:
> "Implement a platform service that returns GraphNodes for URIs. The
> GraphNode is the resource identified by that uri with as BaseGraph sources
> considered authoritative for that resource. "
> 
> Of course "considered authoritative" it not a very sharp description. The
> issue is labeled with "platform" which implies it is not a generic utility
> of clerezza.rdf but that it relies on platform default graphs.
> 
> The solution proposed in commit
> #1125477<http://svn.apache.org/viewvc?view=rev&rev=1125477>and
> #1125652 <http://svn.apache.org/viewvc?view=rev&rev=1125652> sets the
> basegarph as follows:
> - always trust the content graph
> - for remote resource trust the graph you get by dereferencing the uri
> - for resources in the user-uri space trust that user

Of course what one thinks of this patch depends completely on how this provider 
gets used. It has quite a lot of limitations it seems to me as implemented
currently - which is of course not a final failing in a developing project, 
but it does seem like a good discussion would help us narrow perhaps on better
solutions or on fixes, which is part of the reason I thought the ISSUE should not
yet be closed. 

The failings I mentioned in the ISSUE-540, and develop below are:

1. it to relies on more and more Services that each require TcProviders. This
  feels very ad-hoc. The ones I mentioned are

    - UserManager: requires a TcManager 
    - WebIdGraphsService: also requires a TcManager 
    - PlatformConfig: requires TcManager 
    - ContentGraphProvider: requires TcManager 
    - TcManager 

   Each of these is used, and makes a call to the database. And the TcManager itself each time 
iterates through a number of TcProviders.

2. when asking for an external URI, you get the whole content graph too
   On my fresh install of ZZ that is 20 times more information than the initial graph. 
How big is that going to become as one's content graph grows over time? Is this not going 
to create a huge bottleneck very quickly? I thought I had heard people mention issues
with speed on this list. 

  So to verify this do the following:

zz> import org.apache.clerezza.platform.graphnodeprovider._
zz> val gnp = $[GraphNodeProvider]
zz> val tbl = gnp.get(new UriRef("http://www.w3.org/People/Berners-Lee/card#i"))
zz> tbl.getGraph.size
res0: Int = 1878

If I get Tim Berners Lee's Graph on the command line I find

$ rapper http://www.w3.org/People/Berners-Lee/card | wc
rapper: Parsing URI http://www.w3.org/People/Berners-Lee/card with parser rdfxml
rapper: Serializing with serializer ntriples
rapper: Parsing returned 78 triples
      78     380    9978

So here we have 78 triples, but the resulting answer schlepped around is 20 times bigger - on a new installation!

I wondered what was going on, because to my surprise on a new installation the Content Graph contains only 1 triple.
So I looked into a running instance of ContentGraphProvider and found that the additions array contained the following graphs in addition to the content graph:

  - <urn:x-localinstance:/documentation.graph>   1002 triples
  - <urn:x-localinstance:/config.graph>           176 triples
  - <urn:x-localinstance:/web-resources.graph>    621 triples
  - <urn:x-localinstance:/enrichment.graph>         0 triples

So that does then indeed add up to the number.

What I am wondering is in what cases is this needed? It seems like this may
indeed what a particular application may require, but does it have to be 
a general service? The name certainly suggests a very general service, not
one required for a particular application.

Perhaps changing the name from GraphNodeProvider to ContentGraphPlusOtherProvider
would make more sense.

> This might not match an intuitive understanding of "authoritative" and I'm
> happy to redefine the issue so that no confusion arises.

One thing I am not quite clear about yet, is who writes to the content graph?
I see a lot of modules use it.

> 
> What I do strongly believe is that the proposed patch offers a major and
> very useful new functionality. Especially as it allows the following
> features to be implemented:
> - Thanks to CLEREZZA-544 one can call the render-method to delegate the
> rendering of resources with a UriRef instead of a resource,

I think you mean a "UriRef instead of a Graph".

Yes, that makes sense. But why does the GraphNodeProvider have to cast
such a wide net to catch so many triples? It seems to me that if one
is to use a URI then it would be better that the URI refer precisely to 
that named graph (or to a node it it). One could use other tools to create
virtual graphs, like Simon Schenk's Networked Graphs I mentioned

http://blogs.oracle.com/bblfish/entry/opening_sesame_with_networked_graphs

These allow one to have virtual graphs depending on a SPARQL query pattern.
There it would be easy for different services to specify different ones. 
And I think something like that would be really good to have.

> in this case the
> resource is rendered using its own baseGraph rather than the one of the
> calling template. An example usecase for this is rendering the author of a
> comment, the whole profile of the (possibly remote) commenter isn't and
> shall not be part of the baseGraph of the GraphNode returned by the jax-rs
> resource method, yet for rendering the comment-author infobox it might be
> beneficial to render a GarphNode with a baseGraph containing all of the
> information in the users profile-document

But why also all the information from the documentation and the config graphs?
It may be useful in some very limited cases, but it may mostly not be. It seems that
some use cases would be useful to help describe this in more detail. 

> - With CLEREZZA-541 the GraphNodeService is accessed from TypeHandler, I
> posted a resolution to this issue because it was already quite there on my
> local machine when Herny reopened CLEREZZA-540, to respect the reopening of
> the issue I didn't mark the dependent issue as resolved. I will of course
> revert the changes if requested to do so by a qualifying -1.
> 
> I'm not arguing that my patches solve all issues one might have around
> getting resource descriptions but I do think it is very valuable and to
> allow to base other stuff on this service I would like the issue to be
> closed. As Henry reopened the issue twice and I don't want to close the
> issue again without a broader discussion. Yet as many thing depend on the
> issue leaving it open doesn't seem an option to me.

What depends on it is something you are wanting to do in your projects it seems
to me, and that is not that clearly laid out. Because it does not seem obvious to
me why a service should make the decisions this one does about what is authoritative.

> 
> Future enhancement might include:

> - manually force refresh of caches for graphs related to a requested
> resource

Yes, indeed. But why here, when it is not in the WebProxy? You would think cache
update functionality should go in the WebProxy right? 

> - force an alternative set of baseGraphs to be used (e.g. Only local or only
> remote sources)

What I am wondering is why all this is done like this? If I go over the changes of the
past few weeks this is what I see:

So if we go over the history of refactorings that led us here.

1. You did not like the initial WebProxy you I wrote by refactoring your WebIdGraphsService. 
   Neither did I in fact - but it did work  at least and added minimum change - being new to ZZ 
  I did not want to play around too much in the internals.
2. You moved the old WebProxy to what seemed like a nicer interface: the TcProvider interface. And 
   indeed that does look a lot better. BUT but this interface is really meant for direct, no interpretation 
   access to the database and so lack
   - key notions of caching (well I suppose they could make sense even for other sesame or jena graphs?)
   - does not provide a method for returning the final name of the graph (for redireted resources, or foaf:knows), 
     when the WebProxy gets called
      (since this the TcProvider assumes you give it exactly the correct name of the graph)
   => So really it is quite uncomfortable there somehow.
3. This led you then to move to this GraphNodeProvider in order get a graph from a URI - which is very similar
   to the TcProvider in many ways, right? It even uses the code of the original WebProxy to do a HEAD 
   on a remote resource to find the graph name  (and which one would assume would be part of the WebProxy 
   code since it  will be making the real HTTP Connection, and so can follow the changes of the graph names.)
   But because the TcManager interface is really a database layer interface, that cannot be placed there, and
   so is now placed into something outside - this class you have now written.
4. But instead of just having a GraphNodeProvider that just returns the graph, you have added some twists to
   it and return more than jut the named graph. There is nothing to say that a named graph cannot be the union
   of many other graphs, but it seems really arbitrary for me to get the documentation of clerezza along with the 
   triples of Tim Berners Lee's graph.

   Somehow things have gone a bit haywire at the end here. And I think this is due to a bit of confusion of the needs
   of your application with trying to keep the general architecture clean.

   Now on the whole I have learnt a lot about Clerezza by following this, but I just can't say that this looks like
 a good long term solution.  We are constantly moving around and around something.

   Would any of the following work?

   - TcProvider extended to specify caching options? 
   - Graph to be extended so that it can contain its name (so that one can ask for a resource in a TcProvider, 
    and find out what its name really was by inspecting the resulting graph)
    -> if not, should WebProxy really be a TcProvider? Since there is no way of knowing ahead of time what the name
       of a graph for a resource is, given that redirects can occur at any time. 

    The WebProxy as TcProvider mostly makes sense otherwise, so it does feel like the above two things would help.

> 
> So I'm asking you to kindly review the proposed code and vote about closing
> CLEREZZA-540
> 
> [ ] +1, I agree with accepting the proposed code into trunk
> [ ] 0, I don't care
> [ ] -1, I don't want this code in trunk (must specify a technical
> explanation, please also specify what would have to be changed for the patch
> to be acceptable to you.

-1 for the moment on closing the issue. (not on removing the code)
   Please answer the above points carefully.

Henry

> 
> Cheers,
> Reto

Social Web Architect
http://bblfish.net/

Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Hasan Hasan <ha...@trialox.org>.

+1

please see my comments inline

On Thu, May 26, 2011 at 8:31 PM, Reto Bachmann-Gmuer <
reto.bachmann@trialox.org> wrote:

> With CLEREZZA-540 I suggest a GraphNodeProvider-Service that returns a
> GraphNode given a named resource. Mainly this code that used to be in
> DiscoBitsTypeHandler which has been generalized.
>

I support generalization of functionality to allow flexible reuses.

>
> The issue is described as:
> "Implement a platform service that returns GraphNodes for URIs. The
> GraphNode is the resource identified by that uri with as BaseGraph sources
> considered authoritative for that resource. "
>
> Of course "considered authoritative" it not a very sharp description.


This is one main point of the debate between Henry and Reto as can be seen
by following their comments on jira. Since the javadoc defines what that
means I can live with that.



> The
> issue is labeled with "platform" which implies it is not a generic utility
> of clerezza.rdf but that it relies on platform default graphs.
>
> The solution proposed in commit
> #1125477<http://svn.apache.org/viewvc?view=rev&rev=1125477>and
> #1125652 <http://svn.apache.org/viewvc?view=rev&rev=1125652> sets the
> basegarph as follows:
> - always trust the content graph
> - for remote resource trust the graph you get by dereferencing the uri
> - for resources in the user-uri space trust that user
>
> This might not match an intuitive understanding of "authoritative" and I'm
> happy to redefine the issue so that no confusion arises.
>
> What I do strongly believe is that the proposed patch offers a major and
> very useful new functionality. Especially as it allows the following
> features to be implemented:
> - Thanks to CLEREZZA-544 one can call the render-method to delegate the
> rendering of resources with a UriRef instead of a resource, in this case
> the
> resource is rendered using its own baseGraph rather than the one of the
> calling template. An example usecase for this is rendering the author of a
> comment, the whole profile of the (possibly remote) commenter isn't and
> shall not be part of the baseGraph of the GraphNode returned by the jax-rs
> resource method, yet for rendering the comment-author infobox it might be
> beneficial to render a GarphNode with a baseGraph containing all of the
> information in the users profile-document
>

I think the described use case is plausible. Besides, I don't see that the
new service
causes changes to any existing API (please correct me if this is wrong)
except adding
new ones, e.g., in CallbackRendererImpl. Thus, existing applications are not
affected
and other or new applications can benefit from the new service.


> - With CLEREZZA-541 the GraphNodeService is accessed from TypeHandler, I
> posted a resolution to this issue because it was already quite there on my
> local machine when Herny reopened CLEREZZA-540, to respect the reopening of
> the issue I didn't mark the dependent issue as resolved. I will of course
> revert the changes if requested to do so by a qualifying -1.
>
> I'm not arguing that my patches solve all issues one might have around
> getting resource descriptions but I do think it is very valuable and to
> allow to base other stuff on this service I would like the issue to be
> closed. As Henry reopened the issue twice and I don't want to close the
> issue again without a broader discussion. Yet as many thing depend on the
> issue leaving it open doesn't seem an option to me.
>
> Future enhancement might include:
> - manually force refresh of caches for graphs related to a requested
> resource
> - force an alternative set of baseGraphs to be used (e.g. Only local or
> only
> remote sources)
>

Just eager to know: will the second enhancement help Henry to solve his
problem that could be solved with the "old" webproxy code?


> So I'm asking you to kindly review the proposed code and vote about closing
> CLEREZZA-540
>
> [ ] +1, I agree with accepting the proposed code into trunk
> [ ] 0, I don't care
> [ ] -1, I don't want this code in trunk (must specify a technical
> explanation, please also specify what would have to be changed for the
> patch
> to be acceptable to you)
>
> Cheers,
> Reto
>

Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Reto Bachmann-Gmuer <re...@trialox.org>.

+1

On Thu, May 26, 2011 at 8:31 PM, Reto Bachmann-Gmuer <
reto.bachmann@trialox.org> wrote:

> With CLEREZZA-540 I suggest a GraphNodeProvider-Service that returns a
> GraphNode given a named resource. Mainly this code that used to be in
> DiscoBitsTypeHandler which has been generalized.
>
> The issue is described as:
> "Implement a platform service that returns GraphNodes for URIs. The
> GraphNode is the resource identified by that uri with as BaseGraph sources
> considered authoritative for that resource. "
>
> Of course "considered authoritative" it not a very sharp description. The
> issue is labeled with "platform" which implies it is not a generic utility
> of clerezza.rdf but that it relies on platform default graphs.
>
> The solution proposed in commit #1125477<http://svn.apache.org/viewvc?view=rev&rev=1125477>and
> #1125652 <http://svn.apache.org/viewvc?view=rev&rev=1125652> sets the
> basegarph as follows:
> - always trust the content graph
> - for remote resource trust the graph you get by dereferencing the uri
> - for resources in the user-uri space trust that user
>
> This might not match an intuitive understanding of "authoritative" and I'm
> happy to redefine the issue so that no confusion arises.
>
> What I do strongly believe is that the proposed patch offers a major and
> very useful new functionality. Especially as it allows the following
> features to be implemented:
> - Thanks to CLEREZZA-544 one can call the render-method to delegate the
> rendering of resources with a UriRef instead of a resource, in this case the
> resource is rendered using its own baseGraph rather than the one of the
> calling template. An example usecase for this is rendering the author of a
> comment, the whole profile of the (possibly remote) commenter isn't and
> shall not be part of the baseGraph of the GraphNode returned by the jax-rs
> resource method, yet for rendering the comment-author infobox it might be
> beneficial to render a GarphNode with a baseGraph containing all of the
> information in the users profile-document
> - With CLEREZZA-541 the GraphNodeService is accessed from TypeHandler, I
> posted a resolution to this issue because it was already quite there on my
> local machine when Herny reopened CLEREZZA-540, to respect the reopening of
> the issue I didn't mark the dependent issue as resolved. I will of course
> revert the changes if requested to do so by a qualifying -1.
>
> I'm not arguing that my patches solve all issues one might have around
> getting resource descriptions but I do think it is very valuable and to
> allow to base other stuff on this service I would like the issue to be
> closed. As Henry reopened the issue twice and I don't want to close the
> issue again without a broader discussion. Yet as many thing depend on the
> issue leaving it open doesn't seem an option to me.
>
> Future enhancement might include:
> - manually force refresh of caches for graphs related to a requested
> resource
> - force an alternative set of baseGraphs to be used (e.g. Only local or
> only remote sources)
>
> So I'm asking you to kindly review the proposed code and vote about closing
> CLEREZZA-540
>
> [ ] +1, I agree with accepting the proposed code into trunk
> [ ] 0, I don't care
> [ ] -1, I don't want this code in trunk (must specify a technical
> explanation, please also specify what would have to be changed for the patch
> to be acceptable to you)
>
> Cheers,
> Reto
>

Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Tsuyoshi Ito <ts...@trialox.org>.

Hi Reto

Thank you for the detailed explanations.

Cheers
Tsuy


On Wed, Jun 1, 2011 at 2:08 AM, Reto Bachmann-Gmuer
<re...@trialox.org> wrote:
> On Sun, May 29, 2011 at 8:16 PM, Henry Story <he...@bblfish.net>wrote:
>
>>
>> On 28 May 2011, at 08:55, Tsuyoshi Ito wrote:
>>
>> > Dear all
>> >
>> > If no existing API will be changed:
>>
>> Well the closing of 540 would also officially close 544 (which is still
>> closed waiting for 540 to be closed
>> even though officially this is not a legal Apache maneuvre).
>
> No, CLEREZZA-544 is not closed. I explained why I already committed code for
> that issue. But closing the issue this thread is about, would not prevent
> you from vetoing against 544.
>
>
>> And 544 does changes quite an important api.
>>
> It adds a method, it doesn't change neither signature nor behaviour.
>
>>
>> It adds a new method to CallbackRenderer
>>
>>  public void render(UriRef resource, GraphNode context, String mode,
>>                     OutputStream os) throws IOException;
>>
>> I am not against such an addition were it not for the way GraphNodeProvider
>> is implemented currently.
>>
>
> [...]
>
>>
>>
>> Now this means that the GraphNodeProvider is not just a package for some of
>> Reto's pet projects, but
>> will be central to the working of Clerezza.  In which case the issue of the
>> efficiency of it and the
>> decisions it makes on what is authoritative should be considered more
>> intently it seems to me.
>>
> I think that the Trunk is not a place for Pet projects anyway.
>
> Reto
>

Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Reto Bachmann-Gmuer <re...@trialox.org>.

On Sun, May 29, 2011 at 8:16 PM, Henry Story <he...@bblfish.net>wrote:

>
> On 28 May 2011, at 08:55, Tsuyoshi Ito wrote:
>
> > Dear all
> >
> > If no existing API will be changed:
>
> Well the closing of 540 would also officially close 544 (which is still
> closed waiting for 540 to be closed
> even though officially this is not a legal Apache maneuvre).

No, CLEREZZA-544 is not closed. I explained why I already committed code for
that issue. But closing the issue this thread is about, would not prevent
you from vetoing against 544.


> And 544 does changes quite an important api.
>
It adds a method, it doesn't change neither signature nor behaviour.

>
> It adds a new method to CallbackRenderer
>
>  public void render(UriRef resource, GraphNode context, String mode,
>                     OutputStream os) throws IOException;
>
> I am not against such an addition were it not for the way GraphNodeProvider
> is implemented currently.
>

[...]

>
>
> Now this means that the GraphNodeProvider is not just a package for some of
> Reto's pet projects, but
> will be central to the working of Clerezza.  In which case the issue of the
> efficiency of it and the
> decisions it makes on what is authoritative should be considered more
> intently it seems to me.
>
I think that the Trunk is not a place for Pet projects anyway.

Reto

Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Henry Story <he...@bblfish.net>.

On 28 May 2011, at 08:55, Tsuyoshi Ito wrote:

> Dear all
> 
> If no existing API will be changed:

Well the closing of 540 would also officially close 544 (which is still closed waiting for 540 to be closed
even though officially this is not a legal Apache maneuvre). And 544 does changes quite an important api.

It adds a new method to CallbackRenderer

  public void render(UriRef resource, GraphNode context, String mode,
                     OutputStream os) throws IOException;

I am not against such an addition were it not for the way GraphNodeProvider is implemented currently.
So if you look you will find that CallbackRendererImpl.java  has implemented the above method like this

   @Override
+    public void render(final UriRef resource, GraphNode context, String mode,
+            OutputStream os) throws IOException {
+        final GraphNode resourceNode = AccessController.doPrivileged( new PrivilegedAction<GraphNode>() {
+                    @Override
+                    public GraphNode run() {
+                        return graphNodeProvider.get(resource);
+                    }
+                });
+        render(resourceNode, context, mode, os);
+    }

Now this means that the GraphNodeProvider is not just a package for some of Reto's pet projects, but
will be central to the working of Clerezza.  In which case the issue of the efficiency of it and the
decisions it makes on what is authoritative should be considered more intently it seems to me.

Henry

> 
> +1
> 
> Cheers
> Tsuy

Social Web Architect
http://bblfish.net/

Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Tsuyoshi Ito <ts...@trialox.org>.

Dear all

If no existing API will be changed:

+1

Cheers
Tsuy

On Fri, May 27, 2011 at 3:15 PM, Manuel Innerhofer <s0...@gmail.com> wrote:
> +1
>
> Cheers,
> Manuel
>
> On Thu, May 26, 2011 at 8:31 PM, Reto Bachmann-Gmuer <
> reto.bachmann@trialox.org> wrote:
>
>> With CLEREZZA-540 I suggest a GraphNodeProvider-Service that returns a
>> GraphNode given a named resource. Mainly this code that used to be in
>> DiscoBitsTypeHandler which has been generalized.
>>
>> The issue is described as:
>> "Implement a platform service that returns GraphNodes for URIs. The
>> GraphNode is the resource identified by that uri with as BaseGraph sources
>> considered authoritative for that resource. "
>>
>> Of course "considered authoritative" it not a very sharp description. The
>> issue is labeled with "platform" which implies it is not a generic utility
>> of clerezza.rdf but that it relies on platform default graphs.
>>
>> The solution proposed in commit
>> #1125477<http://svn.apache.org/viewvc?view=rev&rev=1125477>and
>> #1125652 <http://svn.apache.org/viewvc?view=rev&rev=1125652> sets the
>> basegarph as follows:
>> - always trust the content graph
>> - for remote resource trust the graph you get by dereferencing the uri
>> - for resources in the user-uri space trust that user
>>
>> This might not match an intuitive understanding of "authoritative" and I'm
>> happy to redefine the issue so that no confusion arises.
>>
>> What I do strongly believe is that the proposed patch offers a major and
>> very useful new functionality. Especially as it allows the following
>> features to be implemented:
>> - Thanks to CLEREZZA-544 one can call the render-method to delegate the
>> rendering of resources with a UriRef instead of a resource, in this case
>> the
>> resource is rendered using its own baseGraph rather than the one of the
>> calling template. An example usecase for this is rendering the author of a
>> comment, the whole profile of the (possibly remote) commenter isn't and
>> shall not be part of the baseGraph of the GraphNode returned by the jax-rs
>> resource method, yet for rendering the comment-author infobox it might be
>> beneficial to render a GarphNode with a baseGraph containing all of the
>> information in the users profile-document
>> - With CLEREZZA-541 the GraphNodeService is accessed from TypeHandler, I
>> posted a resolution to this issue because it was already quite there on my
>> local machine when Herny reopened CLEREZZA-540, to respect the reopening of
>> the issue I didn't mark the dependent issue as resolved. I will of course
>> revert the changes if requested to do so by a qualifying -1.
>>
>> I'm not arguing that my patches solve all issues one might have around
>> getting resource descriptions but I do think it is very valuable and to
>> allow to base other stuff on this service I would like the issue to be
>> closed. As Henry reopened the issue twice and I don't want to close the
>> issue again without a broader discussion. Yet as many thing depend on the
>> issue leaving it open doesn't seem an option to me.
>>
>> Future enhancement might include:
>> - manually force refresh of caches for graphs related to a requested
>> resource
>> - force an alternative set of baseGraphs to be used (e.g. Only local or
>> only
>> remote sources)
>>
>> So I'm asking you to kindly review the proposed code and vote about closing
>> CLEREZZA-540
>>
>> [ ] +1, I agree with accepting the proposed code into trunk
>> [ ] 0, I don't care
>> [ ] -1, I don't want this code in trunk (must specify a technical
>> explanation, please also specify what would have to be changed for the
>> patch
>> to be acceptable to you)
>>
>> Cheers,
>> Reto
>>
>



-- 
--trialox ag--------------------------------------

Tsuyoshi Ito
Binzmuehlestrasse 14
CH-8050 Zürich
Tel. +41 44 635 75 77
URL: http://trialox.org

Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Posted by Manuel Innerhofer <s0...@gmail.com>.

+1

Cheers,
Manuel

On Thu, May 26, 2011 at 8:31 PM, Reto Bachmann-Gmuer <
reto.bachmann@trialox.org> wrote:

> With CLEREZZA-540 I suggest a GraphNodeProvider-Service that returns a
> GraphNode given a named resource. Mainly this code that used to be in
> DiscoBitsTypeHandler which has been generalized.
>
> The issue is described as:
> "Implement a platform service that returns GraphNodes for URIs. The
> GraphNode is the resource identified by that uri with as BaseGraph sources
> considered authoritative for that resource. "
>
> Of course "considered authoritative" it not a very sharp description. The
> issue is labeled with "platform" which implies it is not a generic utility
> of clerezza.rdf but that it relies on platform default graphs.
>
> The solution proposed in commit
> #1125477<http://svn.apache.org/viewvc?view=rev&rev=1125477>and
> #1125652 <http://svn.apache.org/viewvc?view=rev&rev=1125652> sets the
> basegarph as follows:
> - always trust the content graph
> - for remote resource trust the graph you get by dereferencing the uri
> - for resources in the user-uri space trust that user
>
> This might not match an intuitive understanding of "authoritative" and I'm
> happy to redefine the issue so that no confusion arises.
>
> What I do strongly believe is that the proposed patch offers a major and
> very useful new functionality. Especially as it allows the following
> features to be implemented:
> - Thanks to CLEREZZA-544 one can call the render-method to delegate the
> rendering of resources with a UriRef instead of a resource, in this case
> the
> resource is rendered using its own baseGraph rather than the one of the
> calling template. An example usecase for this is rendering the author of a
> comment, the whole profile of the (possibly remote) commenter isn't and
> shall not be part of the baseGraph of the GraphNode returned by the jax-rs
> resource method, yet for rendering the comment-author infobox it might be
> beneficial to render a GarphNode with a baseGraph containing all of the
> information in the users profile-document
> - With CLEREZZA-541 the GraphNodeService is accessed from TypeHandler, I
> posted a resolution to this issue because it was already quite there on my
> local machine when Herny reopened CLEREZZA-540, to respect the reopening of
> the issue I didn't mark the dependent issue as resolved. I will of course
> revert the changes if requested to do so by a qualifying -1.
>
> I'm not arguing that my patches solve all issues one might have around
> getting resource descriptions but I do think it is very valuable and to
> allow to base other stuff on this service I would like the issue to be
> closed. As Henry reopened the issue twice and I don't want to close the
> issue again without a broader discussion. Yet as many thing depend on the
> issue leaving it open doesn't seem an option to me.
>
> Future enhancement might include:
> - manually force refresh of caches for graphs related to a requested
> resource
> - force an alternative set of baseGraphs to be used (e.g. Only local or
> only
> remote sources)
>
> So I'm asking you to kindly review the proposed code and vote about closing
> CLEREZZA-540
>
> [ ] +1, I agree with accepting the proposed code into trunk
> [ ] 0, I don't care
> [ ] -1, I don't want this code in trunk (must specify a technical
> explanation, please also specify what would have to be changed for the
> patch
> to be acceptable to you)
>
> Cheers,
> Reto
>