You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sling.apache.org by Dishara Wijewardana <dd...@gmail.com> on 2013/06/18 21:49:02 UTC

[Status Update] Apache Cassandra backend for Sling

Hi Ian,
I am starting this thread to keep track on things related to the GSoC
project related milestone status updates and related discussions.
So the first task over view will be as follows as per GSoC proposal
provided.

1. Implementing a CassandraResourceProvider  to READ from Cassandra.
Implementation Details [1]



[1] : Implementation Details:

1.A) Write a CassanrdaResourceProviderUtil  which is basically a cassendra
client which will facilitate all cassandra related operations required by
other modules (CassandraResourceProvider and CassandraResourceResolver).

1.B) Implementation of  CassandraResourceProvider

1.C)  Implementation of CassandraResourceResolver

1.D) Implementation of CassandraResource


And I will start writing the CassanrdaResourceProviderUtil class which will
do basic add and get using hector API. Please provide any feedback that
will be useful to accomplish this task.
So for this how does path mapping should be done. Because for example, the
path of the cassendra node will not be same as the jcr node path. i.e
provider will ask a node path /system/myapps/test/foo and where should we
return it from Cassandra. Aren't we have to first consider the WRITE aspect
to Cassandra ?


-- 
Thanks
/Dishara

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Ian Boston <ie...@tfd.co.uk>.
Hi,
I'll start a new thread on this for you.
Ian

On 5 August 2013 01:20, Dishara Wijewardana <dd...@gmail.com> wrote:
> On Tue, Jul 23, 2013 at 5:32 PM, Dishara Wijewardana <
> ddwijewardana@gmail.com> wrote:
>
>>
>>
>> On Mon, Jul 22, 2013 at 1:56 PM, Ian Boston <ie...@tfd.co.uk> wrote:
>>
>>> Hi Dishara,
>>>
>>> The Unit test coverage sounds great. I will pull the code and review
>>> today.
>>>
>>>
>>> Have you tried loading the bundle into a running Sling instance ?
>>
>> Not yet. I will try that out and let you know in one of following ways
>> Thanks.
>>
>>>
>>> Once you have built it you can load in 2 ways:
>>>
>>> method A
>>>
>>> mvn clean install sling:install
>>>
>>> The sling:install will post the Jar into the OSGi container over HTTP and
>>> cause it to start.
>>>
>> Hi Ian
>
> I started Sling launchpad and Cassandra Server as well. And uploaded the
> bundle through management console from  http://localhost:8080/system/console
> .
> When I try to start the bundle it does nothing and no error logs in the
> back end as well. The I noticed following where me.prettyprint.* classes
> cannot loaded and hence my bundle cannot start. So ideally my bundle only
> contains it self only.  Not included its dependencies.  As I feel one
> option is get a me.prettyprint.* jar and install that bundle (I am not sure
> whether it also fails due to a similar reason). So what is the best
> approach to this ?
>
>
> Symbolic Name org.apache.sling.cassandra
> Version 0.0.1.SNAPSHOT
> Bundle Location inputstream:org.apache.sling.cassandra-0.0.1-SNAPSHOT.jar
> Last Modification Mon Aug 05 05:26:16 IST 2013
> Description Provides a ResourceProvider implementation supporting Apache
> Cassandra based resources.
> Start Level 20
> Exported Packages
> org.apache.sling.cassandra.resource.provider,version=0.0.1.SNAPSHOT
> org.apache.sling.cassandra.resource.provider.mapper,version=0.0.1.SNAPSHOT
> org.apache.sling.cassandra.resource.provider.util,version=0.0.1.SNAPSHOT
> Imported Packages  com.sun.org.apache.xerces.internal.impl.dv.util --
> Cannot be resolved and overwritten by Boot Delegation
> javax.servlet.http from org.apache.felix.http.jetty (1)
> me.prettyprint.cassandra.model,version=[1.0,2) -- Cannot be resolved
> me.prettyprint.cassandra.serializers,version=[1.0,2) -- Cannot be resolved
> me.prettyprint.hector.api,version=[1.0,2) -- Cannot be resolved
> me.prettyprint.hector.api.beans,version=[1.0,2) -- Cannot be resolved
> me.prettyprint.hector.api.ddl,version=[1.0,2) -- Cannot be resolved
> me.prettyprint.hector.api.factory,version=[1.0,2) -- Cannot be resolved
> me.prettyprint.hector.api.query,version=[1.0,2) -- Cannot be resolved
> org.apache.sling.api.resource,version=[2.3,3) from org.apache.sling.api (98)
> org.slf4j,version=[1.5,2) from slf4j.api (6)
> Manifest Headers Bnd-LastModified: 1375238009819
> Build-Jdk: 1.6.0_26
> Built-By: dishara
> Bundle-Description: Provides a ResourceProvider implementation supporting
> Apache Cassandra based resources.
> Bundle-ManifestVersion: 2
> Bundle-Name: Apache Sling Cassandra Resource Provider
> Bundle-SymbolicName: org.apache.sling.cassandra
> Bundle-Version: 0.0.1.SNAPSHOT
> Created-By: Apache Maven Bundle Plugin
> Export-Package: org.apache.sling.cassandra.resource.provider.mapper;
> version="0.0.1.SNAPSHOT", org.apache.sling.cassandra.resource.provider;
> uses:="javax.servlet.http, me.prettyprint.hector.api,
> org.apache.sling.api.resource,
> org.apache.sling.cassandra.resource.provider.mapper";
> version="0.0.1.SNAPSHOT",
> org.apache.sling.cassandra.resource.provider.util;
> uses:="me.prettyprint.cassandra.model,
> me.prettyprint.cassandra.serializers, me.prettyprint.hector.api,
> me.prettyprint.hector.api.query, org.apache.sling.api.resource,
> org.apache.sling.cassandra.resource.provider"; version="0.0.1.SNAPSHOT"
> Import-Package: com.sun.org.apache.xerces.internal.impl.dv.util,
> javax.servlet.http, me.prettyprint.cassandra.model; version="[1.0, 2)",
> me.prettyprint.cassandra.serializers; version="[1.0, 2)",
> me.prettyprint.hector.api; version="[1.0, 2)",
> me.prettyprint.hector.api.beans; version="[1.0, 2)",
> me.prettyprint.hector.api.ddl; version="[1.0, 2)",
> me.prettyprint.hector.api.factory; version="[1.0, 2)",
> me.prettyprint.hector.api.query; version="[1.0, 2)",
> org.apache.sling.api.resource; version="[2.3, 3)", org.slf4j;
> version="[1.5, 2)"
> Manifest-Version: 1.0
> Tool: Bnd-2.1.0.20130426-122213
>
>
>
>>
>>>
>>> method B
>>>
>>> Goto http://localhost:8080/system/console
>>>
>>> select the bundle tab and install the bundle by uploading.
>>>
>>> If you monitor the logs you should see no errors, the bundle should
>>> register and you should be able to map cassandra read only to somewhere in
>>> the resource tree.
>>>
>>
> Can you explain in a  bit more detail how  I should map a cassandra
> resources from sling browser  http://localhost:8080/.explorer.html ? . I
> can add different JCR node types. How my cassandra node type get registered
> here. Does it load all resource providers from reflection when I
> implemented the provider interface.
>
>
>
>
>
>>
>>> Best Regards
>>> Ian
>>>
>>>
>>> On 21 July 2013 13:08, Dishara Wijewardana <dd...@gmail.com>
>>> wrote:
>>>
>>> > On Sun, Jul 21, 2013 at 5:32 PM, Dishara Wijewardana <
>>> > ddwijewardana@gmail.com> wrote:
>>> >
>>> > >
>>> > >
>>> > > On Fri, Jul 19, 2013 at 2:35 PM, Ian Boston <ie...@tfd.co.uk> wrote:
>>> > >
>>> > >> Hi,
>>> > >>
>>> > >>
>>> > >> On 19 July 2013 03:20, Dishara Wijewardana <dd...@gmail.com>
>>> > >> wrote:
>>> > >>
>>> > >> > Hi Ian
>>> > >> > This is regarding the sub tasks completion  of the project
>>> according
>>> > to
>>> > >> the
>>> > >> > time line.
>>> > >> >
>>> > >> > I have my code locally, but yet to do some completion with some
>>> > >> stuff(one
>>> > >> > of them is the ongoing discussion on listChildren).
>>> > >> > After that from API point of view implementation around Cassandra
>>> > >> Resource
>>> > >> > Provider and Resource will be finish. And I have to add some more
>>> > JUnit
>>> > >> > tests(have local code in to some extent already, will commit them
>>> once
>>> > >> all
>>> > >> > done around this).
>>> > >> >
>>> > >> > So after that (after mid term)  what I have to do is  enhance the
>>> > >> provider
>>> > >> > implementation  to  do READ operations with access control. My
>>> idea is
>>> > >> to
>>> > >> > finish that also before the mid term. I was kind of got stuck in
>>> some
>>> > >> OSGi
>>> > >> > stuff last days ;-).  And I will make sure I will have JUnit tests
>>> to
>>> > >> cover
>>> > >> > all the implementations before the midterm.
>>> > >> >
>>> > >>
>>> > >> Yes that is fine.
>>> > >> Before adding ACLs I would like to have the code running inside
>>> Sling,
>>> > >> inside OSGi connected to a Cassandra instance so that we can do some
>>> > basic
>>> > >> tests over http using Curl.
>>> > >>
>>> > >> Perhaps you are there already? Let me know when you are ready and
>>> I'll
>>> > >> give
>>> > >> it a go ?
>>> > >>
>>> > >
>>> > > Hi Ian,
>>> > > That is great and +1. In fact now the code is there and I have
>>> completed,
>>> > > rest of the implementation on normal READ and commited.
>>> > >
>>> > > I have added 3 more tests to cover the core implementation.  The tests
>>> > are
>>> > > running which covers add/read nodes, list children,iterate/iterable
>>> > > children and get parent related stuff.
>>> > >
>>> >
>>> > Correction . It should be 4 more tests. Now we have 5 tests all together
>>> > which covers above aspects. We should write more test around Cassandra
>>> > Resource. Will work on that as well.
>>> >
>>> > Please let me know if any issues come across when you run this in the
>>> sling
>>> > container. Excited to see whether it works in there :-).
>>> >
>>> >
>>> > > And I have made the changes you mentioned. So now resources will not
>>> get
>>> > > loaded unless we do a API call with the resource.
>>> > >
>>> > >
>>> > >>
>>> > >>
>>> > >> > If you would like this approach (or unless please provide your
>>> > feedback
>>> > >> on
>>> > >> > what needs to be done before midterm), please advice me on how to
>>> > >> approach
>>> > >> > on READ with access control. Can I do it and test it with keeping
>>> my
>>> > >> code
>>> > >> > in Google Code still ?  And please add if I have missed anything.
>>> > >> >
>>> > >> > On Thu, Jul 4, 2013 at 11:13 PM, Dishara Wijewardana <
>>> > >> > ddwijewardana@gmail.com> wrote:
>>> > >> >
>>> > >> > > Hi Ian,
>>> > >> > > I have refactored almost all the code review changes requested.
>>> In
>>> > the
>>> > >> > > process of the rest of the implementation.
>>> > >> > >
>>> > >> > > On Wed, Jul 3, 2013 at 12:52 PM, Bertrand Delacretaz <
>>> > >> > > bdelacretaz@apache.org> wrote:
>>> > >> > >
>>> > >> > >> Hi Dishara,
>>> > >> > >>
>>> > >> > >> On Tue, Jul 2, 2013 at 6:51 PM, Dishara Wijewardana
>>> > >> > >> <dd...@gmail.com> wrote:
>>> > >> > >> > ...Does each bundle in sling made to run their junit tests
>>> > >> separately
>>> > >> > >> at build
>>> > >> > >> > time...
>>> > >> > >>
>>> > >> > >> Could you start new threads with new subject lines when you
>>> start a
>>> > >> > >> new question or discussion?
>>> > >> > >>
>>> > >> > >> Otherwise it's very hard to find things in our mailing list
>>> > archives.
>>> > >> > >>
>>> > >> > >> OK. That's a good idea. I will follow that.
>>> > >> > >
>>> > >> > >
>>> > >> > >> Thanks, and keep up the good work!
>>> > >> > >> -Bertrand
>>> > >> > >>
>>> > >> > >
>>> > >> > >
>>> > >> > >
>>> > >> > > --
>>> > >> > > Thanks
>>> > >> > > /Dishara
>>> > >> > >
>>> > >> >
>>> > >> >
>>> > >> >
>>> > >> > --
>>> > >> > Thanks
>>> > >> > /Dishara
>>> > >> >
>>> > >>
>>> > >
>>> > >
>>> > >
>>> > > --
>>> > > Thanks
>>> > > /Dishara
>>> > >
>>> >
>>> >
>>> >
>>> > --
>>> > Thanks
>>> > /Dishara
>>> >
>>>
>>
>>
>>
>> --
>> Thanks
>> /Dishara
>>
>
>
>
> --
> Thanks
> /Dishara

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Dishara Wijewardana <dd...@gmail.com>.
On Mon, Aug 5, 2013 at 5:50 AM, Dishara Wijewardana <ddwijewardana@gmail.com
> wrote:

>
>
> On Tue, Jul 23, 2013 at 5:32 PM, Dishara Wijewardana <
> ddwijewardana@gmail.com> wrote:
>
>>
>>
>> On Mon, Jul 22, 2013 at 1:56 PM, Ian Boston <ie...@tfd.co.uk> wrote:
>>
>>> Hi Dishara,
>>>
>>> The Unit test coverage sounds great. I will pull the code and review
>>> today.
>>>
>>>
>>> Have you tried loading the bundle into a running Sling instance ?
>>
>> Not yet. I will try that out and let you know in one of following ways
>> Thanks.
>>
>>>
>>> Once you have built it you can load in 2 ways:
>>>
>>> method A
>>>
>>> mvn clean install sling:install
>>>
>>> The sling:install will post the Jar into the OSGi container over HTTP and
>>> cause it to start.
>>>
>> Hi Ian
>
> I started Sling launchpad and Cassandra Server as well. And uploaded the
> bundle through management console from
> http://localhost:8080/system/console.
> When I try to start the bundle it does nothing and no error logs in the
> back end as well. The I noticed following where me.prettyprint.* classes
> cannot loaded and hence my bundle cannot start. So ideally my bundle only
> contains it self only.  Not included its dependencies.  As I feel one
> option is get a me.prettyprint.* jar and install that bundle (I am not sure
> whether it also fails due to a similar reason). So what is the best
> approach to this ?
>
>
> Symbolic Name org.apache.sling.cassandra
> Version 0.0.1.SNAPSHOT
> Bundle Location inputstream:org.apache.sling.cassandra-0.0.1-SNAPSHOT.jar
> Last Modification Mon Aug 05 05:26:16 IST 2013
> Description Provides a ResourceProvider implementation supporting Apache
> Cassandra based resources.
> Start Level 20
> Exported Packages
> org.apache.sling.cassandra.resource.provider,version=0.0.1.SNAPSHOT
> org.apache.sling.cassandra.resource.provider.mapper,version=0.0.1.SNAPSHOT
> org.apache.sling.cassandra.resource.provider.util,version=0.0.1.SNAPSHOT
> Imported Packages  com.sun.org.apache.xerces.internal.impl.dv.util --
> Cannot be resolved and overwritten by Boot Delegation
> javax.servlet.http from org.apache.felix.http.jetty (1)
> me.prettyprint.cassandra.model,version=[1.0,2) -- Cannot be resolved
> me.prettyprint.cassandra.serializers,version=[1.0,2) -- Cannot be resolved
> me.prettyprint.hector.api,version=[1.0,2) -- Cannot be resolved
> me.prettyprint.hector.api.beans,version=[1.0,2) -- Cannot be resolved
> me.prettyprint.hector.api.ddl,version=[1.0,2) -- Cannot be resolved
> me.prettyprint.hector.api.factory,version=[1.0,2) -- Cannot be resolved
> me.prettyprint.hector.api.query,version=[1.0,2) -- Cannot be resolved
> org.apache.sling.api.resource,version=[2.3,3) from org.apache.sling.api
> (98)
> org.slf4j,version=[1.5,2) from slf4j.api (6)
> Manifest Headers Bnd-LastModified: 1375238009819
> Build-Jdk: 1.6.0_26
> Built-By: dishara
> Bundle-Description: Provides a ResourceProvider implementation supporting
> Apache Cassandra based resources.
> Bundle-ManifestVersion: 2
> Bundle-Name: Apache Sling Cassandra Resource Provider
> Bundle-SymbolicName: org.apache.sling.cassandra
> Bundle-Version: 0.0.1.SNAPSHOT
> Created-By: Apache Maven Bundle Plugin
> Export-Package: org.apache.sling.cassandra.resource.provider.mapper;
> version="0.0.1.SNAPSHOT", org.apache.sling.cassandra.resource.provider;
> uses:="javax.servlet.http, me.prettyprint.hector.api,
> org.apache.sling.api.resource,
> org.apache.sling.cassandra.resource.provider.mapper";
> version="0.0.1.SNAPSHOT",
> org.apache.sling.cassandra.resource.provider.util;
> uses:="me.prettyprint.cassandra.model,
> me.prettyprint.cassandra.serializers, me.prettyprint.hector.api,
> me.prettyprint.hector.api.query, org.apache.sling.api.resource,
> org.apache.sling.cassandra.resource.provider"; version="0.0.1.SNAPSHOT"
> Import-Package: com.sun.org.apache.xerces.internal.impl.dv.util,
> javax.servlet.http, me.prettyprint.cassandra.model; version="[1.0, 2)",
> me.prettyprint.cassandra.serializers; version="[1.0, 2)",
> me.prettyprint.hector.api; version="[1.0, 2)",
> me.prettyprint.hector.api.beans; version="[1.0, 2)",
> me.prettyprint.hector.api.ddl; version="[1.0, 2)",
> me.prettyprint.hector.api.factory; version="[1.0, 2)",
> me.prettyprint.hector.api.query; version="[1.0, 2)",
> org.apache.sling.api.resource; version="[2.3, 3)", org.slf4j;
> version="[1.5, 2)"
> Manifest-Version: 1.0
> Tool: Bnd-2.1.0.20130426-122213
>
>
>
>>
>>>
>>> method B
>>>
>>> Goto http://localhost:8080/system/console
>>>
>>> select the bundle tab and install the bundle by uploading.
>>>
>>> If you monitor the logs you should see no errors, the bundle should
>>> register and you should be able to map cassandra read only to somewhere
>>> in
>>> the resource tree.
>>>
>>
> Can you explain in a  bit more detail how  I should map a cassandra
> resources from sling browser  http://localhost:8080/.explorer.html ? . I
> can add different JCR node types. How my cassandra node type get registered
> here. Does it load all resource providers from reflection when I
> implemented the provider interface.
>

Hi Ian,
Can you please provide me an explanation on above fact I raised which is
related to this thread, thought you might have missed this.


>
>
>
>
>
>>
>>> Best Regards
>>> Ian
>>>
>>>
>>> On 21 July 2013 13:08, Dishara Wijewardana <dd...@gmail.com>
>>> wrote:
>>>
>>> > On Sun, Jul 21, 2013 at 5:32 PM, Dishara Wijewardana <
>>> > ddwijewardana@gmail.com> wrote:
>>> >
>>> > >
>>> > >
>>> > > On Fri, Jul 19, 2013 at 2:35 PM, Ian Boston <ie...@tfd.co.uk> wrote:
>>> > >
>>> > >> Hi,
>>> > >>
>>> > >>
>>> > >> On 19 July 2013 03:20, Dishara Wijewardana <ddwijewardana@gmail.com
>>> >
>>> > >> wrote:
>>> > >>
>>> > >> > Hi Ian
>>> > >> > This is regarding the sub tasks completion  of the project
>>> according
>>> > to
>>> > >> the
>>> > >> > time line.
>>> > >> >
>>> > >> > I have my code locally, but yet to do some completion with some
>>> > >> stuff(one
>>> > >> > of them is the ongoing discussion on listChildren).
>>> > >> > After that from API point of view implementation around Cassandra
>>> > >> Resource
>>> > >> > Provider and Resource will be finish. And I have to add some more
>>> > JUnit
>>> > >> > tests(have local code in to some extent already, will commit them
>>> once
>>> > >> all
>>> > >> > done around this).
>>> > >> >
>>> > >> > So after that (after mid term)  what I have to do is  enhance the
>>> > >> provider
>>> > >> > implementation  to  do READ operations with access control. My
>>> idea is
>>> > >> to
>>> > >> > finish that also before the mid term. I was kind of got stuck in
>>> some
>>> > >> OSGi
>>> > >> > stuff last days ;-).  And I will make sure I will have JUnit
>>> tests to
>>> > >> cover
>>> > >> > all the implementations before the midterm.
>>> > >> >
>>> > >>
>>> > >> Yes that is fine.
>>> > >> Before adding ACLs I would like to have the code running inside
>>> Sling,
>>> > >> inside OSGi connected to a Cassandra instance so that we can do some
>>> > basic
>>> > >> tests over http using Curl.
>>> > >>
>>> > >> Perhaps you are there already? Let me know when you are ready and
>>> I'll
>>> > >> give
>>> > >> it a go ?
>>> > >>
>>> > >
>>> > > Hi Ian,
>>> > > That is great and +1. In fact now the code is there and I have
>>> completed,
>>> > > rest of the implementation on normal READ and commited.
>>> > >
>>> > > I have added 3 more tests to cover the core implementation.  The
>>> tests
>>> > are
>>> > > running which covers add/read nodes, list children,iterate/iterable
>>> > > children and get parent related stuff.
>>> > >
>>> >
>>> > Correction . It should be 4 more tests. Now we have 5 tests all
>>> together
>>> > which covers above aspects. We should write more test around Cassandra
>>> > Resource. Will work on that as well.
>>> >
>>> > Please let me know if any issues come across when you run this in the
>>> sling
>>> > container. Excited to see whether it works in there :-).
>>> >
>>> >
>>> > > And I have made the changes you mentioned. So now resources will not
>>> get
>>> > > loaded unless we do a API call with the resource.
>>> > >
>>> > >
>>> > >>
>>> > >>
>>> > >> > If you would like this approach (or unless please provide your
>>> > feedback
>>> > >> on
>>> > >> > what needs to be done before midterm), please advice me on how to
>>> > >> approach
>>> > >> > on READ with access control. Can I do it and test it with keeping
>>> my
>>> > >> code
>>> > >> > in Google Code still ?  And please add if I have missed anything.
>>> > >> >
>>> > >> > On Thu, Jul 4, 2013 at 11:13 PM, Dishara Wijewardana <
>>> > >> > ddwijewardana@gmail.com> wrote:
>>> > >> >
>>> > >> > > Hi Ian,
>>> > >> > > I have refactored almost all the code review changes requested.
>>> In
>>> > the
>>> > >> > > process of the rest of the implementation.
>>> > >> > >
>>> > >> > > On Wed, Jul 3, 2013 at 12:52 PM, Bertrand Delacretaz <
>>> > >> > > bdelacretaz@apache.org> wrote:
>>> > >> > >
>>> > >> > >> Hi Dishara,
>>> > >> > >>
>>> > >> > >> On Tue, Jul 2, 2013 at 6:51 PM, Dishara Wijewardana
>>> > >> > >> <dd...@gmail.com> wrote:
>>> > >> > >> > ...Does each bundle in sling made to run their junit tests
>>> > >> separately
>>> > >> > >> at build
>>> > >> > >> > time...
>>> > >> > >>
>>> > >> > >> Could you start new threads with new subject lines when you
>>> start a
>>> > >> > >> new question or discussion?
>>> > >> > >>
>>> > >> > >> Otherwise it's very hard to find things in our mailing list
>>> > archives.
>>> > >> > >>
>>> > >> > >> OK. That's a good idea. I will follow that.
>>> > >> > >
>>> > >> > >
>>> > >> > >> Thanks, and keep up the good work!
>>> > >> > >> -Bertrand
>>> > >> > >>
>>> > >> > >
>>> > >> > >
>>> > >> > >
>>> > >> > > --
>>> > >> > > Thanks
>>> > >> > > /Dishara
>>> > >> > >
>>> > >> >
>>> > >> >
>>> > >> >
>>> > >> > --
>>> > >> > Thanks
>>> > >> > /Dishara
>>> > >> >
>>> > >>
>>> > >
>>> > >
>>> > >
>>> > > --
>>> > > Thanks
>>> > > /Dishara
>>> > >
>>> >
>>> >
>>> >
>>> > --
>>> > Thanks
>>> > /Dishara
>>> >
>>>
>>
>>
>>
>> --
>> Thanks
>> /Dishara
>>
>
>
>
> --
> Thanks
> /Dishara
>



-- 
Thanks
/Dishara

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Dishara Wijewardana <dd...@gmail.com>.
On Tue, Jul 23, 2013 at 5:32 PM, Dishara Wijewardana <
ddwijewardana@gmail.com> wrote:

>
>
> On Mon, Jul 22, 2013 at 1:56 PM, Ian Boston <ie...@tfd.co.uk> wrote:
>
>> Hi Dishara,
>>
>> The Unit test coverage sounds great. I will pull the code and review
>> today.
>>
>>
>> Have you tried loading the bundle into a running Sling instance ?
>
> Not yet. I will try that out and let you know in one of following ways
> Thanks.
>
>>
>> Once you have built it you can load in 2 ways:
>>
>> method A
>>
>> mvn clean install sling:install
>>
>> The sling:install will post the Jar into the OSGi container over HTTP and
>> cause it to start.
>>
> Hi Ian

I started Sling launchpad and Cassandra Server as well. And uploaded the
bundle through management console from  http://localhost:8080/system/console
.
When I try to start the bundle it does nothing and no error logs in the
back end as well. The I noticed following where me.prettyprint.* classes
cannot loaded and hence my bundle cannot start. So ideally my bundle only
contains it self only.  Not included its dependencies.  As I feel one
option is get a me.prettyprint.* jar and install that bundle (I am not sure
whether it also fails due to a similar reason). So what is the best
approach to this ?


Symbolic Name org.apache.sling.cassandra
Version 0.0.1.SNAPSHOT
Bundle Location inputstream:org.apache.sling.cassandra-0.0.1-SNAPSHOT.jar
Last Modification Mon Aug 05 05:26:16 IST 2013
Description Provides a ResourceProvider implementation supporting Apache
Cassandra based resources.
Start Level 20
Exported Packages
org.apache.sling.cassandra.resource.provider,version=0.0.1.SNAPSHOT
org.apache.sling.cassandra.resource.provider.mapper,version=0.0.1.SNAPSHOT
org.apache.sling.cassandra.resource.provider.util,version=0.0.1.SNAPSHOT
Imported Packages  com.sun.org.apache.xerces.internal.impl.dv.util --
Cannot be resolved and overwritten by Boot Delegation
javax.servlet.http from org.apache.felix.http.jetty (1)
me.prettyprint.cassandra.model,version=[1.0,2) -- Cannot be resolved
me.prettyprint.cassandra.serializers,version=[1.0,2) -- Cannot be resolved
me.prettyprint.hector.api,version=[1.0,2) -- Cannot be resolved
me.prettyprint.hector.api.beans,version=[1.0,2) -- Cannot be resolved
me.prettyprint.hector.api.ddl,version=[1.0,2) -- Cannot be resolved
me.prettyprint.hector.api.factory,version=[1.0,2) -- Cannot be resolved
me.prettyprint.hector.api.query,version=[1.0,2) -- Cannot be resolved
org.apache.sling.api.resource,version=[2.3,3) from org.apache.sling.api (98)
org.slf4j,version=[1.5,2) from slf4j.api (6)
Manifest Headers Bnd-LastModified: 1375238009819
Build-Jdk: 1.6.0_26
Built-By: dishara
Bundle-Description: Provides a ResourceProvider implementation supporting
Apache Cassandra based resources.
Bundle-ManifestVersion: 2
Bundle-Name: Apache Sling Cassandra Resource Provider
Bundle-SymbolicName: org.apache.sling.cassandra
Bundle-Version: 0.0.1.SNAPSHOT
Created-By: Apache Maven Bundle Plugin
Export-Package: org.apache.sling.cassandra.resource.provider.mapper;
version="0.0.1.SNAPSHOT", org.apache.sling.cassandra.resource.provider;
uses:="javax.servlet.http, me.prettyprint.hector.api,
org.apache.sling.api.resource,
org.apache.sling.cassandra.resource.provider.mapper";
version="0.0.1.SNAPSHOT",
org.apache.sling.cassandra.resource.provider.util;
uses:="me.prettyprint.cassandra.model,
me.prettyprint.cassandra.serializers, me.prettyprint.hector.api,
me.prettyprint.hector.api.query, org.apache.sling.api.resource,
org.apache.sling.cassandra.resource.provider"; version="0.0.1.SNAPSHOT"
Import-Package: com.sun.org.apache.xerces.internal.impl.dv.util,
javax.servlet.http, me.prettyprint.cassandra.model; version="[1.0, 2)",
me.prettyprint.cassandra.serializers; version="[1.0, 2)",
me.prettyprint.hector.api; version="[1.0, 2)",
me.prettyprint.hector.api.beans; version="[1.0, 2)",
me.prettyprint.hector.api.ddl; version="[1.0, 2)",
me.prettyprint.hector.api.factory; version="[1.0, 2)",
me.prettyprint.hector.api.query; version="[1.0, 2)",
org.apache.sling.api.resource; version="[2.3, 3)", org.slf4j;
version="[1.5, 2)"
Manifest-Version: 1.0
Tool: Bnd-2.1.0.20130426-122213



>
>>
>> method B
>>
>> Goto http://localhost:8080/system/console
>>
>> select the bundle tab and install the bundle by uploading.
>>
>> If you monitor the logs you should see no errors, the bundle should
>> register and you should be able to map cassandra read only to somewhere in
>> the resource tree.
>>
>
Can you explain in a  bit more detail how  I should map a cassandra
resources from sling browser  http://localhost:8080/.explorer.html ? . I
can add different JCR node types. How my cassandra node type get registered
here. Does it load all resource providers from reflection when I
implemented the provider interface.





>
>> Best Regards
>> Ian
>>
>>
>> On 21 July 2013 13:08, Dishara Wijewardana <dd...@gmail.com>
>> wrote:
>>
>> > On Sun, Jul 21, 2013 at 5:32 PM, Dishara Wijewardana <
>> > ddwijewardana@gmail.com> wrote:
>> >
>> > >
>> > >
>> > > On Fri, Jul 19, 2013 at 2:35 PM, Ian Boston <ie...@tfd.co.uk> wrote:
>> > >
>> > >> Hi,
>> > >>
>> > >>
>> > >> On 19 July 2013 03:20, Dishara Wijewardana <dd...@gmail.com>
>> > >> wrote:
>> > >>
>> > >> > Hi Ian
>> > >> > This is regarding the sub tasks completion  of the project
>> according
>> > to
>> > >> the
>> > >> > time line.
>> > >> >
>> > >> > I have my code locally, but yet to do some completion with some
>> > >> stuff(one
>> > >> > of them is the ongoing discussion on listChildren).
>> > >> > After that from API point of view implementation around Cassandra
>> > >> Resource
>> > >> > Provider and Resource will be finish. And I have to add some more
>> > JUnit
>> > >> > tests(have local code in to some extent already, will commit them
>> once
>> > >> all
>> > >> > done around this).
>> > >> >
>> > >> > So after that (after mid term)  what I have to do is  enhance the
>> > >> provider
>> > >> > implementation  to  do READ operations with access control. My
>> idea is
>> > >> to
>> > >> > finish that also before the mid term. I was kind of got stuck in
>> some
>> > >> OSGi
>> > >> > stuff last days ;-).  And I will make sure I will have JUnit tests
>> to
>> > >> cover
>> > >> > all the implementations before the midterm.
>> > >> >
>> > >>
>> > >> Yes that is fine.
>> > >> Before adding ACLs I would like to have the code running inside
>> Sling,
>> > >> inside OSGi connected to a Cassandra instance so that we can do some
>> > basic
>> > >> tests over http using Curl.
>> > >>
>> > >> Perhaps you are there already? Let me know when you are ready and
>> I'll
>> > >> give
>> > >> it a go ?
>> > >>
>> > >
>> > > Hi Ian,
>> > > That is great and +1. In fact now the code is there and I have
>> completed,
>> > > rest of the implementation on normal READ and commited.
>> > >
>> > > I have added 3 more tests to cover the core implementation.  The tests
>> > are
>> > > running which covers add/read nodes, list children,iterate/iterable
>> > > children and get parent related stuff.
>> > >
>> >
>> > Correction . It should be 4 more tests. Now we have 5 tests all together
>> > which covers above aspects. We should write more test around Cassandra
>> > Resource. Will work on that as well.
>> >
>> > Please let me know if any issues come across when you run this in the
>> sling
>> > container. Excited to see whether it works in there :-).
>> >
>> >
>> > > And I have made the changes you mentioned. So now resources will not
>> get
>> > > loaded unless we do a API call with the resource.
>> > >
>> > >
>> > >>
>> > >>
>> > >> > If you would like this approach (or unless please provide your
>> > feedback
>> > >> on
>> > >> > what needs to be done before midterm), please advice me on how to
>> > >> approach
>> > >> > on READ with access control. Can I do it and test it with keeping
>> my
>> > >> code
>> > >> > in Google Code still ?  And please add if I have missed anything.
>> > >> >
>> > >> > On Thu, Jul 4, 2013 at 11:13 PM, Dishara Wijewardana <
>> > >> > ddwijewardana@gmail.com> wrote:
>> > >> >
>> > >> > > Hi Ian,
>> > >> > > I have refactored almost all the code review changes requested.
>> In
>> > the
>> > >> > > process of the rest of the implementation.
>> > >> > >
>> > >> > > On Wed, Jul 3, 2013 at 12:52 PM, Bertrand Delacretaz <
>> > >> > > bdelacretaz@apache.org> wrote:
>> > >> > >
>> > >> > >> Hi Dishara,
>> > >> > >>
>> > >> > >> On Tue, Jul 2, 2013 at 6:51 PM, Dishara Wijewardana
>> > >> > >> <dd...@gmail.com> wrote:
>> > >> > >> > ...Does each bundle in sling made to run their junit tests
>> > >> separately
>> > >> > >> at build
>> > >> > >> > time...
>> > >> > >>
>> > >> > >> Could you start new threads with new subject lines when you
>> start a
>> > >> > >> new question or discussion?
>> > >> > >>
>> > >> > >> Otherwise it's very hard to find things in our mailing list
>> > archives.
>> > >> > >>
>> > >> > >> OK. That's a good idea. I will follow that.
>> > >> > >
>> > >> > >
>> > >> > >> Thanks, and keep up the good work!
>> > >> > >> -Bertrand
>> > >> > >>
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > > --
>> > >> > > Thanks
>> > >> > > /Dishara
>> > >> > >
>> > >> >
>> > >> >
>> > >> >
>> > >> > --
>> > >> > Thanks
>> > >> > /Dishara
>> > >> >
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > > Thanks
>> > > /Dishara
>> > >
>> >
>> >
>> >
>> > --
>> > Thanks
>> > /Dishara
>> >
>>
>
>
>
> --
> Thanks
> /Dishara
>



-- 
Thanks
/Dishara

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Dishara Wijewardana <dd...@gmail.com>.
On Mon, Jul 22, 2013 at 1:56 PM, Ian Boston <ie...@tfd.co.uk> wrote:

> Hi Dishara,
>
> The Unit test coverage sounds great. I will pull the code and review today.
>
>
> Have you tried loading the bundle into a running Sling instance ?
>
Not yet. I will try that out and let you know in one of following ways
Thanks.

>
> Once you have built it you can load in 2 ways:
>
> method A
>
> mvn clean install sling:install
>
> The sling:install will post the Jar into the OSGi container over HTTP and
> cause it to start.
>
>
> method B
>
> Goto http://localhost:8080/system/console
>
> select the bundle tab and install the bundle by uploading.
>
> If you monitor the logs you should see no errors, the bundle should
> register and you should be able to map cassandra read only to somewhere in
> the resource tree.
>
> Best Regards
> Ian
>
>
> On 21 July 2013 13:08, Dishara Wijewardana <dd...@gmail.com>
> wrote:
>
> > On Sun, Jul 21, 2013 at 5:32 PM, Dishara Wijewardana <
> > ddwijewardana@gmail.com> wrote:
> >
> > >
> > >
> > > On Fri, Jul 19, 2013 at 2:35 PM, Ian Boston <ie...@tfd.co.uk> wrote:
> > >
> > >> Hi,
> > >>
> > >>
> > >> On 19 July 2013 03:20, Dishara Wijewardana <dd...@gmail.com>
> > >> wrote:
> > >>
> > >> > Hi Ian
> > >> > This is regarding the sub tasks completion  of the project according
> > to
> > >> the
> > >> > time line.
> > >> >
> > >> > I have my code locally, but yet to do some completion with some
> > >> stuff(one
> > >> > of them is the ongoing discussion on listChildren).
> > >> > After that from API point of view implementation around Cassandra
> > >> Resource
> > >> > Provider and Resource will be finish. And I have to add some more
> > JUnit
> > >> > tests(have local code in to some extent already, will commit them
> once
> > >> all
> > >> > done around this).
> > >> >
> > >> > So after that (after mid term)  what I have to do is  enhance the
> > >> provider
> > >> > implementation  to  do READ operations with access control. My idea
> is
> > >> to
> > >> > finish that also before the mid term. I was kind of got stuck in
> some
> > >> OSGi
> > >> > stuff last days ;-).  And I will make sure I will have JUnit tests
> to
> > >> cover
> > >> > all the implementations before the midterm.
> > >> >
> > >>
> > >> Yes that is fine.
> > >> Before adding ACLs I would like to have the code running inside Sling,
> > >> inside OSGi connected to a Cassandra instance so that we can do some
> > basic
> > >> tests over http using Curl.
> > >>
> > >> Perhaps you are there already? Let me know when you are ready and I'll
> > >> give
> > >> it a go ?
> > >>
> > >
> > > Hi Ian,
> > > That is great and +1. In fact now the code is there and I have
> completed,
> > > rest of the implementation on normal READ and commited.
> > >
> > > I have added 3 more tests to cover the core implementation.  The tests
> > are
> > > running which covers add/read nodes, list children,iterate/iterable
> > > children and get parent related stuff.
> > >
> >
> > Correction . It should be 4 more tests. Now we have 5 tests all together
> > which covers above aspects. We should write more test around Cassandra
> > Resource. Will work on that as well.
> >
> > Please let me know if any issues come across when you run this in the
> sling
> > container. Excited to see whether it works in there :-).
> >
> >
> > > And I have made the changes you mentioned. So now resources will not
> get
> > > loaded unless we do a API call with the resource.
> > >
> > >
> > >>
> > >>
> > >> > If you would like this approach (or unless please provide your
> > feedback
> > >> on
> > >> > what needs to be done before midterm), please advice me on how to
> > >> approach
> > >> > on READ with access control. Can I do it and test it with keeping my
> > >> code
> > >> > in Google Code still ?  And please add if I have missed anything.
> > >> >
> > >> > On Thu, Jul 4, 2013 at 11:13 PM, Dishara Wijewardana <
> > >> > ddwijewardana@gmail.com> wrote:
> > >> >
> > >> > > Hi Ian,
> > >> > > I have refactored almost all the code review changes requested. In
> > the
> > >> > > process of the rest of the implementation.
> > >> > >
> > >> > > On Wed, Jul 3, 2013 at 12:52 PM, Bertrand Delacretaz <
> > >> > > bdelacretaz@apache.org> wrote:
> > >> > >
> > >> > >> Hi Dishara,
> > >> > >>
> > >> > >> On Tue, Jul 2, 2013 at 6:51 PM, Dishara Wijewardana
> > >> > >> <dd...@gmail.com> wrote:
> > >> > >> > ...Does each bundle in sling made to run their junit tests
> > >> separately
> > >> > >> at build
> > >> > >> > time...
> > >> > >>
> > >> > >> Could you start new threads with new subject lines when you
> start a
> > >> > >> new question or discussion?
> > >> > >>
> > >> > >> Otherwise it's very hard to find things in our mailing list
> > archives.
> > >> > >>
> > >> > >> OK. That's a good idea. I will follow that.
> > >> > >
> > >> > >
> > >> > >> Thanks, and keep up the good work!
> > >> > >> -Bertrand
> > >> > >>
> > >> > >
> > >> > >
> > >> > >
> > >> > > --
> > >> > > Thanks
> > >> > > /Dishara
> > >> > >
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > Thanks
> > >> > /Dishara
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > Thanks
> > > /Dishara
> > >
> >
> >
> >
> > --
> > Thanks
> > /Dishara
> >
>



-- 
Thanks
/Dishara

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Ian Boston <ie...@tfd.co.uk>.
Hi Dishara,

The Unit test coverage sounds great. I will pull the code and review today.


Have you tried loading the bundle into a running Sling instance ?

Once you have built it you can load in 2 ways:

method A

mvn clean install sling:install

The sling:install will post the Jar into the OSGi container over HTTP and
cause it to start.


method B

Goto http://localhost:8080/system/console

select the bundle tab and install the bundle by uploading.

If you monitor the logs you should see no errors, the bundle should
register and you should be able to map cassandra read only to somewhere in
the resource tree.

Best Regards
Ian


On 21 July 2013 13:08, Dishara Wijewardana <dd...@gmail.com> wrote:

> On Sun, Jul 21, 2013 at 5:32 PM, Dishara Wijewardana <
> ddwijewardana@gmail.com> wrote:
>
> >
> >
> > On Fri, Jul 19, 2013 at 2:35 PM, Ian Boston <ie...@tfd.co.uk> wrote:
> >
> >> Hi,
> >>
> >>
> >> On 19 July 2013 03:20, Dishara Wijewardana <dd...@gmail.com>
> >> wrote:
> >>
> >> > Hi Ian
> >> > This is regarding the sub tasks completion  of the project according
> to
> >> the
> >> > time line.
> >> >
> >> > I have my code locally, but yet to do some completion with some
> >> stuff(one
> >> > of them is the ongoing discussion on listChildren).
> >> > After that from API point of view implementation around Cassandra
> >> Resource
> >> > Provider and Resource will be finish. And I have to add some more
> JUnit
> >> > tests(have local code in to some extent already, will commit them once
> >> all
> >> > done around this).
> >> >
> >> > So after that (after mid term)  what I have to do is  enhance the
> >> provider
> >> > implementation  to  do READ operations with access control. My idea is
> >> to
> >> > finish that also before the mid term. I was kind of got stuck in some
> >> OSGi
> >> > stuff last days ;-).  And I will make sure I will have JUnit tests to
> >> cover
> >> > all the implementations before the midterm.
> >> >
> >>
> >> Yes that is fine.
> >> Before adding ACLs I would like to have the code running inside Sling,
> >> inside OSGi connected to a Cassandra instance so that we can do some
> basic
> >> tests over http using Curl.
> >>
> >> Perhaps you are there already? Let me know when you are ready and I'll
> >> give
> >> it a go ?
> >>
> >
> > Hi Ian,
> > That is great and +1. In fact now the code is there and I have completed,
> > rest of the implementation on normal READ and commited.
> >
> > I have added 3 more tests to cover the core implementation.  The tests
> are
> > running which covers add/read nodes, list children,iterate/iterable
> > children and get parent related stuff.
> >
>
> Correction . It should be 4 more tests. Now we have 5 tests all together
> which covers above aspects. We should write more test around Cassandra
> Resource. Will work on that as well.
>
> Please let me know if any issues come across when you run this in the sling
> container. Excited to see whether it works in there :-).
>
>
> > And I have made the changes you mentioned. So now resources will not get
> > loaded unless we do a API call with the resource.
> >
> >
> >>
> >>
> >> > If you would like this approach (or unless please provide your
> feedback
> >> on
> >> > what needs to be done before midterm), please advice me on how to
> >> approach
> >> > on READ with access control. Can I do it and test it with keeping my
> >> code
> >> > in Google Code still ?  And please add if I have missed anything.
> >> >
> >> > On Thu, Jul 4, 2013 at 11:13 PM, Dishara Wijewardana <
> >> > ddwijewardana@gmail.com> wrote:
> >> >
> >> > > Hi Ian,
> >> > > I have refactored almost all the code review changes requested. In
> the
> >> > > process of the rest of the implementation.
> >> > >
> >> > > On Wed, Jul 3, 2013 at 12:52 PM, Bertrand Delacretaz <
> >> > > bdelacretaz@apache.org> wrote:
> >> > >
> >> > >> Hi Dishara,
> >> > >>
> >> > >> On Tue, Jul 2, 2013 at 6:51 PM, Dishara Wijewardana
> >> > >> <dd...@gmail.com> wrote:
> >> > >> > ...Does each bundle in sling made to run their junit tests
> >> separately
> >> > >> at build
> >> > >> > time...
> >> > >>
> >> > >> Could you start new threads with new subject lines when you start a
> >> > >> new question or discussion?
> >> > >>
> >> > >> Otherwise it's very hard to find things in our mailing list
> archives.
> >> > >>
> >> > >> OK. That's a good idea. I will follow that.
> >> > >
> >> > >
> >> > >> Thanks, and keep up the good work!
> >> > >> -Bertrand
> >> > >>
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Thanks
> >> > > /Dishara
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > Thanks
> >> > /Dishara
> >> >
> >>
> >
> >
> >
> > --
> > Thanks
> > /Dishara
> >
>
>
>
> --
> Thanks
> /Dishara
>

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Dishara Wijewardana <dd...@gmail.com>.
On Sun, Jul 21, 2013 at 5:32 PM, Dishara Wijewardana <
ddwijewardana@gmail.com> wrote:

>
>
> On Fri, Jul 19, 2013 at 2:35 PM, Ian Boston <ie...@tfd.co.uk> wrote:
>
>> Hi,
>>
>>
>> On 19 July 2013 03:20, Dishara Wijewardana <dd...@gmail.com>
>> wrote:
>>
>> > Hi Ian
>> > This is regarding the sub tasks completion  of the project according to
>> the
>> > time line.
>> >
>> > I have my code locally, but yet to do some completion with some
>> stuff(one
>> > of them is the ongoing discussion on listChildren).
>> > After that from API point of view implementation around Cassandra
>> Resource
>> > Provider and Resource will be finish. And I have to add some more JUnit
>> > tests(have local code in to some extent already, will commit them once
>> all
>> > done around this).
>> >
>> > So after that (after mid term)  what I have to do is  enhance the
>> provider
>> > implementation  to  do READ operations with access control. My idea is
>> to
>> > finish that also before the mid term. I was kind of got stuck in some
>> OSGi
>> > stuff last days ;-).  And I will make sure I will have JUnit tests to
>> cover
>> > all the implementations before the midterm.
>> >
>>
>> Yes that is fine.
>> Before adding ACLs I would like to have the code running inside Sling,
>> inside OSGi connected to a Cassandra instance so that we can do some basic
>> tests over http using Curl.
>>
>> Perhaps you are there already? Let me know when you are ready and I'll
>> give
>> it a go ?
>>
>
> Hi Ian,
> That is great and +1. In fact now the code is there and I have completed,
> rest of the implementation on normal READ and commited.
>
> I have added 3 more tests to cover the core implementation.  The tests are
> running which covers add/read nodes, list children,iterate/iterable
> children and get parent related stuff.
>

Correction . It should be 4 more tests. Now we have 5 tests all together
which covers above aspects. We should write more test around Cassandra
Resource. Will work on that as well.

Please let me know if any issues come across when you run this in the sling
container. Excited to see whether it works in there :-).


> And I have made the changes you mentioned. So now resources will not get
> loaded unless we do a API call with the resource.
>
>
>>
>>
>> > If you would like this approach (or unless please provide your feedback
>> on
>> > what needs to be done before midterm), please advice me on how to
>> approach
>> > on READ with access control. Can I do it and test it with keeping my
>> code
>> > in Google Code still ?  And please add if I have missed anything.
>> >
>> > On Thu, Jul 4, 2013 at 11:13 PM, Dishara Wijewardana <
>> > ddwijewardana@gmail.com> wrote:
>> >
>> > > Hi Ian,
>> > > I have refactored almost all the code review changes requested. In the
>> > > process of the rest of the implementation.
>> > >
>> > > On Wed, Jul 3, 2013 at 12:52 PM, Bertrand Delacretaz <
>> > > bdelacretaz@apache.org> wrote:
>> > >
>> > >> Hi Dishara,
>> > >>
>> > >> On Tue, Jul 2, 2013 at 6:51 PM, Dishara Wijewardana
>> > >> <dd...@gmail.com> wrote:
>> > >> > ...Does each bundle in sling made to run their junit tests
>> separately
>> > >> at build
>> > >> > time...
>> > >>
>> > >> Could you start new threads with new subject lines when you start a
>> > >> new question or discussion?
>> > >>
>> > >> Otherwise it's very hard to find things in our mailing list archives.
>> > >>
>> > >> OK. That's a good idea. I will follow that.
>> > >
>> > >
>> > >> Thanks, and keep up the good work!
>> > >> -Bertrand
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > > Thanks
>> > > /Dishara
>> > >
>> >
>> >
>> >
>> > --
>> > Thanks
>> > /Dishara
>> >
>>
>
>
>
> --
> Thanks
> /Dishara
>



-- 
Thanks
/Dishara

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Dishara Wijewardana <dd...@gmail.com>.
On Fri, Jul 19, 2013 at 2:35 PM, Ian Boston <ie...@tfd.co.uk> wrote:

> Hi,
>
>
> On 19 July 2013 03:20, Dishara Wijewardana <dd...@gmail.com>
> wrote:
>
> > Hi Ian
> > This is regarding the sub tasks completion  of the project according to
> the
> > time line.
> >
> > I have my code locally, but yet to do some completion with some stuff(one
> > of them is the ongoing discussion on listChildren).
> > After that from API point of view implementation around Cassandra
> Resource
> > Provider and Resource will be finish. And I have to add some more JUnit
> > tests(have local code in to some extent already, will commit them once
> all
> > done around this).
> >
> > So after that (after mid term)  what I have to do is  enhance the
> provider
> > implementation  to  do READ operations with access control. My idea is to
> > finish that also before the mid term. I was kind of got stuck in some
> OSGi
> > stuff last days ;-).  And I will make sure I will have JUnit tests to
> cover
> > all the implementations before the midterm.
> >
>
> Yes that is fine.
> Before adding ACLs I would like to have the code running inside Sling,
> inside OSGi connected to a Cassandra instance so that we can do some basic
> tests over http using Curl.
>
> Perhaps you are there already? Let me know when you are ready and I'll give
> it a go ?
>

Hi Ian,
That is great and +1. In fact now the code is there and I have completed,
rest of the implementation on normal READ and commited.

I have added 3 more tests to cover the core implementation.  The tests are
running which covers add/read nodes, list children,iterate/iterable
children and get parent related stuff.
And I have made the changes you mentioned. So now resources will not get
loaded unless we do a API call with the resource.


>
>
> > If you would like this approach (or unless please provide your feedback
> on
> > what needs to be done before midterm), please advice me on how to
> approach
> > on READ with access control. Can I do it and test it with keeping my code
> > in Google Code still ?  And please add if I have missed anything.
> >
> > On Thu, Jul 4, 2013 at 11:13 PM, Dishara Wijewardana <
> > ddwijewardana@gmail.com> wrote:
> >
> > > Hi Ian,
> > > I have refactored almost all the code review changes requested. In the
> > > process of the rest of the implementation.
> > >
> > > On Wed, Jul 3, 2013 at 12:52 PM, Bertrand Delacretaz <
> > > bdelacretaz@apache.org> wrote:
> > >
> > >> Hi Dishara,
> > >>
> > >> On Tue, Jul 2, 2013 at 6:51 PM, Dishara Wijewardana
> > >> <dd...@gmail.com> wrote:
> > >> > ...Does each bundle in sling made to run their junit tests
> separately
> > >> at build
> > >> > time...
> > >>
> > >> Could you start new threads with new subject lines when you start a
> > >> new question or discussion?
> > >>
> > >> Otherwise it's very hard to find things in our mailing list archives.
> > >>
> > >> OK. That's a good idea. I will follow that.
> > >
> > >
> > >> Thanks, and keep up the good work!
> > >> -Bertrand
> > >>
> > >
> > >
> > >
> > > --
> > > Thanks
> > > /Dishara
> > >
> >
> >
> >
> > --
> > Thanks
> > /Dishara
> >
>



-- 
Thanks
/Dishara

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Ian Boston <ie...@tfd.co.uk>.
Hi,


On 19 July 2013 03:20, Dishara Wijewardana <dd...@gmail.com> wrote:

> Hi Ian
> This is regarding the sub tasks completion  of the project according to the
> time line.
>
> I have my code locally, but yet to do some completion with some stuff(one
> of them is the ongoing discussion on listChildren).
> After that from API point of view implementation around Cassandra Resource
> Provider and Resource will be finish. And I have to add some more JUnit
> tests(have local code in to some extent already, will commit them once all
> done around this).
>
> So after that (after mid term)  what I have to do is  enhance the provider
> implementation  to  do READ operations with access control. My idea is to
> finish that also before the mid term. I was kind of got stuck in some OSGi
> stuff last days ;-).  And I will make sure I will have JUnit tests to cover
> all the implementations before the midterm.
>

Yes that is fine.
Before adding ACLs I would like to have the code running inside Sling,
inside OSGi connected to a Cassandra instance so that we can do some basic
tests over http using Curl.

Perhaps you are there already? Let me know when you are ready and I'll give
it a go ?


> If you would like this approach (or unless please provide your feedback on
> what needs to be done before midterm), please advice me on how to approach
> on READ with access control. Can I do it and test it with keeping my code
> in Google Code still ?  And please add if I have missed anything.
>
> On Thu, Jul 4, 2013 at 11:13 PM, Dishara Wijewardana <
> ddwijewardana@gmail.com> wrote:
>
> > Hi Ian,
> > I have refactored almost all the code review changes requested. In the
> > process of the rest of the implementation.
> >
> > On Wed, Jul 3, 2013 at 12:52 PM, Bertrand Delacretaz <
> > bdelacretaz@apache.org> wrote:
> >
> >> Hi Dishara,
> >>
> >> On Tue, Jul 2, 2013 at 6:51 PM, Dishara Wijewardana
> >> <dd...@gmail.com> wrote:
> >> > ...Does each bundle in sling made to run their junit tests separately
> >> at build
> >> > time...
> >>
> >> Could you start new threads with new subject lines when you start a
> >> new question or discussion?
> >>
> >> Otherwise it's very hard to find things in our mailing list archives.
> >>
> >> OK. That's a good idea. I will follow that.
> >
> >
> >> Thanks, and keep up the good work!
> >> -Bertrand
> >>
> >
> >
> >
> > --
> > Thanks
> > /Dishara
> >
>
>
>
> --
> Thanks
> /Dishara
>

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Dishara Wijewardana <dd...@gmail.com>.
Hi Ian
This is regarding the sub tasks completion  of the project according to the
time line.

I have my code locally, but yet to do some completion with some stuff(one
of them is the ongoing discussion on listChildren).
After that from API point of view implementation around Cassandra Resource
Provider and Resource will be finish. And I have to add some more JUnit
tests(have local code in to some extent already, will commit them once all
done around this).

So after that (after mid term)  what I have to do is  enhance the provider
implementation  to  do READ operations with access control. My idea is to
finish that also before the mid term. I was kind of got stuck in some OSGi
stuff last days ;-).  And I will make sure I will have JUnit tests to cover
all the implementations before the midterm.

If you would like this approach (or unless please provide your feedback on
what needs to be done before midterm), please advice me on how to approach
on READ with access control. Can I do it and test it with keeping my code
in Google Code still ?  And please add if I have missed anything.

On Thu, Jul 4, 2013 at 11:13 PM, Dishara Wijewardana <
ddwijewardana@gmail.com> wrote:

> Hi Ian,
> I have refactored almost all the code review changes requested. In the
> process of the rest of the implementation.
>
> On Wed, Jul 3, 2013 at 12:52 PM, Bertrand Delacretaz <
> bdelacretaz@apache.org> wrote:
>
>> Hi Dishara,
>>
>> On Tue, Jul 2, 2013 at 6:51 PM, Dishara Wijewardana
>> <dd...@gmail.com> wrote:
>> > ...Does each bundle in sling made to run their junit tests separately
>> at build
>> > time...
>>
>> Could you start new threads with new subject lines when you start a
>> new question or discussion?
>>
>> Otherwise it's very hard to find things in our mailing list archives.
>>
>> OK. That's a good idea. I will follow that.
>
>
>> Thanks, and keep up the good work!
>> -Bertrand
>>
>
>
>
> --
> Thanks
> /Dishara
>



-- 
Thanks
/Dishara

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Dishara Wijewardana <dd...@gmail.com>.
Hi Ian,
I have refactored almost all the code review changes requested. In the
process of the rest of the implementation.

On Wed, Jul 3, 2013 at 12:52 PM, Bertrand Delacretaz <bdelacretaz@apache.org
> wrote:

> Hi Dishara,
>
> On Tue, Jul 2, 2013 at 6:51 PM, Dishara Wijewardana
> <dd...@gmail.com> wrote:
> > ...Does each bundle in sling made to run their junit tests separately at
> build
> > time...
>
> Could you start new threads with new subject lines when you start a
> new question or discussion?
>
> Otherwise it's very hard to find things in our mailing list archives.
>
> OK. That's a good idea. I will follow that.


> Thanks, and keep up the good work!
> -Bertrand
>



-- 
Thanks
/Dishara

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Bertrand Delacretaz <bd...@apache.org>.
Hi Dishara,

On Tue, Jul 2, 2013 at 6:51 PM, Dishara Wijewardana
<dd...@gmail.com> wrote:
> ...Does each bundle in sling made to run their junit tests separately at build
> time...

Could you start new threads with new subject lines when you start a
new question or discussion?

Otherwise it's very hard to find things in our mailing list archives.

Thanks, and keep up the good work!
-Bertrand

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Ian Boston <ie...@tfd.co.uk>.
Hi DIshara,


On 3 July 2013 02:51, Dishara Wijewardana <dd...@gmail.com> wrote:

> Hi Ian,
> Does each bundle in sling made to run their junit tests separately at build
> time  ?


Yes.



> If so each pom should have configured to junit test cases.
> Where and how to define them ?
>

All you need to do is depend on he Sling parent, as you have done, include
JUnit4, Slf4j simple, (and Mockito) as a dependency and put your unit tests
in src/test/java. IIRC Maven's default build configuration runs unit tests.

Have a look at [1]

it has

<dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
        </dependency>
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-simple</artifactId>
        </dependency>

Best Regards

Ian


1 http://svn.apache.org/repos/asf/sling/trunk/bundles/commons/json/



>
>
> On Mon, Jul 1, 2013 at 5:32 AM, Ian Boston <ie...@tfd.co.uk> wrote:
>
> > Hi Dishara,
> >
> > I've taken the liberty of creating a code review at [1]. This is all
> > commits. I've emailed you separately with the comments. I think it would
> be
> > good if we can get into the habit of looking at the code in this way as
> it
> > often removes confusion introduced by the english language (which has
> many
> > compilers ;), mine has been known to be buggy at times.).
> >
> >
> > More comments inline below: (BTW, excellent progress!)
> >
> > Best Regards
> > Ian
> >
> >
> > 1 https://codereview.appspot.com/10811044/
> >
> >
> >
> > On 30 June 2013 22:52, Dishara Wijewardana <dd...@gmail.com>
> > wrote:
> >
> > > On Fri, Jun 28, 2013 at 4:37 AM, Ian Boston <ie...@tfd.co.uk> wrote:
> > >
> > > > Hi,
> > > > Have you tried the TypeInferringSerializer for the value serializer ?
> > > > That claims to be detect what the column value is based on the Byte
> > > array.
> > > >
> > > > Failing that, I would consider making everything byte[] and using
> your
> > > own
> > > > serializer that writes and read values to a byte[] using
> > DataInputStream
> > > > DataOutputStream.
> > > >
> > > > [2] Is an example of a serializer written for that purpose that was
> > used
> > > > with Cassandra over raw Thrift. Its not easy to read what it outputs
> to
> > > the
> > > > storage layer, but it is compact and efficient. I would not use it
> > > directly
> > > > as it does some very specific things like slicing large byte[]s into
> > 1MB
> > > > chunks and bypassing the 64K limit on reading and writing UTF8
> strings
> > > with
> > > > DataInputStream.
> > > >
> > > > Try the TypeInferringSerializer first. If it works great, no need to
> do
> > > > anything more complex.
> > > >
> > >
> > > Hi,
> > > In fact I was able to add as many params as I wanted with the same
> > > configurations. But TypeInferringSerializer is a useful one too which
> > might
> > > need in future.
> > > Also I was thinking rather than storing resource meta data as String
> > > values, how about storing a serialized object as you mentioned ?
> >
> >
> > I suspect that TypeInferringSerializer will do a better job of
> serializing
> > than the approach I mentioned. Only consider writing your own, if there
> is
> > a real and demonstrated need for it.
> >
> >
> > > It will be
> > > clear. But I am not sure about the performance. Because when we have
> > multi
> > > valued columns like meta data we have to insert them in a single String
> > as
> > > comma separated values. It is scalable if we have a Bean for Cassandra
> > > Resource ? What do you think ?
> > >
> >
> > Put one property per column in Cassandra if possible. IIRC it does a good
> > job of serializing data, and doesnt need a pre-defined schema as
> > traditional RDBMS's do. The serialisation I mentioned was mostly used to
> > get schemaless storage into an RDBMS.
> >
> >
> >
> > >
> > > And I did a first cut of this  but with many TODOs ;-),  where
> > getResource
> > > method is implemented and currently all the content is printed, but I
> > have
> > > not implemented methods in CassandraResource yet. This is just a POC to
> > > test whether the proposed model works. Apparently it works [1].
> >
> >
> > Yes, this is a great start! I didn't find to many issues with the
> approach,
> > as you will see from the comments on the code review.
> >
> >
> >
> >
> > >  See
> > >  CassandraDataPopulator class which is a plain java test class added
> for
> > > the moment to test the POC.(I am moving this to a proper JUnit)
> > >
> >
> > Good.
> >
> >
> > >
> > > TODOs
> > > - I am in the process of  finishing the implementation of Cassandra
> > > Resource, CassandaResource Provider and etc END to END.
> > > - Move to JUnit test framework and  write more tests for each scenario
> > > where I can extend this to Mockito (I am still not clear how Mockito
> > comes
> > > in to the picture) in near future.
> > >
> >
> > When you write the Unit tests, if you find that you need to mock anything
> > (ie ResourceResolver) to make your unit tests work, dont. Use Mocks. You
> > can even Mockup concrete clases so could mockup the behaviour of the
> Hector
> > API to respond in a pre-defined way to certain CQL queries. This will
> > eliminate the need to have a real cassandra server present when doing the
> > basic unit tests.
> >
> >
> >
> >
> > > - Change the implementation based on the feedbacks from the community.
> > > - Parameterize the constants as much as possible to read from a
> property
> > > file.
> > >
> >
> > These should come from OSGi Properties. See the comments on
> > CassandraResoureProvider
> >
> >
> >
> >
> >
> >
> > >
> > >
> > > [1] -
> > >
> >
> https://cassandra-backend-for-sling.googlecode.com/svn/trunk/main/cassandra
> > >
> > > Thanks
> > >
> >
> > Excellent progress, thank you!
> > Ian
> >
> >
> > >
> > > >
> > > >
> > > > Ian
> > > >
> > > > 1
> > > >
> > > >
> > >
> >
> http://hector-client.github.io/hector/source/content/API/core/0.8.0-2/me/prettyprint/cassandra/serializers/TypeInferringSerializer.html
> > > >
> > > > 2
> > > >
> > > >
> > >
> >
> https://github.com/ieb/sparsemapcontent/tree/master/core/src/main/java/org/sakaiproject/nakamura/lite/storage/spi/types
> > > >
> > > >
> > > > On 28 June 2013 05:14, Dishara Wijewardana <dd...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Ian,
> > > > > I am having a problem with CQL..
> > > > >
> > > > > For example:
> > > > >         CqlQuery*<String,String,Long>* cqlQuery = new CqlQuery*
> > > > > <String,String,Long>*(keyspace, new StringSerializer(),new
> > > > > StringSerializer(), new LongSerializer();
> > > > >         cqlQuery.setQuery("insert into mytable
> > > > (KEY,password,gender,userid)
> > > > > values (3,'pass1','male',34);");
> > > > >         QueryResult<CqlRows<String,String,Long>> result =
> > > > > cqlQuery.execute();
> > > > >
> > > > > This will successfully insert the row with pass1,male and 34 values
> > > under
> > > > > rowId=3.
> > > > >
> > > > > But in sling scenario, we need to have more serializers for a query
> > as
> > > > > follows. Since we have more columns.
> > > > > i.e
> > > > >         CqlQuery*<String,String,String,String> *cqlQuery = new
> > > CqlQuery*
> > > > > <String,String,String,String>*(keyspace, new StringSerializer(),new
> > > > > StringSerializer(),new       StringSerializer(),new
> > > StringSerializer());
> > > > >         cqlQuery.setQuery("insert into mytable
> > > > > (KEY,path,resourceType,resourceSuperType,metadata) values
> > > > >
> > (3,'/content/cassandra/foo/bar','nt:cassandra','nt:super','metadata');
> > > > >         QueryResult<CqlRows<String,String,Long>> result =
> > > > > cqlQuery.execute();
> > > > >
> > > > > Here I am using me.prettyprint.cassandra.model.CqlQuery class. Any
> > idea
> > > > how
> > > > > to proceed with this.
> > > > >
> > > > > Am I doing something wring or is this a limitation of the API I am
> > > using
> > > > ?
> > > > >
> > > > >
> > > > > On Thu, Jun 27, 2013 at 7:41 AM, Dishara Wijewardana <
> > > > > ddwijewardana@gmail.com> wrote:
> > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Jun 27, 2013 at 4:26 AM, Ian Boston <ie...@tfd.co.uk>
> wrote:
> > > > > >
> > > > > >> On 27 June 2013 02:34, Dishara Wijewardana <
> > ddwijewardana@gmail.com
> > > >
> > > > > >> wrote:
> > > > > >>
> > > > > >> > On Tue, Jun 25, 2013 at 4:52 AM, Ian Boston <ie...@tfd.co.uk>
> > > wrote:
> > > > > >> >
> > > > > >> > > Hi,
> > > > > >> > >
> > > > > >> > > (I might have errors in the CQL, Cassandra schema and the
> > > > functions
> > > > > >> need
> > > > > >> > > proper escaping)
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > Example 1:
> > > > > >> > > Zero depth tree wiht UUID as the rowid or key.
> > > > > >> > >
> > > > > >> > > URL /content/cassandra/pictures/13f58d5c95c70b6f
> > > > > >> > >
> > > > > >> > > then the column family is pictures and the URL -> ROWID
> > function
> > > > > just
> > > > > >> > > results in the ROWID being 13f58d5c95c70b6f and
> > > > > >> > >
> > > > > >> > > String cql =
> > > > > mapOfCassandraMappers.get("pictures").getCQL("pictures",
> > > > > >> "
> > > > > >> > > 13f58d5c95c70b6f")
> > > > > >> > > System.err.println(cql);
> > > > > >> > >
> > > > > >> > > where
> > > > > >> > > String getCQL(String cf, String path) {
> > > > > >> > >     return "select * from "+cf+" where rowid = '"+path+"'";
> > > > > >> > > }
> > > > > >> > >
> > > > > >> > > yields:
> > > > > >> > > select * from pictures where rowid = '13f58d5c95c70b6f'
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > 13f58d5c95c70b6f would be generated by the application when
> > the
> > > > user
> > > > > >> > > created a new picture (by upload).
> > > > > >> > >
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > Example 2:
> > > > > >> > > User specified
> > > > > >> > >
> > > > > >> > > URL
> > > > > >>
> > /content/cassandra/catalogue/capacitors/electrolytic/axial/16v/10uf
> > > > > >> > >
> > > > > >> > > String cql =
> > > > > >> mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
> > > > > >> > > capacitors/electrolytic/axial/16v/10uf")
> > > > > >> > > System.err.println(cql);
> > > > > >> > >
> > > > > >> > > where
> > > > > >> > > String getCQL(String cf, String path) {
> > > > > >> > >     MessageDigest md = MessageDigest.getInstance("SHA1");
> > > > > >> > >     String rowID =
> > > > Base64.encode(md.finish(path.getBytes("UTF-8")));
> > > > > >> > >     return "select * from "+cf+" where rowid = '"+rowID+"'";
> > > > > >> > > }
> > > > > >> > >
> > > > > >> > > yields
> > > > > >> > >
> > > > > >> > > select * from pictures where rowid =
> 'NzdlZmU4OTZmNGM4MzMwYzZ'
> > > > > >> > >
> > > > > >> > > If you want to find the parent then
> > > > > >> > >
> > > > > >> > > mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
> > > > > >> > > capacitors/electrolytic/axial/16v")
> > > > > >> > >
> > > > > >> > > select * from pictures where rowid =
> 'ZGFzZGZzZnNkYWZzYWRmc2R'
> > > > > >> > >
> > > > > >> > > And if the parent is stored in the property parent then
> > > > > >> > >
> > > > > >> > > select * from pictures where parent =
> > 'ZGFzZGZzZnNkYWZzYWRmc2R'
> > > > > >> > >
> > > > > >> > > will generate a list of children. (Not sure about
> performance)
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > Example 3:
> > > > > >> > > User is allowed to enter the RowID directly (identical to
> > > Example
> > > > 1
> > > > > >> > > URL
> > > > > >> > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> /content/cassandra/cannesfilmfestival/TomCruiseCassino-20130402112345-ieb.jpg
> > > > > >> > >
> > > > > >> > > where
> > > > > >> > > String getCQL(String cf, String path) {
> > > > > >> > >     return "select * from "+cf+" where rowid = '"+path+"'";
> > > > > >> > > }
> > > > > >> > >
> > > > > >> > > yields:
> > > > > >> > > select * from pictures where rowid = '
> > > > > >> > > TomCruiseCassino-20130402112345-ieb.jpg'
> > > > > >> > >
> > > > > >> >
> > > > > >> > This should be corrected as
> > > > > >> > select * from cannesfilmfestival where rowid = '
> > > > > >> > TomCruiseCassino-20130402112345-ieb.jpg'
> > > > > >> >
> > > > > >> >
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > Does that make sense ?
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > > >> Hi
> > > > > >>
> > > > > >>
> > > > > >> > Hi Ian,
> > > > > >> > I was in fact practicing some cql stuff in related to this
> > > response
> > > > > >> (with
> > > > > >> > cassandra cql terminal). This is quite a wonderful explanation
> > > for a
> > > > > new
> > > > > >> > comer like me. Thank you very much for the explanation again.
> > Now
> > > it
> > > > > >> really
> > > > > >> > makes sense.
> > > > > >> >
> > > > > >>
> > > > > >> excellent!
> > > > > >>
> > > > > >>
> > > > > >> >
> > > > > >> > Other than the zero depth approach, I believe users will be
> more
> > > > > >> > comfortable with Example 2 approach.
> > > > > >> > Shall we go ahead with it ?
> > > > > >> >
> > > > > >>
> > > > > >>
> > > > > >> Yes, go for it. It will be interesting to see how hard it is to
> > > > > implement
> > > > > >> and how well (or not) it works. Remember, keep it as simple as
> > > > possible
> > > > > >> and
> > > > > >> dont try and and cover every use case at the expense of getting
> a
> > > PoC
> > > > > >> working.
> > > > > >>
> > > > > > +1.
> > > > > >
> > > > > >>
> > > > > >> However, dont forget, Unit tests mocked with Mockito are a
> quicker
> > > way
> > > > > of
> > > > > >> getting to working code, than no unit test coverage.
> > > > > >>
> > > > > >> Best Regards
> > > > > >> Ian
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> >
> > > > > >> >
> > > > > >> > > Ian
> > > > > >> > >
> > > > > >> > >
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > On 25 June 2013 05:29, Dishara Wijewardana <
> > > > ddwijewardana@gmail.com
> > > > > >
> > > > > >> > > wrote:
> > > > > >> > >
> > > > > >> > > > On Mon, Jun 24, 2013 at 4:02 AM, Ian Boston <
> ieb@tfd.co.uk>
> > > > > wrote:
> > > > > >> > > >
> > > > > >> > > > > Hi Dishara,
> > > > > >> > > > > Yes. 1 resource == 1 row.
> > > > > >> > > > > The columns within that row represent the properties of
> > the
> > > > > >> resource.
> > > > > >> > > > > I suggest that you use standard property names where
> > > > appropriate
> > > > > >> (eg
> > > > > >> > > > > sling:resourceType is the Resource.resourceType etc)
> > > > > >> > > > >
> > > > > >> > > > > The Resource itself should be adaptable to a generic
> > > > > >> > CassandraResource
> > > > > >> > > > > (which will probably implement Resource) which will
> have a
> > > map
> > > > > of
> > > > > >> > > > > properties containing all the columns of the cassandra
> > row.
> > > > > >> (optimise
> > > > > >> > > > > later) A CassandraResource might look and feel like a
> > > > > Map<String,
> > > > > >> > > Object>
> > > > > >> > > > > or it might have a Map<String, Object> getProperties()
> > > method,
> > > > > or
> > > > > >> > > better
> > > > > >> > > > > still be adaptable to a Map. The essential think is dont
> > > hard
> > > > > code
> > > > > >> > the
> > > > > >> > > > > property names in the interface of CassandraResource for
> > the
> > > > > >> moment.
> > > > > >> > ie
> > > > > >> > > > no
> > > > > >> > > > > getContentType() and no getMimeType(), as we dont really
> > > know
> > > > > >> what a
> > > > > >> > > > > CassandraResource will store.
> > > > > >> > > > >
> > > > > >> > > > > ResourceMetadata should be built from a subset of the
> > > > > >> > CassandraResource
> > > > > >> > > > > properties.
> > > > > >> > > > >
> > > > > >> > > > > You won't need to implement a ResourceResolver, only a
> > > > > >> > ResourceProvider
> > > > > >> > > > > (and Factory). I would use CQL in preference to other
> API
> > > > > methods.
> > > > > >> > > > >
> > > > > >> > > > > There is one thing that hasnt been mentioned, and thats
> > the
> > > > URL
> > > > > ->
> > > > > >> > > > > Cassandra Row mapping.
> > > > > >> > > > > There are several ways of doing this.
> > > > > >> > > > >
> > > > > >> > > > > eg:
> > > > > >> > > > > URL = /content/cassandra/<columnFamily>/<rowID>
> > > > > >> > > > >  Cassandra Column Family = columnFamily
> > > > > >> > > > >  Cassandra RowID = rowID
> > > > > >> > > > > or
> > > > > >> > > > > URL =
> > > > > >> /content/cassandra/<columnFamilySelector>/remainder/of/the/path
> > > > > >> > > > >  Cassandra  Cassandra Column Family =
> > > > > >> > > > > mapOfColumnFamilies.get(columnFamilySelector)
> > > > > >> > > > >  Cassandra  RowID = function(/remainder/of/the/path)
> > > > > >> > > > >
> > > > > >> > > > > or to take that one stage further
> > > > > >> > > > >
> > > > > >> > > > > public interface CassandraMapper {
> > > > > >> > > > >       String getCQL(String columnFamilySelector, String
> > > path);
> > > > > >> > > > > }
> > > > > >> > > > >
> > > > > >> > > > Hi Ian
> > > > > >> > > > Thank you for the detailed explanation.
> > > > > >> > > >
> > > > > >> > > > OK. +1 for this approach with the mentioned
> flexibility.But
> >  I
> > > > > need
> > > > > >> a
> > > > > >> > > small
> > > > > >> > > > clarification. With this approach,
> > > > > >> > > >
> > > > > >> > > > URL = /content/cassandra/<columnFamilySelector>ROW-ID
> > > > > >> > > > ROW-ID - function(/remainder/of/the/path).
> > > > > >> > > > So you mean ROW-ID is something we have to programatically
> > > > > uniquely
> > > > > >> > > create
> > > > > >> > > >  right ? like a UUID.
> > > > > >> > > >
> > > > > >> > > > What is this "/remainder/of/the/path" means ? Can you give
> > an
> > > > > >> example
> > > > > >> > > with
> > > > > >> > > > real values in the context of a user who want to obtain a
> > > > resource
> > > > > >> from
> > > > > >> > > > cassandra.
> > > > > >> > > > This is just for my understanding.
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > > >
> > > > > >> > > > > URL =
> > > > > /content/cassandra/<columnFamilySelector>/<remainderOfPath>
> > > > > >> > > > >
> > > > > >> > > > >  String cqlQuery =
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> mapOfCassandraMappers.get(columnFamilySelector).getCQL(columnFamilySelector,
> > > > > >> > > > > remainderOfPath);
> > > > > >> > > > >
> > > > > >> > > > > Which would allow us provided one or more
> implementations
> > of
> > > > > >> > > > > CassandraMapper to map between URL and CQL.
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > > HTH
> > > > > >> > > > > Ian
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > > On 23 June 2013 19:29, Dishara Wijewardana <
> > > > > >> ddwijewardana@gmail.com>
> > > > > >> > > > > wrote:
> > > > > >> > > > >
> > > > > >> > > > > > Hi Ian,
> > > > > >> > > > > >
> > > > > >> > > > > > What is the data mapping should be between Cassandra
> and
> > > > Sling
> > > > > >> > > > resource.
> > > > > >> > > > > I
> > > > > >> > > > > > mean is a Sling Resource maps to a Cassandra Column ?
> Or
> > > > > Column
> > > > > >> > > Family
> > > > > >> > > > ?
> > > > > >> > > > > >
> > > > > >> > > > > > Because to get this Cassandra and Sling story correct
> we
> > > > need
> > > > > to
> > > > > >> > > > finalize
> > > > > >> > > > > > this.
> > > > > >> > > > > > For an example what we eventually returns is a Sling
> > > > resource.
> > > > > >> > > > Everything
> > > > > >> > > > > > that needs to fill in to create Sling resource should
> be
> > > > > stored
> > > > > >> in
> > > > > >> > > > > > Cassandra.
> > > > > >> > > > > > In a Sling resource,
> > > > > >> > > > > >
> > > > > >> > > > > >    - Path - direct sling resource path
> > > > > >> > > > > >    - ResourceType - nt:cassandra
> > > > > >> > > > > >    - ResourceSuperType - ?
> > > > > >> > > > > >    - ResourceMetadata - we can create this on the fly
> > with
> > > > the
> > > > > >> data
> > > > > >> > > > from
> > > > > >> > > > > >    the corresponding column. At insertion, those need
> to
> > > be
> > > > > >> stored.
> > > > > >> > > > > > Following
> > > > > >> > > > > >    are the ones which I thought might be useful by
> > default
> > > > to
> > > > > be
> > > > > >> > set
> > > > > >> > > > for
> > > > > >> > > > > > any
> > > > > >> > > > > >    node. Please add if we need anything more.
> > > > > >> > > > > >       - ContentType
> > > > > >> > > > > >       - ContentLength
> > > > > >> > > > > >       - CreationTime
> > > > > >> > > > > >       - ModificationTime
> > > > > >> > > > > >    - ResourceResolver -  Do we need a resolver in this
> > > case
> > > > ?
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > >  So I believe in CQL context, one ROW should
> represent a
> > > > Sling
> > > > > >> > > > resource.
> > > > > >> > > > > If
> > > > > >> > > > > > that is the case for ResourceMetadata we might need a
> > > > separate
> > > > > >> > column
> > > > > >> > > > to
> > > > > >> > > > > > store it since it has multiple values. I am not sure
> > > whether
> > > > > we
> > > > > >> can
> > > > > >> > > do
> > > > > >> > > > it
> > > > > >> > > > > > with CQL, but it should be possible with hector APIs
> may
> > > be.
> > > > > >> > > > > >
> > > > > >> > > > > > Appreciate your thoughts ?
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > > On Wed, Jun 19, 2013 at 1:19 AM, Dishara Wijewardana <
> > > > > >> > > > > > ddwijewardana@gmail.com> wrote:
> > > > > >> > > > > >
> > > > > >> > > > > > > Hi Ian,
> > > > > >> > > > > > > I am starting this thread to keep track on things
> > > related
> > > > to
> > > > > >> the
> > > > > >> > > GSoC
> > > > > >> > > > > > > project related milestone status updates and related
> > > > > >> discussions.
> > > > > >> > > > > > > So the first task over view will be as follows as
> per
> > > GSoC
> > > > > >> > proposal
> > > > > >> > > > > > > provided.
> > > > > >> > > > > > >
> > > > > >> > > > > > > 1. Implementing a CassandraResourceProvider  to READ
> > > from
> > > > > >> > > Cassandra.
> > > > > >> > > > > > > Implementation Details [1]
> > > > > >> > > > > > >
> > > > > >> > > > > > >
> > > > > >> > > > > > >
> > > > > >> > > > > > > [1] : Implementation Details:
> > > > > >> > > > > > >
> > > > > >> > > > > > >  1.A) Write a CassanrdaResourceProviderUtil  which
> is
> > > > > >> basically a
> > > > > >> > > > > > > cassendra client which will facilitate all cassandra
> > > > related
> > > > > >> > > > operations
> > > > > >> > > > > > > required by other modules (CassandraResourceProvider
> > and
> > > > > >> > > > > > > CassandraResourceResolver).
> > > > > >> > > > > > >
> > > > > >> > > > > > > 1.B) Implementation of  CassandraResourceProvider
> > > > > >> > > > > > >
> > > > > >> > > > > > > 1.C)  Implementation of CassandraResourceResolver
> > > > > >> > > > > > >
> > > > > >> > > > > > > 1.D) Implementation of CassandraResource
> > > > > >> > > > > > >
> > > > > >> > > > > > >
> > > > > >> > > > > > > And I will start writing the
> > > CassanrdaResourceProviderUtil
> > > > > >> class
> > > > > >> > > > which
> > > > > >> > > > > > > will do basic add and get using hector API. Please
> > > provide
> > > > > any
> > > > > >> > > > feedback
> > > > > >> > > > > > > that will be useful to accomplish this task.
> > > > > >> > > > > > > So for this how does path mapping should be done.
> > > Because
> > > > > for
> > > > > >> > > > example,
> > > > > >> > > > > > the
> > > > > >> > > > > > > path of the cassendra node will not be same as the
> jcr
> > > > node
> > > > > >> path.
> > > > > >> > > i.e
> > > > > >> > > > > > > provider will ask a node path
> /system/myapps/test/foo
> > > and
> > > > > >> where
> > > > > >> > > > should
> > > > > >> > > > > we
> > > > > >> > > > > > > return it from Cassandra. Aren't we have to first
> > > consider
> > > > > the
> > > > > >> > > WRITE
> > > > > >> > > > > > aspect
> > > > > >> > > > > > > to Cassandra ?
> > > > > >> > > > > > >
> > > > > >> > > > > > >
> > > > > >> > > > > > > --
> > > > > >> > > > > > > Thanks
> > > > > >> > > > > > > /Dishara
> > > > > >> > > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > > --
> > > > > >> > > > > > Thanks
> > > > > >> > > > > > /Dishara
> > > > > >> > > > > >
> > > > > >> > > > >
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > > --
> > > > > >> > > > Thanks
> > > > > >> > > > /Dishara
> > > > > >> > > >
> > > > > >> > >
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> > --
> > > > > >> > Thanks
> > > > > >> > /Dishara
> > > > > >> >
> > > > > >>
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Thanks
> > > > > > /Dishara
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Thanks
> > > > > /Dishara
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Thanks
> > > /Dishara
> > >
> >
>
>
>
> --
> Thanks
> /Dishara
>

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Dishara Wijewardana <dd...@gmail.com>.
Hi Ian,
Does each bundle in sling made to run their junit tests separately at build
time  ? If so each pom should have configured to junit test cases.
Where and how to define them ?


On Mon, Jul 1, 2013 at 5:32 AM, Ian Boston <ie...@tfd.co.uk> wrote:

> Hi Dishara,
>
> I've taken the liberty of creating a code review at [1]. This is all
> commits. I've emailed you separately with the comments. I think it would be
> good if we can get into the habit of looking at the code in this way as it
> often removes confusion introduced by the english language (which has many
> compilers ;), mine has been known to be buggy at times.).
>
>
> More comments inline below: (BTW, excellent progress!)
>
> Best Regards
> Ian
>
>
> 1 https://codereview.appspot.com/10811044/
>
>
>
> On 30 June 2013 22:52, Dishara Wijewardana <dd...@gmail.com>
> wrote:
>
> > On Fri, Jun 28, 2013 at 4:37 AM, Ian Boston <ie...@tfd.co.uk> wrote:
> >
> > > Hi,
> > > Have you tried the TypeInferringSerializer for the value serializer ?
> > > That claims to be detect what the column value is based on the Byte
> > array.
> > >
> > > Failing that, I would consider making everything byte[] and using your
> > own
> > > serializer that writes and read values to a byte[] using
> DataInputStream
> > > DataOutputStream.
> > >
> > > [2] Is an example of a serializer written for that purpose that was
> used
> > > with Cassandra over raw Thrift. Its not easy to read what it outputs to
> > the
> > > storage layer, but it is compact and efficient. I would not use it
> > directly
> > > as it does some very specific things like slicing large byte[]s into
> 1MB
> > > chunks and bypassing the 64K limit on reading and writing UTF8 strings
> > with
> > > DataInputStream.
> > >
> > > Try the TypeInferringSerializer first. If it works great, no need to do
> > > anything more complex.
> > >
> >
> > Hi,
> > In fact I was able to add as many params as I wanted with the same
> > configurations. But TypeInferringSerializer is a useful one too which
> might
> > need in future.
> > Also I was thinking rather than storing resource meta data as String
> > values, how about storing a serialized object as you mentioned ?
>
>
> I suspect that TypeInferringSerializer will do a better job of serializing
> than the approach I mentioned. Only consider writing your own, if there is
> a real and demonstrated need for it.
>
>
> > It will be
> > clear. But I am not sure about the performance. Because when we have
> multi
> > valued columns like meta data we have to insert them in a single String
> as
> > comma separated values. It is scalable if we have a Bean for Cassandra
> > Resource ? What do you think ?
> >
>
> Put one property per column in Cassandra if possible. IIRC it does a good
> job of serializing data, and doesnt need a pre-defined schema as
> traditional RDBMS's do. The serialisation I mentioned was mostly used to
> get schemaless storage into an RDBMS.
>
>
>
> >
> > And I did a first cut of this  but with many TODOs ;-),  where
> getResource
> > method is implemented and currently all the content is printed, but I
> have
> > not implemented methods in CassandraResource yet. This is just a POC to
> > test whether the proposed model works. Apparently it works [1].
>
>
> Yes, this is a great start! I didn't find to many issues with the approach,
> as you will see from the comments on the code review.
>
>
>
>
> >  See
> >  CassandraDataPopulator class which is a plain java test class added for
> > the moment to test the POC.(I am moving this to a proper JUnit)
> >
>
> Good.
>
>
> >
> > TODOs
> > - I am in the process of  finishing the implementation of Cassandra
> > Resource, CassandaResource Provider and etc END to END.
> > - Move to JUnit test framework and  write more tests for each scenario
> > where I can extend this to Mockito (I am still not clear how Mockito
> comes
> > in to the picture) in near future.
> >
>
> When you write the Unit tests, if you find that you need to mock anything
> (ie ResourceResolver) to make your unit tests work, dont. Use Mocks. You
> can even Mockup concrete clases so could mockup the behaviour of the Hector
> API to respond in a pre-defined way to certain CQL queries. This will
> eliminate the need to have a real cassandra server present when doing the
> basic unit tests.
>
>
>
>
> > - Change the implementation based on the feedbacks from the community.
> > - Parameterize the constants as much as possible to read from a property
> > file.
> >
>
> These should come from OSGi Properties. See the comments on
> CassandraResoureProvider
>
>
>
>
>
>
> >
> >
> > [1] -
> >
> https://cassandra-backend-for-sling.googlecode.com/svn/trunk/main/cassandra
> >
> > Thanks
> >
>
> Excellent progress, thank you!
> Ian
>
>
> >
> > >
> > >
> > > Ian
> > >
> > > 1
> > >
> > >
> >
> http://hector-client.github.io/hector/source/content/API/core/0.8.0-2/me/prettyprint/cassandra/serializers/TypeInferringSerializer.html
> > >
> > > 2
> > >
> > >
> >
> https://github.com/ieb/sparsemapcontent/tree/master/core/src/main/java/org/sakaiproject/nakamura/lite/storage/spi/types
> > >
> > >
> > > On 28 June 2013 05:14, Dishara Wijewardana <dd...@gmail.com>
> > > wrote:
> > >
> > > > Hi Ian,
> > > > I am having a problem with CQL..
> > > >
> > > > For example:
> > > >         CqlQuery*<String,String,Long>* cqlQuery = new CqlQuery*
> > > > <String,String,Long>*(keyspace, new StringSerializer(),new
> > > > StringSerializer(), new LongSerializer();
> > > >         cqlQuery.setQuery("insert into mytable
> > > (KEY,password,gender,userid)
> > > > values (3,'pass1','male',34);");
> > > >         QueryResult<CqlRows<String,String,Long>> result =
> > > > cqlQuery.execute();
> > > >
> > > > This will successfully insert the row with pass1,male and 34 values
> > under
> > > > rowId=3.
> > > >
> > > > But in sling scenario, we need to have more serializers for a query
> as
> > > > follows. Since we have more columns.
> > > > i.e
> > > >         CqlQuery*<String,String,String,String> *cqlQuery = new
> > CqlQuery*
> > > > <String,String,String,String>*(keyspace, new StringSerializer(),new
> > > > StringSerializer(),new       StringSerializer(),new
> > StringSerializer());
> > > >         cqlQuery.setQuery("insert into mytable
> > > > (KEY,path,resourceType,resourceSuperType,metadata) values
> > > >
> (3,'/content/cassandra/foo/bar','nt:cassandra','nt:super','metadata');
> > > >         QueryResult<CqlRows<String,String,Long>> result =
> > > > cqlQuery.execute();
> > > >
> > > > Here I am using me.prettyprint.cassandra.model.CqlQuery class. Any
> idea
> > > how
> > > > to proceed with this.
> > > >
> > > > Am I doing something wring or is this a limitation of the API I am
> > using
> > > ?
> > > >
> > > >
> > > > On Thu, Jun 27, 2013 at 7:41 AM, Dishara Wijewardana <
> > > > ddwijewardana@gmail.com> wrote:
> > > >
> > > > >
> > > > >
> > > > > On Thu, Jun 27, 2013 at 4:26 AM, Ian Boston <ie...@tfd.co.uk> wrote:
> > > > >
> > > > >> On 27 June 2013 02:34, Dishara Wijewardana <
> ddwijewardana@gmail.com
> > >
> > > > >> wrote:
> > > > >>
> > > > >> > On Tue, Jun 25, 2013 at 4:52 AM, Ian Boston <ie...@tfd.co.uk>
> > wrote:
> > > > >> >
> > > > >> > > Hi,
> > > > >> > >
> > > > >> > > (I might have errors in the CQL, Cassandra schema and the
> > > functions
> > > > >> need
> > > > >> > > proper escaping)
> > > > >> > >
> > > > >> > >
> > > > >> > > Example 1:
> > > > >> > > Zero depth tree wiht UUID as the rowid or key.
> > > > >> > >
> > > > >> > > URL /content/cassandra/pictures/13f58d5c95c70b6f
> > > > >> > >
> > > > >> > > then the column family is pictures and the URL -> ROWID
> function
> > > > just
> > > > >> > > results in the ROWID being 13f58d5c95c70b6f and
> > > > >> > >
> > > > >> > > String cql =
> > > > mapOfCassandraMappers.get("pictures").getCQL("pictures",
> > > > >> "
> > > > >> > > 13f58d5c95c70b6f")
> > > > >> > > System.err.println(cql);
> > > > >> > >
> > > > >> > > where
> > > > >> > > String getCQL(String cf, String path) {
> > > > >> > >     return "select * from "+cf+" where rowid = '"+path+"'";
> > > > >> > > }
> > > > >> > >
> > > > >> > > yields:
> > > > >> > > select * from pictures where rowid = '13f58d5c95c70b6f'
> > > > >> > >
> > > > >> > >
> > > > >> > > 13f58d5c95c70b6f would be generated by the application when
> the
> > > user
> > > > >> > > created a new picture (by upload).
> > > > >> > >
> > > > >> > >
> > > > >> > >
> > > > >> > > Example 2:
> > > > >> > > User specified
> > > > >> > >
> > > > >> > > URL
> > > > >>
> /content/cassandra/catalogue/capacitors/electrolytic/axial/16v/10uf
> > > > >> > >
> > > > >> > > String cql =
> > > > >> mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
> > > > >> > > capacitors/electrolytic/axial/16v/10uf")
> > > > >> > > System.err.println(cql);
> > > > >> > >
> > > > >> > > where
> > > > >> > > String getCQL(String cf, String path) {
> > > > >> > >     MessageDigest md = MessageDigest.getInstance("SHA1");
> > > > >> > >     String rowID =
> > > Base64.encode(md.finish(path.getBytes("UTF-8")));
> > > > >> > >     return "select * from "+cf+" where rowid = '"+rowID+"'";
> > > > >> > > }
> > > > >> > >
> > > > >> > > yields
> > > > >> > >
> > > > >> > > select * from pictures where rowid = 'NzdlZmU4OTZmNGM4MzMwYzZ'
> > > > >> > >
> > > > >> > > If you want to find the parent then
> > > > >> > >
> > > > >> > > mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
> > > > >> > > capacitors/electrolytic/axial/16v")
> > > > >> > >
> > > > >> > > select * from pictures where rowid = 'ZGFzZGZzZnNkYWZzYWRmc2R'
> > > > >> > >
> > > > >> > > And if the parent is stored in the property parent then
> > > > >> > >
> > > > >> > > select * from pictures where parent =
> 'ZGFzZGZzZnNkYWZzYWRmc2R'
> > > > >> > >
> > > > >> > > will generate a list of children. (Not sure about performance)
> > > > >> > >
> > > > >> > >
> > > > >> > > Example 3:
> > > > >> > > User is allowed to enter the RowID directly (identical to
> > Example
> > > 1
> > > > >> > > URL
> > > > >> > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > >
> > >
> >
> /content/cassandra/cannesfilmfestival/TomCruiseCassino-20130402112345-ieb.jpg
> > > > >> > >
> > > > >> > > where
> > > > >> > > String getCQL(String cf, String path) {
> > > > >> > >     return "select * from "+cf+" where rowid = '"+path+"'";
> > > > >> > > }
> > > > >> > >
> > > > >> > > yields:
> > > > >> > > select * from pictures where rowid = '
> > > > >> > > TomCruiseCassino-20130402112345-ieb.jpg'
> > > > >> > >
> > > > >> >
> > > > >> > This should be corrected as
> > > > >> > select * from cannesfilmfestival where rowid = '
> > > > >> > TomCruiseCassino-20130402112345-ieb.jpg'
> > > > >> >
> > > > >> >
> > > > >> > >
> > > > >> > >
> > > > >> > > Does that make sense ?
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >> Hi
> > > > >>
> > > > >>
> > > > >> > Hi Ian,
> > > > >> > I was in fact practicing some cql stuff in related to this
> > response
> > > > >> (with
> > > > >> > cassandra cql terminal). This is quite a wonderful explanation
> > for a
> > > > new
> > > > >> > comer like me. Thank you very much for the explanation again.
> Now
> > it
> > > > >> really
> > > > >> > makes sense.
> > > > >> >
> > > > >>
> > > > >> excellent!
> > > > >>
> > > > >>
> > > > >> >
> > > > >> > Other than the zero depth approach, I believe users will be more
> > > > >> > comfortable with Example 2 approach.
> > > > >> > Shall we go ahead with it ?
> > > > >> >
> > > > >>
> > > > >>
> > > > >> Yes, go for it. It will be interesting to see how hard it is to
> > > > implement
> > > > >> and how well (or not) it works. Remember, keep it as simple as
> > > possible
> > > > >> and
> > > > >> dont try and and cover every use case at the expense of getting a
> > PoC
> > > > >> working.
> > > > >>
> > > > > +1.
> > > > >
> > > > >>
> > > > >> However, dont forget, Unit tests mocked with Mockito are a quicker
> > way
> > > > of
> > > > >> getting to working code, than no unit test coverage.
> > > > >>
> > > > >> Best Regards
> > > > >> Ian
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >> >
> > > > >> >
> > > > >> > > Ian
> > > > >> > >
> > > > >> > >
> > > > >> > >
> > > > >> > >
> > > > >> > > On 25 June 2013 05:29, Dishara Wijewardana <
> > > ddwijewardana@gmail.com
> > > > >
> > > > >> > > wrote:
> > > > >> > >
> > > > >> > > > On Mon, Jun 24, 2013 at 4:02 AM, Ian Boston <ie...@tfd.co.uk>
> > > > wrote:
> > > > >> > > >
> > > > >> > > > > Hi Dishara,
> > > > >> > > > > Yes. 1 resource == 1 row.
> > > > >> > > > > The columns within that row represent the properties of
> the
> > > > >> resource.
> > > > >> > > > > I suggest that you use standard property names where
> > > appropriate
> > > > >> (eg
> > > > >> > > > > sling:resourceType is the Resource.resourceType etc)
> > > > >> > > > >
> > > > >> > > > > The Resource itself should be adaptable to a generic
> > > > >> > CassandraResource
> > > > >> > > > > (which will probably implement Resource) which will have a
> > map
> > > > of
> > > > >> > > > > properties containing all the columns of the cassandra
> row.
> > > > >> (optimise
> > > > >> > > > > later) A CassandraResource might look and feel like a
> > > > Map<String,
> > > > >> > > Object>
> > > > >> > > > > or it might have a Map<String, Object> getProperties()
> > method,
> > > > or
> > > > >> > > better
> > > > >> > > > > still be adaptable to a Map. The essential think is dont
> > hard
> > > > code
> > > > >> > the
> > > > >> > > > > property names in the interface of CassandraResource for
> the
> > > > >> moment.
> > > > >> > ie
> > > > >> > > > no
> > > > >> > > > > getContentType() and no getMimeType(), as we dont really
> > know
> > > > >> what a
> > > > >> > > > > CassandraResource will store.
> > > > >> > > > >
> > > > >> > > > > ResourceMetadata should be built from a subset of the
> > > > >> > CassandraResource
> > > > >> > > > > properties.
> > > > >> > > > >
> > > > >> > > > > You won't need to implement a ResourceResolver, only a
> > > > >> > ResourceProvider
> > > > >> > > > > (and Factory). I would use CQL in preference to other API
> > > > methods.
> > > > >> > > > >
> > > > >> > > > > There is one thing that hasnt been mentioned, and thats
> the
> > > URL
> > > > ->
> > > > >> > > > > Cassandra Row mapping.
> > > > >> > > > > There are several ways of doing this.
> > > > >> > > > >
> > > > >> > > > > eg:
> > > > >> > > > > URL = /content/cassandra/<columnFamily>/<rowID>
> > > > >> > > > >  Cassandra Column Family = columnFamily
> > > > >> > > > >  Cassandra RowID = rowID
> > > > >> > > > > or
> > > > >> > > > > URL =
> > > > >> /content/cassandra/<columnFamilySelector>/remainder/of/the/path
> > > > >> > > > >  Cassandra  Cassandra Column Family =
> > > > >> > > > > mapOfColumnFamilies.get(columnFamilySelector)
> > > > >> > > > >  Cassandra  RowID = function(/remainder/of/the/path)
> > > > >> > > > >
> > > > >> > > > > or to take that one stage further
> > > > >> > > > >
> > > > >> > > > > public interface CassandraMapper {
> > > > >> > > > >       String getCQL(String columnFamilySelector, String
> > path);
> > > > >> > > > > }
> > > > >> > > > >
> > > > >> > > > Hi Ian
> > > > >> > > > Thank you for the detailed explanation.
> > > > >> > > >
> > > > >> > > > OK. +1 for this approach with the mentioned flexibility.But
>  I
> > > > need
> > > > >> a
> > > > >> > > small
> > > > >> > > > clarification. With this approach,
> > > > >> > > >
> > > > >> > > > URL = /content/cassandra/<columnFamilySelector>ROW-ID
> > > > >> > > > ROW-ID - function(/remainder/of/the/path).
> > > > >> > > > So you mean ROW-ID is something we have to programatically
> > > > uniquely
> > > > >> > > create
> > > > >> > > >  right ? like a UUID.
> > > > >> > > >
> > > > >> > > > What is this "/remainder/of/the/path" means ? Can you give
> an
> > > > >> example
> > > > >> > > with
> > > > >> > > > real values in the context of a user who want to obtain a
> > > resource
> > > > >> from
> > > > >> > > > cassandra.
> > > > >> > > > This is just for my understanding.
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > >
> > > > >> > > > > URL =
> > > > /content/cassandra/<columnFamilySelector>/<remainderOfPath>
> > > > >> > > > >
> > > > >> > > > >  String cqlQuery =
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > >
> > >
> >
> mapOfCassandraMappers.get(columnFamilySelector).getCQL(columnFamilySelector,
> > > > >> > > > > remainderOfPath);
> > > > >> > > > >
> > > > >> > > > > Which would allow us provided one or more implementations
> of
> > > > >> > > > > CassandraMapper to map between URL and CQL.
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > > HTH
> > > > >> > > > > Ian
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > > On 23 June 2013 19:29, Dishara Wijewardana <
> > > > >> ddwijewardana@gmail.com>
> > > > >> > > > > wrote:
> > > > >> > > > >
> > > > >> > > > > > Hi Ian,
> > > > >> > > > > >
> > > > >> > > > > > What is the data mapping should be between Cassandra and
> > > Sling
> > > > >> > > > resource.
> > > > >> > > > > I
> > > > >> > > > > > mean is a Sling Resource maps to a Cassandra Column ? Or
> > > > Column
> > > > >> > > Family
> > > > >> > > > ?
> > > > >> > > > > >
> > > > >> > > > > > Because to get this Cassandra and Sling story correct we
> > > need
> > > > to
> > > > >> > > > finalize
> > > > >> > > > > > this.
> > > > >> > > > > > For an example what we eventually returns is a Sling
> > > resource.
> > > > >> > > > Everything
> > > > >> > > > > > that needs to fill in to create Sling resource should be
> > > > stored
> > > > >> in
> > > > >> > > > > > Cassandra.
> > > > >> > > > > > In a Sling resource,
> > > > >> > > > > >
> > > > >> > > > > >    - Path - direct sling resource path
> > > > >> > > > > >    - ResourceType - nt:cassandra
> > > > >> > > > > >    - ResourceSuperType - ?
> > > > >> > > > > >    - ResourceMetadata - we can create this on the fly
> with
> > > the
> > > > >> data
> > > > >> > > > from
> > > > >> > > > > >    the corresponding column. At insertion, those need to
> > be
> > > > >> stored.
> > > > >> > > > > > Following
> > > > >> > > > > >    are the ones which I thought might be useful by
> default
> > > to
> > > > be
> > > > >> > set
> > > > >> > > > for
> > > > >> > > > > > any
> > > > >> > > > > >    node. Please add if we need anything more.
> > > > >> > > > > >       - ContentType
> > > > >> > > > > >       - ContentLength
> > > > >> > > > > >       - CreationTime
> > > > >> > > > > >       - ModificationTime
> > > > >> > > > > >    - ResourceResolver -  Do we need a resolver in this
> > case
> > > ?
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > >  So I believe in CQL context, one ROW should represent a
> > > Sling
> > > > >> > > > resource.
> > > > >> > > > > If
> > > > >> > > > > > that is the case for ResourceMetadata we might need a
> > > separate
> > > > >> > column
> > > > >> > > > to
> > > > >> > > > > > store it since it has multiple values. I am not sure
> > whether
> > > > we
> > > > >> can
> > > > >> > > do
> > > > >> > > > it
> > > > >> > > > > > with CQL, but it should be possible with hector APIs may
> > be.
> > > > >> > > > > >
> > > > >> > > > > > Appreciate your thoughts ?
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > > On Wed, Jun 19, 2013 at 1:19 AM, Dishara Wijewardana <
> > > > >> > > > > > ddwijewardana@gmail.com> wrote:
> > > > >> > > > > >
> > > > >> > > > > > > Hi Ian,
> > > > >> > > > > > > I am starting this thread to keep track on things
> > related
> > > to
> > > > >> the
> > > > >> > > GSoC
> > > > >> > > > > > > project related milestone status updates and related
> > > > >> discussions.
> > > > >> > > > > > > So the first task over view will be as follows as per
> > GSoC
> > > > >> > proposal
> > > > >> > > > > > > provided.
> > > > >> > > > > > >
> > > > >> > > > > > > 1. Implementing a CassandraResourceProvider  to READ
> > from
> > > > >> > > Cassandra.
> > > > >> > > > > > > Implementation Details [1]
> > > > >> > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > > > [1] : Implementation Details:
> > > > >> > > > > > >
> > > > >> > > > > > >  1.A) Write a CassanrdaResourceProviderUtil  which is
> > > > >> basically a
> > > > >> > > > > > > cassendra client which will facilitate all cassandra
> > > related
> > > > >> > > > operations
> > > > >> > > > > > > required by other modules (CassandraResourceProvider
> and
> > > > >> > > > > > > CassandraResourceResolver).
> > > > >> > > > > > >
> > > > >> > > > > > > 1.B) Implementation of  CassandraResourceProvider
> > > > >> > > > > > >
> > > > >> > > > > > > 1.C)  Implementation of CassandraResourceResolver
> > > > >> > > > > > >
> > > > >> > > > > > > 1.D) Implementation of CassandraResource
> > > > >> > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > > > And I will start writing the
> > CassanrdaResourceProviderUtil
> > > > >> class
> > > > >> > > > which
> > > > >> > > > > > > will do basic add and get using hector API. Please
> > provide
> > > > any
> > > > >> > > > feedback
> > > > >> > > > > > > that will be useful to accomplish this task.
> > > > >> > > > > > > So for this how does path mapping should be done.
> > Because
> > > > for
> > > > >> > > > example,
> > > > >> > > > > > the
> > > > >> > > > > > > path of the cassendra node will not be same as the jcr
> > > node
> > > > >> path.
> > > > >> > > i.e
> > > > >> > > > > > > provider will ask a node path /system/myapps/test/foo
> > and
> > > > >> where
> > > > >> > > > should
> > > > >> > > > > we
> > > > >> > > > > > > return it from Cassandra. Aren't we have to first
> > consider
> > > > the
> > > > >> > > WRITE
> > > > >> > > > > > aspect
> > > > >> > > > > > > to Cassandra ?
> > > > >> > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > > > --
> > > > >> > > > > > > Thanks
> > > > >> > > > > > > /Dishara
> > > > >> > > > > > >
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > > --
> > > > >> > > > > > Thanks
> > > > >> > > > > > /Dishara
> > > > >> > > > > >
> > > > >> > > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > --
> > > > >> > > > Thanks
> > > > >> > > > /Dishara
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > --
> > > > >> > Thanks
> > > > >> > /Dishara
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Thanks
> > > > > /Dishara
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Thanks
> > > > /Dishara
> > > >
> > >
> >
> >
> >
> > --
> > Thanks
> > /Dishara
> >
>



-- 
Thanks
/Dishara

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Ian Boston <ie...@tfd.co.uk>.
Hi Dishara,

I've taken the liberty of creating a code review at [1]. This is all
commits. I've emailed you separately with the comments. I think it would be
good if we can get into the habit of looking at the code in this way as it
often removes confusion introduced by the english language (which has many
compilers ;), mine has been known to be buggy at times.).


More comments inline below: (BTW, excellent progress!)

Best Regards
Ian


1 https://codereview.appspot.com/10811044/



On 30 June 2013 22:52, Dishara Wijewardana <dd...@gmail.com> wrote:

> On Fri, Jun 28, 2013 at 4:37 AM, Ian Boston <ie...@tfd.co.uk> wrote:
>
> > Hi,
> > Have you tried the TypeInferringSerializer for the value serializer ?
> > That claims to be detect what the column value is based on the Byte
> array.
> >
> > Failing that, I would consider making everything byte[] and using your
> own
> > serializer that writes and read values to a byte[] using DataInputStream
> > DataOutputStream.
> >
> > [2] Is an example of a serializer written for that purpose that was used
> > with Cassandra over raw Thrift. Its not easy to read what it outputs to
> the
> > storage layer, but it is compact and efficient. I would not use it
> directly
> > as it does some very specific things like slicing large byte[]s into 1MB
> > chunks and bypassing the 64K limit on reading and writing UTF8 strings
> with
> > DataInputStream.
> >
> > Try the TypeInferringSerializer first. If it works great, no need to do
> > anything more complex.
> >
>
> Hi,
> In fact I was able to add as many params as I wanted with the same
> configurations. But TypeInferringSerializer is a useful one too which might
> need in future.
> Also I was thinking rather than storing resource meta data as String
> values, how about storing a serialized object as you mentioned ?


I suspect that TypeInferringSerializer will do a better job of serializing
than the approach I mentioned. Only consider writing your own, if there is
a real and demonstrated need for it.


> It will be
> clear. But I am not sure about the performance. Because when we have multi
> valued columns like meta data we have to insert them in a single String as
> comma separated values. It is scalable if we have a Bean for Cassandra
> Resource ? What do you think ?
>

Put one property per column in Cassandra if possible. IIRC it does a good
job of serializing data, and doesnt need a pre-defined schema as
traditional RDBMS's do. The serialisation I mentioned was mostly used to
get schemaless storage into an RDBMS.



>
> And I did a first cut of this  but with many TODOs ;-),  where getResource
> method is implemented and currently all the content is printed, but I have
> not implemented methods in CassandraResource yet. This is just a POC to
> test whether the proposed model works. Apparently it works [1].


Yes, this is a great start! I didn't find to many issues with the approach,
as you will see from the comments on the code review.




>  See
>  CassandraDataPopulator class which is a plain java test class added for
> the moment to test the POC.(I am moving this to a proper JUnit)
>

Good.


>
> TODOs
> - I am in the process of  finishing the implementation of Cassandra
> Resource, CassandaResource Provider and etc END to END.
> - Move to JUnit test framework and  write more tests for each scenario
> where I can extend this to Mockito (I am still not clear how Mockito comes
> in to the picture) in near future.
>

When you write the Unit tests, if you find that you need to mock anything
(ie ResourceResolver) to make your unit tests work, dont. Use Mocks. You
can even Mockup concrete clases so could mockup the behaviour of the Hector
API to respond in a pre-defined way to certain CQL queries. This will
eliminate the need to have a real cassandra server present when doing the
basic unit tests.




> - Change the implementation based on the feedbacks from the community.
> - Parameterize the constants as much as possible to read from a property
> file.
>

These should come from OSGi Properties. See the comments on
CassandraResoureProvider






>
>
> [1] -
> https://cassandra-backend-for-sling.googlecode.com/svn/trunk/main/cassandra
>
> Thanks
>

Excellent progress, thank you!
Ian


>
> >
> >
> > Ian
> >
> > 1
> >
> >
> http://hector-client.github.io/hector/source/content/API/core/0.8.0-2/me/prettyprint/cassandra/serializers/TypeInferringSerializer.html
> >
> > 2
> >
> >
> https://github.com/ieb/sparsemapcontent/tree/master/core/src/main/java/org/sakaiproject/nakamura/lite/storage/spi/types
> >
> >
> > On 28 June 2013 05:14, Dishara Wijewardana <dd...@gmail.com>
> > wrote:
> >
> > > Hi Ian,
> > > I am having a problem with CQL..
> > >
> > > For example:
> > >         CqlQuery*<String,String,Long>* cqlQuery = new CqlQuery*
> > > <String,String,Long>*(keyspace, new StringSerializer(),new
> > > StringSerializer(), new LongSerializer();
> > >         cqlQuery.setQuery("insert into mytable
> > (KEY,password,gender,userid)
> > > values (3,'pass1','male',34);");
> > >         QueryResult<CqlRows<String,String,Long>> result =
> > > cqlQuery.execute();
> > >
> > > This will successfully insert the row with pass1,male and 34 values
> under
> > > rowId=3.
> > >
> > > But in sling scenario, we need to have more serializers for a query as
> > > follows. Since we have more columns.
> > > i.e
> > >         CqlQuery*<String,String,String,String> *cqlQuery = new
> CqlQuery*
> > > <String,String,String,String>*(keyspace, new StringSerializer(),new
> > > StringSerializer(),new       StringSerializer(),new
> StringSerializer());
> > >         cqlQuery.setQuery("insert into mytable
> > > (KEY,path,resourceType,resourceSuperType,metadata) values
> > > (3,'/content/cassandra/foo/bar','nt:cassandra','nt:super','metadata');
> > >         QueryResult<CqlRows<String,String,Long>> result =
> > > cqlQuery.execute();
> > >
> > > Here I am using me.prettyprint.cassandra.model.CqlQuery class. Any idea
> > how
> > > to proceed with this.
> > >
> > > Am I doing something wring or is this a limitation of the API I am
> using
> > ?
> > >
> > >
> > > On Thu, Jun 27, 2013 at 7:41 AM, Dishara Wijewardana <
> > > ddwijewardana@gmail.com> wrote:
> > >
> > > >
> > > >
> > > > On Thu, Jun 27, 2013 at 4:26 AM, Ian Boston <ie...@tfd.co.uk> wrote:
> > > >
> > > >> On 27 June 2013 02:34, Dishara Wijewardana <ddwijewardana@gmail.com
> >
> > > >> wrote:
> > > >>
> > > >> > On Tue, Jun 25, 2013 at 4:52 AM, Ian Boston <ie...@tfd.co.uk>
> wrote:
> > > >> >
> > > >> > > Hi,
> > > >> > >
> > > >> > > (I might have errors in the CQL, Cassandra schema and the
> > functions
> > > >> need
> > > >> > > proper escaping)
> > > >> > >
> > > >> > >
> > > >> > > Example 1:
> > > >> > > Zero depth tree wiht UUID as the rowid or key.
> > > >> > >
> > > >> > > URL /content/cassandra/pictures/13f58d5c95c70b6f
> > > >> > >
> > > >> > > then the column family is pictures and the URL -> ROWID function
> > > just
> > > >> > > results in the ROWID being 13f58d5c95c70b6f and
> > > >> > >
> > > >> > > String cql =
> > > mapOfCassandraMappers.get("pictures").getCQL("pictures",
> > > >> "
> > > >> > > 13f58d5c95c70b6f")
> > > >> > > System.err.println(cql);
> > > >> > >
> > > >> > > where
> > > >> > > String getCQL(String cf, String path) {
> > > >> > >     return "select * from "+cf+" where rowid = '"+path+"'";
> > > >> > > }
> > > >> > >
> > > >> > > yields:
> > > >> > > select * from pictures where rowid = '13f58d5c95c70b6f'
> > > >> > >
> > > >> > >
> > > >> > > 13f58d5c95c70b6f would be generated by the application when the
> > user
> > > >> > > created a new picture (by upload).
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > Example 2:
> > > >> > > User specified
> > > >> > >
> > > >> > > URL
> > > >> /content/cassandra/catalogue/capacitors/electrolytic/axial/16v/10uf
> > > >> > >
> > > >> > > String cql =
> > > >> mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
> > > >> > > capacitors/electrolytic/axial/16v/10uf")
> > > >> > > System.err.println(cql);
> > > >> > >
> > > >> > > where
> > > >> > > String getCQL(String cf, String path) {
> > > >> > >     MessageDigest md = MessageDigest.getInstance("SHA1");
> > > >> > >     String rowID =
> > Base64.encode(md.finish(path.getBytes("UTF-8")));
> > > >> > >     return "select * from "+cf+" where rowid = '"+rowID+"'";
> > > >> > > }
> > > >> > >
> > > >> > > yields
> > > >> > >
> > > >> > > select * from pictures where rowid = 'NzdlZmU4OTZmNGM4MzMwYzZ'
> > > >> > >
> > > >> > > If you want to find the parent then
> > > >> > >
> > > >> > > mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
> > > >> > > capacitors/electrolytic/axial/16v")
> > > >> > >
> > > >> > > select * from pictures where rowid = 'ZGFzZGZzZnNkYWZzYWRmc2R'
> > > >> > >
> > > >> > > And if the parent is stored in the property parent then
> > > >> > >
> > > >> > > select * from pictures where parent = 'ZGFzZGZzZnNkYWZzYWRmc2R'
> > > >> > >
> > > >> > > will generate a list of children. (Not sure about performance)
> > > >> > >
> > > >> > >
> > > >> > > Example 3:
> > > >> > > User is allowed to enter the RowID directly (identical to
> Example
> > 1
> > > >> > > URL
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> /content/cassandra/cannesfilmfestival/TomCruiseCassino-20130402112345-ieb.jpg
> > > >> > >
> > > >> > > where
> > > >> > > String getCQL(String cf, String path) {
> > > >> > >     return "select * from "+cf+" where rowid = '"+path+"'";
> > > >> > > }
> > > >> > >
> > > >> > > yields:
> > > >> > > select * from pictures where rowid = '
> > > >> > > TomCruiseCassino-20130402112345-ieb.jpg'
> > > >> > >
> > > >> >
> > > >> > This should be corrected as
> > > >> > select * from cannesfilmfestival where rowid = '
> > > >> > TomCruiseCassino-20130402112345-ieb.jpg'
> > > >> >
> > > >> >
> > > >> > >
> > > >> > >
> > > >> > > Does that make sense ?
> > > >> > >
> > > >> >
> > > >>
> > > >> Hi
> > > >>
> > > >>
> > > >> > Hi Ian,
> > > >> > I was in fact practicing some cql stuff in related to this
> response
> > > >> (with
> > > >> > cassandra cql terminal). This is quite a wonderful explanation
> for a
> > > new
> > > >> > comer like me. Thank you very much for the explanation again. Now
> it
> > > >> really
> > > >> > makes sense.
> > > >> >
> > > >>
> > > >> excellent!
> > > >>
> > > >>
> > > >> >
> > > >> > Other than the zero depth approach, I believe users will be more
> > > >> > comfortable with Example 2 approach.
> > > >> > Shall we go ahead with it ?
> > > >> >
> > > >>
> > > >>
> > > >> Yes, go for it. It will be interesting to see how hard it is to
> > > implement
> > > >> and how well (or not) it works. Remember, keep it as simple as
> > possible
> > > >> and
> > > >> dont try and and cover every use case at the expense of getting a
> PoC
> > > >> working.
> > > >>
> > > > +1.
> > > >
> > > >>
> > > >> However, dont forget, Unit tests mocked with Mockito are a quicker
> way
> > > of
> > > >> getting to working code, than no unit test coverage.
> > > >>
> > > >> Best Regards
> > > >> Ian
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> >
> > > >> >
> > > >> > > Ian
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > On 25 June 2013 05:29, Dishara Wijewardana <
> > ddwijewardana@gmail.com
> > > >
> > > >> > > wrote:
> > > >> > >
> > > >> > > > On Mon, Jun 24, 2013 at 4:02 AM, Ian Boston <ie...@tfd.co.uk>
> > > wrote:
> > > >> > > >
> > > >> > > > > Hi Dishara,
> > > >> > > > > Yes. 1 resource == 1 row.
> > > >> > > > > The columns within that row represent the properties of the
> > > >> resource.
> > > >> > > > > I suggest that you use standard property names where
> > appropriate
> > > >> (eg
> > > >> > > > > sling:resourceType is the Resource.resourceType etc)
> > > >> > > > >
> > > >> > > > > The Resource itself should be adaptable to a generic
> > > >> > CassandraResource
> > > >> > > > > (which will probably implement Resource) which will have a
> map
> > > of
> > > >> > > > > properties containing all the columns of the cassandra row.
> > > >> (optimise
> > > >> > > > > later) A CassandraResource might look and feel like a
> > > Map<String,
> > > >> > > Object>
> > > >> > > > > or it might have a Map<String, Object> getProperties()
> method,
> > > or
> > > >> > > better
> > > >> > > > > still be adaptable to a Map. The essential think is dont
> hard
> > > code
> > > >> > the
> > > >> > > > > property names in the interface of CassandraResource for the
> > > >> moment.
> > > >> > ie
> > > >> > > > no
> > > >> > > > > getContentType() and no getMimeType(), as we dont really
> know
> > > >> what a
> > > >> > > > > CassandraResource will store.
> > > >> > > > >
> > > >> > > > > ResourceMetadata should be built from a subset of the
> > > >> > CassandraResource
> > > >> > > > > properties.
> > > >> > > > >
> > > >> > > > > You won't need to implement a ResourceResolver, only a
> > > >> > ResourceProvider
> > > >> > > > > (and Factory). I would use CQL in preference to other API
> > > methods.
> > > >> > > > >
> > > >> > > > > There is one thing that hasnt been mentioned, and thats the
> > URL
> > > ->
> > > >> > > > > Cassandra Row mapping.
> > > >> > > > > There are several ways of doing this.
> > > >> > > > >
> > > >> > > > > eg:
> > > >> > > > > URL = /content/cassandra/<columnFamily>/<rowID>
> > > >> > > > >  Cassandra Column Family = columnFamily
> > > >> > > > >  Cassandra RowID = rowID
> > > >> > > > > or
> > > >> > > > > URL =
> > > >> /content/cassandra/<columnFamilySelector>/remainder/of/the/path
> > > >> > > > >  Cassandra  Cassandra Column Family =
> > > >> > > > > mapOfColumnFamilies.get(columnFamilySelector)
> > > >> > > > >  Cassandra  RowID = function(/remainder/of/the/path)
> > > >> > > > >
> > > >> > > > > or to take that one stage further
> > > >> > > > >
> > > >> > > > > public interface CassandraMapper {
> > > >> > > > >       String getCQL(String columnFamilySelector, String
> path);
> > > >> > > > > }
> > > >> > > > >
> > > >> > > > Hi Ian
> > > >> > > > Thank you for the detailed explanation.
> > > >> > > >
> > > >> > > > OK. +1 for this approach with the mentioned flexibility.But  I
> > > need
> > > >> a
> > > >> > > small
> > > >> > > > clarification. With this approach,
> > > >> > > >
> > > >> > > > URL = /content/cassandra/<columnFamilySelector>ROW-ID
> > > >> > > > ROW-ID - function(/remainder/of/the/path).
> > > >> > > > So you mean ROW-ID is something we have to programatically
> > > uniquely
> > > >> > > create
> > > >> > > >  right ? like a UUID.
> > > >> > > >
> > > >> > > > What is this "/remainder/of/the/path" means ? Can you give an
> > > >> example
> > > >> > > with
> > > >> > > > real values in the context of a user who want to obtain a
> > resource
> > > >> from
> > > >> > > > cassandra.
> > > >> > > > This is just for my understanding.
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > >
> > > >> > > > > URL =
> > > /content/cassandra/<columnFamilySelector>/<remainderOfPath>
> > > >> > > > >
> > > >> > > > >  String cqlQuery =
> > > >> > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> mapOfCassandraMappers.get(columnFamilySelector).getCQL(columnFamilySelector,
> > > >> > > > > remainderOfPath);
> > > >> > > > >
> > > >> > > > > Which would allow us provided one or more implementations of
> > > >> > > > > CassandraMapper to map between URL and CQL.
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > HTH
> > > >> > > > > Ian
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > On 23 June 2013 19:29, Dishara Wijewardana <
> > > >> ddwijewardana@gmail.com>
> > > >> > > > > wrote:
> > > >> > > > >
> > > >> > > > > > Hi Ian,
> > > >> > > > > >
> > > >> > > > > > What is the data mapping should be between Cassandra and
> > Sling
> > > >> > > > resource.
> > > >> > > > > I
> > > >> > > > > > mean is a Sling Resource maps to a Cassandra Column ? Or
> > > Column
> > > >> > > Family
> > > >> > > > ?
> > > >> > > > > >
> > > >> > > > > > Because to get this Cassandra and Sling story correct we
> > need
> > > to
> > > >> > > > finalize
> > > >> > > > > > this.
> > > >> > > > > > For an example what we eventually returns is a Sling
> > resource.
> > > >> > > > Everything
> > > >> > > > > > that needs to fill in to create Sling resource should be
> > > stored
> > > >> in
> > > >> > > > > > Cassandra.
> > > >> > > > > > In a Sling resource,
> > > >> > > > > >
> > > >> > > > > >    - Path - direct sling resource path
> > > >> > > > > >    - ResourceType - nt:cassandra
> > > >> > > > > >    - ResourceSuperType - ?
> > > >> > > > > >    - ResourceMetadata - we can create this on the fly with
> > the
> > > >> data
> > > >> > > > from
> > > >> > > > > >    the corresponding column. At insertion, those need to
> be
> > > >> stored.
> > > >> > > > > > Following
> > > >> > > > > >    are the ones which I thought might be useful by default
> > to
> > > be
> > > >> > set
> > > >> > > > for
> > > >> > > > > > any
> > > >> > > > > >    node. Please add if we need anything more.
> > > >> > > > > >       - ContentType
> > > >> > > > > >       - ContentLength
> > > >> > > > > >       - CreationTime
> > > >> > > > > >       - ModificationTime
> > > >> > > > > >    - ResourceResolver -  Do we need a resolver in this
> case
> > ?
> > > >> > > > > >
> > > >> > > > > >
> > > >> > > > > >  So I believe in CQL context, one ROW should represent a
> > Sling
> > > >> > > > resource.
> > > >> > > > > If
> > > >> > > > > > that is the case for ResourceMetadata we might need a
> > separate
> > > >> > column
> > > >> > > > to
> > > >> > > > > > store it since it has multiple values. I am not sure
> whether
> > > we
> > > >> can
> > > >> > > do
> > > >> > > > it
> > > >> > > > > > with CQL, but it should be possible with hector APIs may
> be.
> > > >> > > > > >
> > > >> > > > > > Appreciate your thoughts ?
> > > >> > > > > >
> > > >> > > > > >
> > > >> > > > > > On Wed, Jun 19, 2013 at 1:19 AM, Dishara Wijewardana <
> > > >> > > > > > ddwijewardana@gmail.com> wrote:
> > > >> > > > > >
> > > >> > > > > > > Hi Ian,
> > > >> > > > > > > I am starting this thread to keep track on things
> related
> > to
> > > >> the
> > > >> > > GSoC
> > > >> > > > > > > project related milestone status updates and related
> > > >> discussions.
> > > >> > > > > > > So the first task over view will be as follows as per
> GSoC
> > > >> > proposal
> > > >> > > > > > > provided.
> > > >> > > > > > >
> > > >> > > > > > > 1. Implementing a CassandraResourceProvider  to READ
> from
> > > >> > > Cassandra.
> > > >> > > > > > > Implementation Details [1]
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > > > [1] : Implementation Details:
> > > >> > > > > > >
> > > >> > > > > > >  1.A) Write a CassanrdaResourceProviderUtil  which is
> > > >> basically a
> > > >> > > > > > > cassendra client which will facilitate all cassandra
> > related
> > > >> > > > operations
> > > >> > > > > > > required by other modules (CassandraResourceProvider and
> > > >> > > > > > > CassandraResourceResolver).
> > > >> > > > > > >
> > > >> > > > > > > 1.B) Implementation of  CassandraResourceProvider
> > > >> > > > > > >
> > > >> > > > > > > 1.C)  Implementation of CassandraResourceResolver
> > > >> > > > > > >
> > > >> > > > > > > 1.D) Implementation of CassandraResource
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > > > And I will start writing the
> CassanrdaResourceProviderUtil
> > > >> class
> > > >> > > > which
> > > >> > > > > > > will do basic add and get using hector API. Please
> provide
> > > any
> > > >> > > > feedback
> > > >> > > > > > > that will be useful to accomplish this task.
> > > >> > > > > > > So for this how does path mapping should be done.
> Because
> > > for
> > > >> > > > example,
> > > >> > > > > > the
> > > >> > > > > > > path of the cassendra node will not be same as the jcr
> > node
> > > >> path.
> > > >> > > i.e
> > > >> > > > > > > provider will ask a node path /system/myapps/test/foo
> and
> > > >> where
> > > >> > > > should
> > > >> > > > > we
> > > >> > > > > > > return it from Cassandra. Aren't we have to first
> consider
> > > the
> > > >> > > WRITE
> > > >> > > > > > aspect
> > > >> > > > > > > to Cassandra ?
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > > > --
> > > >> > > > > > > Thanks
> > > >> > > > > > > /Dishara
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > > >
> > > >> > > > > >
> > > >> > > > > > --
> > > >> > > > > > Thanks
> > > >> > > > > > /Dishara
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > --
> > > >> > > > Thanks
> > > >> > > > /Dishara
> > > >> > > >
> > > >> > >
> > > >> >
> > > >> >
> > > >> >
> > > >> > --
> > > >> > Thanks
> > > >> > /Dishara
> > > >> >
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Thanks
> > > > /Dishara
> > > >
> > >
> > >
> > >
> > > --
> > > Thanks
> > > /Dishara
> > >
> >
>
>
>
> --
> Thanks
> /Dishara
>

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Dishara Wijewardana <dd...@gmail.com>.
On Fri, Jun 28, 2013 at 4:37 AM, Ian Boston <ie...@tfd.co.uk> wrote:

> Hi,
> Have you tried the TypeInferringSerializer for the value serializer ?
> That claims to be detect what the column value is based on the Byte array.
>
> Failing that, I would consider making everything byte[] and using your own
> serializer that writes and read values to a byte[] using DataInputStream
> DataOutputStream.
>
> [2] Is an example of a serializer written for that purpose that was used
> with Cassandra over raw Thrift. Its not easy to read what it outputs to the
> storage layer, but it is compact and efficient. I would not use it directly
> as it does some very specific things like slicing large byte[]s into 1MB
> chunks and bypassing the 64K limit on reading and writing UTF8 strings with
> DataInputStream.
>
> Try the TypeInferringSerializer first. If it works great, no need to do
> anything more complex.
>

Hi,
In fact I was able to add as many params as I wanted with the same
configurations. But TypeInferringSerializer is a useful one too which might
need in future.
Also I was thinking rather than storing resource meta data as String
values, how about storing a serialized object as you mentioned ? It will be
clear. But I am not sure about the performance. Because when we have multi
valued columns like meta data we have to insert them in a single String as
comma separated values. It is scalable if we have a Bean for Cassandra
Resource ? What do you think ?

And I did a first cut of this  but with many TODOs ;-),  where getResource
method is implemented and currently all the content is printed, but I have
not implemented methods in CassandraResource yet. This is just a POC to
test whether the proposed model works. Apparently it works [1].  See
 CassandraDataPopulator class which is a plain java test class added for
the moment to test the POC.(I am moving this to a proper JUnit)

TODOs
- I am in the process of  finishing the implementation of Cassandra
Resource, CassandaResource Provider and etc END to END.
- Move to JUnit test framework and  write more tests for each scenario
where I can extend this to Mockito (I am still not clear how Mockito comes
in to the picture) in near future.
- Change the implementation based on the feedbacks from the community.
- Parameterize the constants as much as possible to read from a property
file.


[1] -
https://cassandra-backend-for-sling.googlecode.com/svn/trunk/main/cassandra

Thanks

>
>
> Ian
>
> 1
>
> http://hector-client.github.io/hector/source/content/API/core/0.8.0-2/me/prettyprint/cassandra/serializers/TypeInferringSerializer.html
>
> 2
>
> https://github.com/ieb/sparsemapcontent/tree/master/core/src/main/java/org/sakaiproject/nakamura/lite/storage/spi/types
>
>
> On 28 June 2013 05:14, Dishara Wijewardana <dd...@gmail.com>
> wrote:
>
> > Hi Ian,
> > I am having a problem with CQL..
> >
> > For example:
> >         CqlQuery*<String,String,Long>* cqlQuery = new CqlQuery*
> > <String,String,Long>*(keyspace, new StringSerializer(),new
> > StringSerializer(), new LongSerializer();
> >         cqlQuery.setQuery("insert into mytable
> (KEY,password,gender,userid)
> > values (3,'pass1','male',34);");
> >         QueryResult<CqlRows<String,String,Long>> result =
> > cqlQuery.execute();
> >
> > This will successfully insert the row with pass1,male and 34 values under
> > rowId=3.
> >
> > But in sling scenario, we need to have more serializers for a query as
> > follows. Since we have more columns.
> > i.e
> >         CqlQuery*<String,String,String,String> *cqlQuery = new CqlQuery*
> > <String,String,String,String>*(keyspace, new StringSerializer(),new
> > StringSerializer(),new       StringSerializer(),new StringSerializer());
> >         cqlQuery.setQuery("insert into mytable
> > (KEY,path,resourceType,resourceSuperType,metadata) values
> > (3,'/content/cassandra/foo/bar','nt:cassandra','nt:super','metadata');
> >         QueryResult<CqlRows<String,String,Long>> result =
> > cqlQuery.execute();
> >
> > Here I am using me.prettyprint.cassandra.model.CqlQuery class. Any idea
> how
> > to proceed with this.
> >
> > Am I doing something wring or is this a limitation of the API I am using
> ?
> >
> >
> > On Thu, Jun 27, 2013 at 7:41 AM, Dishara Wijewardana <
> > ddwijewardana@gmail.com> wrote:
> >
> > >
> > >
> > > On Thu, Jun 27, 2013 at 4:26 AM, Ian Boston <ie...@tfd.co.uk> wrote:
> > >
> > >> On 27 June 2013 02:34, Dishara Wijewardana <dd...@gmail.com>
> > >> wrote:
> > >>
> > >> > On Tue, Jun 25, 2013 at 4:52 AM, Ian Boston <ie...@tfd.co.uk> wrote:
> > >> >
> > >> > > Hi,
> > >> > >
> > >> > > (I might have errors in the CQL, Cassandra schema and the
> functions
> > >> need
> > >> > > proper escaping)
> > >> > >
> > >> > >
> > >> > > Example 1:
> > >> > > Zero depth tree wiht UUID as the rowid or key.
> > >> > >
> > >> > > URL /content/cassandra/pictures/13f58d5c95c70b6f
> > >> > >
> > >> > > then the column family is pictures and the URL -> ROWID function
> > just
> > >> > > results in the ROWID being 13f58d5c95c70b6f and
> > >> > >
> > >> > > String cql =
> > mapOfCassandraMappers.get("pictures").getCQL("pictures",
> > >> "
> > >> > > 13f58d5c95c70b6f")
> > >> > > System.err.println(cql);
> > >> > >
> > >> > > where
> > >> > > String getCQL(String cf, String path) {
> > >> > >     return "select * from "+cf+" where rowid = '"+path+"'";
> > >> > > }
> > >> > >
> > >> > > yields:
> > >> > > select * from pictures where rowid = '13f58d5c95c70b6f'
> > >> > >
> > >> > >
> > >> > > 13f58d5c95c70b6f would be generated by the application when the
> user
> > >> > > created a new picture (by upload).
> > >> > >
> > >> > >
> > >> > >
> > >> > > Example 2:
> > >> > > User specified
> > >> > >
> > >> > > URL
> > >> /content/cassandra/catalogue/capacitors/electrolytic/axial/16v/10uf
> > >> > >
> > >> > > String cql =
> > >> mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
> > >> > > capacitors/electrolytic/axial/16v/10uf")
> > >> > > System.err.println(cql);
> > >> > >
> > >> > > where
> > >> > > String getCQL(String cf, String path) {
> > >> > >     MessageDigest md = MessageDigest.getInstance("SHA1");
> > >> > >     String rowID =
> Base64.encode(md.finish(path.getBytes("UTF-8")));
> > >> > >     return "select * from "+cf+" where rowid = '"+rowID+"'";
> > >> > > }
> > >> > >
> > >> > > yields
> > >> > >
> > >> > > select * from pictures where rowid = 'NzdlZmU4OTZmNGM4MzMwYzZ'
> > >> > >
> > >> > > If you want to find the parent then
> > >> > >
> > >> > > mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
> > >> > > capacitors/electrolytic/axial/16v")
> > >> > >
> > >> > > select * from pictures where rowid = 'ZGFzZGZzZnNkYWZzYWRmc2R'
> > >> > >
> > >> > > And if the parent is stored in the property parent then
> > >> > >
> > >> > > select * from pictures where parent = 'ZGFzZGZzZnNkYWZzYWRmc2R'
> > >> > >
> > >> > > will generate a list of children. (Not sure about performance)
> > >> > >
> > >> > >
> > >> > > Example 3:
> > >> > > User is allowed to enter the RowID directly (identical to Example
> 1
> > >> > > URL
> > >> > >
> > >> > >
> > >> >
> > >>
> >
> /content/cassandra/cannesfilmfestival/TomCruiseCassino-20130402112345-ieb.jpg
> > >> > >
> > >> > > where
> > >> > > String getCQL(String cf, String path) {
> > >> > >     return "select * from "+cf+" where rowid = '"+path+"'";
> > >> > > }
> > >> > >
> > >> > > yields:
> > >> > > select * from pictures where rowid = '
> > >> > > TomCruiseCassino-20130402112345-ieb.jpg'
> > >> > >
> > >> >
> > >> > This should be corrected as
> > >> > select * from cannesfilmfestival where rowid = '
> > >> > TomCruiseCassino-20130402112345-ieb.jpg'
> > >> >
> > >> >
> > >> > >
> > >> > >
> > >> > > Does that make sense ?
> > >> > >
> > >> >
> > >>
> > >> Hi
> > >>
> > >>
> > >> > Hi Ian,
> > >> > I was in fact practicing some cql stuff in related to this response
> > >> (with
> > >> > cassandra cql terminal). This is quite a wonderful explanation for a
> > new
> > >> > comer like me. Thank you very much for the explanation again. Now it
> > >> really
> > >> > makes sense.
> > >> >
> > >>
> > >> excellent!
> > >>
> > >>
> > >> >
> > >> > Other than the zero depth approach, I believe users will be more
> > >> > comfortable with Example 2 approach.
> > >> > Shall we go ahead with it ?
> > >> >
> > >>
> > >>
> > >> Yes, go for it. It will be interesting to see how hard it is to
> > implement
> > >> and how well (or not) it works. Remember, keep it as simple as
> possible
> > >> and
> > >> dont try and and cover every use case at the expense of getting a PoC
> > >> working.
> > >>
> > > +1.
> > >
> > >>
> > >> However, dont forget, Unit tests mocked with Mockito are a quicker way
> > of
> > >> getting to working code, than no unit test coverage.
> > >>
> > >> Best Regards
> > >> Ian
> > >>
> > >>
> > >>
> > >>
> > >> >
> > >> >
> > >> > > Ian
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > On 25 June 2013 05:29, Dishara Wijewardana <
> ddwijewardana@gmail.com
> > >
> > >> > > wrote:
> > >> > >
> > >> > > > On Mon, Jun 24, 2013 at 4:02 AM, Ian Boston <ie...@tfd.co.uk>
> > wrote:
> > >> > > >
> > >> > > > > Hi Dishara,
> > >> > > > > Yes. 1 resource == 1 row.
> > >> > > > > The columns within that row represent the properties of the
> > >> resource.
> > >> > > > > I suggest that you use standard property names where
> appropriate
> > >> (eg
> > >> > > > > sling:resourceType is the Resource.resourceType etc)
> > >> > > > >
> > >> > > > > The Resource itself should be adaptable to a generic
> > >> > CassandraResource
> > >> > > > > (which will probably implement Resource) which will have a map
> > of
> > >> > > > > properties containing all the columns of the cassandra row.
> > >> (optimise
> > >> > > > > later) A CassandraResource might look and feel like a
> > Map<String,
> > >> > > Object>
> > >> > > > > or it might have a Map<String, Object> getProperties() method,
> > or
> > >> > > better
> > >> > > > > still be adaptable to a Map. The essential think is dont hard
> > code
> > >> > the
> > >> > > > > property names in the interface of CassandraResource for the
> > >> moment.
> > >> > ie
> > >> > > > no
> > >> > > > > getContentType() and no getMimeType(), as we dont really know
> > >> what a
> > >> > > > > CassandraResource will store.
> > >> > > > >
> > >> > > > > ResourceMetadata should be built from a subset of the
> > >> > CassandraResource
> > >> > > > > properties.
> > >> > > > >
> > >> > > > > You won't need to implement a ResourceResolver, only a
> > >> > ResourceProvider
> > >> > > > > (and Factory). I would use CQL in preference to other API
> > methods.
> > >> > > > >
> > >> > > > > There is one thing that hasnt been mentioned, and thats the
> URL
> > ->
> > >> > > > > Cassandra Row mapping.
> > >> > > > > There are several ways of doing this.
> > >> > > > >
> > >> > > > > eg:
> > >> > > > > URL = /content/cassandra/<columnFamily>/<rowID>
> > >> > > > >  Cassandra Column Family = columnFamily
> > >> > > > >  Cassandra RowID = rowID
> > >> > > > > or
> > >> > > > > URL =
> > >> /content/cassandra/<columnFamilySelector>/remainder/of/the/path
> > >> > > > >  Cassandra  Cassandra Column Family =
> > >> > > > > mapOfColumnFamilies.get(columnFamilySelector)
> > >> > > > >  Cassandra  RowID = function(/remainder/of/the/path)
> > >> > > > >
> > >> > > > > or to take that one stage further
> > >> > > > >
> > >> > > > > public interface CassandraMapper {
> > >> > > > >       String getCQL(String columnFamilySelector, String path);
> > >> > > > > }
> > >> > > > >
> > >> > > > Hi Ian
> > >> > > > Thank you for the detailed explanation.
> > >> > > >
> > >> > > > OK. +1 for this approach with the mentioned flexibility.But  I
> > need
> > >> a
> > >> > > small
> > >> > > > clarification. With this approach,
> > >> > > >
> > >> > > > URL = /content/cassandra/<columnFamilySelector>ROW-ID
> > >> > > > ROW-ID - function(/remainder/of/the/path).
> > >> > > > So you mean ROW-ID is something we have to programatically
> > uniquely
> > >> > > create
> > >> > > >  right ? like a UUID.
> > >> > > >
> > >> > > > What is this "/remainder/of/the/path" means ? Can you give an
> > >> example
> > >> > > with
> > >> > > > real values in the context of a user who want to obtain a
> resource
> > >> from
> > >> > > > cassandra.
> > >> > > > This is just for my understanding.
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > >
> > >> > > > > URL =
> > /content/cassandra/<columnFamilySelector>/<remainderOfPath>
> > >> > > > >
> > >> > > > >  String cqlQuery =
> > >> > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> mapOfCassandraMappers.get(columnFamilySelector).getCQL(columnFamilySelector,
> > >> > > > > remainderOfPath);
> > >> > > > >
> > >> > > > > Which would allow us provided one or more implementations of
> > >> > > > > CassandraMapper to map between URL and CQL.
> > >> > > > >
> > >> > > > >
> > >> > > > > HTH
> > >> > > > > Ian
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > > On 23 June 2013 19:29, Dishara Wijewardana <
> > >> ddwijewardana@gmail.com>
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > > > Hi Ian,
> > >> > > > > >
> > >> > > > > > What is the data mapping should be between Cassandra and
> Sling
> > >> > > > resource.
> > >> > > > > I
> > >> > > > > > mean is a Sling Resource maps to a Cassandra Column ? Or
> > Column
> > >> > > Family
> > >> > > > ?
> > >> > > > > >
> > >> > > > > > Because to get this Cassandra and Sling story correct we
> need
> > to
> > >> > > > finalize
> > >> > > > > > this.
> > >> > > > > > For an example what we eventually returns is a Sling
> resource.
> > >> > > > Everything
> > >> > > > > > that needs to fill in to create Sling resource should be
> > stored
> > >> in
> > >> > > > > > Cassandra.
> > >> > > > > > In a Sling resource,
> > >> > > > > >
> > >> > > > > >    - Path - direct sling resource path
> > >> > > > > >    - ResourceType - nt:cassandra
> > >> > > > > >    - ResourceSuperType - ?
> > >> > > > > >    - ResourceMetadata - we can create this on the fly with
> the
> > >> data
> > >> > > > from
> > >> > > > > >    the corresponding column. At insertion, those need to be
> > >> stored.
> > >> > > > > > Following
> > >> > > > > >    are the ones which I thought might be useful by default
> to
> > be
> > >> > set
> > >> > > > for
> > >> > > > > > any
> > >> > > > > >    node. Please add if we need anything more.
> > >> > > > > >       - ContentType
> > >> > > > > >       - ContentLength
> > >> > > > > >       - CreationTime
> > >> > > > > >       - ModificationTime
> > >> > > > > >    - ResourceResolver -  Do we need a resolver in this case
> ?
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >  So I believe in CQL context, one ROW should represent a
> Sling
> > >> > > > resource.
> > >> > > > > If
> > >> > > > > > that is the case for ResourceMetadata we might need a
> separate
> > >> > column
> > >> > > > to
> > >> > > > > > store it since it has multiple values. I am not sure whether
> > we
> > >> can
> > >> > > do
> > >> > > > it
> > >> > > > > > with CQL, but it should be possible with hector APIs may be.
> > >> > > > > >
> > >> > > > > > Appreciate your thoughts ?
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > On Wed, Jun 19, 2013 at 1:19 AM, Dishara Wijewardana <
> > >> > > > > > ddwijewardana@gmail.com> wrote:
> > >> > > > > >
> > >> > > > > > > Hi Ian,
> > >> > > > > > > I am starting this thread to keep track on things related
> to
> > >> the
> > >> > > GSoC
> > >> > > > > > > project related milestone status updates and related
> > >> discussions.
> > >> > > > > > > So the first task over view will be as follows as per GSoC
> > >> > proposal
> > >> > > > > > > provided.
> > >> > > > > > >
> > >> > > > > > > 1. Implementing a CassandraResourceProvider  to READ from
> > >> > > Cassandra.
> > >> > > > > > > Implementation Details [1]
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > [1] : Implementation Details:
> > >> > > > > > >
> > >> > > > > > >  1.A) Write a CassanrdaResourceProviderUtil  which is
> > >> basically a
> > >> > > > > > > cassendra client which will facilitate all cassandra
> related
> > >> > > > operations
> > >> > > > > > > required by other modules (CassandraResourceProvider and
> > >> > > > > > > CassandraResourceResolver).
> > >> > > > > > >
> > >> > > > > > > 1.B) Implementation of  CassandraResourceProvider
> > >> > > > > > >
> > >> > > > > > > 1.C)  Implementation of CassandraResourceResolver
> > >> > > > > > >
> > >> > > > > > > 1.D) Implementation of CassandraResource
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > And I will start writing the CassanrdaResourceProviderUtil
> > >> class
> > >> > > > which
> > >> > > > > > > will do basic add and get using hector API. Please provide
> > any
> > >> > > > feedback
> > >> > > > > > > that will be useful to accomplish this task.
> > >> > > > > > > So for this how does path mapping should be done. Because
> > for
> > >> > > > example,
> > >> > > > > > the
> > >> > > > > > > path of the cassendra node will not be same as the jcr
> node
> > >> path.
> > >> > > i.e
> > >> > > > > > > provider will ask a node path /system/myapps/test/foo and
> > >> where
> > >> > > > should
> > >> > > > > we
> > >> > > > > > > return it from Cassandra. Aren't we have to first consider
> > the
> > >> > > WRITE
> > >> > > > > > aspect
> > >> > > > > > > to Cassandra ?
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > --
> > >> > > > > > > Thanks
> > >> > > > > > > /Dishara
> > >> > > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > --
> > >> > > > > > Thanks
> > >> > > > > > /Dishara
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > --
> > >> > > > Thanks
> > >> > > > /Dishara
> > >> > > >
> > >> > >
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > Thanks
> > >> > /Dishara
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > Thanks
> > > /Dishara
> > >
> >
> >
> >
> > --
> > Thanks
> > /Dishara
> >
>



-- 
Thanks
/Dishara

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Ian Boston <ie...@tfd.co.uk>.
Hi,
Have you tried the TypeInferringSerializer for the value serializer ?
That claims to be detect what the column value is based on the Byte array.

Failing that, I would consider making everything byte[] and using your own
serializer that writes and read values to a byte[] using DataInputStream
DataOutputStream.

[2] Is an example of a serializer written for that purpose that was used
with Cassandra over raw Thrift. Its not easy to read what it outputs to the
storage layer, but it is compact and efficient. I would not use it directly
as it does some very specific things like slicing large byte[]s into 1MB
chunks and bypassing the 64K limit on reading and writing UTF8 strings with
DataInputStream.

Try the TypeInferringSerializer first. If it works great, no need to do
anything more complex.


Ian

1
http://hector-client.github.io/hector/source/content/API/core/0.8.0-2/me/prettyprint/cassandra/serializers/TypeInferringSerializer.html

2
https://github.com/ieb/sparsemapcontent/tree/master/core/src/main/java/org/sakaiproject/nakamura/lite/storage/spi/types


On 28 June 2013 05:14, Dishara Wijewardana <dd...@gmail.com> wrote:

> Hi Ian,
> I am having a problem with CQL..
>
> For example:
>         CqlQuery*<String,String,Long>* cqlQuery = new CqlQuery*
> <String,String,Long>*(keyspace, new StringSerializer(),new
> StringSerializer(), new LongSerializer();
>         cqlQuery.setQuery("insert into mytable (KEY,password,gender,userid)
> values (3,'pass1','male',34);");
>         QueryResult<CqlRows<String,String,Long>> result =
> cqlQuery.execute();
>
> This will successfully insert the row with pass1,male and 34 values under
> rowId=3.
>
> But in sling scenario, we need to have more serializers for a query as
> follows. Since we have more columns.
> i.e
>         CqlQuery*<String,String,String,String> *cqlQuery = new CqlQuery*
> <String,String,String,String>*(keyspace, new StringSerializer(),new
> StringSerializer(),new       StringSerializer(),new StringSerializer());
>         cqlQuery.setQuery("insert into mytable
> (KEY,path,resourceType,resourceSuperType,metadata) values
> (3,'/content/cassandra/foo/bar','nt:cassandra','nt:super','metadata');
>         QueryResult<CqlRows<String,String,Long>> result =
> cqlQuery.execute();
>
> Here I am using me.prettyprint.cassandra.model.CqlQuery class. Any idea how
> to proceed with this.
>
> Am I doing something wring or is this a limitation of the API I am using ?
>
>
> On Thu, Jun 27, 2013 at 7:41 AM, Dishara Wijewardana <
> ddwijewardana@gmail.com> wrote:
>
> >
> >
> > On Thu, Jun 27, 2013 at 4:26 AM, Ian Boston <ie...@tfd.co.uk> wrote:
> >
> >> On 27 June 2013 02:34, Dishara Wijewardana <dd...@gmail.com>
> >> wrote:
> >>
> >> > On Tue, Jun 25, 2013 at 4:52 AM, Ian Boston <ie...@tfd.co.uk> wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > (I might have errors in the CQL, Cassandra schema and the functions
> >> need
> >> > > proper escaping)
> >> > >
> >> > >
> >> > > Example 1:
> >> > > Zero depth tree wiht UUID as the rowid or key.
> >> > >
> >> > > URL /content/cassandra/pictures/13f58d5c95c70b6f
> >> > >
> >> > > then the column family is pictures and the URL -> ROWID function
> just
> >> > > results in the ROWID being 13f58d5c95c70b6f and
> >> > >
> >> > > String cql =
> mapOfCassandraMappers.get("pictures").getCQL("pictures",
> >> "
> >> > > 13f58d5c95c70b6f")
> >> > > System.err.println(cql);
> >> > >
> >> > > where
> >> > > String getCQL(String cf, String path) {
> >> > >     return "select * from "+cf+" where rowid = '"+path+"'";
> >> > > }
> >> > >
> >> > > yields:
> >> > > select * from pictures where rowid = '13f58d5c95c70b6f'
> >> > >
> >> > >
> >> > > 13f58d5c95c70b6f would be generated by the application when the user
> >> > > created a new picture (by upload).
> >> > >
> >> > >
> >> > >
> >> > > Example 2:
> >> > > User specified
> >> > >
> >> > > URL
> >> /content/cassandra/catalogue/capacitors/electrolytic/axial/16v/10uf
> >> > >
> >> > > String cql =
> >> mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
> >> > > capacitors/electrolytic/axial/16v/10uf")
> >> > > System.err.println(cql);
> >> > >
> >> > > where
> >> > > String getCQL(String cf, String path) {
> >> > >     MessageDigest md = MessageDigest.getInstance("SHA1");
> >> > >     String rowID = Base64.encode(md.finish(path.getBytes("UTF-8")));
> >> > >     return "select * from "+cf+" where rowid = '"+rowID+"'";
> >> > > }
> >> > >
> >> > > yields
> >> > >
> >> > > select * from pictures where rowid = 'NzdlZmU4OTZmNGM4MzMwYzZ'
> >> > >
> >> > > If you want to find the parent then
> >> > >
> >> > > mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
> >> > > capacitors/electrolytic/axial/16v")
> >> > >
> >> > > select * from pictures where rowid = 'ZGFzZGZzZnNkYWZzYWRmc2R'
> >> > >
> >> > > And if the parent is stored in the property parent then
> >> > >
> >> > > select * from pictures where parent = 'ZGFzZGZzZnNkYWZzYWRmc2R'
> >> > >
> >> > > will generate a list of children. (Not sure about performance)
> >> > >
> >> > >
> >> > > Example 3:
> >> > > User is allowed to enter the RowID directly (identical to Example 1
> >> > > URL
> >> > >
> >> > >
> >> >
> >>
> /content/cassandra/cannesfilmfestival/TomCruiseCassino-20130402112345-ieb.jpg
> >> > >
> >> > > where
> >> > > String getCQL(String cf, String path) {
> >> > >     return "select * from "+cf+" where rowid = '"+path+"'";
> >> > > }
> >> > >
> >> > > yields:
> >> > > select * from pictures where rowid = '
> >> > > TomCruiseCassino-20130402112345-ieb.jpg'
> >> > >
> >> >
> >> > This should be corrected as
> >> > select * from cannesfilmfestival where rowid = '
> >> > TomCruiseCassino-20130402112345-ieb.jpg'
> >> >
> >> >
> >> > >
> >> > >
> >> > > Does that make sense ?
> >> > >
> >> >
> >>
> >> Hi
> >>
> >>
> >> > Hi Ian,
> >> > I was in fact practicing some cql stuff in related to this response
> >> (with
> >> > cassandra cql terminal). This is quite a wonderful explanation for a
> new
> >> > comer like me. Thank you very much for the explanation again. Now it
> >> really
> >> > makes sense.
> >> >
> >>
> >> excellent!
> >>
> >>
> >> >
> >> > Other than the zero depth approach, I believe users will be more
> >> > comfortable with Example 2 approach.
> >> > Shall we go ahead with it ?
> >> >
> >>
> >>
> >> Yes, go for it. It will be interesting to see how hard it is to
> implement
> >> and how well (or not) it works. Remember, keep it as simple as possible
> >> and
> >> dont try and and cover every use case at the expense of getting a PoC
> >> working.
> >>
> > +1.
> >
> >>
> >> However, dont forget, Unit tests mocked with Mockito are a quicker way
> of
> >> getting to working code, than no unit test coverage.
> >>
> >> Best Regards
> >> Ian
> >>
> >>
> >>
> >>
> >> >
> >> >
> >> > > Ian
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > On 25 June 2013 05:29, Dishara Wijewardana <ddwijewardana@gmail.com
> >
> >> > > wrote:
> >> > >
> >> > > > On Mon, Jun 24, 2013 at 4:02 AM, Ian Boston <ie...@tfd.co.uk>
> wrote:
> >> > > >
> >> > > > > Hi Dishara,
> >> > > > > Yes. 1 resource == 1 row.
> >> > > > > The columns within that row represent the properties of the
> >> resource.
> >> > > > > I suggest that you use standard property names where appropriate
> >> (eg
> >> > > > > sling:resourceType is the Resource.resourceType etc)
> >> > > > >
> >> > > > > The Resource itself should be adaptable to a generic
> >> > CassandraResource
> >> > > > > (which will probably implement Resource) which will have a map
> of
> >> > > > > properties containing all the columns of the cassandra row.
> >> (optimise
> >> > > > > later) A CassandraResource might look and feel like a
> Map<String,
> >> > > Object>
> >> > > > > or it might have a Map<String, Object> getProperties() method,
> or
> >> > > better
> >> > > > > still be adaptable to a Map. The essential think is dont hard
> code
> >> > the
> >> > > > > property names in the interface of CassandraResource for the
> >> moment.
> >> > ie
> >> > > > no
> >> > > > > getContentType() and no getMimeType(), as we dont really know
> >> what a
> >> > > > > CassandraResource will store.
> >> > > > >
> >> > > > > ResourceMetadata should be built from a subset of the
> >> > CassandraResource
> >> > > > > properties.
> >> > > > >
> >> > > > > You won't need to implement a ResourceResolver, only a
> >> > ResourceProvider
> >> > > > > (and Factory). I would use CQL in preference to other API
> methods.
> >> > > > >
> >> > > > > There is one thing that hasnt been mentioned, and thats the URL
> ->
> >> > > > > Cassandra Row mapping.
> >> > > > > There are several ways of doing this.
> >> > > > >
> >> > > > > eg:
> >> > > > > URL = /content/cassandra/<columnFamily>/<rowID>
> >> > > > >  Cassandra Column Family = columnFamily
> >> > > > >  Cassandra RowID = rowID
> >> > > > > or
> >> > > > > URL =
> >> /content/cassandra/<columnFamilySelector>/remainder/of/the/path
> >> > > > >  Cassandra  Cassandra Column Family =
> >> > > > > mapOfColumnFamilies.get(columnFamilySelector)
> >> > > > >  Cassandra  RowID = function(/remainder/of/the/path)
> >> > > > >
> >> > > > > or to take that one stage further
> >> > > > >
> >> > > > > public interface CassandraMapper {
> >> > > > >       String getCQL(String columnFamilySelector, String path);
> >> > > > > }
> >> > > > >
> >> > > > Hi Ian
> >> > > > Thank you for the detailed explanation.
> >> > > >
> >> > > > OK. +1 for this approach with the mentioned flexibility.But  I
> need
> >> a
> >> > > small
> >> > > > clarification. With this approach,
> >> > > >
> >> > > > URL = /content/cassandra/<columnFamilySelector>ROW-ID
> >> > > > ROW-ID - function(/remainder/of/the/path).
> >> > > > So you mean ROW-ID is something we have to programatically
> uniquely
> >> > > create
> >> > > >  right ? like a UUID.
> >> > > >
> >> > > > What is this "/remainder/of/the/path" means ? Can you give an
> >> example
> >> > > with
> >> > > > real values in the context of a user who want to obtain a resource
> >> from
> >> > > > cassandra.
> >> > > > This is just for my understanding.
> >> > > >
> >> > > >
> >> > > >
> >> > > > >
> >> > > > > URL =
> /content/cassandra/<columnFamilySelector>/<remainderOfPath>
> >> > > > >
> >> > > > >  String cqlQuery =
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> mapOfCassandraMappers.get(columnFamilySelector).getCQL(columnFamilySelector,
> >> > > > > remainderOfPath);
> >> > > > >
> >> > > > > Which would allow us provided one or more implementations of
> >> > > > > CassandraMapper to map between URL and CQL.
> >> > > > >
> >> > > > >
> >> > > > > HTH
> >> > > > > Ian
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > On 23 June 2013 19:29, Dishara Wijewardana <
> >> ddwijewardana@gmail.com>
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Hi Ian,
> >> > > > > >
> >> > > > > > What is the data mapping should be between Cassandra and Sling
> >> > > > resource.
> >> > > > > I
> >> > > > > > mean is a Sling Resource maps to a Cassandra Column ? Or
> Column
> >> > > Family
> >> > > > ?
> >> > > > > >
> >> > > > > > Because to get this Cassandra and Sling story correct we need
> to
> >> > > > finalize
> >> > > > > > this.
> >> > > > > > For an example what we eventually returns is a Sling resource.
> >> > > > Everything
> >> > > > > > that needs to fill in to create Sling resource should be
> stored
> >> in
> >> > > > > > Cassandra.
> >> > > > > > In a Sling resource,
> >> > > > > >
> >> > > > > >    - Path - direct sling resource path
> >> > > > > >    - ResourceType - nt:cassandra
> >> > > > > >    - ResourceSuperType - ?
> >> > > > > >    - ResourceMetadata - we can create this on the fly with the
> >> data
> >> > > > from
> >> > > > > >    the corresponding column. At insertion, those need to be
> >> stored.
> >> > > > > > Following
> >> > > > > >    are the ones which I thought might be useful by default to
> be
> >> > set
> >> > > > for
> >> > > > > > any
> >> > > > > >    node. Please add if we need anything more.
> >> > > > > >       - ContentType
> >> > > > > >       - ContentLength
> >> > > > > >       - CreationTime
> >> > > > > >       - ModificationTime
> >> > > > > >    - ResourceResolver -  Do we need a resolver in this case ?
> >> > > > > >
> >> > > > > >
> >> > > > > >  So I believe in CQL context, one ROW should represent a Sling
> >> > > > resource.
> >> > > > > If
> >> > > > > > that is the case for ResourceMetadata we might need a separate
> >> > column
> >> > > > to
> >> > > > > > store it since it has multiple values. I am not sure whether
> we
> >> can
> >> > > do
> >> > > > it
> >> > > > > > with CQL, but it should be possible with hector APIs may be.
> >> > > > > >
> >> > > > > > Appreciate your thoughts ?
> >> > > > > >
> >> > > > > >
> >> > > > > > On Wed, Jun 19, 2013 at 1:19 AM, Dishara Wijewardana <
> >> > > > > > ddwijewardana@gmail.com> wrote:
> >> > > > > >
> >> > > > > > > Hi Ian,
> >> > > > > > > I am starting this thread to keep track on things related to
> >> the
> >> > > GSoC
> >> > > > > > > project related milestone status updates and related
> >> discussions.
> >> > > > > > > So the first task over view will be as follows as per GSoC
> >> > proposal
> >> > > > > > > provided.
> >> > > > > > >
> >> > > > > > > 1. Implementing a CassandraResourceProvider  to READ from
> >> > > Cassandra.
> >> > > > > > > Implementation Details [1]
> >> > > > > > >
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > [1] : Implementation Details:
> >> > > > > > >
> >> > > > > > >  1.A) Write a CassanrdaResourceProviderUtil  which is
> >> basically a
> >> > > > > > > cassendra client which will facilitate all cassandra related
> >> > > > operations
> >> > > > > > > required by other modules (CassandraResourceProvider and
> >> > > > > > > CassandraResourceResolver).
> >> > > > > > >
> >> > > > > > > 1.B) Implementation of  CassandraResourceProvider
> >> > > > > > >
> >> > > > > > > 1.C)  Implementation of CassandraResourceResolver
> >> > > > > > >
> >> > > > > > > 1.D) Implementation of CassandraResource
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > And I will start writing the CassanrdaResourceProviderUtil
> >> class
> >> > > > which
> >> > > > > > > will do basic add and get using hector API. Please provide
> any
> >> > > > feedback
> >> > > > > > > that will be useful to accomplish this task.
> >> > > > > > > So for this how does path mapping should be done. Because
> for
> >> > > > example,
> >> > > > > > the
> >> > > > > > > path of the cassendra node will not be same as the jcr node
> >> path.
> >> > > i.e
> >> > > > > > > provider will ask a node path /system/myapps/test/foo and
> >> where
> >> > > > should
> >> > > > > we
> >> > > > > > > return it from Cassandra. Aren't we have to first consider
> the
> >> > > WRITE
> >> > > > > > aspect
> >> > > > > > > to Cassandra ?
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > --
> >> > > > > > > Thanks
> >> > > > > > > /Dishara
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > --
> >> > > > > > Thanks
> >> > > > > > /Dishara
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Thanks
> >> > > > /Dishara
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > Thanks
> >> > /Dishara
> >> >
> >>
> >
> >
> >
> > --
> > Thanks
> > /Dishara
> >
>
>
>
> --
> Thanks
> /Dishara
>

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Dishara Wijewardana <dd...@gmail.com>.
Hi Ian,
I am having a problem with CQL..

For example:
        CqlQuery*<String,String,Long>* cqlQuery = new CqlQuery*
<String,String,Long>*(keyspace, new StringSerializer(),new
StringSerializer(), new LongSerializer();
        cqlQuery.setQuery("insert into mytable (KEY,password,gender,userid)
values (3,'pass1','male',34);");
        QueryResult<CqlRows<String,String,Long>> result =
cqlQuery.execute();

This will successfully insert the row with pass1,male and 34 values under
rowId=3.

But in sling scenario, we need to have more serializers for a query as
follows. Since we have more columns.
i.e
        CqlQuery*<String,String,String,String> *cqlQuery = new CqlQuery*
<String,String,String,String>*(keyspace, new StringSerializer(),new
StringSerializer(),new       StringSerializer(),new StringSerializer());
        cqlQuery.setQuery("insert into mytable
(KEY,path,resourceType,resourceSuperType,metadata) values
(3,'/content/cassandra/foo/bar','nt:cassandra','nt:super','metadata');
        QueryResult<CqlRows<String,String,Long>> result =
cqlQuery.execute();

Here I am using me.prettyprint.cassandra.model.CqlQuery class. Any idea how
to proceed with this.

Am I doing something wring or is this a limitation of the API I am using ?


On Thu, Jun 27, 2013 at 7:41 AM, Dishara Wijewardana <
ddwijewardana@gmail.com> wrote:

>
>
> On Thu, Jun 27, 2013 at 4:26 AM, Ian Boston <ie...@tfd.co.uk> wrote:
>
>> On 27 June 2013 02:34, Dishara Wijewardana <dd...@gmail.com>
>> wrote:
>>
>> > On Tue, Jun 25, 2013 at 4:52 AM, Ian Boston <ie...@tfd.co.uk> wrote:
>> >
>> > > Hi,
>> > >
>> > > (I might have errors in the CQL, Cassandra schema and the functions
>> need
>> > > proper escaping)
>> > >
>> > >
>> > > Example 1:
>> > > Zero depth tree wiht UUID as the rowid or key.
>> > >
>> > > URL /content/cassandra/pictures/13f58d5c95c70b6f
>> > >
>> > > then the column family is pictures and the URL -> ROWID function just
>> > > results in the ROWID being 13f58d5c95c70b6f and
>> > >
>> > > String cql = mapOfCassandraMappers.get("pictures").getCQL("pictures",
>> "
>> > > 13f58d5c95c70b6f")
>> > > System.err.println(cql);
>> > >
>> > > where
>> > > String getCQL(String cf, String path) {
>> > >     return "select * from "+cf+" where rowid = '"+path+"'";
>> > > }
>> > >
>> > > yields:
>> > > select * from pictures where rowid = '13f58d5c95c70b6f'
>> > >
>> > >
>> > > 13f58d5c95c70b6f would be generated by the application when the user
>> > > created a new picture (by upload).
>> > >
>> > >
>> > >
>> > > Example 2:
>> > > User specified
>> > >
>> > > URL
>> /content/cassandra/catalogue/capacitors/electrolytic/axial/16v/10uf
>> > >
>> > > String cql =
>> mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
>> > > capacitors/electrolytic/axial/16v/10uf")
>> > > System.err.println(cql);
>> > >
>> > > where
>> > > String getCQL(String cf, String path) {
>> > >     MessageDigest md = MessageDigest.getInstance("SHA1");
>> > >     String rowID = Base64.encode(md.finish(path.getBytes("UTF-8")));
>> > >     return "select * from "+cf+" where rowid = '"+rowID+"'";
>> > > }
>> > >
>> > > yields
>> > >
>> > > select * from pictures where rowid = 'NzdlZmU4OTZmNGM4MzMwYzZ'
>> > >
>> > > If you want to find the parent then
>> > >
>> > > mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
>> > > capacitors/electrolytic/axial/16v")
>> > >
>> > > select * from pictures where rowid = 'ZGFzZGZzZnNkYWZzYWRmc2R'
>> > >
>> > > And if the parent is stored in the property parent then
>> > >
>> > > select * from pictures where parent = 'ZGFzZGZzZnNkYWZzYWRmc2R'
>> > >
>> > > will generate a list of children. (Not sure about performance)
>> > >
>> > >
>> > > Example 3:
>> > > User is allowed to enter the RowID directly (identical to Example 1
>> > > URL
>> > >
>> > >
>> >
>> /content/cassandra/cannesfilmfestival/TomCruiseCassino-20130402112345-ieb.jpg
>> > >
>> > > where
>> > > String getCQL(String cf, String path) {
>> > >     return "select * from "+cf+" where rowid = '"+path+"'";
>> > > }
>> > >
>> > > yields:
>> > > select * from pictures where rowid = '
>> > > TomCruiseCassino-20130402112345-ieb.jpg'
>> > >
>> >
>> > This should be corrected as
>> > select * from cannesfilmfestival where rowid = '
>> > TomCruiseCassino-20130402112345-ieb.jpg'
>> >
>> >
>> > >
>> > >
>> > > Does that make sense ?
>> > >
>> >
>>
>> Hi
>>
>>
>> > Hi Ian,
>> > I was in fact practicing some cql stuff in related to this response
>> (with
>> > cassandra cql terminal). This is quite a wonderful explanation for a new
>> > comer like me. Thank you very much for the explanation again. Now it
>> really
>> > makes sense.
>> >
>>
>> excellent!
>>
>>
>> >
>> > Other than the zero depth approach, I believe users will be more
>> > comfortable with Example 2 approach.
>> > Shall we go ahead with it ?
>> >
>>
>>
>> Yes, go for it. It will be interesting to see how hard it is to implement
>> and how well (or not) it works. Remember, keep it as simple as possible
>> and
>> dont try and and cover every use case at the expense of getting a PoC
>> working.
>>
> +1.
>
>>
>> However, dont forget, Unit tests mocked with Mockito are a quicker way of
>> getting to working code, than no unit test coverage.
>>
>> Best Regards
>> Ian
>>
>>
>>
>>
>> >
>> >
>> > > Ian
>> > >
>> > >
>> > >
>> > >
>> > > On 25 June 2013 05:29, Dishara Wijewardana <dd...@gmail.com>
>> > > wrote:
>> > >
>> > > > On Mon, Jun 24, 2013 at 4:02 AM, Ian Boston <ie...@tfd.co.uk> wrote:
>> > > >
>> > > > > Hi Dishara,
>> > > > > Yes. 1 resource == 1 row.
>> > > > > The columns within that row represent the properties of the
>> resource.
>> > > > > I suggest that you use standard property names where appropriate
>> (eg
>> > > > > sling:resourceType is the Resource.resourceType etc)
>> > > > >
>> > > > > The Resource itself should be adaptable to a generic
>> > CassandraResource
>> > > > > (which will probably implement Resource) which will have a map of
>> > > > > properties containing all the columns of the cassandra row.
>> (optimise
>> > > > > later) A CassandraResource might look and feel like a Map<String,
>> > > Object>
>> > > > > or it might have a Map<String, Object> getProperties() method, or
>> > > better
>> > > > > still be adaptable to a Map. The essential think is dont hard code
>> > the
>> > > > > property names in the interface of CassandraResource for the
>> moment.
>> > ie
>> > > > no
>> > > > > getContentType() and no getMimeType(), as we dont really know
>> what a
>> > > > > CassandraResource will store.
>> > > > >
>> > > > > ResourceMetadata should be built from a subset of the
>> > CassandraResource
>> > > > > properties.
>> > > > >
>> > > > > You won't need to implement a ResourceResolver, only a
>> > ResourceProvider
>> > > > > (and Factory). I would use CQL in preference to other API methods.
>> > > > >
>> > > > > There is one thing that hasnt been mentioned, and thats the URL ->
>> > > > > Cassandra Row mapping.
>> > > > > There are several ways of doing this.
>> > > > >
>> > > > > eg:
>> > > > > URL = /content/cassandra/<columnFamily>/<rowID>
>> > > > >  Cassandra Column Family = columnFamily
>> > > > >  Cassandra RowID = rowID
>> > > > > or
>> > > > > URL =
>> /content/cassandra/<columnFamilySelector>/remainder/of/the/path
>> > > > >  Cassandra  Cassandra Column Family =
>> > > > > mapOfColumnFamilies.get(columnFamilySelector)
>> > > > >  Cassandra  RowID = function(/remainder/of/the/path)
>> > > > >
>> > > > > or to take that one stage further
>> > > > >
>> > > > > public interface CassandraMapper {
>> > > > >       String getCQL(String columnFamilySelector, String path);
>> > > > > }
>> > > > >
>> > > > Hi Ian
>> > > > Thank you for the detailed explanation.
>> > > >
>> > > > OK. +1 for this approach with the mentioned flexibility.But  I need
>> a
>> > > small
>> > > > clarification. With this approach,
>> > > >
>> > > > URL = /content/cassandra/<columnFamilySelector>ROW-ID
>> > > > ROW-ID - function(/remainder/of/the/path).
>> > > > So you mean ROW-ID is something we have to programatically uniquely
>> > > create
>> > > >  right ? like a UUID.
>> > > >
>> > > > What is this "/remainder/of/the/path" means ? Can you give an
>> example
>> > > with
>> > > > real values in the context of a user who want to obtain a resource
>> from
>> > > > cassandra.
>> > > > This is just for my understanding.
>> > > >
>> > > >
>> > > >
>> > > > >
>> > > > > URL = /content/cassandra/<columnFamilySelector>/<remainderOfPath>
>> > > > >
>> > > > >  String cqlQuery =
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>> mapOfCassandraMappers.get(columnFamilySelector).getCQL(columnFamilySelector,
>> > > > > remainderOfPath);
>> > > > >
>> > > > > Which would allow us provided one or more implementations of
>> > > > > CassandraMapper to map between URL and CQL.
>> > > > >
>> > > > >
>> > > > > HTH
>> > > > > Ian
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > On 23 June 2013 19:29, Dishara Wijewardana <
>> ddwijewardana@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > > > Hi Ian,
>> > > > > >
>> > > > > > What is the data mapping should be between Cassandra and Sling
>> > > > resource.
>> > > > > I
>> > > > > > mean is a Sling Resource maps to a Cassandra Column ? Or Column
>> > > Family
>> > > > ?
>> > > > > >
>> > > > > > Because to get this Cassandra and Sling story correct we need to
>> > > > finalize
>> > > > > > this.
>> > > > > > For an example what we eventually returns is a Sling resource.
>> > > > Everything
>> > > > > > that needs to fill in to create Sling resource should be stored
>> in
>> > > > > > Cassandra.
>> > > > > > In a Sling resource,
>> > > > > >
>> > > > > >    - Path - direct sling resource path
>> > > > > >    - ResourceType - nt:cassandra
>> > > > > >    - ResourceSuperType - ?
>> > > > > >    - ResourceMetadata - we can create this on the fly with the
>> data
>> > > > from
>> > > > > >    the corresponding column. At insertion, those need to be
>> stored.
>> > > > > > Following
>> > > > > >    are the ones which I thought might be useful by default to be
>> > set
>> > > > for
>> > > > > > any
>> > > > > >    node. Please add if we need anything more.
>> > > > > >       - ContentType
>> > > > > >       - ContentLength
>> > > > > >       - CreationTime
>> > > > > >       - ModificationTime
>> > > > > >    - ResourceResolver -  Do we need a resolver in this case ?
>> > > > > >
>> > > > > >
>> > > > > >  So I believe in CQL context, one ROW should represent a Sling
>> > > > resource.
>> > > > > If
>> > > > > > that is the case for ResourceMetadata we might need a separate
>> > column
>> > > > to
>> > > > > > store it since it has multiple values. I am not sure whether we
>> can
>> > > do
>> > > > it
>> > > > > > with CQL, but it should be possible with hector APIs may be.
>> > > > > >
>> > > > > > Appreciate your thoughts ?
>> > > > > >
>> > > > > >
>> > > > > > On Wed, Jun 19, 2013 at 1:19 AM, Dishara Wijewardana <
>> > > > > > ddwijewardana@gmail.com> wrote:
>> > > > > >
>> > > > > > > Hi Ian,
>> > > > > > > I am starting this thread to keep track on things related to
>> the
>> > > GSoC
>> > > > > > > project related milestone status updates and related
>> discussions.
>> > > > > > > So the first task over view will be as follows as per GSoC
>> > proposal
>> > > > > > > provided.
>> > > > > > >
>> > > > > > > 1. Implementing a CassandraResourceProvider  to READ from
>> > > Cassandra.
>> > > > > > > Implementation Details [1]
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > [1] : Implementation Details:
>> > > > > > >
>> > > > > > >  1.A) Write a CassanrdaResourceProviderUtil  which is
>> basically a
>> > > > > > > cassendra client which will facilitate all cassandra related
>> > > > operations
>> > > > > > > required by other modules (CassandraResourceProvider and
>> > > > > > > CassandraResourceResolver).
>> > > > > > >
>> > > > > > > 1.B) Implementation of  CassandraResourceProvider
>> > > > > > >
>> > > > > > > 1.C)  Implementation of CassandraResourceResolver
>> > > > > > >
>> > > > > > > 1.D) Implementation of CassandraResource
>> > > > > > >
>> > > > > > >
>> > > > > > > And I will start writing the CassanrdaResourceProviderUtil
>> class
>> > > > which
>> > > > > > > will do basic add and get using hector API. Please provide any
>> > > > feedback
>> > > > > > > that will be useful to accomplish this task.
>> > > > > > > So for this how does path mapping should be done. Because for
>> > > > example,
>> > > > > > the
>> > > > > > > path of the cassendra node will not be same as the jcr node
>> path.
>> > > i.e
>> > > > > > > provider will ask a node path /system/myapps/test/foo and
>> where
>> > > > should
>> > > > > we
>> > > > > > > return it from Cassandra. Aren't we have to first consider the
>> > > WRITE
>> > > > > > aspect
>> > > > > > > to Cassandra ?
>> > > > > > >
>> > > > > > >
>> > > > > > > --
>> > > > > > > Thanks
>> > > > > > > /Dishara
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Thanks
>> > > > > > /Dishara
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Thanks
>> > > > /Dishara
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > Thanks
>> > /Dishara
>> >
>>
>
>
>
> --
> Thanks
> /Dishara
>



-- 
Thanks
/Dishara

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Dishara Wijewardana <dd...@gmail.com>.
On Thu, Jun 27, 2013 at 4:26 AM, Ian Boston <ie...@tfd.co.uk> wrote:

> On 27 June 2013 02:34, Dishara Wijewardana <dd...@gmail.com>
> wrote:
>
> > On Tue, Jun 25, 2013 at 4:52 AM, Ian Boston <ie...@tfd.co.uk> wrote:
> >
> > > Hi,
> > >
> > > (I might have errors in the CQL, Cassandra schema and the functions
> need
> > > proper escaping)
> > >
> > >
> > > Example 1:
> > > Zero depth tree wiht UUID as the rowid or key.
> > >
> > > URL /content/cassandra/pictures/13f58d5c95c70b6f
> > >
> > > then the column family is pictures and the URL -> ROWID function just
> > > results in the ROWID being 13f58d5c95c70b6f and
> > >
> > > String cql = mapOfCassandraMappers.get("pictures").getCQL("pictures", "
> > > 13f58d5c95c70b6f")
> > > System.err.println(cql);
> > >
> > > where
> > > String getCQL(String cf, String path) {
> > >     return "select * from "+cf+" where rowid = '"+path+"'";
> > > }
> > >
> > > yields:
> > > select * from pictures where rowid = '13f58d5c95c70b6f'
> > >
> > >
> > > 13f58d5c95c70b6f would be generated by the application when the user
> > > created a new picture (by upload).
> > >
> > >
> > >
> > > Example 2:
> > > User specified
> > >
> > > URL /content/cassandra/catalogue/capacitors/electrolytic/axial/16v/10uf
> > >
> > > String cql =
> mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
> > > capacitors/electrolytic/axial/16v/10uf")
> > > System.err.println(cql);
> > >
> > > where
> > > String getCQL(String cf, String path) {
> > >     MessageDigest md = MessageDigest.getInstance("SHA1");
> > >     String rowID = Base64.encode(md.finish(path.getBytes("UTF-8")));
> > >     return "select * from "+cf+" where rowid = '"+rowID+"'";
> > > }
> > >
> > > yields
> > >
> > > select * from pictures where rowid = 'NzdlZmU4OTZmNGM4MzMwYzZ'
> > >
> > > If you want to find the parent then
> > >
> > > mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
> > > capacitors/electrolytic/axial/16v")
> > >
> > > select * from pictures where rowid = 'ZGFzZGZzZnNkYWZzYWRmc2R'
> > >
> > > And if the parent is stored in the property parent then
> > >
> > > select * from pictures where parent = 'ZGFzZGZzZnNkYWZzYWRmc2R'
> > >
> > > will generate a list of children. (Not sure about performance)
> > >
> > >
> > > Example 3:
> > > User is allowed to enter the RowID directly (identical to Example 1
> > > URL
> > >
> > >
> >
> /content/cassandra/cannesfilmfestival/TomCruiseCassino-20130402112345-ieb.jpg
> > >
> > > where
> > > String getCQL(String cf, String path) {
> > >     return "select * from "+cf+" where rowid = '"+path+"'";
> > > }
> > >
> > > yields:
> > > select * from pictures where rowid = '
> > > TomCruiseCassino-20130402112345-ieb.jpg'
> > >
> >
> > This should be corrected as
> > select * from cannesfilmfestival where rowid = '
> > TomCruiseCassino-20130402112345-ieb.jpg'
> >
> >
> > >
> > >
> > > Does that make sense ?
> > >
> >
>
> Hi
>
>
> > Hi Ian,
> > I was in fact practicing some cql stuff in related to this response (with
> > cassandra cql terminal). This is quite a wonderful explanation for a new
> > comer like me. Thank you very much for the explanation again. Now it
> really
> > makes sense.
> >
>
> excellent!
>
>
> >
> > Other than the zero depth approach, I believe users will be more
> > comfortable with Example 2 approach.
> > Shall we go ahead with it ?
> >
>
>
> Yes, go for it. It will be interesting to see how hard it is to implement
> and how well (or not) it works. Remember, keep it as simple as possible and
> dont try and and cover every use case at the expense of getting a PoC
> working.
>
+1.

>
> However, dont forget, Unit tests mocked with Mockito are a quicker way of
> getting to working code, than no unit test coverage.
>
> Best Regards
> Ian
>
>
>
>
> >
> >
> > > Ian
> > >
> > >
> > >
> > >
> > > On 25 June 2013 05:29, Dishara Wijewardana <dd...@gmail.com>
> > > wrote:
> > >
> > > > On Mon, Jun 24, 2013 at 4:02 AM, Ian Boston <ie...@tfd.co.uk> wrote:
> > > >
> > > > > Hi Dishara,
> > > > > Yes. 1 resource == 1 row.
> > > > > The columns within that row represent the properties of the
> resource.
> > > > > I suggest that you use standard property names where appropriate
> (eg
> > > > > sling:resourceType is the Resource.resourceType etc)
> > > > >
> > > > > The Resource itself should be adaptable to a generic
> > CassandraResource
> > > > > (which will probably implement Resource) which will have a map of
> > > > > properties containing all the columns of the cassandra row.
> (optimise
> > > > > later) A CassandraResource might look and feel like a Map<String,
> > > Object>
> > > > > or it might have a Map<String, Object> getProperties() method, or
> > > better
> > > > > still be adaptable to a Map. The essential think is dont hard code
> > the
> > > > > property names in the interface of CassandraResource for the
> moment.
> > ie
> > > > no
> > > > > getContentType() and no getMimeType(), as we dont really know what
> a
> > > > > CassandraResource will store.
> > > > >
> > > > > ResourceMetadata should be built from a subset of the
> > CassandraResource
> > > > > properties.
> > > > >
> > > > > You won't need to implement a ResourceResolver, only a
> > ResourceProvider
> > > > > (and Factory). I would use CQL in preference to other API methods.
> > > > >
> > > > > There is one thing that hasnt been mentioned, and thats the URL ->
> > > > > Cassandra Row mapping.
> > > > > There are several ways of doing this.
> > > > >
> > > > > eg:
> > > > > URL = /content/cassandra/<columnFamily>/<rowID>
> > > > >  Cassandra Column Family = columnFamily
> > > > >  Cassandra RowID = rowID
> > > > > or
> > > > > URL =
> /content/cassandra/<columnFamilySelector>/remainder/of/the/path
> > > > >  Cassandra  Cassandra Column Family =
> > > > > mapOfColumnFamilies.get(columnFamilySelector)
> > > > >  Cassandra  RowID = function(/remainder/of/the/path)
> > > > >
> > > > > or to take that one stage further
> > > > >
> > > > > public interface CassandraMapper {
> > > > >       String getCQL(String columnFamilySelector, String path);
> > > > > }
> > > > >
> > > > Hi Ian
> > > > Thank you for the detailed explanation.
> > > >
> > > > OK. +1 for this approach with the mentioned flexibility.But  I need a
> > > small
> > > > clarification. With this approach,
> > > >
> > > > URL = /content/cassandra/<columnFamilySelector>ROW-ID
> > > > ROW-ID - function(/remainder/of/the/path).
> > > > So you mean ROW-ID is something we have to programatically uniquely
> > > create
> > > >  right ? like a UUID.
> > > >
> > > > What is this "/remainder/of/the/path" means ? Can you give an example
> > > with
> > > > real values in the context of a user who want to obtain a resource
> from
> > > > cassandra.
> > > > This is just for my understanding.
> > > >
> > > >
> > > >
> > > > >
> > > > > URL = /content/cassandra/<columnFamilySelector>/<remainderOfPath>
> > > > >
> > > > >  String cqlQuery =
> > > > >
> > > > >
> > > >
> > >
> >
> mapOfCassandraMappers.get(columnFamilySelector).getCQL(columnFamilySelector,
> > > > > remainderOfPath);
> > > > >
> > > > > Which would allow us provided one or more implementations of
> > > > > CassandraMapper to map between URL and CQL.
> > > > >
> > > > >
> > > > > HTH
> > > > > Ian
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On 23 June 2013 19:29, Dishara Wijewardana <
> ddwijewardana@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Ian,
> > > > > >
> > > > > > What is the data mapping should be between Cassandra and Sling
> > > > resource.
> > > > > I
> > > > > > mean is a Sling Resource maps to a Cassandra Column ? Or Column
> > > Family
> > > > ?
> > > > > >
> > > > > > Because to get this Cassandra and Sling story correct we need to
> > > > finalize
> > > > > > this.
> > > > > > For an example what we eventually returns is a Sling resource.
> > > > Everything
> > > > > > that needs to fill in to create Sling resource should be stored
> in
> > > > > > Cassandra.
> > > > > > In a Sling resource,
> > > > > >
> > > > > >    - Path - direct sling resource path
> > > > > >    - ResourceType - nt:cassandra
> > > > > >    - ResourceSuperType - ?
> > > > > >    - ResourceMetadata - we can create this on the fly with the
> data
> > > > from
> > > > > >    the corresponding column. At insertion, those need to be
> stored.
> > > > > > Following
> > > > > >    are the ones which I thought might be useful by default to be
> > set
> > > > for
> > > > > > any
> > > > > >    node. Please add if we need anything more.
> > > > > >       - ContentType
> > > > > >       - ContentLength
> > > > > >       - CreationTime
> > > > > >       - ModificationTime
> > > > > >    - ResourceResolver -  Do we need a resolver in this case ?
> > > > > >
> > > > > >
> > > > > >  So I believe in CQL context, one ROW should represent a Sling
> > > > resource.
> > > > > If
> > > > > > that is the case for ResourceMetadata we might need a separate
> > column
> > > > to
> > > > > > store it since it has multiple values. I am not sure whether we
> can
> > > do
> > > > it
> > > > > > with CQL, but it should be possible with hector APIs may be.
> > > > > >
> > > > > > Appreciate your thoughts ?
> > > > > >
> > > > > >
> > > > > > On Wed, Jun 19, 2013 at 1:19 AM, Dishara Wijewardana <
> > > > > > ddwijewardana@gmail.com> wrote:
> > > > > >
> > > > > > > Hi Ian,
> > > > > > > I am starting this thread to keep track on things related to
> the
> > > GSoC
> > > > > > > project related milestone status updates and related
> discussions.
> > > > > > > So the first task over view will be as follows as per GSoC
> > proposal
> > > > > > > provided.
> > > > > > >
> > > > > > > 1. Implementing a CassandraResourceProvider  to READ from
> > > Cassandra.
> > > > > > > Implementation Details [1]
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > [1] : Implementation Details:
> > > > > > >
> > > > > > >  1.A) Write a CassanrdaResourceProviderUtil  which is
> basically a
> > > > > > > cassendra client which will facilitate all cassandra related
> > > > operations
> > > > > > > required by other modules (CassandraResourceProvider and
> > > > > > > CassandraResourceResolver).
> > > > > > >
> > > > > > > 1.B) Implementation of  CassandraResourceProvider
> > > > > > >
> > > > > > > 1.C)  Implementation of CassandraResourceResolver
> > > > > > >
> > > > > > > 1.D) Implementation of CassandraResource
> > > > > > >
> > > > > > >
> > > > > > > And I will start writing the CassanrdaResourceProviderUtil
> class
> > > > which
> > > > > > > will do basic add and get using hector API. Please provide any
> > > > feedback
> > > > > > > that will be useful to accomplish this task.
> > > > > > > So for this how does path mapping should be done. Because for
> > > > example,
> > > > > > the
> > > > > > > path of the cassendra node will not be same as the jcr node
> path.
> > > i.e
> > > > > > > provider will ask a node path /system/myapps/test/foo and where
> > > > should
> > > > > we
> > > > > > > return it from Cassandra. Aren't we have to first consider the
> > > WRITE
> > > > > > aspect
> > > > > > > to Cassandra ?
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Thanks
> > > > > > > /Dishara
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Thanks
> > > > > > /Dishara
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Thanks
> > > > /Dishara
> > > >
> > >
> >
> >
> >
> > --
> > Thanks
> > /Dishara
> >
>



-- 
Thanks
/Dishara

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Ian Boston <ie...@tfd.co.uk>.
On 27 June 2013 02:34, Dishara Wijewardana <dd...@gmail.com> wrote:

> On Tue, Jun 25, 2013 at 4:52 AM, Ian Boston <ie...@tfd.co.uk> wrote:
>
> > Hi,
> >
> > (I might have errors in the CQL, Cassandra schema and the functions need
> > proper escaping)
> >
> >
> > Example 1:
> > Zero depth tree wiht UUID as the rowid or key.
> >
> > URL /content/cassandra/pictures/13f58d5c95c70b6f
> >
> > then the column family is pictures and the URL -> ROWID function just
> > results in the ROWID being 13f58d5c95c70b6f and
> >
> > String cql = mapOfCassandraMappers.get("pictures").getCQL("pictures", "
> > 13f58d5c95c70b6f")
> > System.err.println(cql);
> >
> > where
> > String getCQL(String cf, String path) {
> >     return "select * from "+cf+" where rowid = '"+path+"'";
> > }
> >
> > yields:
> > select * from pictures where rowid = '13f58d5c95c70b6f'
> >
> >
> > 13f58d5c95c70b6f would be generated by the application when the user
> > created a new picture (by upload).
> >
> >
> >
> > Example 2:
> > User specified
> >
> > URL /content/cassandra/catalogue/capacitors/electrolytic/axial/16v/10uf
> >
> > String cql = mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
> > capacitors/electrolytic/axial/16v/10uf")
> > System.err.println(cql);
> >
> > where
> > String getCQL(String cf, String path) {
> >     MessageDigest md = MessageDigest.getInstance("SHA1");
> >     String rowID = Base64.encode(md.finish(path.getBytes("UTF-8")));
> >     return "select * from "+cf+" where rowid = '"+rowID+"'";
> > }
> >
> > yields
> >
> > select * from pictures where rowid = 'NzdlZmU4OTZmNGM4MzMwYzZ'
> >
> > If you want to find the parent then
> >
> > mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
> > capacitors/electrolytic/axial/16v")
> >
> > select * from pictures where rowid = 'ZGFzZGZzZnNkYWZzYWRmc2R'
> >
> > And if the parent is stored in the property parent then
> >
> > select * from pictures where parent = 'ZGFzZGZzZnNkYWZzYWRmc2R'
> >
> > will generate a list of children. (Not sure about performance)
> >
> >
> > Example 3:
> > User is allowed to enter the RowID directly (identical to Example 1
> > URL
> >
> >
> /content/cassandra/cannesfilmfestival/TomCruiseCassino-20130402112345-ieb.jpg
> >
> > where
> > String getCQL(String cf, String path) {
> >     return "select * from "+cf+" where rowid = '"+path+"'";
> > }
> >
> > yields:
> > select * from pictures where rowid = '
> > TomCruiseCassino-20130402112345-ieb.jpg'
> >
>
> This should be corrected as
> select * from cannesfilmfestival where rowid = '
> TomCruiseCassino-20130402112345-ieb.jpg'
>
>
> >
> >
> > Does that make sense ?
> >
>

Hi


> Hi Ian,
> I was in fact practicing some cql stuff in related to this response (with
> cassandra cql terminal). This is quite a wonderful explanation for a new
> comer like me. Thank you very much for the explanation again. Now it really
> makes sense.
>

excellent!


>
> Other than the zero depth approach, I believe users will be more
> comfortable with Example 2 approach.
> Shall we go ahead with it ?
>


Yes, go for it. It will be interesting to see how hard it is to implement
and how well (or not) it works. Remember, keep it as simple as possible and
dont try and and cover every use case at the expense of getting a PoC
working.

However, dont forget, Unit tests mocked with Mockito are a quicker way of
getting to working code, than no unit test coverage.

Best Regards
Ian




>
>
> > Ian
> >
> >
> >
> >
> > On 25 June 2013 05:29, Dishara Wijewardana <dd...@gmail.com>
> > wrote:
> >
> > > On Mon, Jun 24, 2013 at 4:02 AM, Ian Boston <ie...@tfd.co.uk> wrote:
> > >
> > > > Hi Dishara,
> > > > Yes. 1 resource == 1 row.
> > > > The columns within that row represent the properties of the resource.
> > > > I suggest that you use standard property names where appropriate (eg
> > > > sling:resourceType is the Resource.resourceType etc)
> > > >
> > > > The Resource itself should be adaptable to a generic
> CassandraResource
> > > > (which will probably implement Resource) which will have a map of
> > > > properties containing all the columns of the cassandra row. (optimise
> > > > later) A CassandraResource might look and feel like a Map<String,
> > Object>
> > > > or it might have a Map<String, Object> getProperties() method, or
> > better
> > > > still be adaptable to a Map. The essential think is dont hard code
> the
> > > > property names in the interface of CassandraResource for the moment.
> ie
> > > no
> > > > getContentType() and no getMimeType(), as we dont really know what a
> > > > CassandraResource will store.
> > > >
> > > > ResourceMetadata should be built from a subset of the
> CassandraResource
> > > > properties.
> > > >
> > > > You won't need to implement a ResourceResolver, only a
> ResourceProvider
> > > > (and Factory). I would use CQL in preference to other API methods.
> > > >
> > > > There is one thing that hasnt been mentioned, and thats the URL ->
> > > > Cassandra Row mapping.
> > > > There are several ways of doing this.
> > > >
> > > > eg:
> > > > URL = /content/cassandra/<columnFamily>/<rowID>
> > > >  Cassandra Column Family = columnFamily
> > > >  Cassandra RowID = rowID
> > > > or
> > > > URL = /content/cassandra/<columnFamilySelector>/remainder/of/the/path
> > > >  Cassandra  Cassandra Column Family =
> > > > mapOfColumnFamilies.get(columnFamilySelector)
> > > >  Cassandra  RowID = function(/remainder/of/the/path)
> > > >
> > > > or to take that one stage further
> > > >
> > > > public interface CassandraMapper {
> > > >       String getCQL(String columnFamilySelector, String path);
> > > > }
> > > >
> > > Hi Ian
> > > Thank you for the detailed explanation.
> > >
> > > OK. +1 for this approach with the mentioned flexibility.But  I need a
> > small
> > > clarification. With this approach,
> > >
> > > URL = /content/cassandra/<columnFamilySelector>ROW-ID
> > > ROW-ID - function(/remainder/of/the/path).
> > > So you mean ROW-ID is something we have to programatically uniquely
> > create
> > >  right ? like a UUID.
> > >
> > > What is this "/remainder/of/the/path" means ? Can you give an example
> > with
> > > real values in the context of a user who want to obtain a resource from
> > > cassandra.
> > > This is just for my understanding.
> > >
> > >
> > >
> > > >
> > > > URL = /content/cassandra/<columnFamilySelector>/<remainderOfPath>
> > > >
> > > >  String cqlQuery =
> > > >
> > > >
> > >
> >
> mapOfCassandraMappers.get(columnFamilySelector).getCQL(columnFamilySelector,
> > > > remainderOfPath);
> > > >
> > > > Which would allow us provided one or more implementations of
> > > > CassandraMapper to map between URL and CQL.
> > > >
> > > >
> > > > HTH
> > > > Ian
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On 23 June 2013 19:29, Dishara Wijewardana <dd...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Ian,
> > > > >
> > > > > What is the data mapping should be between Cassandra and Sling
> > > resource.
> > > > I
> > > > > mean is a Sling Resource maps to a Cassandra Column ? Or Column
> > Family
> > > ?
> > > > >
> > > > > Because to get this Cassandra and Sling story correct we need to
> > > finalize
> > > > > this.
> > > > > For an example what we eventually returns is a Sling resource.
> > > Everything
> > > > > that needs to fill in to create Sling resource should be stored in
> > > > > Cassandra.
> > > > > In a Sling resource,
> > > > >
> > > > >    - Path - direct sling resource path
> > > > >    - ResourceType - nt:cassandra
> > > > >    - ResourceSuperType - ?
> > > > >    - ResourceMetadata - we can create this on the fly with the data
> > > from
> > > > >    the corresponding column. At insertion, those need to be stored.
> > > > > Following
> > > > >    are the ones which I thought might be useful by default to be
> set
> > > for
> > > > > any
> > > > >    node. Please add if we need anything more.
> > > > >       - ContentType
> > > > >       - ContentLength
> > > > >       - CreationTime
> > > > >       - ModificationTime
> > > > >    - ResourceResolver -  Do we need a resolver in this case ?
> > > > >
> > > > >
> > > > >  So I believe in CQL context, one ROW should represent a Sling
> > > resource.
> > > > If
> > > > > that is the case for ResourceMetadata we might need a separate
> column
> > > to
> > > > > store it since it has multiple values. I am not sure whether we can
> > do
> > > it
> > > > > with CQL, but it should be possible with hector APIs may be.
> > > > >
> > > > > Appreciate your thoughts ?
> > > > >
> > > > >
> > > > > On Wed, Jun 19, 2013 at 1:19 AM, Dishara Wijewardana <
> > > > > ddwijewardana@gmail.com> wrote:
> > > > >
> > > > > > Hi Ian,
> > > > > > I am starting this thread to keep track on things related to the
> > GSoC
> > > > > > project related milestone status updates and related discussions.
> > > > > > So the first task over view will be as follows as per GSoC
> proposal
> > > > > > provided.
> > > > > >
> > > > > > 1. Implementing a CassandraResourceProvider  to READ from
> > Cassandra.
> > > > > > Implementation Details [1]
> > > > > >
> > > > > >
> > > > > >
> > > > > > [1] : Implementation Details:
> > > > > >
> > > > > >  1.A) Write a CassanrdaResourceProviderUtil  which is basically a
> > > > > > cassendra client which will facilitate all cassandra related
> > > operations
> > > > > > required by other modules (CassandraResourceProvider and
> > > > > > CassandraResourceResolver).
> > > > > >
> > > > > > 1.B) Implementation of  CassandraResourceProvider
> > > > > >
> > > > > > 1.C)  Implementation of CassandraResourceResolver
> > > > > >
> > > > > > 1.D) Implementation of CassandraResource
> > > > > >
> > > > > >
> > > > > > And I will start writing the CassanrdaResourceProviderUtil class
> > > which
> > > > > > will do basic add and get using hector API. Please provide any
> > > feedback
> > > > > > that will be useful to accomplish this task.
> > > > > > So for this how does path mapping should be done. Because for
> > > example,
> > > > > the
> > > > > > path of the cassendra node will not be same as the jcr node path.
> > i.e
> > > > > > provider will ask a node path /system/myapps/test/foo and where
> > > should
> > > > we
> > > > > > return it from Cassandra. Aren't we have to first consider the
> > WRITE
> > > > > aspect
> > > > > > to Cassandra ?
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Thanks
> > > > > > /Dishara
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Thanks
> > > > > /Dishara
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Thanks
> > > /Dishara
> > >
> >
>
>
>
> --
> Thanks
> /Dishara
>

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Dishara Wijewardana <dd...@gmail.com>.
On Tue, Jun 25, 2013 at 4:52 AM, Ian Boston <ie...@tfd.co.uk> wrote:

> Hi,
>
> (I might have errors in the CQL, Cassandra schema and the functions need
> proper escaping)
>
>
> Example 1:
> Zero depth tree wiht UUID as the rowid or key.
>
> URL /content/cassandra/pictures/13f58d5c95c70b6f
>
> then the column family is pictures and the URL -> ROWID function just
> results in the ROWID being 13f58d5c95c70b6f and
>
> String cql = mapOfCassandraMappers.get("pictures").getCQL("pictures", "
> 13f58d5c95c70b6f")
> System.err.println(cql);
>
> where
> String getCQL(String cf, String path) {
>     return "select * from "+cf+" where rowid = '"+path+"'";
> }
>
> yields:
> select * from pictures where rowid = '13f58d5c95c70b6f'
>
>
> 13f58d5c95c70b6f would be generated by the application when the user
> created a new picture (by upload).
>
>
>
> Example 2:
> User specified
>
> URL /content/cassandra/catalogue/capacitors/electrolytic/axial/16v/10uf
>
> String cql = mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
> capacitors/electrolytic/axial/16v/10uf")
> System.err.println(cql);
>
> where
> String getCQL(String cf, String path) {
>     MessageDigest md = MessageDigest.getInstance("SHA1");
>     String rowID = Base64.encode(md.finish(path.getBytes("UTF-8")));
>     return "select * from "+cf+" where rowid = '"+rowID+"'";
> }
>
> yields
>
> select * from pictures where rowid = 'NzdlZmU4OTZmNGM4MzMwYzZ'
>
> If you want to find the parent then
>
> mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
> capacitors/electrolytic/axial/16v")
>
> select * from pictures where rowid = 'ZGFzZGZzZnNkYWZzYWRmc2R'
>
> And if the parent is stored in the property parent then
>
> select * from pictures where parent = 'ZGFzZGZzZnNkYWZzYWRmc2R'
>
> will generate a list of children. (Not sure about performance)
>
>
> Example 3:
> User is allowed to enter the RowID directly (identical to Example 1
> URL
>
> /content/cassandra/cannesfilmfestival/TomCruiseCassino-20130402112345-ieb.jpg
>
> where
> String getCQL(String cf, String path) {
>     return "select * from "+cf+" where rowid = '"+path+"'";
> }
>
> yields:
> select * from pictures where rowid = '
> TomCruiseCassino-20130402112345-ieb.jpg'
>

This should be corrected as
select * from cannesfilmfestival where rowid = '
TomCruiseCassino-20130402112345-ieb.jpg'


>
>
> Does that make sense ?
>
Hi Ian,
I was in fact practicing some cql stuff in related to this response (with
cassandra cql terminal). This is quite a wonderful explanation for a new
comer like me. Thank you very much for the explanation again. Now it really
makes sense.

Other than the zero depth approach, I believe users will be more
comfortable with Example 2 approach.
Shall we go ahead with it ?


> Ian
>
>
>
>
> On 25 June 2013 05:29, Dishara Wijewardana <dd...@gmail.com>
> wrote:
>
> > On Mon, Jun 24, 2013 at 4:02 AM, Ian Boston <ie...@tfd.co.uk> wrote:
> >
> > > Hi Dishara,
> > > Yes. 1 resource == 1 row.
> > > The columns within that row represent the properties of the resource.
> > > I suggest that you use standard property names where appropriate (eg
> > > sling:resourceType is the Resource.resourceType etc)
> > >
> > > The Resource itself should be adaptable to a generic CassandraResource
> > > (which will probably implement Resource) which will have a map of
> > > properties containing all the columns of the cassandra row. (optimise
> > > later) A CassandraResource might look and feel like a Map<String,
> Object>
> > > or it might have a Map<String, Object> getProperties() method, or
> better
> > > still be adaptable to a Map. The essential think is dont hard code the
> > > property names in the interface of CassandraResource for the moment. ie
> > no
> > > getContentType() and no getMimeType(), as we dont really know what a
> > > CassandraResource will store.
> > >
> > > ResourceMetadata should be built from a subset of the CassandraResource
> > > properties.
> > >
> > > You won't need to implement a ResourceResolver, only a ResourceProvider
> > > (and Factory). I would use CQL in preference to other API methods.
> > >
> > > There is one thing that hasnt been mentioned, and thats the URL ->
> > > Cassandra Row mapping.
> > > There are several ways of doing this.
> > >
> > > eg:
> > > URL = /content/cassandra/<columnFamily>/<rowID>
> > >  Cassandra Column Family = columnFamily
> > >  Cassandra RowID = rowID
> > > or
> > > URL = /content/cassandra/<columnFamilySelector>/remainder/of/the/path
> > >  Cassandra  Cassandra Column Family =
> > > mapOfColumnFamilies.get(columnFamilySelector)
> > >  Cassandra  RowID = function(/remainder/of/the/path)
> > >
> > > or to take that one stage further
> > >
> > > public interface CassandraMapper {
> > >       String getCQL(String columnFamilySelector, String path);
> > > }
> > >
> > Hi Ian
> > Thank you for the detailed explanation.
> >
> > OK. +1 for this approach with the mentioned flexibility.But  I need a
> small
> > clarification. With this approach,
> >
> > URL = /content/cassandra/<columnFamilySelector>ROW-ID
> > ROW-ID - function(/remainder/of/the/path).
> > So you mean ROW-ID is something we have to programatically uniquely
> create
> >  right ? like a UUID.
> >
> > What is this "/remainder/of/the/path" means ? Can you give an example
> with
> > real values in the context of a user who want to obtain a resource from
> > cassandra.
> > This is just for my understanding.
> >
> >
> >
> > >
> > > URL = /content/cassandra/<columnFamilySelector>/<remainderOfPath>
> > >
> > >  String cqlQuery =
> > >
> > >
> >
> mapOfCassandraMappers.get(columnFamilySelector).getCQL(columnFamilySelector,
> > > remainderOfPath);
> > >
> > > Which would allow us provided one or more implementations of
> > > CassandraMapper to map between URL and CQL.
> > >
> > >
> > > HTH
> > > Ian
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On 23 June 2013 19:29, Dishara Wijewardana <dd...@gmail.com>
> > > wrote:
> > >
> > > > Hi Ian,
> > > >
> > > > What is the data mapping should be between Cassandra and Sling
> > resource.
> > > I
> > > > mean is a Sling Resource maps to a Cassandra Column ? Or Column
> Family
> > ?
> > > >
> > > > Because to get this Cassandra and Sling story correct we need to
> > finalize
> > > > this.
> > > > For an example what we eventually returns is a Sling resource.
> > Everything
> > > > that needs to fill in to create Sling resource should be stored in
> > > > Cassandra.
> > > > In a Sling resource,
> > > >
> > > >    - Path - direct sling resource path
> > > >    - ResourceType - nt:cassandra
> > > >    - ResourceSuperType - ?
> > > >    - ResourceMetadata - we can create this on the fly with the data
> > from
> > > >    the corresponding column. At insertion, those need to be stored.
> > > > Following
> > > >    are the ones which I thought might be useful by default to be set
> > for
> > > > any
> > > >    node. Please add if we need anything more.
> > > >       - ContentType
> > > >       - ContentLength
> > > >       - CreationTime
> > > >       - ModificationTime
> > > >    - ResourceResolver -  Do we need a resolver in this case ?
> > > >
> > > >
> > > >  So I believe in CQL context, one ROW should represent a Sling
> > resource.
> > > If
> > > > that is the case for ResourceMetadata we might need a separate column
> > to
> > > > store it since it has multiple values. I am not sure whether we can
> do
> > it
> > > > with CQL, but it should be possible with hector APIs may be.
> > > >
> > > > Appreciate your thoughts ?
> > > >
> > > >
> > > > On Wed, Jun 19, 2013 at 1:19 AM, Dishara Wijewardana <
> > > > ddwijewardana@gmail.com> wrote:
> > > >
> > > > > Hi Ian,
> > > > > I am starting this thread to keep track on things related to the
> GSoC
> > > > > project related milestone status updates and related discussions.
> > > > > So the first task over view will be as follows as per GSoC proposal
> > > > > provided.
> > > > >
> > > > > 1. Implementing a CassandraResourceProvider  to READ from
> Cassandra.
> > > > > Implementation Details [1]
> > > > >
> > > > >
> > > > >
> > > > > [1] : Implementation Details:
> > > > >
> > > > >  1.A) Write a CassanrdaResourceProviderUtil  which is basically a
> > > > > cassendra client which will facilitate all cassandra related
> > operations
> > > > > required by other modules (CassandraResourceProvider and
> > > > > CassandraResourceResolver).
> > > > >
> > > > > 1.B) Implementation of  CassandraResourceProvider
> > > > >
> > > > > 1.C)  Implementation of CassandraResourceResolver
> > > > >
> > > > > 1.D) Implementation of CassandraResource
> > > > >
> > > > >
> > > > > And I will start writing the CassanrdaResourceProviderUtil class
> > which
> > > > > will do basic add and get using hector API. Please provide any
> > feedback
> > > > > that will be useful to accomplish this task.
> > > > > So for this how does path mapping should be done. Because for
> > example,
> > > > the
> > > > > path of the cassendra node will not be same as the jcr node path.
> i.e
> > > > > provider will ask a node path /system/myapps/test/foo and where
> > should
> > > we
> > > > > return it from Cassandra. Aren't we have to first consider the
> WRITE
> > > > aspect
> > > > > to Cassandra ?
> > > > >
> > > > >
> > > > > --
> > > > > Thanks
> > > > > /Dishara
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Thanks
> > > > /Dishara
> > > >
> > >
> >
> >
> >
> > --
> > Thanks
> > /Dishara
> >
>



-- 
Thanks
/Dishara

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Ian Boston <ie...@tfd.co.uk>.
Hi,

(I might have errors in the CQL, Cassandra schema and the functions need
proper escaping)


Example 1:
Zero depth tree wiht UUID as the rowid or key.

URL /content/cassandra/pictures/13f58d5c95c70b6f

then the column family is pictures and the URL -> ROWID function just
results in the ROWID being 13f58d5c95c70b6f and

String cql = mapOfCassandraMappers.get("pictures").getCQL("pictures", "
13f58d5c95c70b6f")
System.err.println(cql);

where
String getCQL(String cf, String path) {
    return "select * from "+cf+" where rowid = '"+path+"'";
}

yields:
select * from pictures where rowid = '13f58d5c95c70b6f'


13f58d5c95c70b6f would be generated by the application when the user
created a new picture (by upload).



Example 2:
User specified

URL /content/cassandra/catalogue/capacitors/electrolytic/axial/16v/10uf

String cql = mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
capacitors/electrolytic/axial/16v/10uf")
System.err.println(cql);

where
String getCQL(String cf, String path) {
    MessageDigest md = MessageDigest.getInstance("SHA1");
    String rowID = Base64.encode(md.finish(path.getBytes("UTF-8")));
    return "select * from "+cf+" where rowid = '"+rowID+"'";
}

yields

select * from pictures where rowid = 'NzdlZmU4OTZmNGM4MzMwYzZ'

If you want to find the parent then

mapOfCassandraMappers.get("catalogue").getCQL("catalogue", "
capacitors/electrolytic/axial/16v")

select * from pictures where rowid = 'ZGFzZGZzZnNkYWZzYWRmc2R'

And if the parent is stored in the property parent then

select * from pictures where parent = 'ZGFzZGZzZnNkYWZzYWRmc2R'

will generate a list of children. (Not sure about performance)


Example 3:
User is allowed to enter the RowID directly (identical to Example 1
URL
/content/cassandra/cannesfilmfestival/TomCruiseCassino-20130402112345-ieb.jpg

where
String getCQL(String cf, String path) {
    return "select * from "+cf+" where rowid = '"+path+"'";
}

yields:
select * from pictures where rowid = '
TomCruiseCassino-20130402112345-ieb.jpg'


Does that make sense ?
Ian




On 25 June 2013 05:29, Dishara Wijewardana <dd...@gmail.com> wrote:

> On Mon, Jun 24, 2013 at 4:02 AM, Ian Boston <ie...@tfd.co.uk> wrote:
>
> > Hi Dishara,
> > Yes. 1 resource == 1 row.
> > The columns within that row represent the properties of the resource.
> > I suggest that you use standard property names where appropriate (eg
> > sling:resourceType is the Resource.resourceType etc)
> >
> > The Resource itself should be adaptable to a generic CassandraResource
> > (which will probably implement Resource) which will have a map of
> > properties containing all the columns of the cassandra row. (optimise
> > later) A CassandraResource might look and feel like a Map<String, Object>
> > or it might have a Map<String, Object> getProperties() method, or better
> > still be adaptable to a Map. The essential think is dont hard code the
> > property names in the interface of CassandraResource for the moment. ie
> no
> > getContentType() and no getMimeType(), as we dont really know what a
> > CassandraResource will store.
> >
> > ResourceMetadata should be built from a subset of the CassandraResource
> > properties.
> >
> > You won't need to implement a ResourceResolver, only a ResourceProvider
> > (and Factory). I would use CQL in preference to other API methods.
> >
> > There is one thing that hasnt been mentioned, and thats the URL ->
> > Cassandra Row mapping.
> > There are several ways of doing this.
> >
> > eg:
> > URL = /content/cassandra/<columnFamily>/<rowID>
> >  Cassandra Column Family = columnFamily
> >  Cassandra RowID = rowID
> > or
> > URL = /content/cassandra/<columnFamilySelector>/remainder/of/the/path
> >  Cassandra  Cassandra Column Family =
> > mapOfColumnFamilies.get(columnFamilySelector)
> >  Cassandra  RowID = function(/remainder/of/the/path)
> >
> > or to take that one stage further
> >
> > public interface CassandraMapper {
> >       String getCQL(String columnFamilySelector, String path);
> > }
> >
> Hi Ian
> Thank you for the detailed explanation.
>
> OK. +1 for this approach with the mentioned flexibility.But  I need a small
> clarification. With this approach,
>
> URL = /content/cassandra/<columnFamilySelector>ROW-ID
> ROW-ID - function(/remainder/of/the/path).
> So you mean ROW-ID is something we have to programatically uniquely create
>  right ? like a UUID.
>
> What is this "/remainder/of/the/path" means ? Can you give an example with
> real values in the context of a user who want to obtain a resource from
> cassandra.
> This is just for my understanding.
>
>
>
> >
> > URL = /content/cassandra/<columnFamilySelector>/<remainderOfPath>
> >
> >  String cqlQuery =
> >
> >
> mapOfCassandraMappers.get(columnFamilySelector).getCQL(columnFamilySelector,
> > remainderOfPath);
> >
> > Which would allow us provided one or more implementations of
> > CassandraMapper to map between URL and CQL.
> >
> >
> > HTH
> > Ian
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On 23 June 2013 19:29, Dishara Wijewardana <dd...@gmail.com>
> > wrote:
> >
> > > Hi Ian,
> > >
> > > What is the data mapping should be between Cassandra and Sling
> resource.
> > I
> > > mean is a Sling Resource maps to a Cassandra Column ? Or Column Family
> ?
> > >
> > > Because to get this Cassandra and Sling story correct we need to
> finalize
> > > this.
> > > For an example what we eventually returns is a Sling resource.
> Everything
> > > that needs to fill in to create Sling resource should be stored in
> > > Cassandra.
> > > In a Sling resource,
> > >
> > >    - Path - direct sling resource path
> > >    - ResourceType - nt:cassandra
> > >    - ResourceSuperType - ?
> > >    - ResourceMetadata - we can create this on the fly with the data
> from
> > >    the corresponding column. At insertion, those need to be stored.
> > > Following
> > >    are the ones which I thought might be useful by default to be set
> for
> > > any
> > >    node. Please add if we need anything more.
> > >       - ContentType
> > >       - ContentLength
> > >       - CreationTime
> > >       - ModificationTime
> > >    - ResourceResolver -  Do we need a resolver in this case ?
> > >
> > >
> > >  So I believe in CQL context, one ROW should represent a Sling
> resource.
> > If
> > > that is the case for ResourceMetadata we might need a separate column
> to
> > > store it since it has multiple values. I am not sure whether we can do
> it
> > > with CQL, but it should be possible with hector APIs may be.
> > >
> > > Appreciate your thoughts ?
> > >
> > >
> > > On Wed, Jun 19, 2013 at 1:19 AM, Dishara Wijewardana <
> > > ddwijewardana@gmail.com> wrote:
> > >
> > > > Hi Ian,
> > > > I am starting this thread to keep track on things related to the GSoC
> > > > project related milestone status updates and related discussions.
> > > > So the first task over view will be as follows as per GSoC proposal
> > > > provided.
> > > >
> > > > 1. Implementing a CassandraResourceProvider  to READ from Cassandra.
> > > > Implementation Details [1]
> > > >
> > > >
> > > >
> > > > [1] : Implementation Details:
> > > >
> > > >  1.A) Write a CassanrdaResourceProviderUtil  which is basically a
> > > > cassendra client which will facilitate all cassandra related
> operations
> > > > required by other modules (CassandraResourceProvider and
> > > > CassandraResourceResolver).
> > > >
> > > > 1.B) Implementation of  CassandraResourceProvider
> > > >
> > > > 1.C)  Implementation of CassandraResourceResolver
> > > >
> > > > 1.D) Implementation of CassandraResource
> > > >
> > > >
> > > > And I will start writing the CassanrdaResourceProviderUtil class
> which
> > > > will do basic add and get using hector API. Please provide any
> feedback
> > > > that will be useful to accomplish this task.
> > > > So for this how does path mapping should be done. Because for
> example,
> > > the
> > > > path of the cassendra node will not be same as the jcr node path. i.e
> > > > provider will ask a node path /system/myapps/test/foo and where
> should
> > we
> > > > return it from Cassandra. Aren't we have to first consider the WRITE
> > > aspect
> > > > to Cassandra ?
> > > >
> > > >
> > > > --
> > > > Thanks
> > > > /Dishara
> > > >
> > >
> > >
> > >
> > > --
> > > Thanks
> > > /Dishara
> > >
> >
>
>
>
> --
> Thanks
> /Dishara
>

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Dishara Wijewardana <dd...@gmail.com>.
On Mon, Jun 24, 2013 at 4:02 AM, Ian Boston <ie...@tfd.co.uk> wrote:

> Hi Dishara,
> Yes. 1 resource == 1 row.
> The columns within that row represent the properties of the resource.
> I suggest that you use standard property names where appropriate (eg
> sling:resourceType is the Resource.resourceType etc)
>
> The Resource itself should be adaptable to a generic CassandraResource
> (which will probably implement Resource) which will have a map of
> properties containing all the columns of the cassandra row. (optimise
> later) A CassandraResource might look and feel like a Map<String, Object>
> or it might have a Map<String, Object> getProperties() method, or better
> still be adaptable to a Map. The essential think is dont hard code the
> property names in the interface of CassandraResource for the moment. ie no
> getContentType() and no getMimeType(), as we dont really know what a
> CassandraResource will store.
>
> ResourceMetadata should be built from a subset of the CassandraResource
> properties.
>
> You won't need to implement a ResourceResolver, only a ResourceProvider
> (and Factory). I would use CQL in preference to other API methods.
>
> There is one thing that hasnt been mentioned, and thats the URL ->
> Cassandra Row mapping.
> There are several ways of doing this.
>
> eg:
> URL = /content/cassandra/<columnFamily>/<rowID>
>  Cassandra Column Family = columnFamily
>  Cassandra RowID = rowID
> or
> URL = /content/cassandra/<columnFamilySelector>/remainder/of/the/path
>  Cassandra  Cassandra Column Family =
> mapOfColumnFamilies.get(columnFamilySelector)
>  Cassandra  RowID = function(/remainder/of/the/path)
>
> or to take that one stage further
>
> public interface CassandraMapper {
>       String getCQL(String columnFamilySelector, String path);
> }
>
Hi Ian
Thank you for the detailed explanation.

OK. +1 for this approach with the mentioned flexibility.But  I need a small
clarification. With this approach,

URL = /content/cassandra/<columnFamilySelector>ROW-ID
ROW-ID - function(/remainder/of/the/path).
So you mean ROW-ID is something we have to programatically uniquely create
 right ? like a UUID.

What is this "/remainder/of/the/path" means ? Can you give an example with
real values in the context of a user who want to obtain a resource from
cassandra.
This is just for my understanding.



>
> URL = /content/cassandra/<columnFamilySelector>/<remainderOfPath>
>
>  String cqlQuery =
>
> mapOfCassandraMappers.get(columnFamilySelector).getCQL(columnFamilySelector,
> remainderOfPath);
>
> Which would allow us provided one or more implementations of
> CassandraMapper to map between URL and CQL.
>
>
> HTH
> Ian
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On 23 June 2013 19:29, Dishara Wijewardana <dd...@gmail.com>
> wrote:
>
> > Hi Ian,
> >
> > What is the data mapping should be between Cassandra and Sling resource.
> I
> > mean is a Sling Resource maps to a Cassandra Column ? Or Column Family ?
> >
> > Because to get this Cassandra and Sling story correct we need to finalize
> > this.
> > For an example what we eventually returns is a Sling resource. Everything
> > that needs to fill in to create Sling resource should be stored in
> > Cassandra.
> > In a Sling resource,
> >
> >    - Path - direct sling resource path
> >    - ResourceType - nt:cassandra
> >    - ResourceSuperType - ?
> >    - ResourceMetadata - we can create this on the fly with the data from
> >    the corresponding column. At insertion, those need to be stored.
> > Following
> >    are the ones which I thought might be useful by default to be set for
> > any
> >    node. Please add if we need anything more.
> >       - ContentType
> >       - ContentLength
> >       - CreationTime
> >       - ModificationTime
> >    - ResourceResolver -  Do we need a resolver in this case ?
> >
> >
> >  So I believe in CQL context, one ROW should represent a Sling resource.
> If
> > that is the case for ResourceMetadata we might need a separate column to
> > store it since it has multiple values. I am not sure whether we can do it
> > with CQL, but it should be possible with hector APIs may be.
> >
> > Appreciate your thoughts ?
> >
> >
> > On Wed, Jun 19, 2013 at 1:19 AM, Dishara Wijewardana <
> > ddwijewardana@gmail.com> wrote:
> >
> > > Hi Ian,
> > > I am starting this thread to keep track on things related to the GSoC
> > > project related milestone status updates and related discussions.
> > > So the first task over view will be as follows as per GSoC proposal
> > > provided.
> > >
> > > 1. Implementing a CassandraResourceProvider  to READ from Cassandra.
> > > Implementation Details [1]
> > >
> > >
> > >
> > > [1] : Implementation Details:
> > >
> > >  1.A) Write a CassanrdaResourceProviderUtil  which is basically a
> > > cassendra client which will facilitate all cassandra related operations
> > > required by other modules (CassandraResourceProvider and
> > > CassandraResourceResolver).
> > >
> > > 1.B) Implementation of  CassandraResourceProvider
> > >
> > > 1.C)  Implementation of CassandraResourceResolver
> > >
> > > 1.D) Implementation of CassandraResource
> > >
> > >
> > > And I will start writing the CassanrdaResourceProviderUtil class which
> > > will do basic add and get using hector API. Please provide any feedback
> > > that will be useful to accomplish this task.
> > > So for this how does path mapping should be done. Because for example,
> > the
> > > path of the cassendra node will not be same as the jcr node path. i.e
> > > provider will ask a node path /system/myapps/test/foo and where should
> we
> > > return it from Cassandra. Aren't we have to first consider the WRITE
> > aspect
> > > to Cassandra ?
> > >
> > >
> > > --
> > > Thanks
> > > /Dishara
> > >
> >
> >
> >
> > --
> > Thanks
> > /Dishara
> >
>



-- 
Thanks
/Dishara

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Ian Boston <ie...@tfd.co.uk>.
Hi Dishara,
Yes. 1 resource == 1 row.
The columns within that row represent the properties of the resource.
I suggest that you use standard property names where appropriate (eg
sling:resourceType is the Resource.resourceType etc)

The Resource itself should be adaptable to a generic CassandraResource
(which will probably implement Resource) which will have a map of
properties containing all the columns of the cassandra row. (optimise
later) A CassandraResource might look and feel like a Map<String, Object>
or it might have a Map<String, Object> getProperties() method, or better
still be adaptable to a Map. The essential think is dont hard code the
property names in the interface of CassandraResource for the moment. ie no
getContentType() and no getMimeType(), as we dont really know what a
CassandraResource will store.

ResourceMetadata should be built from a subset of the CassandraResource
properties.

You won't need to implement a ResourceResolver, only a ResourceProvider
(and Factory). I would use CQL in preference to other API methods.

There is one thing that hasnt been mentioned, and thats the URL ->
Cassandra Row mapping.
There are several ways of doing this.

eg:
URL = /content/cassandra/<columnFamily>/<rowID>
 Cassandra Column Family = columnFamily
 Cassandra RowID = rowID
or
URL = /content/cassandra/<columnFamilySelector>/remainder/of/the/path
 Cassandra  Cassandra Column Family =
mapOfColumnFamilies.get(columnFamilySelector)
 Cassandra  RowID = function(/remainder/of/the/path)

or to take that one stage further

public interface CassandraMapper {
      String getCQL(String columnFamilySelector, String path);
}

URL = /content/cassandra/<columnFamilySelector>/<remainderOfPath>

 String cqlQuery =
mapOfCassandraMappers.get(columnFamilySelector).getCQL(columnFamilySelector,
remainderOfPath);

Which would allow us provided one or more implementations of
CassandraMapper to map between URL and CQL.


HTH
Ian
















On 23 June 2013 19:29, Dishara Wijewardana <dd...@gmail.com> wrote:

> Hi Ian,
>
> What is the data mapping should be between Cassandra and Sling resource. I
> mean is a Sling Resource maps to a Cassandra Column ? Or Column Family ?
>
> Because to get this Cassandra and Sling story correct we need to finalize
> this.
> For an example what we eventually returns is a Sling resource. Everything
> that needs to fill in to create Sling resource should be stored in
> Cassandra.
> In a Sling resource,
>
>    - Path - direct sling resource path
>    - ResourceType - nt:cassandra
>    - ResourceSuperType - ?
>    - ResourceMetadata - we can create this on the fly with the data from
>    the corresponding column. At insertion, those need to be stored.
> Following
>    are the ones which I thought might be useful by default to be set for
> any
>    node. Please add if we need anything more.
>       - ContentType
>       - ContentLength
>       - CreationTime
>       - ModificationTime
>    - ResourceResolver -  Do we need a resolver in this case ?
>
>
>  So I believe in CQL context, one ROW should represent a Sling resource. If
> that is the case for ResourceMetadata we might need a separate column to
> store it since it has multiple values. I am not sure whether we can do it
> with CQL, but it should be possible with hector APIs may be.
>
> Appreciate your thoughts ?
>
>
> On Wed, Jun 19, 2013 at 1:19 AM, Dishara Wijewardana <
> ddwijewardana@gmail.com> wrote:
>
> > Hi Ian,
> > I am starting this thread to keep track on things related to the GSoC
> > project related milestone status updates and related discussions.
> > So the first task over view will be as follows as per GSoC proposal
> > provided.
> >
> > 1. Implementing a CassandraResourceProvider  to READ from Cassandra.
> > Implementation Details [1]
> >
> >
> >
> > [1] : Implementation Details:
> >
> >  1.A) Write a CassanrdaResourceProviderUtil  which is basically a
> > cassendra client which will facilitate all cassandra related operations
> > required by other modules (CassandraResourceProvider and
> > CassandraResourceResolver).
> >
> > 1.B) Implementation of  CassandraResourceProvider
> >
> > 1.C)  Implementation of CassandraResourceResolver
> >
> > 1.D) Implementation of CassandraResource
> >
> >
> > And I will start writing the CassanrdaResourceProviderUtil class which
> > will do basic add and get using hector API. Please provide any feedback
> > that will be useful to accomplish this task.
> > So for this how does path mapping should be done. Because for example,
> the
> > path of the cassendra node will not be same as the jcr node path. i.e
> > provider will ask a node path /system/myapps/test/foo and where should we
> > return it from Cassandra. Aren't we have to first consider the WRITE
> aspect
> > to Cassandra ?
> >
> >
> > --
> > Thanks
> > /Dishara
> >
>
>
>
> --
> Thanks
> /Dishara
>

Re: [Status Update] Apache Cassandra backend for Sling

Posted by Dishara Wijewardana <dd...@gmail.com>.
Hi Ian,

What is the data mapping should be between Cassandra and Sling resource. I
mean is a Sling Resource maps to a Cassandra Column ? Or Column Family ?

Because to get this Cassandra and Sling story correct we need to finalize
this.
For an example what we eventually returns is a Sling resource. Everything
that needs to fill in to create Sling resource should be stored in
Cassandra.
In a Sling resource,

   - Path - direct sling resource path
   - ResourceType - nt:cassandra
   - ResourceSuperType - ?
   - ResourceMetadata - we can create this on the fly with the data from
   the corresponding column. At insertion, those need to be stored. Following
   are the ones which I thought might be useful by default to be set for any
   node. Please add if we need anything more.
      - ContentType
      - ContentLength
      - CreationTime
      - ModificationTime
   - ResourceResolver -  Do we need a resolver in this case ?


 So I believe in CQL context, one ROW should represent a Sling resource. If
that is the case for ResourceMetadata we might need a separate column to
store it since it has multiple values. I am not sure whether we can do it
with CQL, but it should be possible with hector APIs may be.

Appreciate your thoughts ?


On Wed, Jun 19, 2013 at 1:19 AM, Dishara Wijewardana <
ddwijewardana@gmail.com> wrote:

> Hi Ian,
> I am starting this thread to keep track on things related to the GSoC
> project related milestone status updates and related discussions.
> So the first task over view will be as follows as per GSoC proposal
> provided.
>
> 1. Implementing a CassandraResourceProvider  to READ from Cassandra.
> Implementation Details [1]
>
>
>
> [1] : Implementation Details:
>
>  1.A) Write a CassanrdaResourceProviderUtil  which is basically a
> cassendra client which will facilitate all cassandra related operations
> required by other modules (CassandraResourceProvider and
> CassandraResourceResolver).
>
> 1.B) Implementation of  CassandraResourceProvider
>
> 1.C)  Implementation of CassandraResourceResolver
>
> 1.D) Implementation of CassandraResource
>
>
> And I will start writing the CassanrdaResourceProviderUtil class which
> will do basic add and get using hector API. Please provide any feedback
> that will be useful to accomplish this task.
> So for this how does path mapping should be done. Because for example, the
> path of the cassendra node will not be same as the jcr node path. i.e
> provider will ask a node path /system/myapps/test/foo and where should we
> return it from Cassandra. Aren't we have to first consider the WRITE aspect
> to Cassandra ?
>
>
> --
> Thanks
> /Dishara
>



-- 
Thanks
/Dishara