You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Aditya <ad...@gmail.com> on 2011/12/27 07:17:05 UTC

Retrieve all composite columns from a row, whose composite name's first component matches from a list of Integers

I need to store data of all activities by user's followies in single row. I
am trying to do that making use of composite column names in a single user
specific row named 'rowX'.

On any activity by a user's followie on an item, a column is stored in
'rowX'. The column has a composite type column name made up of
itemId+userId (which makes it unique col. name) in rowX. (& column value
contains the activity data related to that item by that followie)


Now I want to retrieve activity by all users on a list of items. So I need
to retrieve all composite columns with composite's first component matching
the itemId. Is it possible to do such a query to Cassandra ? I am using
Hector.

Re: Retrieve all composite columns from a row, whose composite name's first component matches from a list of Integers

Posted by Martin Arrowsmith <ar...@gmail.com>.
I believe this calls for Cassanda Cookbook 2nd edition :)

On Wed, Dec 28, 2011 at 10:26 AM, Edward Capriolo <ed...@gmail.com>wrote:

> Super columns have the same fundamental problem and perform worse in
> general. So switching from composites to super columns is NEVER a good idea.
>
>
> On Wed, Dec 28, 2011 at 1:19 PM, Aditya <ad...@gmail.com> wrote:
>
>> Since I have around 20 items to query, I guess making 20 queries to
>> retrieve activities by all followies on all of those 20 columns would too
>> inefficient, so to take the advantage of more efficient queries, are
>> supercolumns recommended for this case ? Anyways, in case I use
>> supercolumns, I need to retrieve the entire supercolumn at any point of
>> time & I am writing subcolumn(s) to the supercolumn at different times not
>> at once.
>>
>> On Wed, Dec 28, 2011 at 8:07 PM, Edward Capriolo <ed...@gmail.com>wrote:
>>
>>> You need to execute one get slice operation for each item id or if the
>>> row is not large , you can try one large get slice on the entire row and
>>> deal with the results client side.
>>>
>>> If you try method 1 When doing slices on composites you can set the
>>> start inclusive or exclusive values to get only the column you want and not
>>> some extra columns up to slice range size.
>>>
>>>
>>> On Tuesday, December 27, 2011, Aditya <ad...@gmail.com> wrote:
>>> > I need to store data of all activities by user's followies in single
>>> row. I am trying to do that making use of composite column names in a
>>> single user specific row named 'rowX'.
>>> > On any activity by a user's followie on an item, a column is stored in
>>> 'rowX'. The column has a composite type column name made up of
>>> itemId+userId (which makes it unique col. name) in rowX. (& column value
>>> contains the activity data related to that item by that followie)
>>> >
>>> > Now I want to retrieve activity by all users on a list of items. So I
>>> need to retrieve all composite columns with composite's first component
>>> matching the itemId. Is it possible to do such a query to Cassandra ? I am
>>> using Hector.
>>>
>>
>>
>

Re: Retrieve all composite columns from a row, whose composite name's first component matches from a list of Integers

Posted by Aditya <ad...@gmail.com>.
Also point worth noticing is that there might be at max 8-10  subcolumns
per supercolumn.
I need to write a subcolumn at a time( but always read entire supercolumn
at any time).

On Fri, Dec 30, 2011 at 12:20 AM, Aditya <ad...@gmail.com> wrote:

> @Edward: Perhaps you missed to notice that I need to always retrieve 'all
> columns' under the supercolumn at any time.. and as per my query
> requirements if I use composite columns instead of supercolumns then it is
> impossible to do wildcard queries like the ones asked in this thread's
> headline but which is much easier to do through the use of supercolumns.
>
>
> On Thu, Dec 29, 2011 at 11:06 PM, Edward Capriolo <ed...@gmail.com>wrote:
>
>> The use case in question was: Only accessing some columns.
>>
>> Even if that is not the case:
>>
>> SuperColumns: 1 extra level of nesting
>> Composite Colunns: Arbitrary levels of nesting
>>
>> SuperColumns: More overhead (space on disk) then using your own delimiter
>> '_'
>> SuperColumns: Likely going to be replaced in future c* version behind
>> the scenes by composite columns anyway
>> SuperColumns: Usually an afterthought for API developers, (support for
>> them comes "later")
>> SuperColumns: Almost always utilized incorrectly by users, users speak
>> of '10%' performance gains after they switch away from them.
>>
>> There are some (a small % of cases) where SuperColumns are a better
>> choice, but this is rare. With composites and concatenating columns
>> they have no great purpose any more, (bad analogy coming!) like a
>> mechanical type writer.
>>
>> On 12/29/11, Philippe <wa...@gmail.com> wrote:
>> > Would you stand by that statement in case all colums inside the super
>> > column need to be read?  Why?
>> >
>> > Thanks
>> > Le 28 déc. 2011 19:26, "Edward Capriolo" <ed...@gmail.com> a
>> écrit :
>> >
>> >> Super columns have the same fundamental problem and perform worse in
>> >> general. So switching from composites to super columns is NEVER a good
>> >> idea.
>> >>
>> >>
>> >> On Wed, Dec 28, 2011 at 1:19 PM, Aditya <ad...@gmail.com> wrote:
>> >>
>> >>> Since I have around 20 items to query, I guess making 20 queries to
>> >>> retrieve activities by all followies on all of those 20 columns would
>> too
>> >>> inefficient, so to take the advantage of more efficient queries, are
>> >>> supercolumns recommended for this case ? Anyways, in case I use
>> >>> supercolumns, I need to retrieve the entire supercolumn at any point
>> of
>> >>> time & I am writing subcolumn(s) to the supercolumn at different times
>> >>> not
>> >>> at once.
>> >>>
>> >>> On Wed, Dec 28, 2011 at 8:07 PM, Edward Capriolo
>> >>> <ed...@gmail.com>wrote:
>> >>>
>> >>>> You need to execute one get slice operation for each item id or if
>> the
>> >>>> row is not large , you can try one large get slice on the entire row
>> and
>> >>>> deal with the results client side.
>> >>>>
>> >>>> If you try method 1 When doing slices on composites you can set the
>> >>>> start inclusive or exclusive values to get only the column you want
>> and
>> >>>> not
>> >>>> some extra columns up to slice range size.
>> >>>>
>> >>>>
>> >>>> On Tuesday, December 27, 2011, Aditya <ad...@gmail.com> wrote:
>> >>>> > I need to store data of all activities by user's followies in
>> single
>> >>>> row. I am trying to do that making use of composite column names in a
>> >>>> single user specific row named 'rowX'.
>> >>>> > On any activity by a user's followie on an item, a column is
>> stored in
>> >>>> 'rowX'. The column has a composite type column name made up of
>> >>>> itemId+userId (which makes it unique col. name) in rowX. (& column
>> value
>> >>>> contains the activity data related to that item by that followie)
>> >>>> >
>> >>>> > Now I want to retrieve activity by all users on a list of items.
>> So I
>> >>>> need to retrieve all composite columns with composite's first
>> component
>> >>>> matching the itemId. Is it possible to do such a query to Cassandra
>> ? I
>> >>>> am
>> >>>> using Hector.
>> >>>>
>> >>>
>> >>>
>> >>
>> >
>>
>
>

Re: Retrieve all composite columns from a row, whose composite name's first component matches from a list of Integers

Posted by Philippe <wa...@gmail.com>.
I currently have
scf[c1][sc1]=value
scf[c1][sc2]=value
...
scf[c2][sc1]=value
scf[c2][sc2]=value
scf[c2][sc3]=value
scf[c2][sc4]=value

99% of the time, I do multiget super slices: for multiple keys, I query for
columns explicitly c1,c2,c10,c12
1% of the time, I do a multigetrange superslice where for multiple keys, I
query for a range of super columns
As Tyler said, it can be done by specifying supercolumns in the slice
predicate, it will implicitly return all its columns. I use Hector and it
works great.

Now interestingly enough, column names sc1, sc2, sc3 are in fact home-made
composite columns.

I could and would switch to full composite columns because I am fishing for
every drop of performance I can. However, I would need "Letting
multiget_slice accept multiple SlicePredicates per key could also
accomplish this."
Can anyone on the dev team comment on doing this ? Is it a no-no ?

Thanks

2011/12/29 Edward Capriolo <ed...@gmail.com>

> Hum...
>
> Do you have this?
> scf [b][1][a]=value
> scf [b][1][x]=value
> scf [b][7][b]=value
>
> and you want to slice:
> scf [b][1][*]
>
> Which would result in
>
> scf [b][1][a]=value
> scf [b][1][x]=value
>
> ?
>
> The composite version of this would be:
> cf [b][1:a]=value
> cf [b][1:x]=value
> cf [b][7:b]=value
>
> I am not sure exactly what you are doing because A SlicePredicate
> takes either a list of columns or a SliceRange. A ColumnPath takes a
> Single SuperColumn.
>
> I do not see how this is done with Columns or SuperColumns. Maybe you
> can provide a code snippet and/or some sample data?
>
> On 12/29/11, Aditya <ad...@gmail.com> wrote:
> > @Edward: Perhaps you missed to notice that I need to always retrieve 'all
> > columns' under the supercolumn at any time.. and as per my query
> > requirements if I use composite columns instead of supercolumns then it
> is
> > impossible to do wildcard queries like the ones asked in this thread's
> > headline but which is much easier to do through the use of supercolumns.
> >
> > On Thu, Dec 29, 2011 at 11:06 PM, Edward Capriolo
> > <ed...@gmail.com>wrote:
> >
> >> The use case in question was: Only accessing some columns.
> >>
> >> Even if that is not the case:
> >>
> >> SuperColumns: 1 extra level of nesting
> >> Composite Colunns: Arbitrary levels of nesting
> >>
> >> SuperColumns: More overhead (space on disk) then using your own
> delimiter
> >> '_'
> >> SuperColumns: Likely going to be replaced in future c* version behind
> >> the scenes by composite columns anyway
> >> SuperColumns: Usually an afterthought for API developers, (support for
> >> them comes "later")
> >> SuperColumns: Almost always utilized incorrectly by users, users speak
> >> of '10%' performance gains after they switch away from them.
> >>
> >> There are some (a small % of cases) where SuperColumns are a better
> >> choice, but this is rare. With composites and concatenating columns
> >> they have no great purpose any more, (bad analogy coming!) like a
> >> mechanical type writer.
> >>
> >> On 12/29/11, Philippe <wa...@gmail.com> wrote:
> >> > Would you stand by that statement in case all colums inside the super
> >> > column need to be read?  Why?
> >> >
> >> > Thanks
> >> > Le 28 déc. 2011 19:26, "Edward Capriolo" <ed...@gmail.com> a
> >> écrit :
> >> >
> >> >> Super columns have the same fundamental problem and perform worse in
> >> >> general. So switching from composites to super columns is NEVER a
> good
> >> >> idea.
> >> >>
> >> >>
> >> >> On Wed, Dec 28, 2011 at 1:19 PM, Aditya <ad...@gmail.com> wrote:
> >> >>
> >> >>> Since I have around 20 items to query, I guess making 20 queries to
> >> >>> retrieve activities by all followies on all of those 20 columns
> would
> >> too
> >> >>> inefficient, so to take the advantage of more efficient queries, are
> >> >>> supercolumns recommended for this case ? Anyways, in case I use
> >> >>> supercolumns, I need to retrieve the entire supercolumn at any point
> >> >>> of
> >> >>> time & I am writing subcolumn(s) to the supercolumn at different
> times
> >> >>> not
> >> >>> at once.
> >> >>>
> >> >>> On Wed, Dec 28, 2011 at 8:07 PM, Edward Capriolo
> >> >>> <ed...@gmail.com>wrote:
> >> >>>
> >> >>>> You need to execute one get slice operation for each item id or if
> >> >>>> the
> >> >>>> row is not large , you can try one large get slice on the entire
> row
> >> and
> >> >>>> deal with the results client side.
> >> >>>>
> >> >>>> If you try method 1 When doing slices on composites you can set the
> >> >>>> start inclusive or exclusive values to get only the column you want
> >> and
> >> >>>> not
> >> >>>> some extra columns up to slice range size.
> >> >>>>
> >> >>>>
> >> >>>> On Tuesday, December 27, 2011, Aditya <ad...@gmail.com> wrote:
> >> >>>> > I need to store data of all activities by user's followies in
> >> >>>> > single
> >> >>>> row. I am trying to do that making use of composite column names
> in a
> >> >>>> single user specific row named 'rowX'.
> >> >>>> > On any activity by a user's followie on an item, a column is
> stored
> >> in
> >> >>>> 'rowX'. The column has a composite type column name made up of
> >> >>>> itemId+userId (which makes it unique col. name) in rowX. (& column
> >> value
> >> >>>> contains the activity data related to that item by that followie)
> >> >>>> >
> >> >>>> > Now I want to retrieve activity by all users on a list of items.
> So
> >> I
> >> >>>> need to retrieve all composite columns with composite's first
> >> component
> >> >>>> matching the itemId. Is it possible to do such a query to
> Cassandra ?
> >> I
> >> >>>> am
> >> >>>> using Hector.
> >> >>>>
> >> >>>
> >> >>>
> >> >>
> >> >
> >>
> >
>

Re: Retrieve all composite columns from a row, whose composite name's first component matches from a list of Integers

Posted by Tyler Hobbs <ty...@datastax.com>.
On Thu, Dec 29, 2011 at 3:13 PM, Edward Capriolo <ed...@gmail.com>wrote:

>
> You seen to say you can query for a list of supercolumns, I am not
> sure how this works because the ColumnParent seems to only accept a
> single SuperColumn, but if you can do it I am not calling you a liar.
>

If you don't specify a super column ColumnParent, then
SlicePredicate.columns are assumed to be super column names.


>
> Maybe this is a good case for 'server side scanners'. Ow man I know
> jbellis read this and put my face up on a dart board.


Letting multiget_slice accept multiple SlicePredicates per key could also
accomplish this.

-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Re: Retrieve all composite columns from a row, whose composite name's first component matches from a list of Integers

Posted by Edward Capriolo <ed...@gmail.com>.
vi ./src/java/org/apache/cassandra/db/marshal/CompositeType.java

'end-of-component' byte should always be 0 for actual column name.
 * However, it can set to 1 for query bounds. This allows to query for the
 * equivalent of 'give me the full super-column'. That is, if during a slice
 * query uses:
 *   start = <3><"foo".getBytes()><0>
 *   end   = <3><"foo".getBytes()><1>

So with composites columns you can do:

scf [b][1][*]

by setting the start and end component.

But you can not do
scf [b][1][*]
scf [b][7][*]
in a single operation with composites.

You seen to say you can query for a list of supercolumns, I am not
sure how this works because the ColumnParent seems to only accept a
single SuperColumn, but if you can do it I am not calling you a liar.

Maybe this is a good case for 'server side scanners'. Ow man I know
jbellis read this and put my face up on a dart board.




On 12/29/11, Aditya <ad...@gmail.com> wrote:
> On Fri, Dec 30, 2011 at 1:42 AM, Edward Capriolo
> <ed...@gmail.com>wrote:
>
>> Hum...
>>
>> Do you have this?
>> scf [b][1][a]=value
>> scf [b][1][x]=value
>> scf [b][7][b]=value
>>
>> and you want to slice:
>> scf [b][1][*]
>>
>> Which would result in
>>
>> scf [b][1][a]=value
>> scf [b][1][x]=value
>>
>> ?
>>
>
> Exactly I have this!
> And as for the queries, I want to retrieve columns (satisfying from a list
> of wildcard names) , something like below :
>
> scf [b][1][*]
> scf [b][7][*]
>
> Now this type of queries are not possible with composite columns but it is
> very easily achievable through use of supercolumns, i can simply query for
> a list of  supercolumns(with entire subcolumns) by name. Right?
>
> So this is easier in terms of designing a query but since I don't
> understand much about the internals and all, I am not sure if this is best
> option for me, though by looking at my retrieval needs I feel somewhat
> biased towards using supercolumns.
>
>>
>> The composite version of this would be:
>> cf [b][1:a]=value
>> cf [b][1:x]=value
>> cf [b][7:b]=value
>>
>> I am not sure exactly what you are doing because A SlicePredicate
>> takes either a list of columns or a SliceRange. A ColumnPath takes a
>> Single SuperColumn.
>>
>> I do not see how this is done with Columns or SuperColumns. Maybe you
>> can provide a code snippet and/or some sample data?
>>
>> On 12/29/11, Aditya <ad...@gmail.com> wrote:
>> > @Edward: Perhaps you missed to notice that I need to always retrieve
>> > 'all
>> > columns' under the supercolumn at any time.. and as per my query
>> > requirements if I use composite columns instead of supercolumns then it
>> is
>> > impossible to do wildcard queries like the ones asked in this thread's
>> > headline but which is much easier to do through the use of supercolumns.
>> >
>> > On Thu, Dec 29, 2011 at 11:06 PM, Edward Capriolo
>> > <ed...@gmail.com>wrote:
>> >
>> >> The use case in question was: Only accessing some columns.
>> >>
>> >> Even if that is not the case:
>> >>
>> >> SuperColumns: 1 extra level of nesting
>> >> Composite Colunns: Arbitrary levels of nesting
>> >>
>> >> SuperColumns: More overhead (space on disk) then using your own
>> delimiter
>> >> '_'
>> >> SuperColumns: Likely going to be replaced in future c* version behind
>> >> the scenes by composite columns anyway
>> >> SuperColumns: Usually an afterthought for API developers, (support for
>> >> them comes "later")
>> >> SuperColumns: Almost always utilized incorrectly by users, users speak
>> >> of '10%' performance gains after they switch away from them.
>> >>
>> >> There are some (a small % of cases) where SuperColumns are a better
>> >> choice, but this is rare. With composites and concatenating columns
>> >> they have no great purpose any more, (bad analogy coming!) like a
>> >> mechanical type writer.
>> >>
>> >> On 12/29/11, Philippe <wa...@gmail.com> wrote:
>> >> > Would you stand by that statement in case all colums inside the super
>> >> > column need to be read?  Why?
>> >> >
>> >> > Thanks
>> >> > Le 28 déc. 2011 19:26, "Edward Capriolo" <ed...@gmail.com> a
>> >> écrit :
>> >> >
>> >> >> Super columns have the same fundamental problem and perform worse in
>> >> >> general. So switching from composites to super columns is NEVER a
>> good
>> >> >> idea.
>> >> >>
>> >> >>
>> >> >> On Wed, Dec 28, 2011 at 1:19 PM, Aditya <ad...@gmail.com> wrote:
>> >> >>
>> >> >>> Since I have around 20 items to query, I guess making 20 queries to
>> >> >>> retrieve activities by all followies on all of those 20 columns
>> would
>> >> too
>> >> >>> inefficient, so to take the advantage of more efficient queries,
>> >> >>> are
>> >> >>> supercolumns recommended for this case ? Anyways, in case I use
>> >> >>> supercolumns, I need to retrieve the entire supercolumn at any
>> >> >>> point
>> >> >>> of
>> >> >>> time & I am writing subcolumn(s) to the supercolumn at different
>> times
>> >> >>> not
>> >> >>> at once.
>> >> >>>
>> >> >>> On Wed, Dec 28, 2011 at 8:07 PM, Edward Capriolo
>> >> >>> <ed...@gmail.com>wrote:
>> >> >>>
>> >> >>>> You need to execute one get slice operation for each item id or if
>> >> >>>> the
>> >> >>>> row is not large , you can try one large get slice on the entire
>> row
>> >> and
>> >> >>>> deal with the results client side.
>> >> >>>>
>> >> >>>> If you try method 1 When doing slices on composites you can set
>> >> >>>> the
>> >> >>>> start inclusive or exclusive values to get only the column you
>> >> >>>> want
>> >> and
>> >> >>>> not
>> >> >>>> some extra columns up to slice range size.
>> >> >>>>
>> >> >>>>
>> >> >>>> On Tuesday, December 27, 2011, Aditya <ad...@gmail.com> wrote:
>> >> >>>> > I need to store data of all activities by user's followies in
>> >> >>>> > single
>> >> >>>> row. I am trying to do that making use of composite column names
>> in a
>> >> >>>> single user specific row named 'rowX'.
>> >> >>>> > On any activity by a user's followie on an item, a column is
>> stored
>> >> in
>> >> >>>> 'rowX'. The column has a composite type column name made up of
>> >> >>>> itemId+userId (which makes it unique col. name) in rowX. (& column
>> >> value
>> >> >>>> contains the activity data related to that item by that followie)
>> >> >>>> >
>> >> >>>> > Now I want to retrieve activity by all users on a list of items.
>> So
>> >> I
>> >> >>>> need to retrieve all composite columns with composite's first
>> >> component
>> >> >>>> matching the itemId. Is it possible to do such a query to
>> Cassandra ?
>> >> I
>> >> >>>> am
>> >> >>>> using Hector.
>> >> >>>>
>> >> >>>
>> >> >>>
>> >> >>
>> >> >
>> >>
>> >
>>
>

Re: Retrieve all composite columns from a row, whose composite name's first component matches from a list of Integers

Posted by Aditya <ad...@gmail.com>.
On Fri, Dec 30, 2011 at 1:42 AM, Edward Capriolo <ed...@gmail.com>wrote:

> Hum...
>
> Do you have this?
> scf [b][1][a]=value
> scf [b][1][x]=value
> scf [b][7][b]=value
>
> and you want to slice:
> scf [b][1][*]
>
> Which would result in
>
> scf [b][1][a]=value
> scf [b][1][x]=value
>
> ?
>

Exactly I have this!
And as for the queries, I want to retrieve columns (satisfying from a list
of wildcard names) , something like below :

scf [b][1][*]
scf [b][7][*]

Now this type of queries are not possible with composite columns but it is
very easily achievable through use of supercolumns, i can simply query for
a list of  supercolumns(with entire subcolumns) by name. Right?

So this is easier in terms of designing a query but since I don't
understand much about the internals and all, I am not sure if this is best
option for me, though by looking at my retrieval needs I feel somewhat
biased towards using supercolumns.

>
> The composite version of this would be:
> cf [b][1:a]=value
> cf [b][1:x]=value
> cf [b][7:b]=value
>
> I am not sure exactly what you are doing because A SlicePredicate
> takes either a list of columns or a SliceRange. A ColumnPath takes a
> Single SuperColumn.
>
> I do not see how this is done with Columns or SuperColumns. Maybe you
> can provide a code snippet and/or some sample data?
>
> On 12/29/11, Aditya <ad...@gmail.com> wrote:
> > @Edward: Perhaps you missed to notice that I need to always retrieve 'all
> > columns' under the supercolumn at any time.. and as per my query
> > requirements if I use composite columns instead of supercolumns then it
> is
> > impossible to do wildcard queries like the ones asked in this thread's
> > headline but which is much easier to do through the use of supercolumns.
> >
> > On Thu, Dec 29, 2011 at 11:06 PM, Edward Capriolo
> > <ed...@gmail.com>wrote:
> >
> >> The use case in question was: Only accessing some columns.
> >>
> >> Even if that is not the case:
> >>
> >> SuperColumns: 1 extra level of nesting
> >> Composite Colunns: Arbitrary levels of nesting
> >>
> >> SuperColumns: More overhead (space on disk) then using your own
> delimiter
> >> '_'
> >> SuperColumns: Likely going to be replaced in future c* version behind
> >> the scenes by composite columns anyway
> >> SuperColumns: Usually an afterthought for API developers, (support for
> >> them comes "later")
> >> SuperColumns: Almost always utilized incorrectly by users, users speak
> >> of '10%' performance gains after they switch away from them.
> >>
> >> There are some (a small % of cases) where SuperColumns are a better
> >> choice, but this is rare. With composites and concatenating columns
> >> they have no great purpose any more, (bad analogy coming!) like a
> >> mechanical type writer.
> >>
> >> On 12/29/11, Philippe <wa...@gmail.com> wrote:
> >> > Would you stand by that statement in case all colums inside the super
> >> > column need to be read?  Why?
> >> >
> >> > Thanks
> >> > Le 28 déc. 2011 19:26, "Edward Capriolo" <ed...@gmail.com> a
> >> écrit :
> >> >
> >> >> Super columns have the same fundamental problem and perform worse in
> >> >> general. So switching from composites to super columns is NEVER a
> good
> >> >> idea.
> >> >>
> >> >>
> >> >> On Wed, Dec 28, 2011 at 1:19 PM, Aditya <ad...@gmail.com> wrote:
> >> >>
> >> >>> Since I have around 20 items to query, I guess making 20 queries to
> >> >>> retrieve activities by all followies on all of those 20 columns
> would
> >> too
> >> >>> inefficient, so to take the advantage of more efficient queries, are
> >> >>> supercolumns recommended for this case ? Anyways, in case I use
> >> >>> supercolumns, I need to retrieve the entire supercolumn at any point
> >> >>> of
> >> >>> time & I am writing subcolumn(s) to the supercolumn at different
> times
> >> >>> not
> >> >>> at once.
> >> >>>
> >> >>> On Wed, Dec 28, 2011 at 8:07 PM, Edward Capriolo
> >> >>> <ed...@gmail.com>wrote:
> >> >>>
> >> >>>> You need to execute one get slice operation for each item id or if
> >> >>>> the
> >> >>>> row is not large , you can try one large get slice on the entire
> row
> >> and
> >> >>>> deal with the results client side.
> >> >>>>
> >> >>>> If you try method 1 When doing slices on composites you can set the
> >> >>>> start inclusive or exclusive values to get only the column you want
> >> and
> >> >>>> not
> >> >>>> some extra columns up to slice range size.
> >> >>>>
> >> >>>>
> >> >>>> On Tuesday, December 27, 2011, Aditya <ad...@gmail.com> wrote:
> >> >>>> > I need to store data of all activities by user's followies in
> >> >>>> > single
> >> >>>> row. I am trying to do that making use of composite column names
> in a
> >> >>>> single user specific row named 'rowX'.
> >> >>>> > On any activity by a user's followie on an item, a column is
> stored
> >> in
> >> >>>> 'rowX'. The column has a composite type column name made up of
> >> >>>> itemId+userId (which makes it unique col. name) in rowX. (& column
> >> value
> >> >>>> contains the activity data related to that item by that followie)
> >> >>>> >
> >> >>>> > Now I want to retrieve activity by all users on a list of items.
> So
> >> I
> >> >>>> need to retrieve all composite columns with composite's first
> >> component
> >> >>>> matching the itemId. Is it possible to do such a query to
> Cassandra ?
> >> I
> >> >>>> am
> >> >>>> using Hector.
> >> >>>>
> >> >>>
> >> >>>
> >> >>
> >> >
> >>
> >
>

Re: Retrieve all composite columns from a row, whose composite name's first component matches from a list of Integers

Posted by Edward Capriolo <ed...@gmail.com>.
Hum...

Do you have this?
scf [b][1][a]=value
scf [b][1][x]=value
scf [b][7][b]=value

and you want to slice:
scf [b][1][*]

Which would result in

scf [b][1][a]=value
scf [b][1][x]=value

?

The composite version of this would be:
cf [b][1:a]=value
cf [b][1:x]=value
cf [b][7:b]=value

I am not sure exactly what you are doing because A SlicePredicate
takes either a list of columns or a SliceRange. A ColumnPath takes a
Single SuperColumn.

I do not see how this is done with Columns or SuperColumns. Maybe you
can provide a code snippet and/or some sample data?

On 12/29/11, Aditya <ad...@gmail.com> wrote:
> @Edward: Perhaps you missed to notice that I need to always retrieve 'all
> columns' under the supercolumn at any time.. and as per my query
> requirements if I use composite columns instead of supercolumns then it is
> impossible to do wildcard queries like the ones asked in this thread's
> headline but which is much easier to do through the use of supercolumns.
>
> On Thu, Dec 29, 2011 at 11:06 PM, Edward Capriolo
> <ed...@gmail.com>wrote:
>
>> The use case in question was: Only accessing some columns.
>>
>> Even if that is not the case:
>>
>> SuperColumns: 1 extra level of nesting
>> Composite Colunns: Arbitrary levels of nesting
>>
>> SuperColumns: More overhead (space on disk) then using your own delimiter
>> '_'
>> SuperColumns: Likely going to be replaced in future c* version behind
>> the scenes by composite columns anyway
>> SuperColumns: Usually an afterthought for API developers, (support for
>> them comes "later")
>> SuperColumns: Almost always utilized incorrectly by users, users speak
>> of '10%' performance gains after they switch away from them.
>>
>> There are some (a small % of cases) where SuperColumns are a better
>> choice, but this is rare. With composites and concatenating columns
>> they have no great purpose any more, (bad analogy coming!) like a
>> mechanical type writer.
>>
>> On 12/29/11, Philippe <wa...@gmail.com> wrote:
>> > Would you stand by that statement in case all colums inside the super
>> > column need to be read?  Why?
>> >
>> > Thanks
>> > Le 28 déc. 2011 19:26, "Edward Capriolo" <ed...@gmail.com> a
>> écrit :
>> >
>> >> Super columns have the same fundamental problem and perform worse in
>> >> general. So switching from composites to super columns is NEVER a good
>> >> idea.
>> >>
>> >>
>> >> On Wed, Dec 28, 2011 at 1:19 PM, Aditya <ad...@gmail.com> wrote:
>> >>
>> >>> Since I have around 20 items to query, I guess making 20 queries to
>> >>> retrieve activities by all followies on all of those 20 columns would
>> too
>> >>> inefficient, so to take the advantage of more efficient queries, are
>> >>> supercolumns recommended for this case ? Anyways, in case I use
>> >>> supercolumns, I need to retrieve the entire supercolumn at any point
>> >>> of
>> >>> time & I am writing subcolumn(s) to the supercolumn at different times
>> >>> not
>> >>> at once.
>> >>>
>> >>> On Wed, Dec 28, 2011 at 8:07 PM, Edward Capriolo
>> >>> <ed...@gmail.com>wrote:
>> >>>
>> >>>> You need to execute one get slice operation for each item id or if
>> >>>> the
>> >>>> row is not large , you can try one large get slice on the entire row
>> and
>> >>>> deal with the results client side.
>> >>>>
>> >>>> If you try method 1 When doing slices on composites you can set the
>> >>>> start inclusive or exclusive values to get only the column you want
>> and
>> >>>> not
>> >>>> some extra columns up to slice range size.
>> >>>>
>> >>>>
>> >>>> On Tuesday, December 27, 2011, Aditya <ad...@gmail.com> wrote:
>> >>>> > I need to store data of all activities by user's followies in
>> >>>> > single
>> >>>> row. I am trying to do that making use of composite column names in a
>> >>>> single user specific row named 'rowX'.
>> >>>> > On any activity by a user's followie on an item, a column is stored
>> in
>> >>>> 'rowX'. The column has a composite type column name made up of
>> >>>> itemId+userId (which makes it unique col. name) in rowX. (& column
>> value
>> >>>> contains the activity data related to that item by that followie)
>> >>>> >
>> >>>> > Now I want to retrieve activity by all users on a list of items. So
>> I
>> >>>> need to retrieve all composite columns with composite's first
>> component
>> >>>> matching the itemId. Is it possible to do such a query to Cassandra ?
>> I
>> >>>> am
>> >>>> using Hector.
>> >>>>
>> >>>
>> >>>
>> >>
>> >
>>
>

Re: Retrieve all composite columns from a row, whose composite name's first component matches from a list of Integers

Posted by Aditya <ad...@gmail.com>.
@Edward: Perhaps you missed to notice that I need to always retrieve 'all
columns' under the supercolumn at any time.. and as per my query
requirements if I use composite columns instead of supercolumns then it is
impossible to do wildcard queries like the ones asked in this thread's
headline but which is much easier to do through the use of supercolumns.

On Thu, Dec 29, 2011 at 11:06 PM, Edward Capriolo <ed...@gmail.com>wrote:

> The use case in question was: Only accessing some columns.
>
> Even if that is not the case:
>
> SuperColumns: 1 extra level of nesting
> Composite Colunns: Arbitrary levels of nesting
>
> SuperColumns: More overhead (space on disk) then using your own delimiter
> '_'
> SuperColumns: Likely going to be replaced in future c* version behind
> the scenes by composite columns anyway
> SuperColumns: Usually an afterthought for API developers, (support for
> them comes "later")
> SuperColumns: Almost always utilized incorrectly by users, users speak
> of '10%' performance gains after they switch away from them.
>
> There are some (a small % of cases) where SuperColumns are a better
> choice, but this is rare. With composites and concatenating columns
> they have no great purpose any more, (bad analogy coming!) like a
> mechanical type writer.
>
> On 12/29/11, Philippe <wa...@gmail.com> wrote:
> > Would you stand by that statement in case all colums inside the super
> > column need to be read?  Why?
> >
> > Thanks
> > Le 28 déc. 2011 19:26, "Edward Capriolo" <ed...@gmail.com> a
> écrit :
> >
> >> Super columns have the same fundamental problem and perform worse in
> >> general. So switching from composites to super columns is NEVER a good
> >> idea.
> >>
> >>
> >> On Wed, Dec 28, 2011 at 1:19 PM, Aditya <ad...@gmail.com> wrote:
> >>
> >>> Since I have around 20 items to query, I guess making 20 queries to
> >>> retrieve activities by all followies on all of those 20 columns would
> too
> >>> inefficient, so to take the advantage of more efficient queries, are
> >>> supercolumns recommended for this case ? Anyways, in case I use
> >>> supercolumns, I need to retrieve the entire supercolumn at any point of
> >>> time & I am writing subcolumn(s) to the supercolumn at different times
> >>> not
> >>> at once.
> >>>
> >>> On Wed, Dec 28, 2011 at 8:07 PM, Edward Capriolo
> >>> <ed...@gmail.com>wrote:
> >>>
> >>>> You need to execute one get slice operation for each item id or if the
> >>>> row is not large , you can try one large get slice on the entire row
> and
> >>>> deal with the results client side.
> >>>>
> >>>> If you try method 1 When doing slices on composites you can set the
> >>>> start inclusive or exclusive values to get only the column you want
> and
> >>>> not
> >>>> some extra columns up to slice range size.
> >>>>
> >>>>
> >>>> On Tuesday, December 27, 2011, Aditya <ad...@gmail.com> wrote:
> >>>> > I need to store data of all activities by user's followies in single
> >>>> row. I am trying to do that making use of composite column names in a
> >>>> single user specific row named 'rowX'.
> >>>> > On any activity by a user's followie on an item, a column is stored
> in
> >>>> 'rowX'. The column has a composite type column name made up of
> >>>> itemId+userId (which makes it unique col. name) in rowX. (& column
> value
> >>>> contains the activity data related to that item by that followie)
> >>>> >
> >>>> > Now I want to retrieve activity by all users on a list of items. So
> I
> >>>> need to retrieve all composite columns with composite's first
> component
> >>>> matching the itemId. Is it possible to do such a query to Cassandra ?
> I
> >>>> am
> >>>> using Hector.
> >>>>
> >>>
> >>>
> >>
> >
>

Re: Retrieve all composite columns from a row, whose composite name's first component matches from a list of Integers

Posted by Edward Capriolo <ed...@gmail.com>.
The use case in question was: Only accessing some columns.

Even if that is not the case:

SuperColumns: 1 extra level of nesting
Composite Colunns: Arbitrary levels of nesting

SuperColumns: More overhead (space on disk) then using your own delimiter '_'
SuperColumns: Likely going to be replaced in future c* version behind
the scenes by composite columns anyway
SuperColumns: Usually an afterthought for API developers, (support for
them comes "later")
SuperColumns: Almost always utilized incorrectly by users, users speak
of '10%' performance gains after they switch away from them.

There are some (a small % of cases) where SuperColumns are a better
choice, but this is rare. With composites and concatenating columns
they have no great purpose any more, (bad analogy coming!) like a
mechanical type writer.

On 12/29/11, Philippe <wa...@gmail.com> wrote:
> Would you stand by that statement in case all colums inside the super
> column need to be read?  Why?
>
> Thanks
> Le 28 déc. 2011 19:26, "Edward Capriolo" <ed...@gmail.com> a écrit :
>
>> Super columns have the same fundamental problem and perform worse in
>> general. So switching from composites to super columns is NEVER a good
>> idea.
>>
>>
>> On Wed, Dec 28, 2011 at 1:19 PM, Aditya <ad...@gmail.com> wrote:
>>
>>> Since I have around 20 items to query, I guess making 20 queries to
>>> retrieve activities by all followies on all of those 20 columns would too
>>> inefficient, so to take the advantage of more efficient queries, are
>>> supercolumns recommended for this case ? Anyways, in case I use
>>> supercolumns, I need to retrieve the entire supercolumn at any point of
>>> time & I am writing subcolumn(s) to the supercolumn at different times
>>> not
>>> at once.
>>>
>>> On Wed, Dec 28, 2011 at 8:07 PM, Edward Capriolo
>>> <ed...@gmail.com>wrote:
>>>
>>>> You need to execute one get slice operation for each item id or if the
>>>> row is not large , you can try one large get slice on the entire row and
>>>> deal with the results client side.
>>>>
>>>> If you try method 1 When doing slices on composites you can set the
>>>> start inclusive or exclusive values to get only the column you want and
>>>> not
>>>> some extra columns up to slice range size.
>>>>
>>>>
>>>> On Tuesday, December 27, 2011, Aditya <ad...@gmail.com> wrote:
>>>> > I need to store data of all activities by user's followies in single
>>>> row. I am trying to do that making use of composite column names in a
>>>> single user specific row named 'rowX'.
>>>> > On any activity by a user's followie on an item, a column is stored in
>>>> 'rowX'. The column has a composite type column name made up of
>>>> itemId+userId (which makes it unique col. name) in rowX. (& column value
>>>> contains the activity data related to that item by that followie)
>>>> >
>>>> > Now I want to retrieve activity by all users on a list of items. So I
>>>> need to retrieve all composite columns with composite's first component
>>>> matching the itemId. Is it possible to do such a query to Cassandra ? I
>>>> am
>>>> using Hector.
>>>>
>>>
>>>
>>
>

Re: Retrieve all composite columns from a row, whose composite name's first component matches from a list of Integers

Posted by Philippe <wa...@gmail.com>.
Would you stand by that statement in case all colums inside the super
column need to be read?  Why?

Thanks
Le 28 déc. 2011 19:26, "Edward Capriolo" <ed...@gmail.com> a écrit :

> Super columns have the same fundamental problem and perform worse in
> general. So switching from composites to super columns is NEVER a good idea.
>
>
> On Wed, Dec 28, 2011 at 1:19 PM, Aditya <ad...@gmail.com> wrote:
>
>> Since I have around 20 items to query, I guess making 20 queries to
>> retrieve activities by all followies on all of those 20 columns would too
>> inefficient, so to take the advantage of more efficient queries, are
>> supercolumns recommended for this case ? Anyways, in case I use
>> supercolumns, I need to retrieve the entire supercolumn at any point of
>> time & I am writing subcolumn(s) to the supercolumn at different times not
>> at once.
>>
>> On Wed, Dec 28, 2011 at 8:07 PM, Edward Capriolo <ed...@gmail.com>wrote:
>>
>>> You need to execute one get slice operation for each item id or if the
>>> row is not large , you can try one large get slice on the entire row and
>>> deal with the results client side.
>>>
>>> If you try method 1 When doing slices on composites you can set the
>>> start inclusive or exclusive values to get only the column you want and not
>>> some extra columns up to slice range size.
>>>
>>>
>>> On Tuesday, December 27, 2011, Aditya <ad...@gmail.com> wrote:
>>> > I need to store data of all activities by user's followies in single
>>> row. I am trying to do that making use of composite column names in a
>>> single user specific row named 'rowX'.
>>> > On any activity by a user's followie on an item, a column is stored in
>>> 'rowX'. The column has a composite type column name made up of
>>> itemId+userId (which makes it unique col. name) in rowX. (& column value
>>> contains the activity data related to that item by that followie)
>>> >
>>> > Now I want to retrieve activity by all users on a list of items. So I
>>> need to retrieve all composite columns with composite's first component
>>> matching the itemId. Is it possible to do such a query to Cassandra ? I am
>>> using Hector.
>>>
>>
>>
>

Re: Retrieve all composite columns from a row, whose composite name's first component matches from a list of Integers

Posted by Edward Capriolo <ed...@gmail.com>.
Super columns have the same fundamental problem and perform worse in
general. So switching from composites to super columns is NEVER a good idea.


On Wed, Dec 28, 2011 at 1:19 PM, Aditya <ad...@gmail.com> wrote:

> Since I have around 20 items to query, I guess making 20 queries to
> retrieve activities by all followies on all of those 20 columns would too
> inefficient, so to take the advantage of more efficient queries, are
> supercolumns recommended for this case ? Anyways, in case I use
> supercolumns, I need to retrieve the entire supercolumn at any point of
> time & I am writing subcolumn(s) to the supercolumn at different times not
> at once.
>
> On Wed, Dec 28, 2011 at 8:07 PM, Edward Capriolo <ed...@gmail.com>wrote:
>
>> You need to execute one get slice operation for each item id or if the
>> row is not large , you can try one large get slice on the entire row and
>> deal with the results client side.
>>
>> If you try method 1 When doing slices on composites you can set the start
>> inclusive or exclusive values to get only the column you want and not some
>> extra columns up to slice range size.
>>
>>
>> On Tuesday, December 27, 2011, Aditya <ad...@gmail.com> wrote:
>> > I need to store data of all activities by user's followies in single
>> row. I am trying to do that making use of composite column names in a
>> single user specific row named 'rowX'.
>> > On any activity by a user's followie on an item, a column is stored in
>> 'rowX'. The column has a composite type column name made up of
>> itemId+userId (which makes it unique col. name) in rowX. (& column value
>> contains the activity data related to that item by that followie)
>> >
>> > Now I want to retrieve activity by all users on a list of items. So I
>> need to retrieve all composite columns with composite's first component
>> matching the itemId. Is it possible to do such a query to Cassandra ? I am
>> using Hector.
>>
>
>

Re: Retrieve all composite columns from a row, whose composite name's first component matches from a list of Integers

Posted by Aditya <ad...@gmail.com>.
Since I have around 20 items to query, I guess making 20 queries to
retrieve activities by all followies on all of those 20 columns would too
inefficient, so to take the advantage of more efficient queries, are
supercolumns recommended for this case ? Anyways, in case I use
supercolumns, I need to retrieve the entire supercolumn at any point of
time & I am writing subcolumn(s) to the supercolumn at different times not
at once.

On Wed, Dec 28, 2011 at 8:07 PM, Edward Capriolo <ed...@gmail.com>wrote:

> You need to execute one get slice operation for each item id or if the row
> is not large , you can try one large get slice on the entire row and deal
> with the results client side.
>
> If you try method 1 When doing slices on composites you can set the start
> inclusive or exclusive values to get only the column you want and not some
> extra columns up to slice range size.
>
>
> On Tuesday, December 27, 2011, Aditya <ad...@gmail.com> wrote:
> > I need to store data of all activities by user's followies in single
> row. I am trying to do that making use of composite column names in a
> single user specific row named 'rowX'.
> > On any activity by a user's followie on an item, a column is stored in
> 'rowX'. The column has a composite type column name made up of
> itemId+userId (which makes it unique col. name) in rowX. (& column value
> contains the activity data related to that item by that followie)
> >
> > Now I want to retrieve activity by all users on a list of items. So I
> need to retrieve all composite columns with composite's first component
> matching the itemId. Is it possible to do such a query to Cassandra ? I am
> using Hector.
>

Re: Retrieve all composite columns from a row, whose composite name's first component matches from a list of Integers

Posted by Edward Capriolo <ed...@gmail.com>.
You need to execute one get slice operation for each item id or if the row
is not large , you can try one large get slice on the entire row and deal
with the results client side.

If you try method 1 When doing slices on composites you can set the start
inclusive or exclusive values to get only the column you want and not some
extra columns up to slice range size.

On Tuesday, December 27, 2011, Aditya <ad...@gmail.com> wrote:
> I need to store data of all activities by user's followies in single row.
I am trying to do that making use of composite column names in a single
user specific row named 'rowX'.
> On any activity by a user's followie on an item, a column is stored in
'rowX'. The column has a composite type column name made up of
itemId+userId (which makes it unique col. name) in rowX. (& column value
contains the activity data related to that item by that followie)
>
> Now I want to retrieve activity by all users on a list of items. So I
need to retrieve all composite columns with composite's first component
matching the itemId. Is it possible to do such a query to Cassandra ? I am
using Hector.