You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Miriam Allalouf <mi...@gmail.com> on 2010/07/25 15:05:58 UTC

Can we filter a key or a column name using regular expression?

Hi,
I need to build a mode where I can retrieve an ordered list of objects
sharing the same prefix and contains a certain delimiter.

For example:  Get all the object names that start with 'root' and
contain '/' in it,
given the names:
root@abc/ddd
roo@bbb/c
root@pppp

should return the first two names (root@abc/ddd  roo@bbb/c)

I can build the model such that the a lot object name is a key or a column name.
Can we  retrieve a key or a column name using such filter or other
regular expression-like filters?
Thanks and I will appreciate  you help,
Miriam

Re: Can we filter a key or a column name using regular expression?

Posted by Benjamin Black <b...@b3k.us>.
Nope.

On Sun, Jul 25, 2010 at 6:05 AM, Miriam Allalouf
<mi...@gmail.com> wrote:
> Hi,
> I need to build a mode where I can retrieve an ordered list of objects
> sharing the same prefix and contains a certain delimiter.
>
> For example:  Get all the object names that start with 'root' and
> contain '/' in it,
> given the names:
> root@abc/ddd
> roo@bbb/c
> root@pppp
>
> should return the first two names (root@abc/ddd  roo@bbb/c)
>
> I can build the model such that the a lot object name is a key or a column name.
> Can we  retrieve a key or a column name using such filter or other
> regular expression-like filters?
> Thanks and I will appreciate  you help,
> Miriam
>

Re: Can we filter a key or a column name using regular expression?

Posted by Aaron Morton <aa...@thelastpickle.com>.
Say you have the colums "foo", "foo.bar", "foo.baz", "monkeys"

And you want to read all the columns that start with 'foo.'

You could set the start column for the SliceRange to "foo." and the end column to "" and make repeated get_slice calls until you see a column that does not start with "foo."

Or you could set the start col to "foo." and the end col to "foo.~", again make repeated calls but cassandra should not return the money column.

(am saying to make repeated calls in case there are lots and lots of columns that start with foo)

The ~ character is the highest ascii character code (126) that is printable. If you are sorting by ASCII it's a handy marker, .

e.g. in python
In [133]: s1 = "foo"
In [134]: s2 = "foo.bar"
In [135]: s3 = "monkeys"
In [136]: q = "foo~"
In [137]: s1<q
Out[137]: True
In [138]: s2<q
Out[138]: True
In [139]: s3<q
Out[139]: False


Aaron

On 27 Jul, 2010,at 02:13 AM, Miriam Allalouf <mi...@gmail.com> wrote:

> > You can query for things that start with a sub string but specifying a start
> > value and an empty end value or an end value that is the start value
> > concatenated with the max ascii character. Then just make multiple calls,
> > say getting 1000 cols/rows at a time.
>
> What do you mean by concatenated with the max ascii character when
> looking for things that end with something?
> Can you please bring an example?
> Thanks, Miriam
>
>
> > Hope that helps
> > Aaron
> >
> > On 26 Jul, 2010,at 01:05 AM, Miriam Allalouf <mi...@gmail.com>
> > wrote:
> >
> > Hi,
> > I need to build a mode where I can retrieve an ordered list of objects
> > sharing the same prefix and contains a certain delimiter.
> >
> > For example: Get all the object names that start with 'root' and
> > contain '/' in it,
> > given the names:
> > root@abc/ddd
> > roo@bbb/c
> > root@pppp
> >
> > should return the first two names (root@abc/ddd roo@bbb/c)
> >
> > I can build the model such that the a lot object name is a key or a column
> > name.
> > Can we retrieve a key or a column name using such filter or other
> > regular expression-like filters?
> > Thanks and I will appreciate you help,
> > Miriam
> >

Re: Can we filter a key or a column name using regular expression?

Posted by Miriam Allalouf <mi...@gmail.com>.
> You can query for things that start with a sub string but specifying a start
> value and an empty end value or an end value that is the start value
> concatenated with the max ascii character. Then just make multiple calls,
> say getting 1000 cols/rows at a time.

What do you mean by  concatenated with the max ascii character when
looking for things that end with something?
Can you please bring an example?
Thanks, Miriam


> Hope that helps
> Aaron
>
> On 26 Jul, 2010,at 01:05 AM, Miriam Allalouf <mi...@gmail.com>
> wrote:
>
> Hi,
> I need to build a mode where I can retrieve an ordered list of objects
> sharing the same prefix and contains a certain delimiter.
>
> For example: Get all the object names that start with 'root' and
> contain '/' in it,
> given the names:
> root@abc/ddd
> roo@bbb/c
> root@pppp
>
> should return the first two names (root@abc/ddd roo@bbb/c)
>
> I can build the model such that the a lot object name is a key or a column
> name.
> Can we retrieve a key or a column name using such filter or other
> regular expression-like filters?
> Thanks and I will appreciate you help,
> Miriam
>

Re: Can we filter a key or a column name using regular expression?

Posted by Aaron Morton <aa...@thelastpickle.com>.
Have a look at how the SliceRange works  for both get_slice and KeyRange works for get_range_slices.

You can query for things that start with a sub string but specifying a start value and an empty end value or an end value that is the start value concatenated with the max ascii character. Then just make multiple calls, say getting 1000 cols/rows at a time.

You cannot do a "contains" type query, it's just byte/string matching. You'll need to think about how to store your data in a way that lets you get the same result. For example you could store the keys in the a Column Family for "all" and a CF for ones that have a sub part. Or your could store just "root@foo" as the key, then store columns for every child.

Not great examples but the general idea is to denormalise to support the queries.

Hope that helps
Aaron

On 26 Jul, 2010,at 01:05 AM, Miriam Allalouf <mi...@gmail.com> wrote:

> Hi,
> I need to build a mode where I can retrieve an ordered list of objects
> sharing the same prefix and contains a certain delimiter.
>
> For example: Get all the object names that start with 'root' and
> contain '/' in it,
> given the names:
> root@abc/ddd
> roo@bbb/c
> root@pppp
>
> should return the first two names (root@abc/ddd roo@bbb/c)
>
> I can build the model such that the a lot object name is a key or a column name.
> Can we retrieve a key or a column name using such filter or other
> regular expression-like filters?
> Thanks and I will appreciate you help,
> Miriam