You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Ruchir Jha <ru...@gmail.com> on 2015/01/16 19:32:05 UTC

Retrieving all row keys of a CF

We have a column family that has about 800K rows and on an average about a
million columns. I am interested in getting all the row keys in this column
family and I am using the following Astyanax code snippet to do this.

This query never finishes (ran it for 2 days but did not finish).


This query however works with CF's that have lesser number of columns. This
leads me to believe that there might be an API that just retrieves the row
keys and does not depend on the number of columns in the CF. Any
suggestions are appreciated.



I am running Cassandra 2.0.9 and this is a 4 node cluster.



                        keyspace.prepareQuery(*this*
.wideRowTables.get(group)).setConsistencyLevel(ConsistencyLevel.CL_QUORUM).getAllRows().setRowLimit(1000)

                                                .setRepeatLastToken(*false*
).withColumnRange(*new*
RangeBuilder().setLimit(1).build()).executeWithCallback(*new*
RowCallback<String, T>() {



                                                            @Override

                                                            *public*
*boolean* failure(ConnectionException e)

                                                            {


*return* *true*;

                                                            }



                                                            @Override

                                                            *public* *void*
success(Rows<String, T> rows)

                                                            {

                                                                        //
iterating over rows here

                                                            }

                                                });

RE: Retrieving all row keys of a CF

Posted by Mohammed Guller <mo...@glassbeam.com>.
Ruchir,
I am curious if you had better luck with the AllRowsReader recipe.

Mohammed

From: Eric Stevens [mailto:mightye@gmail.com]
Sent: Friday, January 16, 2015 12:33 PM
To: user@cassandra.apache.org
Subject: Re: Retrieving all row keys of a CF

Note that getAllRows() is deprecated in Astyanax (see here<https://github.com/Netflix/astyanax/wiki/Getting-Started#iterate-through-the-entire-keyspace-deprecated>).

You should prefer to use the AllRowsReader recipe: https://github.com/Netflix/astyanax/wiki/AllRowsReader-All-rows-query

Note the section titled Reading only the row keys<https://github.com/Netflix/astyanax/wiki/AllRowsReader-All-rows-query#reading-only-the-row-keys>, which seems to match your use case exactly.  You should start getting row keys back very, very quickly.

On Fri, Jan 16, 2015 at 11:32 AM, Ruchir Jha <ru...@gmail.com>> wrote:
We have a column family that has about 800K rows and on an average about a million columns. I am interested in getting all the row keys in this column family and I am using the following Astyanax code snippet to do this.
This query never finishes (ran it for 2 days but did not finish).

This query however works with CF's that have lesser number of columns. This leads me to believe that there might be an API that just retrieves the row keys and does not depend on the number of columns in the CF. Any suggestions are appreciated.

I am running Cassandra 2.0.9 and this is a 4 node cluster.

                        keyspace.prepareQuery(this.wideRowTables.get(group)).setConsistencyLevel(ConsistencyLevel.CL_QUORUM).getAllRows().setRowLimit(1000)
                                                .setRepeatLastToken(false).withColumnRange(new RangeBuilder().setLimit(1).build()).executeWithCallback(new RowCallback<String, T>() {

                                                            @Override
                                                            public boolean failure(ConnectionException e)
                                                            {
                                                                        return true;
                                                            }

                                                            @Override
                                                            public void success(Rows<String, T> rows)
                                                            {
                                                                        // iterating over rows here
                                                            }
                                                });


Re: Retrieving all row keys of a CF

Posted by Eric Stevens <mi...@gmail.com>.
Note that getAllRows() is deprecated in Astyanax (see here
<https://github.com/Netflix/astyanax/wiki/Getting-Started#iterate-through-the-entire-keyspace-deprecated>
).

You should prefer to use the AllRowsReader recipe:
https://github.com/Netflix/astyanax/wiki/AllRowsReader-All-rows-query

Note the section titled Reading only the row keys
<https://github.com/Netflix/astyanax/wiki/AllRowsReader-All-rows-query#reading-only-the-row-keys>,
which seems to match your use case exactly.  You should start getting row
keys back very, very quickly.

On Fri, Jan 16, 2015 at 11:32 AM, Ruchir Jha <ru...@gmail.com> wrote:

> We have a column family that has about 800K rows and on an average about a
> million columns. I am interested in getting all the row keys in this column
> family and I am using the following Astyanax code snippet to do this.
>
> This query never finishes (ran it for 2 days but did not finish).
>
>
> This query however works with CF's that have lesser number of columns.
> This leads me to believe that there might be an API that just retrieves the
> row keys and does not depend on the number of columns in the CF. Any
> suggestions are appreciated.
>
>
>
> I am running Cassandra 2.0.9 and this is a 4 node cluster.
>
>
>
>                         keyspace.prepareQuery(*this*
> .wideRowTables.get(group)).setConsistencyLevel(ConsistencyLevel.CL_QUORUM).getAllRows().setRowLimit(1000)
>
>                                                 .setRepeatLastToken(
> *false*).withColumnRange(*new*
> RangeBuilder().setLimit(1).build()).executeWithCallback(*new*
> RowCallback<String, T>() {
>
>
>
>                                                             @Override
>
>                                                             *public*
> *boolean* failure(ConnectionException e)
>
>                                                             {
>
>
> *return* *true*;
>
>                                                             }
>
>
>
>                                                             @Override
>
>                                                             *public*
> *void* success(Rows<String, T> rows)
>
>                                                             {
>
>                                                                         //
> iterating over rows here
>
>                                                             }
>
>                                                 });
>