You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Richard Crowley <r...@rcrowley.org> on 2012/08/24 03:54:28 UTC

Secondary index partially created

I have a three-node cluster running Cassandra 1.0.10.  In this cluster
is a keyspace with RF=3.  I *updated* a column family via Astyanax to
add a column definition with an index on that column.  Then I ran a
backfill to populate the column in every row.  Then I tried to query
the index from Java and it failed but so did cassandra-cli:

    get my_column_family where my_column = 'my_value';

Two out of the three nodes are unable to query the new index and throw
this error:

    InvalidRequestException(why:No indexed columns present in index
clause with operator EQ)

The third is able to query the new index happily but doesn't find any
results, even when I expect it to.

`describe cluster;` in cassandra-cli confirms that all three nodes
have the same schema and `show schema;` confirms that schema includes
the new column definition and its index.

The my_column_family.my_index-hd-* files only exist on that one node
that can query the index.

I ran `nodetool repair` on each node and waited for `nodetool
compactionstats` to report zero pending tasks.  Ditto for `nodetool
compact`.  The nodes that failed still fail.  The node that succeeded
still succeed.

Can anyone shed some light?  How do I convince it to let me query the
index from any node?  How do I get it to find results?

Thanks,

Richard

Re: Secondary index partially created

Posted by Richard Crowley <r...@rcrowley.org>.

On Mon, Aug 27, 2012 at 12:59 AM, aaron morton <aa...@thelastpickle.com> wrote:
> If you are still having problems can you post the query and the output from
> nodetool cfstats on one of the nodes that fails ?

driftx got me sorted.  It escaped me that a rolling restart was
necessary to build secondary indexes, which was masked by one node
deciding to build its portion without a restart.

Thanks,

Richard

Re: Secondary index partially created

Posted by aaron morton <aa...@thelastpickle.com>.

If you are still having problems can you post the query and the output from nodetool cfstats on one of the nodes that fails ? 

cfstats will tell us if the secondary index was built. 

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 25/08/2012, at 6:02 AM, Roshni Rajagopal <Ro...@wal-mart.com> wrote:

> What does List my_column_family in CLI show on all the nodes?
> Perhaps the syntax u're using isn't correct?  You should be getting the
> same data on all the nodes irrespective of which node's CLI you use.
> The replication factor is for redundancy to have copies of the data on
> different nodes to help if nodes go down. Even if you had a replication
> factor of 1 you should still get the same data on all nodes.
> 
> 
> 
> On 24/08/12 11:05 PM, "Richard Crowley" <r...@rcrowley.org> wrote:
> 
>> On Thu, Aug 23, 2012 at 6:54 PM, Richard Crowley <r...@rcrowley.org> wrote:
>>> I have a three-node cluster running Cassandra 1.0.10.  In this cluster
>>> is a keyspace with RF=3.  I *updated* a column family via Astyanax to
>>> add a column definition with an index on that column.  Then I ran a
>>> backfill to populate the column in every row.  Then I tried to query
>>> the index from Java and it failed but so did cassandra-cli:
>>> 
>>>    get my_column_family where my_column = 'my_value';
>>> 
>>> Two out of the three nodes are unable to query the new index and throw
>>> this error:
>>> 
>>>    InvalidRequestException(why:No indexed columns present in index
>>> clause with operator EQ)
>>> 
>>> The third is able to query the new index happily but doesn't find any
>>> results, even when I expect it to.
>> 
>> This morning the one node that's able to query the index is also able
>> to produce the expected results.  I'm a dummy and didn't use science
>> so I don't know if the `nodetool compact` I ran across the cluster had
>> anything to do with it.  Regardless, it did not change the situation
>> in any other way.
>> 
>>> 
>>> `describe cluster;` in cassandra-cli confirms that all three nodes
>>> have the same schema and `show schema;` confirms that schema includes
>>> the new column definition and its index.
>>> 
>>> The my_column_family.my_index-hd-* files only exist on that one node
>>> that can query the index.
>>> 
>>> I ran `nodetool repair` on each node and waited for `nodetool
>>> compactionstats` to report zero pending tasks.  Ditto for `nodetool
>>> compact`.  The nodes that failed still fail.  The node that succeeded
>>> still succeed.
>>> 
>>> Can anyone shed some light?  How do I convince it to let me query the
>>> index from any node?  How do I get it to find results?
>>> 
>>> Thanks,
>>> 
>>> Richard
> 
> This email and any files transmitted with it are confidential and intended solely for the individual or entity to whom they are addressed. If you have received this email in error destroy it immediately. *** Walmart Confidential ***

Re: Secondary index partially created

Posted by Roshni Rajagopal <Ro...@wal-mart.com>.

What does List my_column_family in CLI show on all the nodes?
Perhaps the syntax u're using isn't correct?  You should be getting the
same data on all the nodes irrespective of which node's CLI you use.
The replication factor is for redundancy to have copies of the data on
different nodes to help if nodes go down. Even if you had a replication
factor of 1 you should still get the same data on all nodes.



On 24/08/12 11:05 PM, "Richard Crowley" <r...@rcrowley.org> wrote:

>On Thu, Aug 23, 2012 at 6:54 PM, Richard Crowley <r...@rcrowley.org> wrote:
>> I have a three-node cluster running Cassandra 1.0.10.  In this cluster
>> is a keyspace with RF=3.  I *updated* a column family via Astyanax to
>> add a column definition with an index on that column.  Then I ran a
>> backfill to populate the column in every row.  Then I tried to query
>> the index from Java and it failed but so did cassandra-cli:
>>
>>     get my_column_family where my_column = 'my_value';
>>
>> Two out of the three nodes are unable to query the new index and throw
>> this error:
>>
>>     InvalidRequestException(why:No indexed columns present in index
>> clause with operator EQ)
>>
>> The third is able to query the new index happily but doesn't find any
>> results, even when I expect it to.
>
>This morning the one node that's able to query the index is also able
>to produce the expected results.  I'm a dummy and didn't use science
>so I don't know if the `nodetool compact` I ran across the cluster had
>anything to do with it.  Regardless, it did not change the situation
>in any other way.
>
>>
>> `describe cluster;` in cassandra-cli confirms that all three nodes
>> have the same schema and `show schema;` confirms that schema includes
>> the new column definition and its index.
>>
>> The my_column_family.my_index-hd-* files only exist on that one node
>> that can query the index.
>>
>> I ran `nodetool repair` on each node and waited for `nodetool
>> compactionstats` to report zero pending tasks.  Ditto for `nodetool
>> compact`.  The nodes that failed still fail.  The node that succeeded
>> still succeed.
>>
>> Can anyone shed some light?  How do I convince it to let me query the
>> index from any node?  How do I get it to find results?
>>
>> Thanks,
>>
>> Richard

This email and any files transmitted with it are confidential and intended solely for the individual or entity to whom they are addressed. If you have received this email in error destroy it immediately. *** Walmart Confidential ***

Re: Secondary index partially created

Posted by Richard Crowley <r...@rcrowley.org>.

On Thu, Aug 23, 2012 at 6:54 PM, Richard Crowley <r...@rcrowley.org> wrote:
> I have a three-node cluster running Cassandra 1.0.10.  In this cluster
> is a keyspace with RF=3.  I *updated* a column family via Astyanax to
> add a column definition with an index on that column.  Then I ran a
> backfill to populate the column in every row.  Then I tried to query
> the index from Java and it failed but so did cassandra-cli:
>
>     get my_column_family where my_column = 'my_value';
>
> Two out of the three nodes are unable to query the new index and throw
> this error:
>
>     InvalidRequestException(why:No indexed columns present in index
> clause with operator EQ)
>
> The third is able to query the new index happily but doesn't find any
> results, even when I expect it to.

This morning the one node that's able to query the index is also able
to produce the expected results.  I'm a dummy and didn't use science
so I don't know if the `nodetool compact` I ran across the cluster had
anything to do with it.  Regardless, it did not change the situation
in any other way.

>
> `describe cluster;` in cassandra-cli confirms that all three nodes
> have the same schema and `show schema;` confirms that schema includes
> the new column definition and its index.
>
> The my_column_family.my_index-hd-* files only exist on that one node
> that can query the index.
>
> I ran `nodetool repair` on each node and waited for `nodetool
> compactionstats` to report zero pending tasks.  Ditto for `nodetool
> compact`.  The nodes that failed still fail.  The node that succeeded
> still succeed.
>
> Can anyone shed some light?  How do I convince it to let me query the
> index from any node?  How do I get it to find results?
>
> Thanks,
>
> Richard