You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Peter Davies <di...@gmail.com> on 2010/12/14 13:22:13 UTC

Cassandra-Pig keyspace not found issue

Pig seems to think my keyspace doesn't exist. I'm connecting to a remote
cassandra instance configured in the environment variables
PIG_RPC_PORT and PIG_INITIAL_ADDRESS
(an ip address)

I get the following backend logged output...

**************************
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to
create input splits for: cassandra://ActivityLog_peter/Users
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:269)
        at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
        at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
        at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
        at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
        at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
        at java.lang.Thread.run(Thread.java:636)
Caused by: java.io.IOException: Could not get input splits
        at
org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:127)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:258)
        ... 7 more
Caused by: java.util.concurrent.ExecutionException:
java.lang.RuntimeException: InvalidRequestException(why:Keyspace
ActivityLog_peter does not exist)
        at
java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)
        at java.util.concurrent.FutureTask.get(FutureTask.java:111)
        at
org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:123)
        ... 8 more
**************************

Note the "why:Keyspace ActivityLog_peter does not exist" message
The keyspace does exist, and connections to cassandra are permitted from the
server my Pig client is sitting on (this I tested using a simple pycassa
script).
If I change the connection params I get a different 'unable to connect'
message as expected.

Are there configuration settings I'm missing. Do I have to describe the
cassandra schema locally (to Pig)

Many Thanks,

Peter.