You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gora.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/03/03 23:46:04 UTC

[jira] [Comment Edited] (GORA-416) Error when populating data into Cassandra super column - InvalidRequestException(why:supercolumn parameter is not optional for super CF sc

    [ https://issues.apache.org/jira/browse/GORA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14345902#comment-14345902 ] 

Lewis John McGibbney edited comment on GORA-416 at 3/3/15 10:45 PM:
--------------------------------------------------------------------

OK so I've debugged this right through on the [FetcherJob|https://github.com/apache/nutch/blob/2.x/src/java/org/apache/nutch/fetcher/FetcherJob.java] task.
What is happening here is that we iterate through the UNION structure of the [protocolStatus field|https://github.com/apache/nutch/blob/2.x/src/gora/webpage.avsc#L58-L95] of the Nutch WebPage object, with the field value at position 1 being created as **protocolStatus_UnionIndex** and a [subColumn being created as we desire|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L590]. 
However once this has been done, when we come to the field value at position 1 we use recursion on [addOrUpdateField|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L598] where we then encounter the [RECORD which is the actual protocolStatus|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L506]. This one contains the actual value.
What happens now is that we [add this as a normal column|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L512] instead of the super column that is is defined as. This is what results in the InvalidRequestException.
Patch for master branch coming up. I will try to provide a test case as well which replicated the problem.


was (Author: lewismc):
OK so I've debugged this right through on the [FetcherJob|https://github.com/apache/nutch/blob/2.x/src/java/org/apache/nutch/fetcher/FetcherJob.java] task.
What is happening here is that we iterate through the UNION structure of the [protocolStatus field|https://github.com/apache/nutch/blob/2.x/src/gora/webpage.avsc#L58-L95] of the Nutch WebPage object, with the field value at position 1 being created as **protocolStatus_UnionIndex** and a [subColumn being created as we desire|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L590]. 
However once this has been done, when we come to the field value at position 1 we use recursion on [addOrUpdateField|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L598] where we then encounter the [RECORD which is the actual protocolStatus|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L506]. This one contains the actual value.
What happens now is that we [add this as a normal column|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L512] instead of the super column that is is defined as.

> Error when populating data into Cassandra super column - InvalidRequestException(why:supercolumn parameter is not optional for super CF sc
> ------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: GORA-416
>                 URL: https://issues.apache.org/jira/browse/GORA-416
>             Project: Apache Gora
>          Issue Type: Bug
>          Components: gora-cassandra
>    Affects Versions: 0.6
>         Environment: Nutch 2.4-SNAPSHOT, Gora 0.6.1-SNAPSHOT, Hadoop 2.5.2, Cassandra 2.0.7
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>            Priority: Blocker
>             Fix For: 0.6.1
>
>
> Error when populating data into Cassandra super column.
> {code}
> lmcgibbn@LMC-032857 /usr/local/2webgui/runtime/local(master) $ ./bin/nutch fetch 1425410774-370456822
> FetcherJob: starting at 2015-03-03 11:27:57
> FetcherJob: batchId: 1425410774-370456822
> FetcherJob: threads: 10
> FetcherJob: parsing: false
> FetcherJob: resuming: false
> FetcherJob : timelimit set for : -1
> 2015-03-03 11:27:58.101 java[3267:1903] Unable to load realm info from SCDynamicStore
> Using queue mode : byHost
> Fetcher: threads: 10
> QueueFeeder finished: total 1 records. Hit by time limit :0
> fetching http://nutch.apache.org/ (queue crawl delay=5000ms)
> -finishing thread FetcherThread1, activeThreads=1
> -finishing thread FetcherThread2, activeThreads=1
> -finishing thread FetcherThread3, activeThreads=1
> -finishing thread FetcherThread4, activeThreads=1
> -finishing thread FetcherThread5, activeThreads=1
> -finishing thread FetcherThread6, activeThreads=1
> -finishing thread FetcherThread7, activeThreads=1
> -finishing thread FetcherThread8, activeThreads=1
> Fetcher: throughput threshold: -1
> -finishing thread FetcherThread9, activeThreads=1
> Fetcher: throughput threshold sequence: 5
> -finishing thread FetcherThread0, activeThreads=0
> 0/0 spinwaiting/active, 1 pages, 0 errors, 0.2 0 pages/s, 82 82 kb/s, 0 URLs in 0 queues
> -activeThreads=0
> me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:supercolumn parameter is not optional for super CF sc)
> 	at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52)
> 	at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:260)
> 	at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113)
> 	at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
> 	at me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:69)
> 	at org.apache.gora.cassandra.store.HectorUtils.insertColumn(HectorUtils.java:46)
> 	at org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:293)
> 	at org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:512)
> 	at org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:598)
> 	at org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:316)
> 	at org.apache.gora.cassandra.store.CassandraStore.close(CassandraStore.java:160)
> 	at org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:56)
> 	at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550)
> 	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: InvalidRequestException(why:supercolumn parameter is not optional for super CF sc)
> 	at org.apache.cassandra.thrift.Cassandra$batch_mutate_result$batch_mutate_resultStandardScheme.read(Cassandra.java:28082)
> 	at org.apache.cassandra.thrift.Cassandra$batch_mutate_result$batch_mutate_resultStandardScheme.read(Cassandra.java:28068)
> 	at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:28002)
> 	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
> 	at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:1060)
> 	at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:1046)
> 	at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246)
> 	at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:243)
> 	at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104)
> 	at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:253)
> 	... 19 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)