You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@usergrid.apache.org by "Michael Russo (JIRA)" <ji...@apache.org> on 2015/07/08 07:05:04 UTC

[jira] [Commented] (USERGRID-785) Unable to run 2.0 to 2.1 data migration on large data set.

    [ https://issues.apache.org/jira/browse/USERGRID-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617979#comment-14617979 ] 

Michael Russo commented on USERGRID-785:
----------------------------------------

The root cause of this issue occurred before running the migration, when data was loaded into Usergrid via 2.0.  The cause of the issue is due to a large number of tombstones existing in the Graph_Source_Node_Edges column family.  These tombstones were created because of the internal shard impl in Usergrid.  Entities were moved in Cassandra, causing a lot of deletes (more than 100k which is the default tombstone_failure_threshold in Cassandra.  Example log statements showing the entities were moved:

{code}
2015-07-01 18:28:28,954 [graphTaskExecutor-1] INFO  org.apache.usergrid.persistence.graph.serialization.impl.shard.impl.ShardGroupCompactionImpl- Finished compacting [Shard{shardIndex=1435770644766002, createdTime=1435771087041, compacted=true}] shards and moved 74009 edges
{code}

{code}
2015-07-01 18:28:28,755 [graphTaskExecutor-1] INFO  org.apache.usergrid.persistence.graph.serialization.impl.shard.impl.ShardGroupCompactionImpl- Finished compacting [Shard{shardIndex=1435770644766002, createdTime=1435771087041, compacted=true}] shards and moved 58072 edges
{code}

To move past this, the workaround was to update the Graph_Source_Node_Edges column family in Cassandra to use SizeTieredCompactionStrategy and set the gc_grace to something low like 60 secs.  After this update and the gc_grace period has passed, the following command was ran on the Cassandra nodes to manually force a compaction which removed the tombstones:

{code}
nodetool compact <yourKeyspaceName> Graph_Source_Node_Edges
{code}

> Unable to run 2.0 to 2.1 data migration on large data set.
> ----------------------------------------------------------
>
>                 Key: USERGRID-785
>                 URL: https://issues.apache.org/jira/browse/USERGRID-785
>             Project: Usergrid
>          Issue Type: Bug
>            Reporter: Michael Russo
>
> Attempted to run entity data migration in (UG 2.0 to UG 2.1 ) with a instance having 1.1 million entities in a single collection, within a single application.  This causes the following exception on cassandra:
> {code}
> ERROR [ReadStage:16] 2015-07-02 23:33:29,720 SliceQueryFilter.java (line 206) Scanned over 100000 tombstones in ug_migrate_test.Graph_Source_Node_Edges; query aborted (see tombstone_failure_threshold)
> ERROR [ReadStage:27] 2015-07-02 23:33:39,723 SliceQueryFilter.java (line 206) Scanned over 100000 tombstones in ug_migrate_test.Graph_Source_Node_Edges; query aborted (see tombstone_failure_threshold)
> ERROR [ReadStage:27] 2015-07-02 23:33:39,723 CassandraDaemon.java (line 258) Exception in thread Thread[ReadStage:27,5,main]
> java.lang.RuntimeException: org.apache.cassandra.db.filter.TombstoneOverwhelmingException
> 	at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2016)
> 	at 
> {code}
> which results in the following exception in usergrid:
> {code}
> 2015-07-02 23:33:59,680 [Index migrate data formats] ERROR org.apache.usergrid.rest.MigrateResource- Unable to migrate data
> java.lang.RuntimeException: Unable to connect to casandra
> 	at org.apache.usergrid.persistence.core.astyanax.MultiRowColumnIterator.advance(MultiRowColumnIterator.java:190)
> 	at org.apache.usergrid.persistence.core.astyanax.MultiRowColumnIterator.hasNext(MultiRowColumnIterator.java:122)
> 	at org.apache.usergrid.persistence.graph.serialization.impl.shard.impl.ShardsColumnIterator.hasNext(ShardsColumnIterator.java:65)
> 	at org.apache.usergrid.persistence.graph.serialization.impl.shard.impl.ShardGroupColumnIterator.advance(ShardGroupColumnIterator.java:120)
> 	at org.apache.usergrid.persistence.graph.serialization.impl.shard.impl.ShardGroupColumnIterator.hasNext(ShardGroupColumnIterator.java:68)
> 	at org.apache.usergrid.persistence.core.rx.ObservableIterator.call(ObservableIterator.java:66)
> 	at org.apache.usergrid.persistence.core.rx.ObservableIterator.call(ObservableIterator.java:38)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable.unsafeSubscribe(Observable.java:7495)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.handleNewSource(OperatorMerge.java:215)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.onNext(OperatorMerge.java:185)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.onNext(OperatorMerge.java:120)
> 	at rx.internal.operators.OperatorMap$1.onNext(OperatorMap.java:55)
> 	at rx.internal.operators.OperatorDoOnEach$1.onNext(OperatorDoOnEach.java:84)
> 	at org.apache.usergrid.persistence.core.rx.ObservableIterator.call(ObservableIterator.java:71)
> 	at org.apache.usergrid.persistence.core.rx.ObservableIterator.call(ObservableIterator.java:38)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable.unsafeSubscribe(Observable.java:7495)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.handleNewSource(OperatorMerge.java:215)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.onNext(OperatorMerge.java:185)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.onNext(OperatorMerge.java:120)
> 	at rx.internal.operators.OnSubscribeFromIterable$IterableProducer.request(OnSubscribeFromIterable.java:96)
> 	at rx.Subscriber.setProducer(Subscriber.java:177)
> 	at rx.internal.operators.OnSubscribeFromIterable.call(OnSubscribeFromIterable.java:47)
> 	at rx.internal.operators.OnSubscribeFromIterable.call(OnSubscribeFromIterable.java:33)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable.unsafeSubscribe(Observable.java:7495)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.handleNewSource(OperatorMerge.java:215)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.onNext(OperatorMerge.java:185)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.onNext(OperatorMerge.java:120)
> 	at rx.internal.operators.OperatorMap$1.onNext(OperatorMap.java:55)
> 	at rx.internal.operators.OperatorMerge$InnerSubscriber.emit(OperatorMerge.java:676)
> 	at rx.internal.operators.OperatorMerge$InnerSubscriber.onNext(OperatorMerge.java:586)
> 	at rx.internal.operators.OperatorMerge$InnerSubscriber.emit(OperatorMerge.java:676)
> 	at rx.internal.operators.OperatorMerge$InnerSubscriber.onNext(OperatorMerge.java:586)
> 	at rx.internal.operators.OperatorMap$1.onNext(OperatorMap.java:55)
> 	at rx.internal.operators.OperatorFilter$1.onNext(OperatorFilter.java:54)
> 	at rx.internal.operators.OperatorDoOnEach$1.onNext(OperatorDoOnEach.java:84)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.handleScalarSynchronousObservableWithRequestLimits(OperatorMerge.java:280)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.handleScalarSynchronousObservable(OperatorMerge.java:243)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.onNext(OperatorMerge.java:176)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.onNext(OperatorMerge.java:120)
> 	at rx.internal.operators.OperatorMap$1.onNext(OperatorMap.java:55)
> 	at rx.internal.operators.OperatorDoOnEach$1.onNext(OperatorDoOnEach.java:84)
> 	at org.apache.usergrid.persistence.collection.impl.EntityCollectionManagerImpl$1.call(EntityCollectionManagerImpl.java:248)
> 	at org.apache.usergrid.persistence.collection.impl.EntityCollectionManagerImpl$1.call(EntityCollectionManagerImpl.java:240)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable.unsafeSubscribe(Observable.java:7495)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.handleNewSource(OperatorMerge.java:215)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.onNext(OperatorMerge.java:185)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.onNext(OperatorMerge.java:120)
> 	at rx.internal.operators.OperatorMap$1.onNext(OperatorMap.java:55)
> 	at rx.internal.operators.OperatorDoOnEach$1.onNext(OperatorDoOnEach.java:84)
> 	at rx.internal.operators.OperatorMerge$InnerSubscriber.emit(OperatorMerge.java:676)
> 	at rx.internal.operators.OperatorMerge$InnerSubscriber.onNext(OperatorMerge.java:586)
> 	at rx.internal.operators.OperatorFilter$1.onNext(OperatorFilter.java:54)
> 	at rx.internal.operators.OnSubscribeFromIterable$IterableProducer.request(OnSubscribeFromIterable.java:96)
> 	at rx.Subscriber.setProducer(Subscriber.java:177)
> 	at rx.Subscriber.setProducer(Subscriber.java:171)
> 	at rx.internal.operators.OnSubscribeFromIterable.call(OnSubscribeFromIterable.java:47)
> 	at rx.internal.operators.OnSubscribeFromIterable.call(OnSubscribeFromIterable.java:33)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable.unsafeSubscribe(Observable.java:7495)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.handleNewSource(OperatorMerge.java:215)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.onNext(OperatorMerge.java:185)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.onNext(OperatorMerge.java:120)
> 	at rx.internal.operators.OperatorMap$1.onNext(OperatorMap.java:55)
> 	at rx.internal.operators.OperatorBufferWithSize$1.onCompleted(OperatorBufferWithSize.java:119)
> 	at org.apache.usergrid.persistence.core.rx.ObservableIterator.call(ObservableIterator.java:75)
> 	at org.apache.usergrid.persistence.core.rx.ObservableIterator.call(ObservableIterator.java:38)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable.unsafeSubscribe(Observable.java:7495)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.handleNewSource(OperatorMerge.java:215)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.onNext(OperatorMerge.java:185)
> 	at rx.internal.operators.OperatorMerge$MergeSubscriber.onNext(OperatorMerge.java:120)
> 	at rx.internal.operators.OnSubscribeFromIterable$IterableProducer.request(OnSubscribeFromIterable.java:96)
> 	at rx.Subscriber.setProducer(Subscriber.java:177)
> 	at rx.internal.operators.OnSubscribeFromIterable.call(OnSubscribeFromIterable.java:47)
> 	at rx.internal.operators.OnSubscribeFromIterable.call(OnSubscribeFromIterable.java:33)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable$1.call(Observable.java:144)
> 	at rx.Observable$1.call(Observable.java:136)
> 	at rx.Observable.unsafeSubscribe(Observable.java:7495)
> 	at rx.internal.operators.OperatorSubscribeOn$1$1.call(OperatorSubscribeOn.java:62)
> 	at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:55)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: com.netflix.astyanax.connectionpool.exceptions.OperationTimeoutException: OperationTimeoutException: [host=10.16.4.135(10.16.4.135):9160, latency=5001(30012), attempts=6]TimedOutException()
> 	at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:171)
> 	at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65)
> 	at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28)
> 	at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151)
> 	at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:119)
> 	at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:338)
> 	at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$4.execute(ThriftColumnFamilyQueryImpl.java:532)
> 	at org.apache.usergrid.persistence.core.astyanax.MultiRowColumnIterator.advance(MultiRowColumnIterator.java:187)
> 	... 146 more
> Caused by: TimedOutException()
> 	at org.apache.cassandra.thrift.Cassandra$multiget_slice_result.read(Cassandra.java:10480)
> 	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
> 	at org.apache.cassandra.thrift.Cassandra$Client.recv_multiget_slice(Cassandra.java:673)
> 	at org.apache.cassandra.thrift.Cassandra$Client.multiget_slice(Cassandra.java:657)
> 	at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$4$1.internalExecute(ThriftColumnFamilyQueryImpl.java:538)
> 	at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$4$1.internalExecute(ThriftColumnFamilyQueryImpl.java:535)
> 	at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
> 	... 152 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)