You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Filippo Diotalevi <fi...@ntoklo.com> on 2012/08/15 11:32:09 UTC

Migrating to a new cluster (using SSTableLoader or other approaches)

Hi,  
we are trying to use SSTableLoader to bootstrap a new 7-node cassandra (v. 1.0.10) cluster with the snapshots taken from a 3-node cassandra cluster. The new cluster is in a different data centre.  

After reading the articles at
[1] http://www.datastax.com/dev/blog/bulk-loading
[2] http://geekswithblogs.net/johnsPerfBlog/archive/2011/07/26/how-to-use-cassandrs-sstableloader.aspx

we are tried to follow this procedure  
1) we took a snapshot of our keyspaces in the old cluster and moved them to the data folder of 3 of the new machines
2) started cassandra in the new cluster
but we noticed that some column families were missing, other had missing data.

After that we tried to use sstableloader
1) we reinstalled cassandra in the new cluster
2) run sstableloader (as explained in [2]) to load the keyspaces

SSTableLoader starts, but the progress is always 0 and the transfer rate is 0MB/s. Some warning and exceptions are present in the logs

./sstableloader /opt/analytics/analytics/
Starting client (and waiting 30 seconds for gossip) ...
Streaming revelant part of /opt/analytics/analytics/chart-hd-104-Data.db /opt/analytics/analytics/chart-hd-105-Data.db /opt/analytics/analytics/chart-hd-106-Data.db /opt/analytics/analytics/chart-hd-107-Data.db /opt/analytics/analytics/chart-hd-108-Data.db to [/1x.xx.xx.xx5, /1x.xx.xx.xx7, /1x.xx.xx.xx0, /1x.xx.xx.xx7, /1x.xx.xx.xx3, /1x.xx.xx.xx8, /1x.xx.xx.xx7]
WARN 09:02:38,534 Unable to instantiate cache provider org.apache.cassandra.cache.SerializingCacheProvider; using default org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d instead
WARN 09:02:38,549 Unable to instantiate cache provider org.apache.cassandra.cache.SerializingCacheProvider; using default org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d instead


[….]
ERROR 09:02:38,614 Error in ThreadPoolExecutor
java.lang.RuntimeException: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:253)
at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:136)
at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more
Exception in thread "Streaming:1" java.lang.RuntimeException: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:253)
at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:136)
at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more





---------------


What's the correct approach to migrate data from one cluster to another? How can I troubleshoot the problem with sstableloader?  

Thanks,
--  
Filippo Diotalevi



Re: Migrating to a new cluster (using SSTableLoader or other approaches)

Posted by aaron morton <aa...@thelastpickle.com>.
> Which nodetool command are you referring to? (info, cfstats, ring,….)
My bad. I meant to write sstableloader

> Do I modify the log4j-tools.properties in $CASSANDRA_HOME/conf to set the nodetool logs to DEBUG?
You can use the --debug option with sstableloader to get a better exception message. 

Also change the logging in log4j-tools.properties for get DEBUG messages so we can see what's going on. 

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 16/08/2012, at 8:51 PM, Filippo Diotalevi <fi...@ntoklo.com> wrote:

>>> ERROR 09:02:38,614 Error in ThreadPoolExecutor
>>> java.lang.RuntimeException: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
>> 
>> 
>> This one looks like an error.
>> 
>> Can you run nodetool with DEBUG level logging and post the logs ?  
> 
> Thank Aaron.
> Which nodetool command are you referring to? (info, cfstats, ring,….)
> Do I modify the log4j-tools.properties in $CASSANDRA_HOME/conf to set the nodetool logs to DEBUG?
> 
> Thanks,
> --  
> Filippo Diotalevi
> 
> 
> 
> On Wednesday, 15 August 2012 at 22:53, aaron morton wrote:
> 
>>> WARN 09:02:38,534 Unable to instantiate cache provider org.apache.cassandra.cache.SerializingCacheProvider; using default org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d instead
>> 
>> Happens when JNA is not in the path. Nothing to worry about when using the sstableloader.  
>> 
>>> ERROR 09:02:38,614 Error in ThreadPoolExecutor
>>> java.lang.RuntimeException: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
>> 
>> This one looks like an error.  
>> 
>> Can you run nodetool with DEBUG level logging and post the logs ?  
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> 
>> 
>> 
>> 
>> 
>> On 15/08/2012, at 9:32 PM, Filippo Diotalevi <filippo@ntoklo.com (mailto:filippo@ntoklo.com)> wrote:
>>> Hi,  
>>> we are trying to use SSTableLoader to bootstrap a new 7-node cassandra (v. 1.0.10) cluster with the snapshots taken from a 3-node cassandra cluster. The new cluster is in a different data centre.  
>>> 
>>> After reading the articles at
>>> [1] http://www.datastax.com/dev/blog/bulk-loading
>>> [2] http://geekswithblogs.net/johnsPerfBlog/archive/2011/07/26/how-to-use-cassandrs-sstableloader.aspx
>>> 
>>> we are tried to follow this procedure  
>>> 1) we took a snapshot of our keyspaces in the old cluster and moved them to the data folder of 3 of the new machines
>>> 2) started cassandra in the new cluster
>>> but we noticed that some column families were missing, other had missing data.
>>> 
>>> After that we tried to use sstableloader
>>> 1) we reinstalled cassandra in the new cluster
>>> 2) run sstableloader (as explained in [2]) to load the keyspaces
>>> 
>>> SSTableLoader starts, but the progress is always 0 and the transfer rate is 0MB/s. Some warning and exceptions are present in the logs
>>> 
>>> ./sstableloader /opt/analytics/analytics/
>>> Starting client (and waiting 30 seconds for gossip) ...
>>> Streaming revelant part of /opt/analytics/analytics/chart-hd-104-Data.db /opt/analytics/analytics/chart-hd-105-Data.db /opt/analytics/analytics/chart-hd-106-Data.db /opt/analytics/analytics/chart-hd-107-Data.db /opt/analytics/analytics/chart-hd-108-Data.db to [/1x.xx.xx.xx5, /1x.xx.xx.xx7, /1x.xx.xx.xx0, /1x.xx.xx.xx7, /1x.xx.xx.xx3, /1x.xx.xx.xx8, /1x.xx.xx.xx7]
>>> WARN 09:02:38,534 Unable to instantiate cache provider org.apache.cassandra.cache.SerializingCacheProvider; using default org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d instead
>>> WARN 09:02:38,549 Unable to instantiate cache provider org.apache.cassandra.cache.SerializingCacheProvider; using default org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d instead
>>> 
>>> 
>>> [….]
>>> ERROR 09:02:38,614 Error in ThreadPoolExecutor
>>> java.lang.RuntimeException: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
>>> at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689)
>>> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>> at java.lang.Thread.run(Thread.java:619)
>>> Caused by: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
>>> at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:253)
>>> at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:136)
>>> at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
>>> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>>> ... 3 more
>>> Exception in thread "Streaming:1" java.lang.RuntimeException: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
>>> at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689)
>>> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>> at java.lang.Thread.run(Thread.java:619)
>>> Caused by: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
>>> at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:253)
>>> at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:136)
>>> at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
>>> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>>> ... 3 more
>>> 
>>> 
>>> 
>>> 
>>> 
>>> ---------------
>>> 
>>> 
>>> What's the correct approach to migrate data from one cluster to another? How can I troubleshoot the problem with sstableloader?  
>>> 
>>> Thanks,
>>> --  
>>> Filippo Diotalevi
>> 
> 
> 
> 


Re: Migrating to a new cluster (using SSTableLoader or other approaches)

Posted by Filippo Diotalevi <fi...@ntoklo.com>.
> > ERROR 09:02:38,614 Error in ThreadPoolExecutor
> > java.lang.RuntimeException: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
>  
>  
> This one looks like an error.
>  
> Can you run nodetool with DEBUG level logging and post the logs ?  

Thank Aaron.
 Which nodetool command are you referring to? (info, cfstats, ring,….)
Do I modify the log4j-tools.properties in $CASSANDRA_HOME/conf to set the nodetool logs to DEBUG?

Thanks,
--  
Filippo Diotalevi



On Wednesday, 15 August 2012 at 22:53, aaron morton wrote:

> > WARN 09:02:38,534 Unable to instantiate cache provider org.apache.cassandra.cache.SerializingCacheProvider; using default org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d instead
>  
> Happens when JNA is not in the path. Nothing to worry about when using the sstableloader.  
>  
> > ERROR 09:02:38,614 Error in ThreadPoolExecutor
> > java.lang.RuntimeException: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
>  
> This one looks like an error.  
>  
> Can you run nodetool with DEBUG level logging and post the logs ?  
>  
> Cheers
>  
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>  
>  
>  
>  
>  
>  
> On 15/08/2012, at 9:32 PM, Filippo Diotalevi <filippo@ntoklo.com (mailto:filippo@ntoklo.com)> wrote:
> > Hi,  
> > we are trying to use SSTableLoader to bootstrap a new 7-node cassandra (v. 1.0.10) cluster with the snapshots taken from a 3-node cassandra cluster. The new cluster is in a different data centre.  
> >  
> > After reading the articles at
> > [1] http://www.datastax.com/dev/blog/bulk-loading
> > [2] http://geekswithblogs.net/johnsPerfBlog/archive/2011/07/26/how-to-use-cassandrs-sstableloader.aspx
> >  
> > we are tried to follow this procedure  
> > 1) we took a snapshot of our keyspaces in the old cluster and moved them to the data folder of 3 of the new machines
> > 2) started cassandra in the new cluster
> > but we noticed that some column families were missing, other had missing data.
> >  
> > After that we tried to use sstableloader
> > 1) we reinstalled cassandra in the new cluster
> > 2) run sstableloader (as explained in [2]) to load the keyspaces
> >  
> > SSTableLoader starts, but the progress is always 0 and the transfer rate is 0MB/s. Some warning and exceptions are present in the logs
> >  
> > ./sstableloader /opt/analytics/analytics/
> > Starting client (and waiting 30 seconds for gossip) ...
> > Streaming revelant part of /opt/analytics/analytics/chart-hd-104-Data.db /opt/analytics/analytics/chart-hd-105-Data.db /opt/analytics/analytics/chart-hd-106-Data.db /opt/analytics/analytics/chart-hd-107-Data.db /opt/analytics/analytics/chart-hd-108-Data.db to [/1x.xx.xx.xx5, /1x.xx.xx.xx7, /1x.xx.xx.xx0, /1x.xx.xx.xx7, /1x.xx.xx.xx3, /1x.xx.xx.xx8, /1x.xx.xx.xx7]
> > WARN 09:02:38,534 Unable to instantiate cache provider org.apache.cassandra.cache.SerializingCacheProvider; using default org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d instead
> > WARN 09:02:38,549 Unable to instantiate cache provider org.apache.cassandra.cache.SerializingCacheProvider; using default org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d instead
> >  
> >  
> > [….]
> > ERROR 09:02:38,614 Error in ThreadPoolExecutor
> > java.lang.RuntimeException: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
> > at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689)
> > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> > at java.lang.Thread.run(Thread.java:619)
> > Caused by: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
> > at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:253)
> > at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:136)
> > at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
> > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> > ... 3 more
> > Exception in thread "Streaming:1" java.lang.RuntimeException: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
> > at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689)
> > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> > at java.lang.Thread.run(Thread.java:619)
> > Caused by: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
> > at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:253)
> > at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:136)
> > at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
> > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> > ... 3 more
> >  
> >  
> >  
> >  
> >  
> > ---------------
> >  
> >  
> > What's the correct approach to migrate data from one cluster to another? How can I troubleshoot the problem with sstableloader?  
> >  
> > Thanks,
> > --  
> > Filippo Diotalevi
>  




Re: Migrating to a new cluster (using SSTableLoader or other approaches)

Posted by Filippo Diotalevi <fi...@ntoklo.com>.
> > ERROR 09:02:38,614 Error in ThreadPoolExecutor
> > java.lang.RuntimeException: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
>  
>  
> This one looks like an error.
>  
> Can you run nodetool with DEBUG level logging and post the logs ?

Some of the error in the sstableloader

DEBUG [Thread-24] 2012-08-16 22:01:02,494 FileUtils.java (line 51) Deleting chart-tmp-hd-1-CompressionInfo.db
DEBUG [Thread-24] 2012-08-16 22:01:02,495 SSTable.java (line 146) Deleted /opt/SP/data/cassandra/analytics/chart-tmp-hd-1
INFO [Thread-24] 2012-08-16 22:01:02,495 StreamInSession.java (line 144) Streaming of file /opt/test/analytics/analytics/chart-hd-2400-Data.db sections=2 progress=0/-25429537 - 0% from org.apache.cassandra.streaming.StreamInSession@e66db21 failed: requesting a retry.
ERROR [Thread-24] 2012-08-16 22:01:02,496 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[Thread-24,5,main]
java.lang.AssertionError: attempted to delete non-existing file chart-tmp-hd-1-Data.db
at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:49)
at org.apache.cassandra.streaming.IncomingStreamReader.retry(IncomingStreamReader.java:173)
at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:93)
at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:184)
at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:80)

DEBUG [ScheduledTasks:1] 2012-08-16 22:01:08,571 LoadBroadcaster.java (line 86) Disseminating load info ...  



--  
Filippo Diotalevi



On Wednesday, 15 August 2012 at 22:53, aaron morton wrote:

> > WARN 09:02:38,534 Unable to instantiate cache provider org.apache.cassandra.cache.SerializingCacheProvider; using default org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d instead
>  
> Happens when JNA is not in the path. Nothing to worry about when using the sstableloader.  
>  
> > ERROR 09:02:38,614 Error in ThreadPoolExecutor
> > java.lang.RuntimeException: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
>  
> This one looks like an error.  
>  
> Can you run nodetool with DEBUG level logging and post the logs ?  
>  
> Cheers
>  
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>  
>  
>  
>  
>  
>  
> On 15/08/2012, at 9:32 PM, Filippo Diotalevi <filippo@ntoklo.com (mailto:filippo@ntoklo.com)> wrote:
> > Hi,  
> > we are trying to use SSTableLoader to bootstrap a new 7-node cassandra (v. 1.0.10) cluster with the snapshots taken from a 3-node cassandra cluster. The new cluster is in a different data centre.  
> >  
> > After reading the articles at
> > [1] http://www.datastax.com/dev/blog/bulk-loading
> > [2] http://geekswithblogs.net/johnsPerfBlog/archive/2011/07/26/how-to-use-cassandrs-sstableloader.aspx
> >  
> > we are tried to follow this procedure  
> > 1) we took a snapshot of our keyspaces in the old cluster and moved them to the data folder of 3 of the new machines
> > 2) started cassandra in the new cluster
> > but we noticed that some column families were missing, other had missing data.
> >  
> > After that we tried to use sstableloader
> > 1) we reinstalled cassandra in the new cluster
> > 2) run sstableloader (as explained in [2]) to load the keyspaces
> >  
> > SSTableLoader starts, but the progress is always 0 and the transfer rate is 0MB/s. Some warning and exceptions are present in the logs
> >  
> > ./sstableloader /opt/analytics/analytics/
> > Starting client (and waiting 30 seconds for gossip) ...
> > Streaming revelant part of /opt/analytics/analytics/chart-hd-104-Data.db /opt/analytics/analytics/chart-hd-105-Data.db /opt/analytics/analytics/chart-hd-106-Data.db /opt/analytics/analytics/chart-hd-107-Data.db /opt/analytics/analytics/chart-hd-108-Data.db to [/1x.xx.xx.xx5, /1x.xx.xx.xx7, /1x.xx.xx.xx0, /1x.xx.xx.xx7, /1x.xx.xx.xx3, /1x.xx.xx.xx8, /1x.xx.xx.xx7]
> > WARN 09:02:38,534 Unable to instantiate cache provider org.apache.cassandra.cache.SerializingCacheProvider; using default org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d instead
> > WARN 09:02:38,549 Unable to instantiate cache provider org.apache.cassandra.cache.SerializingCacheProvider; using default org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d instead
> >  
> >  
> > [….]
> > ERROR 09:02:38,614 Error in ThreadPoolExecutor
> > java.lang.RuntimeException: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
> > at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689)
> > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> > at java.lang.Thread.run(Thread.java:619)
> > Caused by: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
> > at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:253)
> > at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:136)
> > at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
> > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> > ... 3 more
> > Exception in thread "Streaming:1" java.lang.RuntimeException: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
> > at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689)
> > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> > at java.lang.Thread.run(Thread.java:619)
> > Caused by: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
> > at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:253)
> > at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:136)
> > at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
> > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> > ... 3 more
> >  
> >  
> >  
> >  
> >  
> > ---------------
> >  
> >  
> > What's the correct approach to migrate data from one cluster to another? How can I troubleshoot the problem with sstableloader?  
> >  
> > Thanks,
> > --  
> > Filippo Diotalevi
>  




Re: Migrating to a new cluster (using SSTableLoader or other approaches)

Posted by aaron morton <aa...@thelastpickle.com>.
> WARN 09:02:38,534 Unable to instantiate cache provider org.apache.cassandra.cache.SerializingCacheProvider; using default org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d instead
Happens when JNA is not in the path. Nothing to worry about when using the sstableloader. 

> ERROR 09:02:38,614 Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
This one looks like an error. 

Can you run nodetool with DEBUG level logging and post the logs ? 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 15/08/2012, at 9:32 PM, Filippo Diotalevi <fi...@ntoklo.com> wrote:

> Hi,  
> we are trying to use SSTableLoader to bootstrap a new 7-node cassandra (v. 1.0.10) cluster with the snapshots taken from a 3-node cassandra cluster. The new cluster is in a different data centre.  
> 
> After reading the articles at
> [1] http://www.datastax.com/dev/blog/bulk-loading
> [2] http://geekswithblogs.net/johnsPerfBlog/archive/2011/07/26/how-to-use-cassandrs-sstableloader.aspx
> 
> we are tried to follow this procedure  
> 1) we took a snapshot of our keyspaces in the old cluster and moved them to the data folder of 3 of the new machines
> 2) started cassandra in the new cluster
> but we noticed that some column families were missing, other had missing data.
> 
> After that we tried to use sstableloader
> 1) we reinstalled cassandra in the new cluster
> 2) run sstableloader (as explained in [2]) to load the keyspaces
> 
> SSTableLoader starts, but the progress is always 0 and the transfer rate is 0MB/s. Some warning and exceptions are present in the logs
> 
> ./sstableloader /opt/analytics/analytics/
> Starting client (and waiting 30 seconds for gossip) ...
> Streaming revelant part of /opt/analytics/analytics/chart-hd-104-Data.db /opt/analytics/analytics/chart-hd-105-Data.db /opt/analytics/analytics/chart-hd-106-Data.db /opt/analytics/analytics/chart-hd-107-Data.db /opt/analytics/analytics/chart-hd-108-Data.db to [/1x.xx.xx.xx5, /1x.xx.xx.xx7, /1x.xx.xx.xx0, /1x.xx.xx.xx7, /1x.xx.xx.xx3, /1x.xx.xx.xx8, /1x.xx.xx.xx7]
> WARN 09:02:38,534 Unable to instantiate cache provider org.apache.cassandra.cache.SerializingCacheProvider; using default org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d instead
> WARN 09:02:38,549 Unable to instantiate cache provider org.apache.cassandra.cache.SerializingCacheProvider; using default org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d instead
> 
> 
> [….]
> ERROR 09:02:38,614 Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
> at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
> at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:253)
> at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:136)
> at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> ... 3 more
> Exception in thread "Streaming:1" java.lang.RuntimeException: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
> at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.EOFException: unable to seek to position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only mode
> at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:253)
> at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:136)
> at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> ... 3 more
> 
> 
> 
> 
> 
> ---------------
> 
> 
> What's the correct approach to migrate data from one cluster to another? How can I troubleshoot the problem with sstableloader?  
> 
> Thanks,
> --  
> Filippo Diotalevi
> 
>