You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by pr...@wipro.com on 2013/04/29 15:05:37 UTC

How to use Write Consistency 'ANY' with SSTABLELOADER - DSE Cassandra 1.1.9

Hi All,

We have a requirement to load approximately 10 million records, each record with approximately 100 columns. We are planning to use the Bulk-loader program to convert the data into SSTables and then load them using SSTABLELOADER.

Everything is working fine when all nodes are up and running and the performance is very good. However, when a node is down, the streaming fails and the operation stops. We have to run the SSTABLELOADER with option 'I' to exclude the node that is down. I was wondering if we can enforce Consistency level of 'ANY' with SSTABLELOADER as well.

We tried specifying the consistency level 'ANY' at Keyspace level. However, this is not being used by the SSTABLELOADER. It is still looking for all the nodes to be available.

1. Please can anyone suggest how we can enforce Write Consistency level when using SSTABLELOADER?

2. Will Sqoop be a good option in these scenarios? Do we have any performance stats generated while loading data into Cassandra with Sqoop?

Environment:

Cassandra 1.1.9 provided as part of DSE 3.0
3 Nodes
Replication Factor – 3
Consistency Level – ANY

Regards,
Praveen

Wipro Limited (Company Regn No in UK FC 019088)
Address: Level 2, West wing, 3 Sheldon Square, London W2 6PS, United Kingdom. Tel +44 20 7432 8500 Fax: +44 20 7286 5703 

VAT Number: 563 1964 27

(Branch of Wipro Limited (Incorporated in India at Bangalore with limited liability vide Reg no L99999KA1945PLC02800 with Registrar of Companies at Bangalore, India. Authorized share capital  Rs 5550 mn))

Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. 

www.wipro.com

Re: How to use Write Consistency 'ANY' with SSTABLELOADER - DSE Cassandra 1.1.9

Posted by Robert Coli <rc...@eventbrite.com>.
On Mon, Apr 29, 2013 at 1:17 PM, aaron morton <aa...@thelastpickle.com> wrote:
> Bulk Loader does not use CL, it's more like a repair / bootstrap.
> If you have to skip a node then use repair.

The bulk loader ("sstableloader") can ignore replica nodes via -i option :

./src/java/org/apache/cassandra/tools/BulkLoader.java
"
            options.addOption("i",  IGNORE_NODES_OPTION, "NODES",
"don't stream to this (comma separated) list of nodes");
"

This is one of the major differences between the bulkLoad JMX call and
sstableloader, if bulkLoad fails on a single replica you have to send
to all replicas again.

Also of note, it is confusing that JMX "bulkLoad" operation actually
uses sstableloader class and not BulkLoader class, and sstableloader
tool uses BulkLoader class which . Perhaps this line in BulkLoader
java should be changed from :

"    private static final String TOOL_NAME = "sstableloader"; "
to
"    private static final String TOOL_NAME = "bulkloader"; "

?

=Rob

Re: How to use Write Consistency 'ANY' with SSTABLELOADER - DSE Cassandra 1.1.9

Posted by aaron morton <aa...@thelastpickle.com>.
Is this a once off data load or something you need to do regularly?

One option you have with RF3 and 3 Nodes is to place a copy of all the SSTables on each node and use nodetool refresh to directly load the sstables into the node without any streaming. 

> 1. Please can anyone suggest how we can enforce Write Consistency level when using SSTABLELOADER?
Bulk Loader does not use CL, it's more like a repair / bootstrap. 
If you have to skip a node then use repair. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 30/04/2013, at 1:05 AM, praveen.akunuru@wipro.com wrote:

> Hi All, 
> 
> We have a requirement to load approximately 10 million records, each record with approximately 100 columns. We are planning to use the Bulk-loader program to convert the data into SSTables and then load them using SSTABLELOADER. 
> 
> Everything is working fine when all nodes are up and running and the performance is very good. However, when a node is down, the streaming fails and the operation stops. We have to run the SSTABLELOADER with option 'I' to exclude the node that is down. I was wondering if we can enforce Consistency level of 'ANY' with SSTABLELOADER as well. 
> 
> We tried specifying the consistency level 'ANY' at Keyspace level. However, this is not being used by the SSTABLELOADER. It is still looking for all the nodes to be available.
> 
> 1. Please can anyone suggest how we can enforce Write Consistency level when using SSTABLELOADER?
> 
> 2. Will Sqoop be a good option in these scenarios? Do we have any performance stats generated while loading data into Cassandra with Sqoop?
> 
> Environment: 
> 
> Cassandra 1.1.9 provided as part of DSE 3.0
> 3 Nodes
> Replication Factor – 3
> Consistency Level – ANY
>  
> Regards, 
> Praveen
> Wipro Limited (Company Regn No in UK - FC 019088) 
> Address: Level 2, West wing, 3 Sheldon Square, London W2 6PS, United Kingdom. Tel +44 20 7432 8500 Fax: +44 20 7286 5703
> 
> VAT Number: 563 1964 27
> 
> (Branch of Wipro Limited (Incorporated in India at Bangalore with limited liability vide Reg no L99999KA1945PLC02800 with Registrar of Companies at Bangalore, India. Authorized share capital: Rs 5550 mn))
> 
> Please do not print this email unless it is absolutely necessary.
> 
> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
> 
> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
> 
> www.wipro.com
>