You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by James Pirz <ja...@gmail.com> on 2012/06/27 02:07:49 UTC

bulk load problem

Dear all,

I am trying to use "sstableloader" in cassandra 1.1.1, to bulk load some
data into a single node cluster.
I am running the following command:

bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/

from "another" node (other than the node on which cassandra is running),
while the data should be loaded into a keyspace named "tpch". I made sure
that the 2nd node, from which I run sstableloader, have the same copy of
cassandra.yaml as the destination node.
I have put

tpch-cf0-hd-1-Data.db
tpch-cf0-hd-1-Index.db

under the path, I have passed to sstableloader.

But I am getting the following error:

Could not retrieve endpoint ranges:

Any hint ?

Thanks in advance,

James

Re: bulk load problem

Posted by Pushpalanka Jayawardhana <pu...@gmail.com>.
Hi,

Thanks Brian for your code and Thanks Yuki.

Directory structure was a problem and could correct it with Yuki's guidance.
Still the error was same and it was due to wrong Thrift rpc address. After
correcting it, bulk loading was successful.


On Mon, Jul 9, 2012 at 8:13 PM, Yuki Morishita <mo...@gmail.com> wrote:

>  Due to the change in directory structure from ver 1.1, you have to create
> the directory like
>
> /path/to/sstables/<Keyspace name>/<ColumnFamily name>
>
> and put your sstables.
>
> In your case, I think it would be "/data/ssTable/tpch/tpch/cf0".
> And you have to specify that directory as a parameter for sstableloader
>
> bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/cf0
>
> Yuki
>
> On Tuesday, June 26, 2012 at 7:07 PM, James Pirz wrote:
>
> Dear all,
>
> I am trying to use "sstableloader" in cassandra 1.1.1, to bulk load some
> data into a single node cluster.
> I am running the following command:
>
> bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/
>
> from "another" node (other than the node on which cassandra is running),
> while the data should be loaded into a keyspace named "tpch". I made sure
> that the 2nd node, from which I run sstableloader, have the same copy of
> cassandra.yaml as the destination node.
> I have put
>
> tpch-cf0-hd-1-Data.db
> tpch-cf0-hd-1-Index.db
>
> under the path, I have passed to sstableloader.
>
> But I am getting the following error:
>
> Could not retrieve endpoint ranges:
>
> Any hint ?
>
> Thanks in advance,
>
> James
>
>
>
>
>


-- 
Pushpalanka Jayawardhana | Undergraduate | Computer Science and Engineering
University of Moratuwa

+94779716248 | http://pushpalankajaya.blogspot.com

Twitter: http://twitter.com/Pushpalanka | Slideshare:
http://www.slideshare.net/Pushpalanka

Re: bulk load problem

Posted by Yuki Morishita <mo...@gmail.com>.
Due to the change in directory structure from ver 1.1, you have to create the directory like 

/path/to/sstables/<Keyspace name>/<ColumnFamily name>

and put your sstables.

In your case, I think it would be "/data/ssTable/tpch/tpch/cf0". 
And you have to specify that directory as a parameter for sstableloader

bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/cf0

Yuki


On Tuesday, June 26, 2012 at 7:07 PM, James Pirz wrote:

> Dear all,
> 
> I am trying to use "sstableloader" in cassandra 1.1.1, to bulk load some data into a single node cluster.
> I am running the following command:
> 
> bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/
> 
> from "another" node (other than the node on which cassandra is running), while the data should be loaded into a keyspace named "tpch". I made sure that the 2nd node, from which I run sstableloader, have the same copy of cassandra.yaml as the destination node.
> I have put 
> 
> tpch-cf0-hd-1-Data.db
> tpch-cf0-hd-1-Index.db
> 
> under the path, I have passed to sstableloader.
> 
> But I am getting the following error:
> 
> Could not retrieve endpoint ranges:
> 
> Any hint ?
> 
> Thanks in advance,
> 
> James
> 
> 
> 


Re: bulk load problem

Posted by Brian Jeltema <br...@digitalenvoy.net>.
I couldn't get the same-host sstableloader to work either. But it's easier to use the JMX bulk-load hook that's built
into Cassandra anyway. The following is what I implemented to do this:

import java.io.IOException;
import java.util.HashMap;
import java.util.Map;

import javax.management.JMX;
import javax.management.MBeanServerConnection;
import javax.management.MalformedObjectNameException;
import javax.management.ObjectName;
import javax.management.remote.JMXConnector;
import javax.management.remote.JMXConnectorFactory;
import javax.management.remote.JMXServiceURL;

import org.apache.cassandra.service.StorageServiceMBean;

public class JmxBulkLoader {

    private JMXConnector connector;
    private StorageServiceMBean storageBean;

    public JmxBulkLoader(String host, int port) throws Exception
    {
        connect(host, port);
    }

    private void connect(String host, int port) throws IOException, MalformedObjectNameException
    {
        JMXServiceURL jmxUrl = new JMXServiceURL(String.format("service:jmx:rmi:///jndi/rmi://%s:%d/jmxrmi", host, port));
        Map<String,Object> env = new HashMap<String,Object>();
        connector = JMXConnectorFactory.connect(jmxUrl, env);
        MBeanServerConnection mbeanServerConn = connector.getMBeanServerConnection();
        ObjectName name = new ObjectName("org.apache.cassandra.db:type=StorageService");
        storageBean = JMX.newMBeanProxy(mbeanServerConn, name, StorageServiceMBean.class);
    }

    public void close() throws IOException
    {
        connector.close();
    }
    
    public void bulkLoad(String path) {
        storageBean.bulkLoad(path);
    }
    
    public static void main(String[] args) throws Exception {
        if (args.length == 0) {
            throw new IllegalArgumentException("usage: paths to bulk files");
        }
        JmxBulkLoader np = new JmxBulkLoader("localhost", 7199);
        for (String arg : args) {
            np.bulkLoad(arg);
        }
        np.close();
    }
}

On Jul 9, 2012, at 5:16 AM, Pushpalanka Jayawardhana wrote:

> Hi all,
> 
> I am facing the same problem when trying to load Cassandra using sstableloader. I am running a Cassandra instance in my own machine and sstableloader is also called from the same machine. Following are the steps I followed.
> 
> get a copy of the running Cassandra instance
> set another loopback address with "sudo ifconfig lo:0 127.0.0.2 netmask 255.0.0.0 up"
> set listen address and rpc address of the copied Cassandra's cassandra.yaml to 127.0.0.2
> ran "./sstableloader -d 127.0.0.2 <directory of created sstables>"
> But this give me an error 'Could not retrieve endpoint ranges: ' and just that.
> 
> I am so grateful for any hints to get over this.
> What I want to get done is actually running the sstableloader via a java code. But I couldn't get over it either and trying to understand the required args with this. It is great if someone can help me in either cases.
> 
> Thanks in advance!
> 
> 
> 
> On Tue, Jul 3, 2012 at 5:16 AM, aaron morton <aa...@thelastpickle.com> wrote:
> Do you have the full stack ? It will include a cause.
> 
> Cheers
> 
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 27/06/2012, at 12:07 PM, James Pirz wrote:
> 
>> Dear all,
>> 
>> I am trying to use "sstableloader" in cassandra 1.1.1, to bulk load some data into a single node cluster.
>> I am running the following command:
>> 
>> bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/
>> 
>> from "another" node (other than the node on which cassandra is running), while the data should be loaded into a keyspace named "tpch". I made sure that the 2nd node, from which I run sstableloader, have the same copy of cassandra.yaml as the destination node.
>> I have put 
>> 
>> tpch-cf0-hd-1-Data.db
>> tpch-cf0-hd-1-Index.db
>> 
>> under the path, I have passed to sstableloader.
>> 
>> But I am getting the following error:
>> 
>> Could not retrieve endpoint ranges:
>> 
>> Any hint ?
>> 
>> Thanks in advance,
>> 
>> James
>> 
>> 
>> 
> 
> 
> 
> 
> -- 
> Pushpalanka Jayawardhana | Undergraduate | Computer Science and Engineering
> University of Moratuwa
> +94779716248 | http://pushpalankajaya.blogspot.com
> Twitter: http://twitter.com/Pushpalanka | Slideshare: http://www.slideshare.net/Pushpalanka
> 
> 


Re: bulk load problem

Posted by Pushpalanka Jayawardhana <pu...@gmail.com>.
Hi all,

I am facing the same problem when trying to load Cassandra using
sstableloader. I am running a Cassandra instance in my own machine and
sstableloader is also called from the same machine. Following are the steps
I followed.


   - get a copy of the running Cassandra instance
   - set another loopback address with "sudo ifconfig lo:0 127.0.0.2
   netmask 255.0.0.0 up"
   - set listen address and rpc address of the copied Cassandra's
   cassandra.yaml to 127.0.0.2
   - ran "./sstableloader -d 127.0.0.2 <directory of created sstables>"

But this give me an error 'Could not retrieve endpoint ranges: ' and just
that.

I am so grateful for any hints to get over this.
What I want to get done is actually running the sstableloader via a java
code. But I couldn't get over it either and trying to understand the
required args with this. It is great if someone can help me in either cases.

Thanks in advance!



On Tue, Jul 3, 2012 at 5:16 AM, aaron morton <aa...@thelastpickle.com>wrote:

> Do you have the full stack ? It will include a cause.
>
> Cheers
>
>
>   -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 27/06/2012, at 12:07 PM, James Pirz wrote:
>
> Dear all,
>
> I am trying to use "sstableloader" in cassandra 1.1.1, to bulk load some
> data into a single node cluster.
> I am running the following command:
>
> bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/
>
> from "another" node (other than the node on which cassandra is running),
> while the data should be loaded into a keyspace named "tpch". I made sure
> that the 2nd node, from which I run sstableloader, have the same copy of
> cassandra.yaml as the destination node.
> I have put
>
> tpch-cf0-hd-1-Data.db
> tpch-cf0-hd-1-Index.db
>
> under the path, I have passed to sstableloader.
>
> But I am getting the following error:
>
> Could not retrieve endpoint ranges:
>
> Any hint ?
>
> Thanks in advance,
>
> James
>
>
>
>
>


-- 
Pushpalanka Jayawardhana | Undergraduate | Computer Science and Engineering
University of Moratuwa

+94779716248 | http://pushpalankajaya.blogspot.com

Twitter: http://twitter.com/Pushpalanka | Slideshare:
http://www.slideshare.net/Pushpalanka

Re: bulk load problem

Posted by aaron morton <aa...@thelastpickle.com>.
Do you have the full stack ? It will include a cause.

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 27/06/2012, at 12:07 PM, James Pirz wrote:

> Dear all,
> 
> I am trying to use "sstableloader" in cassandra 1.1.1, to bulk load some data into a single node cluster.
> I am running the following command:
> 
> bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/
> 
> from "another" node (other than the node on which cassandra is running), while the data should be loaded into a keyspace named "tpch". I made sure that the 2nd node, from which I run sstableloader, have the same copy of cassandra.yaml as the destination node.
> I have put 
> 
> tpch-cf0-hd-1-Data.db
> tpch-cf0-hd-1-Index.db
> 
> under the path, I have passed to sstableloader.
> 
> But I am getting the following error:
> 
> Could not retrieve endpoint ranges:
> 
> Any hint ?
> 
> Thanks in advance,
> 
> James
> 
> 
>