You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Brandon Williams (Resolved) (JIRA)" <ji...@apache.org> on 2012/02/17 15:02:01 UTC

[jira] [Resolved] (CASSANDRA-3928) Bulk loading to cassandra with Python Hadoop Job.

     [ https://issues.apache.org/jira/browse/CASSANDRA-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams resolved CASSANDRA-3928.
-----------------------------------------

    Resolution: Won't Fix
      Reviewer:   (was: jbellis)

The only way to do this short of reimplementing everything in python would be to use jython to write the sstables via BOF and stream them in.  Alternatively, you could insert the data via thrift from cpython.
                
> Bulk loading to cassandra with Python Hadoop Job.
> -------------------------------------------------
>
>                 Key: CASSANDRA-3928
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3928
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Hadoop, Tools
>    Affects Versions: 1.2
>            Reporter: Samarth Gahire
>            Assignee: Brandon Williams
>            Priority: Minor
>              Labels: bulkloader, hadoop, python, sstableloader
>             Fix For: 1.2
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> I was wondering if we can have a OutPutFormat to Bulkload the data to Cassandra with Hadoop Job Written in Python.
> I am having very complex Hadoop job written in Python which processes test data and generate structured data in sequential file. I read this data and stream it to cassandra using BulkOutPutFormat.
> Is there any way that I can avoid writing to sequential file and directly process and stream data to Cassandra(With Hadoop Job written in python)?
> What could be a possible solution for same?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira