You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Lasse Nedergaard <la...@gmail.com> on 2020/02/23 09:03:06 UTC

Batch reading from Cassandra

Hi.

We would like to do some batch analytics on our data set stored in Cassandra and are looking for an efficient way to load data from a single table. Not by key, but random 15%, 50% or 100% 
Data bricks has create an efficient way to load Cassandra data into Apache Spark and they are doing it by reading from the underlying SS tables to load in parallel. 
Do we have something similarly in Flink, or how is the most efficient way to load all, or many random data from a single Cassandra table into Flink? 

Any suggestions and/or recommendations is highly appreciated.

Thanks in advance

Lasse Nedergaard

Re: Batch reading from Cassandra

Posted by Piotr Nowojski <pi...@ververica.com>.

Hi,

I’m afraid that we don’t have any native support for reading from Cassandra at the moment. The only things that I could find, are streaming sinks [1][2].

Piotrek

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/cassandra.html <https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/cassandra.html>
[2] https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/connect.html#further-tablesources-and-tablesinks <https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/connect.html#further-tablesources-and-tablesinks>

> On 23 Feb 2020, at 10:03, Lasse Nedergaard <la...@gmail.com> wrote:
> 
> Hi.
> 
> We would like to do some batch analytics on our data set stored in Cassandra and are looking for an efficient way to load data from a single table. Not by key, but random 15%, 50% or 100% 
> Data bricks has create an efficient way to load Cassandra data into Apache Spark and they are doing it by reading from the underlying SS tables to load in parallel. 
> Do we have something similarly in Flink, or how is the most efficient way to load all, or many random data from a single Cassandra table into Flink? 
> 
> Any suggestions and/or recommendations is highly appreciated.
> 
> Thanks in advance
> 
> Lasse Nedergaard