You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Piotr Nowojski <pi...@ververica.com> on 2020/02/28 13:19:32 UTC

Re: Batch reading from Cassandra

Hi,

I’m afraid that we don’t have any native support for reading from Cassandra at the moment. The only things that I could find, are streaming sinks [1][2].

Piotrek

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/cassandra.html <https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/cassandra.html>
[2] https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/connect.html#further-tablesources-and-tablesinks <https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/connect.html#further-tablesources-and-tablesinks>

> On 23 Feb 2020, at 10:03, Lasse Nedergaard <la...@gmail.com> wrote:
> 
> Hi.
> 
> We would like to do some batch analytics on our data set stored in Cassandra and are looking for an efficient way to load data from a single table. Not by key, but random 15%, 50% or 100% 
> Data bricks has create an efficient way to load Cassandra data into Apache Spark and they are doing it by reading from the underlying SS tables to load in parallel. 
> Do we have something similarly in Flink, or how is the most efficient way to load all, or many random data from a single Cassandra table into Flink? 
> 
> Any suggestions and/or recommendations is highly appreciated.
> 
> Thanks in advance
> 
> Lasse Nedergaard