You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by "Bochkarev, Peter" <Pe...@Dell.com> on 2022/05/30 11:37:05 UTC

RE: Fetch all data from Cassandra 3.4.4

Hi guys!

We use Cassandra 3.4.4. We have 2 nodes with full replication.
We have such an issue. We use old Java driver com.datastax.cassandra.cassandra-driver-core and cant simple do upgrade.
We need fetch all data from table but our driver return 23 000 records. Last Java driver com.datastax.oss<https://mvnrepository.com/artifact/com.datastax.oss>.java-driver-core<https://mvnrepository.com/artifact/com.datastax.oss/java-driver-core> fetch 25 000 records.
Could we do something to fix issue except to go latest driver?

Also we use Spring framework 1.5.2 in our app.
Request: select * from my_table;


Internal Use - Confidential

RE: Fetch all data from Cassandra 3.4.4

Posted by "Durity, Sean R" <SE...@homedepot.com>.
A select with no where clause is not a good access pattern for Cassandra, regardless of driver version. It will not scale for large data sets or a large number of nodes.

Ideally you want to select from a single partition for each query. So, depending on the size of the rows, one answer may be to create a partition to hold the 25,000 rows. This is assuming the rows are relatively small (under 100 MB total for the partition) and that you are often dealing with the whole partition or a subset. Of course, this strategy could produce a hot spot on the cluster if there were more nodes.

Others might chime in with Spark-related answers for working through large data sets. If it is only 25,000 rows, that really isn't large, but it is an answer to the general problem of analytics-type queries (needing all rows).


Sean R. Durity



INTERNAL USE
From: Bochkarev, Peter <Pe...@Dell.com>
Sent: Monday, May 30, 2022 7:37 AM
To: user@cassandra.apache.org
Cc: Thondavada, Saiprasad <Sa...@dell.com>; Pikalev, Sergey <Se...@dell.com>; Yaroslavskiy, Vladimir <Vl...@dell.com>
Subject: [EXTERNAL] RE: Fetch all data from Cassandra 3.4.4


Hi guys!

We use Cassandra 3.4.4. We have 2 nodes with full replication.
We have such an issue. We use old Java driver com.datastax.cassandra.cassandra-driver-core and cant simple do upgrade.
We need fetch all data from table but our driver return 23 000 records. Last Java driver com.datastax.oss [mvnrepository.com]<https://urldefense.com/v3/__https:/mvnrepository.com/artifact/com.datastax.oss__;!!M-nmYVHPHQ!K1B06p3j7ZhGQmd7rfJta9YxQkXaOtg0Sb0VjnP-FgRi2QgSArcuJnpSX_If5gor4fbQuhZayAxe-VMynej9B4YR2lHuPVR0$>.java-driver-core [mvnrepository.com]<https://urldefense.com/v3/__https:/mvnrepository.com/artifact/com.datastax.oss/java-driver-core__;!!M-nmYVHPHQ!K1B06p3j7ZhGQmd7rfJta9YxQkXaOtg0Sb0VjnP-FgRi2QgSArcuJnpSX_If5gor4fbQuhZayAxe-VMynej9B4YR2naz8Hgf$> fetch 25 000 records.
Could we do something to fix issue except to go latest driver?

Also we use Spring framework 1.5.2 in our app.
Request: select * from my_table;


Internal Use - Confidential