You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Dennis Birkholz <bi...@pubgrade.com> on 2016/01/13 13:17:59 UTC

Spark Cassandra Java Connector: records missing despite consistency=ALL

Hi together,

we Cassandra to log event data and process it every 15 minutes with 
Spark. We are using the Cassandra Java Connector for Spark.

Randomly our Spark runs produce too few output records because no data 
is returned from Cassandra for a several minutes window of input data. 
When querying the data (with cqlsh), after multiple tries, the data 
eventually becomes available.

To solve the problem, we tried to use consistency=ALL when reading the 
data in Spark. We use the 
CassandraJavaUtil.javafunctions().cassandraTable() method and have set 
"spark.cassandra.input.consistency.level"="ALL" on the config when 
creating the Spark context. The problem persists but according to 
http://stackoverflow.com/a/25043599 using a consistency level of ONE on 
the write side (which we use) and ALL on the READ side should be 
sufficient for data consistency.

I would really appreciate if someone could give me a hint how to fix 
this problem, thanks!

Greets,
Dennis

P.s.:
some information about our setup:
Cassandra 2.1.12 in a two Node configuration with replication factor=2
Spark 1.5.1
Cassandra Java Driver 2.2.0-rc3
Spark Cassandra Java Connector 2.10-1.5.0-M2

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Spark Cassandra Java Connector: records missing despite consistency=ALL

Posted by Alex Popescu <al...@datastax.com>.
Dennis,

You'll have better chances to get an answer on the
spark-cassandra-connector mailing list
https://groups.google.com/a/lists.datastax.com/forum/#!forum/spark-connector-user
or on IRC #spark-cassandra-connector

On Wed, Jan 13, 2016 at 4:17 AM, Dennis Birkholz <bi...@pubgrade.com>
wrote:

> Hi together,
>
> we Cassandra to log event data and process it every 15 minutes with Spark.
> We are using the Cassandra Java Connector for Spark.
>
> Randomly our Spark runs produce too few output records because no data is
> returned from Cassandra for a several minutes window of input data. When
> querying the data (with cqlsh), after multiple tries, the data eventually
> becomes available.
>
> To solve the problem, we tried to use consistency=ALL when reading the
> data in Spark. We use the
> CassandraJavaUtil.javafunctions().cassandraTable() method and have set
> "spark.cassandra.input.consistency.level"="ALL" on the config when creating
> the Spark context. The problem persists but according to
> http://stackoverflow.com/a/25043599 using a consistency level of ONE on
> the write side (which we use) and ALL on the READ side should be sufficient
> for data consistency.
>
> I would really appreciate if someone could give me a hint how to fix this
> problem, thanks!
>
> Greets,
> Dennis
>
> P.s.:
> some information about our setup:
> Cassandra 2.1.12 in a two Node configuration with replication factor=2
> Spark 1.5.1
> Cassandra Java Driver 2.2.0-rc3
> Spark Cassandra Java Connector 2.10-1.5.0-M2
>



-- 
Bests,

Alex Popescu | @al3xandru
Sen. Product Manager @ DataStax

Re: Spark Cassandra Java Connector: records missing despite consistency=ALL

Posted by Dennis Birkholz <bi...@pubgrade.com>.
Hi Anthony,

no, the logging is not done via Spark (but PHP). But that does not 
really matter, as the records are eventually there. So it is the 
READ_CONSISTENCY=ALL that is not working.

Btw. it seems that using withReadConf() and setting the consistency 
level there is working but I need to wait a few more days before I am 
sure of that.

Kind regards,
Dennis

Am 19.01.2016 um 19:39 schrieb Femi Anthony:
> So is the logging to Cassandra being done via Spark ?
>
> On Wed, Jan 13, 2016 at 7:17 AM, Dennis Birkholz <birkholz@pubgrade.com
> <ma...@pubgrade.com>> wrote:
>
>     Hi together,
>
>     we Cassandra to log event data and process it every 15 minutes with
>     Spark. We are using the Cassandra Java Connector for Spark.
>
>     Randomly our Spark runs produce too few output records because no
>     data is returned from Cassandra for a several minutes window of
>     input data. When querying the data (with cqlsh), after multiple
>     tries, the data eventually becomes available.
>
>     To solve the problem, we tried to use consistency=ALL when reading
>     the data in Spark. We use the
>     CassandraJavaUtil.javafunctions().cassandraTable() method and have
>     set "spark.cassandra.input.consistency.level"="ALL" on the config
>     when creating the Spark context. The problem persists but according
>     to http://stackoverflow.com/a/25043599 using a consistency level of
>     ONE on the write side (which we use) and ALL on the READ side should
>     be sufficient for data consistency.
>
>     I would really appreciate if someone could give me a hint how to fix
>     this problem, thanks!
>
>     Greets,
>     Dennis
>
>     P.s.:
>     some information about our setup:
>     Cassandra 2.1.12 in a two Node configuration with replication factor=2
>     Spark 1.5.1
>     Cassandra Java Driver 2.2.0-rc3
>     Spark Cassandra Java Connector 2.10-1.5.0-M2
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>     <ma...@spark.apache.org>
>     For additional commands, e-mail: user-help@spark.apache.org
>     <ma...@spark.apache.org>
>
>
>
>
> --
> http://www.femibyte.com/twiki5/bin/view/Tech/
> http://www.nextmatrix.com
> "Great spirits have always encountered violent opposition from mediocre
> minds." - Albert Einstein.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Spark Cassandra Java Connector: records missing despite consistency=ALL

Posted by Dennis Birkholz <bi...@pubgrade.com>.
Hi Anthony,

no, the logging is not done via Spark (but PHP). But that does not 
really matter, as the records are eventually there. So it is the 
READ_CONSISTENCY=ALL that is not working.

Btw. it seems that using withReadConf() and setting the consistency 
level there is working but I need to wait a few more days before I am 
sure of that.

Kind regards,
Dennis

Am 19.01.2016 um 19:39 schrieb Femi Anthony:
> So is the logging to Cassandra being done via Spark ?
>
> On Wed, Jan 13, 2016 at 7:17 AM, Dennis Birkholz <birkholz@pubgrade.com
> <ma...@pubgrade.com>> wrote:
>
>     Hi together,
>
>     we Cassandra to log event data and process it every 15 minutes with
>     Spark. We are using the Cassandra Java Connector for Spark.
>
>     Randomly our Spark runs produce too few output records because no
>     data is returned from Cassandra for a several minutes window of
>     input data. When querying the data (with cqlsh), after multiple
>     tries, the data eventually becomes available.
>
>     To solve the problem, we tried to use consistency=ALL when reading
>     the data in Spark. We use the
>     CassandraJavaUtil.javafunctions().cassandraTable() method and have
>     set "spark.cassandra.input.consistency.level"="ALL" on the config
>     when creating the Spark context. The problem persists but according
>     to http://stackoverflow.com/a/25043599 using a consistency level of
>     ONE on the write side (which we use) and ALL on the READ side should
>     be sufficient for data consistency.
>
>     I would really appreciate if someone could give me a hint how to fix
>     this problem, thanks!
>
>     Greets,
>     Dennis
>
>     P.s.:
>     some information about our setup:
>     Cassandra 2.1.12 in a two Node configuration with replication factor=2
>     Spark 1.5.1
>     Cassandra Java Driver 2.2.0-rc3
>     Spark Cassandra Java Connector 2.10-1.5.0-M2
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>     <ma...@spark.apache.org>
>     For additional commands, e-mail: user-help@spark.apache.org
>     <ma...@spark.apache.org>
>
>
>
>
> --
> http://www.femibyte.com/twiki5/bin/view/Tech/
> http://www.nextmatrix.com
> "Great spirits have always encountered violent opposition from mediocre
> minds." - Albert Einstein.


Re: Spark Cassandra Java Connector: records missing despite consistency=ALL

Posted by Femi Anthony <fe...@gmail.com>.
So is the logging to Cassandra being done via Spark ?

On Wed, Jan 13, 2016 at 7:17 AM, Dennis Birkholz <bi...@pubgrade.com>
wrote:

> Hi together,
>
> we Cassandra to log event data and process it every 15 minutes with Spark.
> We are using the Cassandra Java Connector for Spark.
>
> Randomly our Spark runs produce too few output records because no data is
> returned from Cassandra for a several minutes window of input data. When
> querying the data (with cqlsh), after multiple tries, the data eventually
> becomes available.
>
> To solve the problem, we tried to use consistency=ALL when reading the
> data in Spark. We use the
> CassandraJavaUtil.javafunctions().cassandraTable() method and have set
> "spark.cassandra.input.consistency.level"="ALL" on the config when creating
> the Spark context. The problem persists but according to
> http://stackoverflow.com/a/25043599 using a consistency level of ONE on
> the write side (which we use) and ALL on the READ side should be sufficient
> for data consistency.
>
> I would really appreciate if someone could give me a hint how to fix this
> problem, thanks!
>
> Greets,
> Dennis
>
> P.s.:
> some information about our setup:
> Cassandra 2.1.12 in a two Node configuration with replication factor=2
> Spark 1.5.1
> Cassandra Java Driver 2.2.0-rc3
> Spark Cassandra Java Connector 2.10-1.5.0-M2
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>


-- 
http://www.femibyte.com/twiki5/bin/view/Tech/
http://www.nextmatrix.com
"Great spirits have always encountered violent opposition from mediocre
minds." - Albert Einstein.