You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Dennis Birkholz <bi...@pubgrade.com> on 2016/01/13 13:17:59 UTC
Spark Cassandra Java Connector: records missing despite
consistency=ALL
Hi together,
we Cassandra to log event data and process it every 15 minutes with
Spark. We are using the Cassandra Java Connector for Spark.
Randomly our Spark runs produce too few output records because no data
is returned from Cassandra for a several minutes window of input data.
When querying the data (with cqlsh), after multiple tries, the data
eventually becomes available.
To solve the problem, we tried to use consistency=ALL when reading the
data in Spark. We use the
CassandraJavaUtil.javafunctions().cassandraTable() method and have set
"spark.cassandra.input.consistency.level"="ALL" on the config when
creating the Spark context. The problem persists but according to
http://stackoverflow.com/a/25043599 using a consistency level of ONE on
the write side (which we use) and ALL on the READ side should be
sufficient for data consistency.
I would really appreciate if someone could give me a hint how to fix
this problem, thanks!
Greets,
Dennis
P.s.:
some information about our setup:
Cassandra 2.1.12 in a two Node configuration with replication factor=2
Spark 1.5.1
Cassandra Java Driver 2.2.0-rc3
Spark Cassandra Java Connector 2.10-1.5.0-M2
Re: Spark Cassandra Java Connector: records missing despite consistency=ALL
Posted by Alex Popescu <al...@datastax.com>.
Dennis,
You'll have better chances to get an answer on the
spark-cassandra-connector mailing list
https://groups.google.com/a/lists.datastax.com/forum/#!forum/spark-connector-user
or on IRC #spark-cassandra-connector
On Wed, Jan 13, 2016 at 4:17 AM, Dennis Birkholz <bi...@pubgrade.com>
wrote:
> Hi together,
>
> we Cassandra to log event data and process it every 15 minutes with Spark.
> We are using the Cassandra Java Connector for Spark.
>
> Randomly our Spark runs produce too few output records because no data is
> returned from Cassandra for a several minutes window of input data. When
> querying the data (with cqlsh), after multiple tries, the data eventually
> becomes available.
>
> To solve the problem, we tried to use consistency=ALL when reading the
> data in Spark. We use the
> CassandraJavaUtil.javafunctions().cassandraTable() method and have set
> "spark.cassandra.input.consistency.level"="ALL" on the config when creating
> the Spark context. The problem persists but according to
> http://stackoverflow.com/a/25043599 using a consistency level of ONE on
> the write side (which we use) and ALL on the READ side should be sufficient
> for data consistency.
>
> I would really appreciate if someone could give me a hint how to fix this
> problem, thanks!
>
> Greets,
> Dennis
>
> P.s.:
> some information about our setup:
> Cassandra 2.1.12 in a two Node configuration with replication factor=2
> Spark 1.5.1
> Cassandra Java Driver 2.2.0-rc3
> Spark Cassandra Java Connector 2.10-1.5.0-M2
>
--
Bests,
Alex Popescu | @al3xandru
Sen. Product Manager @ DataStax
Re: Spark Cassandra Java Connector: records missing despite
consistency=ALL
Posted by Dennis Birkholz <bi...@pubgrade.com>.
Hi Anthony,
no, the logging is not done via Spark (but PHP). But that does not
really matter, as the records are eventually there. So it is the
READ_CONSISTENCY=ALL that is not working.
Btw. it seems that using withReadConf() and setting the consistency
level there is working but I need to wait a few more days before I am
sure of that.
Kind regards,
Dennis
Am 19.01.2016 um 19:39 schrieb Femi Anthony:
> So is the logging to Cassandra being done via Spark ?
>
> On Wed, Jan 13, 2016 at 7:17 AM, Dennis Birkholz <birkholz@pubgrade.com
> <ma...@pubgrade.com>> wrote:
>
> Hi together,
>
> we Cassandra to log event data and process it every 15 minutes with
> Spark. We are using the Cassandra Java Connector for Spark.
>
> Randomly our Spark runs produce too few output records because no
> data is returned from Cassandra for a several minutes window of
> input data. When querying the data (with cqlsh), after multiple
> tries, the data eventually becomes available.
>
> To solve the problem, we tried to use consistency=ALL when reading
> the data in Spark. We use the
> CassandraJavaUtil.javafunctions().cassandraTable() method and have
> set "spark.cassandra.input.consistency.level"="ALL" on the config
> when creating the Spark context. The problem persists but according
> to http://stackoverflow.com/a/25043599 using a consistency level of
> ONE on the write side (which we use) and ALL on the READ side should
> be sufficient for data consistency.
>
> I would really appreciate if someone could give me a hint how to fix
> this problem, thanks!
>
> Greets,
> Dennis
>
> P.s.:
> some information about our setup:
> Cassandra 2.1.12 in a two Node configuration with replication factor=2
> Spark 1.5.1
> Cassandra Java Driver 2.2.0-rc3
> Spark Cassandra Java Connector 2.10-1.5.0-M2
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> <ma...@spark.apache.org>
> For additional commands, e-mail: user-help@spark.apache.org
> <ma...@spark.apache.org>
>
>
>
>
> --
> http://www.femibyte.com/twiki5/bin/view/Tech/
> http://www.nextmatrix.com
> "Great spirits have always encountered violent opposition from mediocre
> minds." - Albert Einstein.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org
Re: Spark Cassandra Java Connector: records missing despite
consistency=ALL
Posted by Dennis Birkholz <bi...@pubgrade.com>.
Hi Anthony,
no, the logging is not done via Spark (but PHP). But that does not
really matter, as the records are eventually there. So it is the
READ_CONSISTENCY=ALL that is not working.
Btw. it seems that using withReadConf() and setting the consistency
level there is working but I need to wait a few more days before I am
sure of that.
Kind regards,
Dennis
Am 19.01.2016 um 19:39 schrieb Femi Anthony:
> So is the logging to Cassandra being done via Spark ?
>
> On Wed, Jan 13, 2016 at 7:17 AM, Dennis Birkholz <birkholz@pubgrade.com
> <ma...@pubgrade.com>> wrote:
>
> Hi together,
>
> we Cassandra to log event data and process it every 15 minutes with
> Spark. We are using the Cassandra Java Connector for Spark.
>
> Randomly our Spark runs produce too few output records because no
> data is returned from Cassandra for a several minutes window of
> input data. When querying the data (with cqlsh), after multiple
> tries, the data eventually becomes available.
>
> To solve the problem, we tried to use consistency=ALL when reading
> the data in Spark. We use the
> CassandraJavaUtil.javafunctions().cassandraTable() method and have
> set "spark.cassandra.input.consistency.level"="ALL" on the config
> when creating the Spark context. The problem persists but according
> to http://stackoverflow.com/a/25043599 using a consistency level of
> ONE on the write side (which we use) and ALL on the READ side should
> be sufficient for data consistency.
>
> I would really appreciate if someone could give me a hint how to fix
> this problem, thanks!
>
> Greets,
> Dennis
>
> P.s.:
> some information about our setup:
> Cassandra 2.1.12 in a two Node configuration with replication factor=2
> Spark 1.5.1
> Cassandra Java Driver 2.2.0-rc3
> Spark Cassandra Java Connector 2.10-1.5.0-M2
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> <ma...@spark.apache.org>
> For additional commands, e-mail: user-help@spark.apache.org
> <ma...@spark.apache.org>
>
>
>
>
> --
> http://www.femibyte.com/twiki5/bin/view/Tech/
> http://www.nextmatrix.com
> "Great spirits have always encountered violent opposition from mediocre
> minds." - Albert Einstein.
Re: Spark Cassandra Java Connector: records missing despite consistency=ALL
Posted by Femi Anthony <fe...@gmail.com>.
So is the logging to Cassandra being done via Spark ?
On Wed, Jan 13, 2016 at 7:17 AM, Dennis Birkholz <bi...@pubgrade.com>
wrote:
> Hi together,
>
> we Cassandra to log event data and process it every 15 minutes with Spark.
> We are using the Cassandra Java Connector for Spark.
>
> Randomly our Spark runs produce too few output records because no data is
> returned from Cassandra for a several minutes window of input data. When
> querying the data (with cqlsh), after multiple tries, the data eventually
> becomes available.
>
> To solve the problem, we tried to use consistency=ALL when reading the
> data in Spark. We use the
> CassandraJavaUtil.javafunctions().cassandraTable() method and have set
> "spark.cassandra.input.consistency.level"="ALL" on the config when creating
> the Spark context. The problem persists but according to
> http://stackoverflow.com/a/25043599 using a consistency level of ONE on
> the write side (which we use) and ALL on the READ side should be sufficient
> for data consistency.
>
> I would really appreciate if someone could give me a hint how to fix this
> problem, thanks!
>
> Greets,
> Dennis
>
> P.s.:
> some information about our setup:
> Cassandra 2.1.12 in a two Node configuration with replication factor=2
> Spark 1.5.1
> Cassandra Java Driver 2.2.0-rc3
> Spark Cassandra Java Connector 2.10-1.5.0-M2
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>
--
http://www.femibyte.com/twiki5/bin/view/Tech/
http://www.nextmatrix.com
"Great spirits have always encountered violent opposition from mediocre
minds." - Albert Einstein.