You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Peter Sg <pe...@varonis.com> on 2017/02/02 09:41:44 UTC

filters Pushdown

Can community help me to figure out some details about Spark:
-	Does Spark support filter Pushdown for types:
  o	Int/long
  o	DateTime
  o	String
-	Does Spark support Pushdown of join operations for partitioned tables (in
case of join condition includes partitioning field)?
-	Does Spark support Pushdown on Parquet, ORC ?
  o	Should I use Hadoop or NTFS/NFS is option was well?




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/filters-Pushdown-tp28357.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


RE: filters Pushdown

Posted by vincent gromakowski <vi...@gmail.com>.
There are some native (in the doc) and some third party (in spark package
https://spark-packages.org/?q=tags%3A"Data%20Sources")
Parquet is prefered native. Cassandra/filodb provides most advanced
pushdown.

Le 2 févr. 2017 11:23 AM, "Peter Shmukler" <pe...@varonis.com> a écrit :

> Hi Vincent,
>
> Thank you for answer. (I don’t see your answer in mailing list, so I’m
> answering directly)
>
>
>
> What connectors can I work with from Spark?
>
> Can you provide any link to read about it because I see nothing in Spark
> documentation?
>
>
>
>
>
> *From:* vincent gromakowski [mailto:vincent.gromakowski@gmail.com]
> *Sent:* Thursday, February 2, 2017 12:12 PM
> *To:* Peter Shmukler <pe...@varonis.com>
> *Cc:* user@spark.apache.org
> *Subject:* Re: filters Pushdown
>
>
>
> Pushdowns depend on the source connector.
> Join pushdown with Cassandra only
> Filter pushdown with mainly all sources with some specific constraints
>
>
>
> Le 2 févr. 2017 10:42 AM, "Peter Sg" <pe...@varonis.com> a écrit :
>
> Can community help me to figure out some details about Spark:
> -       Does Spark support filter Pushdown for types:
>   o     Int/long
>   o     DateTime
>   o     String
> -       Does Spark support Pushdown of join operations for partitioned
> tables (in
> case of join condition includes partitioning field)?
> -       Does Spark support Pushdown on Parquet, ORC ?
>   o     Should I use Hadoop or NTFS/NFS is option was well?
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/filters-Pushdown-tp28357.html
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Dspark-2Duser-2Dlist.1001560.n3.nabble.com_filters-2DPushdown-2Dtp28357.html&d=DwMFaQ&c=7s4bs_giP1ngjwWhX4oayQ&r=kLWLAWGkyIRgjRCprqh7QX1OMFp1eBZjlRawqzDlMWc&m=Zss0q3yuZVzxFuqvPaXLIOHACrxzZOjevU-VE8Eeh04&s=dupzi0-PiyPLCmvPqwWSt2NaEE5hUKlbzmB4-NRuhfg&e=>
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
> ------------------------------
> This email and any attachments thereto may contain private, confidential,
> and privileged material for the sole use of the intended recipient. Any
> review, copying, or distribution of this email (or any attachments thereto)
> by others is strictly prohibited. If you are not the intended recipient,
> please contact the sender immediately and permanently delete the original
> and any copies of this email and any attachments thereto.
>

Re: filters Pushdown

Posted by ayan guha <gu...@gmail.com>.
Look for spark packages website. If your questions were targeted for hive,
then i think in general all answers are yes
On Thu, 2 Feb 2017 at 9:23 pm, Peter Shmukler <pe...@varonis.com> wrote:

> Hi Vincent,
>
> Thank you for answer. (I don’t see your answer in mailing list, so I’m
> answering directly)
>
>
>
> What connectors can I work with from Spark?
>
> Can you provide any link to read about it because I see nothing in Spark
> documentation?
>
>
>
>
>
> *From:* vincent gromakowski [mailto:vincent.gromakowski@gmail.com]
> *Sent:* Thursday, February 2, 2017 12:12 PM
> *To:* Peter Shmukler <pe...@varonis.com>
> *Cc:* user@spark.apache.org
> *Subject:* Re: filters Pushdown
>
>
>
> Pushdowns depend on the source connector.
> Join pushdown with Cassandra only
> Filter pushdown with mainly all sources with some specific constraints
>
>
>
> Le 2 févr. 2017 10:42 AM, "Peter Sg" <pe...@varonis.com> a écrit :
>
> Can community help me to figure out some details about Spark:
> -       Does Spark support filter Pushdown for types:
>   o     Int/long
>   o     DateTime
>   o     String
> -       Does Spark support Pushdown of join operations for partitioned
> tables (in
> case of join condition includes partitioning field)?
> -       Does Spark support Pushdown on Parquet, ORC ?
>   o     Should I use Hadoop or NTFS/NFS is option was well?
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/filters-Pushdown-tp28357.html
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Dspark-2Duser-2Dlist.1001560.n3.nabble.com_filters-2DPushdown-2Dtp28357.html&d=DwMFaQ&c=7s4bs_giP1ngjwWhX4oayQ&r=kLWLAWGkyIRgjRCprqh7QX1OMFp1eBZjlRawqzDlMWc&m=Zss0q3yuZVzxFuqvPaXLIOHACrxzZOjevU-VE8Eeh04&s=dupzi0-PiyPLCmvPqwWSt2NaEE5hUKlbzmB4-NRuhfg&e=>
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
> ------------------------------
> This email and any attachments thereto may contain private, confidential,
> and privileged material for the sole use of the intended recipient. Any
> review, copying, or distribution of this email (or any attachments thereto)
> by others is strictly prohibited. If you are not the intended recipient,
> please contact the sender immediately and permanently delete the original
> and any copies of this email and any attachments thereto.
>
-- 
Best Regards,
Ayan Guha

RE: filters Pushdown

Posted by Peter Shmukler <pe...@varonis.com>.
Hi Vincent,
Thank you for answer. (I don’t see your answer in mailing list, so I’m answering directly)

What connectors can I work with from Spark?
Can you provide any link to read about it because I see nothing in Spark documentation?


From: vincent gromakowski [mailto:vincent.gromakowski@gmail.com]
Sent: Thursday, February 2, 2017 12:12 PM
To: Peter Shmukler <pe...@varonis.com>
Cc: user@spark.apache.org
Subject: Re: filters Pushdown


Pushdowns depend on the source connector.
Join pushdown with Cassandra only
Filter pushdown with mainly all sources with some specific constraints

Le 2 févr. 2017 10:42 AM, "Peter Sg" <pe...@varonis.com>> a écrit :
Can community help me to figure out some details about Spark:
-       Does Spark support filter Pushdown for types:
  o     Int/long
  o     DateTime
  o     String
-       Does Spark support Pushdown of join operations for partitioned tables (in
case of join condition includes partitioning field)?
-       Does Spark support Pushdown on Parquet, ORC ?
  o     Should I use Hadoop or NTFS/NFS is option was well?




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/filters-Pushdown-tp28357.html<https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Dspark-2Duser-2Dlist.1001560.n3.nabble.com_filters-2DPushdown-2Dtp28357.html&d=DwMFaQ&c=7s4bs_giP1ngjwWhX4oayQ&r=kLWLAWGkyIRgjRCprqh7QX1OMFp1eBZjlRawqzDlMWc&m=Zss0q3yuZVzxFuqvPaXLIOHACrxzZOjevU-VE8Eeh04&s=dupzi0-PiyPLCmvPqwWSt2NaEE5hUKlbzmB4-NRuhfg&e=>
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org<ma...@spark.apache.org>

________________________________
This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.

Re: filters Pushdown

Posted by vincent gromakowski <vi...@gmail.com>.
Pushdowns depend on the source connector.
Join pushdown with Cassandra only
Filter pushdown with mainly all sources with some specific constraints

Le 2 févr. 2017 10:42 AM, "Peter Sg" <pe...@varonis.com> a écrit :

> Can community help me to figure out some details about Spark:
> -       Does Spark support filter Pushdown for types:
>   o     Int/long
>   o     DateTime
>   o     String
> -       Does Spark support Pushdown of join operations for partitioned
> tables (in
> case of join condition includes partitioning field)?
> -       Does Spark support Pushdown on Parquet, ORC ?
>   o     Should I use Hadoop or NTFS/NFS is option was well?
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/filters-Pushdown-tp28357.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>