You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ashok Kumar <as...@yahoo.com.INVALID> on 2016/02/24 23:40:26 UTC

Filter on a column having multiple values

 Hi,
I would like to do the following
select count(*) from <table> where column1 in (1,5))
I define
scala> var t = HiveContext.table("table")
This workst.filter($"column1" ===1)
How can I expand this to have column1  for both 1 and 5 please?
thanks

Re: Filter on a column having multiple values

Posted by Yin Yang <yy...@gmail.com>.
However, when the number of choices gets big, the following notation
becomes cumbersome.


On Wed, Feb 24, 2016 at 3:41 PM, Mich Talebzadeh <
mich.talebzadeh@cloudtechnologypartners.co.uk> wrote:

> You can use operators here.
>
> t.filter($"column1" === 1 || $"column1" === 2)
>
>
>
>
>
> On 24/02/2016 22:40, Ashok Kumar wrote:
>
> Hi,
>
> I would like to do the following
>
> select count(*) from <table> where column1 in (1,5))
>
> I define
>
> scala> var t = HiveContext.table("table")
>
> This works
> t.filter($"column1" ===1)
>
> How can I expand this to have column1  for both 1 and 5 please?
>
> thanks
>
>
>
>
> --
>
> Dr Mich Talebzadeh
>
> LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> http://talebzadehmich.wordpress.com
>
> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Cloud Technology Partners Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Cloud Technology partners Ltd, its subsidiaries nor their employees accept any responsibility.
>
>
>

Re: Filter on a column having multiple values

Posted by Mich Talebzadeh <mi...@cloudtechnologypartners.co.uk>.
 

You can use operators here. 

t.filter($"column1" === 1 || $"column1" === 2) 

On 24/02/2016 22:40, Ashok Kumar wrote: 

> Hi, 
> 
> I would like to do the following 
> 
> select count(*) from <table> where column1 in (1,5)) 
> 
> I define 
> 
> scala> var t = HiveContext.table("table") 
> 
> This works 
> t.filter($"column1" ===1) 
> 
> How can I expand this to have column1 for both 1 and 5 please? 
> 
> thanks

-- 

Dr Mich Talebzadeh

LinkedIn
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

http://talebzadehmich.wordpress.com

NOTE: The information in this email is proprietary and confidential.
This message is for the designated recipient only, if you are not the
intended recipient, you should destroy it immediately. Any information
in this message shall not be understood as given or endorsed by Cloud
Technology Partners Ltd, its subsidiaries or their employees, unless
expressly so stated. It is the responsibility of the recipient to ensure
that this email is virus free, therefore neither Cloud Technology
partners Ltd, its subsidiaries nor their employees accept any
responsibility.

 

Re: Filter on a column having multiple values

Posted by Michael Armbrust <mi...@databricks.com>.
You can do this either with expr("... IN ...") or isin.

Here is a full example
<https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/1023043053387187/1075277772969592/2840265927289860/2388bac36e.html>
.

On Wed, Feb 24, 2016 at 2:40 PM, Ashok Kumar <as...@yahoo.com.invalid>
wrote:

> Hi,
>
> I would like to do the following
>
> select count(*) from <table> where column1 in (1,5))
>
> I define
>
> scala> var t = HiveContext.table("table")
>
> This works
> t.filter($"column1" ===1)
>
> How can I expand this to have column1  for both 1 and 5 please?
>
> thanks
>