You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by nayan sharma <na...@gmail.com> on 2017/04/17 14:35:19 UTC
isin query
Dataframe (df) having column msrid(String) having values m_123,m_111,m_145,m_098,m_666
I wanted to filter out rows which are having values m_123,m_111,m_145
df.filter($"msrid".isin("m_123","m_111","m_145")).count
count =0
while
df.filter($"msrid".isin("m_123")).count
count=121212
I have tried using queries like
df.filter($"msrid" isin (List("m_123","m_111","m_145"):_*))
count =0
but
df.filter($"msrid" isin (List("m_123"):_*))
count=121212
Any suggestion will do a great help to me.
Thanks,
Nayan
Re: isin query
Posted by Koert Kuipers <ko...@tresata.com>.
i dont see this behavior in the current spark master:
scala> val df = Seq("m_123", "m_111", "m_145", "m_098",
"m_666").toDF("msrid")
df: org.apache.spark.sql.DataFrame = [msrid: string]
scala> df.filter($"msrid".isin("m_123")).count
res0: Long =
1
scala> df.filter($"msrid".isin("m_123","m_111","m_145")).count
res1: Long = 3
On Mon, Apr 17, 2017 at 10:50 AM, nayan sharma <na...@gmail.com>
wrote:
> Thanks for responding.
> df.filter($”msrid”===“m_123” || $”msrid”===“m_111”)
>
> there are lots of workaround to my question but Can you let know whats
> wrong with the “isin” query.
>
> Regards,
> Nayan
>
> Begin forwarded message:
>
> *From: *ayan guha <gu...@gmail.com>
> *Subject: **Re: isin query*
> *Date: *17 April 2017 at 8:13:24 PM IST
> *To: *nayan sharma <na...@gmail.com>, user@spark.apache.org
>
> How about using OR operator in filter?
>
> On Tue, 18 Apr 2017 at 12:35 am, nayan sharma <na...@gmail.com>
> wrote:
>
>> Dataframe (df) having column msrid(String) having values
>> m_123,m_111,m_145,m_098,m_666
>>
>> I wanted to filter out rows which are having values m_123,m_111,m_145
>>
>> df.filter($"msrid".isin("m_123","m_111","m_145")).count
>> count =0
>> while
>> df.filter($"msrid".isin("m_123")).count
>> count=121212
>> I have tried using queries like
>> df.filter($"msrid" isin (List("m_123","m_111","m_145"):_*))
>> count =0
>> but
>>
>> df.filter($"msrid" isin (List("m_123"):_*))
>> count=121212
>>
>> Any suggestion will do a great help to me.
>>
>> Thanks,
>> Nayan
>>
> --
> Best Regards,
> Ayan Guha
>
>
>
Fwd: isin query
Posted by nayan sharma <na...@gmail.com>.
Thanks for responding.
df.filter($”msrid”===“m_123” || $”msrid”===“m_111”)
there are lots of workaround to my question but Can you let know whats wrong with the “isin” query.
Regards,
Nayan
> Begin forwarded message:
>
> From: ayan guha <gu...@gmail.com>
> Subject: Re: isin query
> Date: 17 April 2017 at 8:13:24 PM IST
> To: nayan sharma <na...@gmail.com>, user@spark.apache.org
>
> How about using OR operator in filter?
>
> On Tue, 18 Apr 2017 at 12:35 am, nayan sharma <nayansharma13@gmail.com <ma...@gmail.com>> wrote:
> Dataframe (df) having column msrid(String) having values m_123,m_111,m_145,m_098,m_666
>
> I wanted to filter out rows which are having values m_123,m_111,m_145
>
> df.filter($"msrid".isin("m_123","m_111","m_145")).count
> count =0
> while
> df.filter($"msrid".isin("m_123")).count
> count=121212
> I have tried using queries like
> df.filter($"msrid" isin (List("m_123","m_111","m_145"):_*))
> count =0
> but
>
> df.filter($"msrid" isin (List("m_123"):_*))
> count=121212
>
> Any suggestion will do a great help to me.
>
> Thanks,
> Nayan
> --
> Best Regards,
> Ayan Guha
Re: isin query
Posted by ayan guha <gu...@gmail.com>.
How about using OR operator in filter?
On Tue, 18 Apr 2017 at 12:35 am, nayan sharma <na...@gmail.com>
wrote:
> Dataframe (df) having column msrid(String) having values
> m_123,m_111,m_145,m_098,m_666
>
> I wanted to filter out rows which are having values m_123,m_111,m_145
>
> df.filter($"msrid".isin("m_123","m_111","m_145")).count
> count =0
> while
> df.filter($"msrid".isin("m_123")).count
> count=121212
> I have tried using queries like
> df.filter($"msrid" isin (List("m_123","m_111","m_145"):_*))
> count =0
> but
>
> df.filter($"msrid" isin (List("m_123"):_*))
> count=121212
>
> Any suggestion will do a great help to me.
>
> Thanks,
> Nayan
>
--
Best Regards,
Ayan Guha