You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Tim Gautier <ti...@gmail.com> on 2016/05/24 22:46:35 UTC

Dataset Set Operations

Hello All,

I've been trying to subtract one dataset from another. Both datasets
contain case classes of the same type. When I subtract B from A, I end up
with a copy of A that still has the records of B in it. (An intersection of
A and B always results in 0 results.) All I can figure is that spark is
doing an equality check that determines nothing matches. What is that
equality function and is there some way I can change it?

Thanks
Tim

Re: Dataset Set Operations

Posted by Michael Armbrust <mi...@databricks.com>.
What is the schema of the case class?

On Tue, May 24, 2016 at 3:46 PM, Tim Gautier <ti...@gmail.com> wrote:

> Hello All,
>
> I've been trying to subtract one dataset from another. Both datasets
> contain case classes of the same type. When I subtract B from A, I end up
> with a copy of A that still has the records of B in it. (An intersection of
> A and B always results in 0 results.) All I can figure is that spark is
> doing an equality check that determines nothing matches. What is that
> equality function and is there some way I can change it?
>
> Thanks
> Tim
>