You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Bryn Keller <xo...@xoltar.org> on 2014/02/28 06:34:35 UTC

How best to ensure partitioners behave?

Hi Folks,

I just filed https://spark-project.atlassian.net/browse/SPARK-1149 - I'm
happy to fix it, but I'd like input about how best to go about it. The
problem is, if a partitioner misbehaves by, say, returning a negative
partition number, Spark hangs. This is easier to do than it sounds.

I'd like to fix that so that instead we'd get an exception that let the
developer know their partitioner had done something wrong.

Unfortunately I don't see an easy way to do that without changing method
signatures. Would a reasonable compromise be to write a utility method that
checks the partition number against Partitioner.numPartitions, and update
the code to use that method everywhere that it directly calls
Partitioner.getPartition now?

Anyone have a better suggestion?

Thanks,
Bryn