You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yuval Yaari (JIRA)" <ji...@apache.org> on 2018/11/08 12:45:00 UTC

[jira] [Updated] (SPARK-25976) Allow rdd.reduce on empty rdd by returning an Option[T]

     [ https://issues.apache.org/jira/browse/SPARK-25976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuval Yaari updated SPARK-25976:
--------------------------------
    Description: 
it is sometimes useful to let the user decide what value to return when reducing on an empty rdd.

currently, if there is no data to reduce an UnsupportedOperationException is thrown. 

although user can catch that exception, it seems like a "shaky" solution as UnsupportedOperationException might be thrown from a different location.

Instead, we can overload the reduce method by adding add a new method:

reduce(f: (T, T) => T, defaultIfEmpty: () => T): T

the reduce API will not be effected as it will simply call the second reduce method throwing an UnsupportedException as the default value

 

  was:
it is sometimes useful to let the user decide what value to return when reducing on an empty rdd.

currently, if there is no data to reduce an UnsupportedOperationException is thrown. 

although user can catch that exception, it seems like a "shaky" solution as UnsupportedOperationException might be thrown from a different location.

Instead, we can overload the reduce method by adding add a new method:

reduce(f: (T, T) => T, defaultIfEmpty: T): T

the reduce API will not be effected as it will simply call the second reduce method throwing an UnsupportedException as the default value

 


> Allow rdd.reduce on empty rdd by returning an Option[T]
> -------------------------------------------------------
>
>                 Key: SPARK-25976
>                 URL: https://issues.apache.org/jira/browse/SPARK-25976
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.3.2
>            Reporter: Yuval Yaari
>            Priority: Minor
>
> it is sometimes useful to let the user decide what value to return when reducing on an empty rdd.
> currently, if there is no data to reduce an UnsupportedOperationException is thrown. 
> although user can catch that exception, it seems like a "shaky" solution as UnsupportedOperationException might be thrown from a different location.
> Instead, we can overload the reduce method by adding add a new method:
> reduce(f: (T, T) => T, defaultIfEmpty: () => T): T
> the reduce API will not be effected as it will simply call the second reduce method throwing an UnsupportedException as the default value
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org