You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Sean R. Owen (Jira)" <ji...@apache.org> on 2019/12/09 21:43:00 UTC

[jira] [Created] (SPARK-30195) Some imports, function need more explicit resolution in 2.13

Sean R. Owen created SPARK-30195:
------------------------------------

             Summary: Some imports, function need more explicit resolution in 2.13
                 Key: SPARK-30195
                 URL: https://issues.apache.org/jira/browse/SPARK-30195
             Project: Spark
          Issue Type: Sub-task
          Components: ML, Spark Core, SQL, Structured Streaming
    Affects Versions: 3.0.0
            Reporter: Sean R. Owen
            Assignee: Sean R. Owen


This is a grouping of related but not identical issues in the 2.13 migration, where the compiler is more picky about explicit types and imports. I'm grouping them as they seem moderately related.

Some are fairly self-evident like wanting an explicit generic type. In a few cases it looks like import resolution rules tightened up a bit and have to be explicit.

A few more cause problems like:

{code}
[ERROR] [Error] /Users/seanowen/Documents/spark_2.13/mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala:220: missing parameter type for expanded function
The argument types of an anonymous function must be fully known. (SLS 8.5)
Expected type was: ?
{code}

In some cases it's just a matter of adding an explicit type, like {{.map { m: Matrix =>}}.

Many seem to concern functions of tuples, or tuples of tuples.

{{.mapGroups { case (g, iter) =>}} needs to be simply {{.mapGroups { (g, iter) =>}}

Or more annoyingly:

{code}
    }.reduceByKey { case ((wc1, df1), (wc2, df2)) =>
      (wc1 + wc2, df1 + df2)
    }
{code}

Apparently can only be fully known without nesting tuples. This _won't_ work:

{code}
    }.reduceByKey { case ((wc1: Long, df1: Int), (wc2: Long, df2: Int)) =>
      (wc1 + wc2, df1 + df2)
    }
{code}

This does:

{code}
    }.reduceByKey { (wcdf1, wcdf2) =>
      (wcdf1._1 + wcdf2._1, wcdf1._2 + wcdf2._2)
    }
{code}

I'm not super clear why most of the problems seem to affect reduceByKey.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org