You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sedona.apache.org by GitBox <gi...@apache.org> on 2021/10/22 11:39:56 UTC

[GitHub] [incubator-sedona] Kimahriman opened a new pull request #557: [SEDONA-67] Support Spark 3.2

Kimahriman opened a new pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557


   ## Is this PR related to a proposed Issue?
   https://issues.apache.org/jira/browse/SEDONA-67
   
   ## What changes were proposed in this PR?
   Adds support for Spark 3.2, but maintains compatibilities.
   
   Specifically to get 3.2 working I had to:
   - Rework how functions are registered to use an API present in all supported versions of Spark.
   - Add `withNewChildrenInternal` to all expressions. Without using `override`, it works both in the new version to override it as well as in older versions it's just ignored.
   - Add a new profile for Spark 3.2 because it needs a newer version of jackson to test with
   
   Additionally I cleaned up a few things:
   - Removed assert statements from the eval method of expressions. This is an analysis time check that can be done on instantiation of the case class
   - Removed a bunch of return statements to make things consistent
   
   ## How was this patch tested?
   Existing tests, though I do need help figuring out how the github workflow should be updated.
   
   ## Did this PR include necessary documentation updates?
   Haven't looked at the documentation yet, probably need to update a compatibility page still


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-950245019


   @Kimahriman 
   
   Thanks for your update. This looks good to me.
   
   Do you mind adding me @jiayuasu  and Pawel @Imbruced to be collaborators of your Sedona fork? So I can update your Github Action config and Pawel can fix Python part.
   
   @Imbruced BTW, I cannot add you as a reviewer of Sedona PR because you don't have the "write" access to this repo. Please fix your ASF account.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-966047257


   @arturdryomov Good question. I will start to prepare Sedona 1.1.1 release this weekend.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] yitao-li commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
yitao-li commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-961386177


   @jiayuasu I pushed some more extensive changes to `sparklyr` in https://github.com/sparklyr/sparklyr/pull/3198. `spark_install()` should be flexible enough to handle new/unknown versions of Apache Spark now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-955238308


   Python tests failed because of https://github.com/apache/incubator-sedona/blob/6fafc93217b14745d11819f44e5a166480746baa/python/tests/serialization/test_deserializers.py#L68 which does a geomfromwkt call with no parameters and doesn't make any assertions, is there any point in that? It never gets evaled which is why the assertion didn't get thrown before


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] yitao-li commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
yitao-li commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-961386177


   @jiayuasu I pushed some more extensive changes to `sparklyr` in https://github.com/sparklyr/sparklyr/pull/3198. `spark_install()` should be flexible enough to handle new/unknown versions of Apache Spark now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Imbruced commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Imbruced commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-949641099


   @Kimahriman Can you add me as a reviewer ? And also I can help you with Python API 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] yitao-li commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
yitao-li commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-964694607


   @jiayuasu OK. I re-enabled R builds for Spark 3.2 in https://github.com/apache/incubator-sedona/pull/561.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] yitao-li commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
yitao-li commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-961241588






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-958945580






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-958669497






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-950846782


   Whoops forgot about viz too 😅 thanks!
   
   I've also been playing around with a Scala 2.13 build too for fun and I think I got that working. I assume that will be easier in a follow on PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-955632064


   Hm didn't realize that would cause an issue. I think I saw another project create their own version of UnaryExecNode/BinaryExecNode to get around this, so that might be an option?
   
   On a separate but related note, could consider dropping 2.4 support at some point since it's EOL to ease some of the burden


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] yitao-li commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
yitao-li commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-961243866


   @Kimahriman Spark 3.0.3 should be working now too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] arturdryomov commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
arturdryomov commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-966020217


   @jiayuasu, is there a rough timeline when the Spark 3.2 support might be shipped? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-958669497


   Let's wait for Yitao's response since he is the main contributor of Sedona R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-961491472


   @yitao-li R tests on Spark 3.2.0 failed due to the Spark temp view bug https://github.com/apache/spark/pull/34473
   
   Apparently, it will take some time for Spark to release 3.2.1 so we have to bypass this issue in our test cases. We have bypassed this issue in Scala code: https://github.com/apache/incubator-sedona/pull/557/files#diff-d2c003f25373535c8866fbbbeba085f66985ff2a9f417538d7b60e090839d1bf. Could you please fix it in Sedona R?
   
   I can accept this PR and then you can create a new PR on Sedona to fix Sedona R. What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-958669497


   Let's wait for Yitao's response since he is the main contributor of Sedona R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu merged pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu merged pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-949705962


   > @Kimahriman Can you add me as a reviewer ? And also I can help you with Python API
   
   I'm not sure how to do that. Also didn't even think about checking the Python API hah


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-955744043


   Python tests passed 🎉 
   
   Don't really know about the R build though. Looks like sparklyr maybe has a hard-coded list of available spark versions so it just can't find the new version to download and use?
   
   Also had to not include a few explain improvements added to Unary/BinaryExecNode because they weren't available in Spark 2.4. Could try to add it with the anchors but it involved some multiline strings so would have to rewrite that and add anchors to every line hah.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-955576126


   Scala/Java test passed but Python test failed. This is possibly caused by the Spark 3.2.0 change in UnaryExecNode and BinaryExecNode. So technically, Sedona join extends a different ExecNode in Spark 3.2.0.
   
   Spark < 3.2.0 UnaryExecNode and BinaryExecNode implementation:
   
   ![image](https://user-images.githubusercontent.com/10948864/139554566-039438b3-d5ce-49ad-84ab-1184032ca1cf.png)
   
   ![image](https://user-images.githubusercontent.com/10948864/139554739-b4c5ac56-d6ba-4c2e-81cb-0032de47c6e0.png)
   
   Spark 3.2.0 UnaryExecNode and BinaryExecNode implementation
   
   ![image](https://user-images.githubusercontent.com/10948864/139554810-d5904226-f817-4c71-98e1-394048f13691.png)
   
   ![image](https://user-images.githubusercontent.com/10948864/139554824-adca11d0-6f11-4f5b-9247-d20ff580c8ce.png)
   
   This is related to https://github.com/apache/incubator-sedona/pull/558
   
   However, on Sedona's side, this does not require any actual code change, so Scala/Java test passed. But I don't understand why Python test cannot figure this out.
   
   @Kimahriman @Imbruced Can we figure out a solution for this? In the worst case, we have to cut a separate release that is compiled on Spark 3.2+. In short, we have to release Sedona for Spark 2.4, 3.0 (3.0-3.1), 3.2. This is a headache...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Imbruced commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Imbruced commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-956428060


   @jiayuasu We can stop supporting spark 2.4, but I am worrying that many users still are using spark 2.4.x.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-961239301


   Is it pulling something from master branch in github versus only the latest released sparklyr version? Looks like 3.0.3 is still missing


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] yitao-li edited a comment on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
yitao-li edited a comment on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-961386177


   @jiayuasu I pushed some more extensive changes to `sparklyr` in https://github.com/sparklyr/sparklyr/pull/3198. `spark_install()` should be flexible enough to handle new/unknown versions of Apache Spark now.
   
   Please try running the R build again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] yitao-li commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
yitao-li commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-961241588


   @Kimahriman At the moment we are using `sparklyr` from the master branch. I'll add URLs for Spark 3.0.3 asap.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-961258366


   @yitao-li It looks like Spark 2.4.8 is not added to Sparklyr yet. Can you add it too?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-954706389


   It seems to be something with creating a temp view with SQL. If you change
   
   ```
       spark.sql(
           """
             |CREATE OR REPLACE TEMP VIEW pixels AS
             |SELECT pixel, shape FROM pointtable
             |LATERAL VIEW EXPLODE(ST_Pixelize(shape, 1000, 1000, ST_PolygonFromEnvelope(-126.790180,24.863836,-64.630926,50.000))) AS pixel
           """.stripMargin)
   
         // Test visualization partitioner
         val zoomLevel = 2
         val newDf = VizPartitioner(spark.table("pixels"), zoomLevel, "pixel", new Envelope(0, 1000, 0, 1000))
   ```
   to
   ```
       val table = spark.sql(
          """
            |SELECT pixel, shape FROM pointtable
            |LATERAL VIEW EXPLODE(ST_Pixelize(shape, 1000, 1000, ST_PolygonFromEnvelope(-126.790180,24.863836,-64.630926,50.000))) AS pixel
           """.stripMargin)
   
         // Test visualization partitioner
         val zoomLevel = 2
         val newDf = VizPartitioner(table, zoomLevel, "pixel", new Envelope(0, 1000, 0, 1000))
   ````
   It works fine. And this also works:
   ```
       val table = spark.sql(
          """
            |SELECT pixel, shape FROM pointtable
            |LATERAL VIEW EXPLODE(ST_Pixelize(shape, 1000, 1000, ST_PolygonFromEnvelope(-126.790180,24.863836,-64.630926,50.000))) AS pixel
           """.stripMargin)
         table.createOrReplaceTempView("pixels")
   
         // Test visualization partitioner
         val zoomLevel = 2
         val newDf = VizPartitioner(spark.table("pixels"), zoomLevel, "pixel", new Envelope(0, 1000, 0, 1000))
   ```
   
   So I'm not sure if it's a bug with creating temp views in SQL or some new "feature"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-954898024


   @Kimahriman Can you try to ask this question in Spark community? If this is intentional, we will proceed with your solution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu edited a comment on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu edited a comment on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-955576126


   The failed test cases are caused by the Spark 3.2.0 change in UnaryExecNode and BinaryExecNode. So technically, Sedona join extends a different ExecNode in Spark 3.2.0.
   
   Spark < 3.2.0 UnaryExecNode and BinaryExecNode implementation:
   
   ![image](https://user-images.githubusercontent.com/10948864/139554566-039438b3-d5ce-49ad-84ab-1184032ca1cf.png)
   
   ![image](https://user-images.githubusercontent.com/10948864/139554739-b4c5ac56-d6ba-4c2e-81cb-0032de47c6e0.png)
   
   Spark 3.2.0 UnaryExecNode and BinaryExecNode implementation
   
   ![image](https://user-images.githubusercontent.com/10948864/139554810-d5904226-f817-4c71-98e1-394048f13691.png)
   
   ![image](https://user-images.githubusercontent.com/10948864/139554824-adca11d0-6f11-4f5b-9247-d20ff580c8ce.png)
   
   This is related to https://github.com/apache/incubator-sedona/pull/558
   
   However, on Sedona's side, this does not require any actual code change.
   
   @Kimahriman @Imbruced Can we figure out a solution for this? In the worst case, we have to cut a separate release that is compiled on Spark 3.2+. In short, we have to release Sedona for Spark 2.4, 3.0 (3.0-3.1), 3.2. This is a headache...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-953771469


   The viz tests keep failing for me with `Undefined function: 'ST_PolygonFromEnvelope'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.` and I can't figure out why, it's definitely being registered. Something weird with a temp view?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-950322398


   You should both be added now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu edited a comment on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu edited a comment on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-955824940


   @Kimahriman This is a brilliant idea. Let use this alternative for a few releases. When Spark 3.1 reaches EOL, then we can use Spark's own ExecNode. We may need to periodically pull the latest ExecNode changes from Spark in the future.
   
   We can actually drop the support of Spark 2.4 right now (in the next release) since it is EOL. @Imbruced @netanel246 What do you think?
   
   @yitao-li Hi Yitao, could you please check how to set the Spark 3.2 support in our GitHub action? Please ping @Kimahriman for the write access in his fork if needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] yitao-li commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
yitao-li commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-961180756


   @Kimahriman @jiayuasu I made the necessary changes in `sparklyr`. You can run the R builds again and everything should work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] yitao-li edited a comment on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
yitao-li edited a comment on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-961386177


   @jiayuasu I pushed some more extensive changes to `sparklyr` in https://github.com/sparklyr/sparklyr/pull/3198. `spark_install()` should be flexible enough to handle new/unknown versions of Apache Spark now.
   
   Please try running the R build again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] yitao-li edited a comment on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
yitao-li edited a comment on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-961386177


   @jiayuasu I pushed some more extensive changes to `sparklyr` in https://github.com/sparklyr/sparklyr/pull/3198. `spark_install()` should be flexible enough to handle new/unknown versions of Apache Spark now.
   
   Please try running the R build again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu merged pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu merged pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-960715817


   Confirmed that does fix the SQL create view behavior failing the test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-954886928


   @Kimahriman Thanks for finding this. This is why I stoped fixing the issue last week. This is really strange. Do you think we should proceed with your current solution?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-955079396


   Sent out an email, I'll update this for now to fix


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman edited a comment on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman edited a comment on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-955238308


   Python tests failed because of https://github.com/apache/incubator-sedona/blob/6fafc93217b14745d11819f44e5a166480746baa/python/tests/serialization/test_deserializers.py#L68-L72 which does a geomfromwkt call with no parameters and doesn't make any assertions, is there any point in that? It never gets evaled which is why the assertion didn't get thrown before


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-958945580


   This is an attempt at fixing the bug with registering the temp views in SQL: https://github.com/apache/spark/pull/34473. I haven't tested it out yet but I'll confirm if it indeed fixes the issue with the viz test. If so, it'll just be a known-issue with a workaround until 3.2.1 is released.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu merged pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu merged pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-961239301






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] yitao-li edited a comment on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
yitao-li edited a comment on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-964694607


   @jiayuasu OK. I was able to make R builds for Spark 3.2 work again with the changes from https://github.com/apache/incubator-sedona/pull/561.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Imbruced commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Imbruced commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-949723197


   @Kimahriman Probably you dont have enough privileges, @jiayuasu can assign reviewers. Maybe I should update sth on my Apache account to get such a previliges on my own. Btw, we should default reviewers. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Imbruced commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Imbruced commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-949647862


   Definitely like your changes according to asserts :+1:


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-961500374


   Should we just take the 3.2 R tests out of the github workflow and add them back in in the follow up?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Imbruced commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Imbruced commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-956428060


   @jiayuasu We can stop supporting spark 2.4, but I am worrying that many users still are using spark 2.4.x.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-961697656


   Spark 3.2 R tests have been removed. @yitao-li Please create a Sedona PR to fix Sedona R tests when you have time. Thank you again for your help!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-961491472






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-961258366






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-958945580


   This is an attempt at fixing the bug with registering the temp views in SQL: https://github.com/apache/spark/pull/34473. I haven't tested it out yet but I'll confirm if it indeed fixes the issue with the viz test. If so, it'll just be a known-issue with a workaround until 3.2.1 is released.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] jiayuasu commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-955824940


   @Kimahriman This is a brilliant idea. Let use this alternative for a few releases. When Spark 3.1 reaches EOL, then we can use Spark's own ExecNode. We may need to periodically pull the latest ExecNode changes from Spark in the future.
   
   @yitao-li Hi Yitao, could you please check how to set the Spark 3.2 support in our GitHub action? Please ping @Kimahriman for the write access in his fork if needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-sedona] Kimahriman commented on pull request #557: [SEDONA-67] Support Spark 3.2

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #557:
URL: https://github.com/apache/incubator-sedona/pull/557#issuecomment-954888460


   I think so? I can try to ask on the spark dev distro if anyone knows if that's intentional or a bug


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org