You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sedona.apache.org by GitBox <gi...@apache.org> on 2022/07/23 03:11:19 UTC

[GitHub] [incubator-sedona] Kimahriman opened a new pull request, #647: [SEDONA-132] Move some functions to a common module

Kimahriman opened a new pull request, #647:
URL: https://github.com/apache/incubator-sedona/pull/647

   ## Did you read the Contributor Guide?
   
   - Yes, I have read [Contributor Rules](https://sedona.apache.org/community/rule/) and [Contributor Development Guide](https://sedona.apache.org/community/develop/)
   
   ## Is this PR related to a JIRA ticket?
   
   - Yes, the URL of the assoicated JIRA ticket is https://issues.apache.org/jira/browse/SEDONA-132.
   
   ## What changes were proposed in this PR?
   Begin creating a new module with common SQL functions across Spark and Flink
   
   ## How was this patch tested?
   Existing UTs
   
   ## Did this PR include necessary documentation updates?
   - No, this PR does not affect any public API so no need to change the docs.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Kimahriman commented on a diff in pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on code in PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#discussion_r914254218


##########
common/src/main/java/org/apache/sedona/Functions.java:
##########
@@ -0,0 +1,28 @@
+package org.apache.sedona;

Review Comment:
   Wasn't sure if it should go in any further package from here. Definitely open to suggestions on a lot of the naming involved.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Kimahriman commented on a diff in pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on code in PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#discussion_r923550167


##########
common/src/main/java/org/apache/sedona/Functions.java:
##########
@@ -0,0 +1,28 @@
+package org.apache.sedona;

Review Comment:
   Fair point, will update



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] jiayuasu commented on pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#issuecomment-1179497791

   @Kimahriman This sounds like a good idea to me! But there are way more functions need to move to this Sedona-common, such as spatial partitioning, serializer, format reader... I believe this will take lots of efforts


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Kimahriman commented on pull request #647: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#issuecomment-1193047236

   So I was actually just playing around with that now that I think I understand how the modules currently work. Basically every dependency is provided scope except for python-adapter, which has basically everything bundled inside of it. Is there a reason it's setup that way? Any reason not to change it to what I think is the more traditional approach of just having compile scope dependencies


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Imbruced commented on a diff in pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Imbruced commented on code in PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#discussion_r917268766


##########
common/src/main/java/org/apache/sedona/Functions.java:
##########
@@ -0,0 +1,28 @@
+package org.apache.sedona;
+
+import org.locationtech.jts.geom.Coordinate;
+import org.locationtech.jts.geom.Geometry;
+
+public class Functions {
+    public static double ST_Distance(Geometry left, Geometry right) {
+        return left.distance(right);
+    }
+
+    public static double ST_YMin(Geometry geometry) {

Review Comment:
   I like the idea about applying DRY to Sedona in that scenarios,  but dont know if that naming is right in this context, maybe we should keep camel case here but only in SQL functions where we are forced we can use ST_* like naming ? WDYT ? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Kimahriman closed pull request #647: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Kimahriman closed pull request #647: [SEDONA-132] Move some functions to a common module
URL: https://github.com/apache/incubator-sedona/pull/647


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Kimahriman commented on a diff in pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on code in PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#discussion_r917274740


##########
.github/workflows/java.yml:
##########
@@ -3,7 +3,7 @@ name: Scala and Java build
 on:
   push:
     branches:
-      - master
+      - '*'

Review Comment:
   Was just temporary so I could get the tests to run before making the PR, will change back



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Kimahriman commented on pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#issuecomment-1187704032

   > > 
   > 
   > Doing it piece by piece is probably a good idea but it would be nice to have a clear end goal.
   > 
   > Spatial partitioning and the kryoserializer are not intertwined with other apis and should be pretty easy to move. I don't think it's strictly nessesary to move format readers. Some ST-constructors depend on format readers. They could be refactored to call JTS and geojson libraries directly. Then the format readers could use the functions in common where it makes sense.
   
   How about limiting this PR to the "Functions" defined in Flink? And then make separate issues for other things Flink already has (predicates, constructors), and then other things can be done as they are used/added to Flink


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] jiayuasu commented on pull request #647: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#issuecomment-1193055918

   @Kimahriman Users were supposed to call Sedona modules individually. They can mix and match different modules. Each module does not have any compile scope dependencies. This is great for Scala/Java experts to easily manage dependencies.
   
   However, when @Imbruced initially designed the python-adapter for Sedona PySpark, he suggests that we should provide a fat jar solution for Python users because people in Python world are not familiar with these Maven packaging stuff. Therefore, we decided to put all dependencies (including all Sedona modules, JTS, GeoJson, excluding Geotools) into sedona-python-adapter.
   
   This python-adapter was supposed to be only used by Python users but later we found that many Scala/Java/R users also suffer from these packaging stuff. So on Sedona website, we recommend that new comers should use python-adapter jar unless they are confident about the packaging stuff.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Imbruced commented on a diff in pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Imbruced commented on code in PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#discussion_r917268535


##########
.github/workflows/java.yml:
##########
@@ -3,7 +3,7 @@ name: Scala and Java build
 on:
   push:
     branches:
-      - master
+      - '*'

Review Comment:
   Do we need to change that ? :) What's a benefit from that ? 



##########
.github/workflows/java.yml:
##########
@@ -3,7 +3,7 @@ name: Scala and Java build
 on:
   push:
     branches:
-      - master
+      - '*'

Review Comment:
   Do we need to change that ? :) What's the benefit from that ? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] umartin commented on pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
umartin commented on PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#issuecomment-1187477821

   > 
   
   Doing it piece by piece is probably a good idea but it would be nice to have a clear end goal.
   
   Spatial partitioning and the kryoserializer are not intertwined with other apis and should be pretty easy to move.
   I don't think it's strictly nessesary to move format readers. Some ST-constructors depend on format readers. They could be refactored to call JTS and geojson libraries directly. Then the format readers could use the functions in common where it makes sense.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Kimahriman commented on a diff in pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on code in PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#discussion_r914254002


##########
.github/workflows/java.yml:
##########
@@ -3,7 +3,7 @@ name: Scala and Java build
 on:
   push:
     branches:
-      - master
+      - '*'

Review Comment:
   This was just to get the CI to run before PR'ing it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Kimahriman commented on a diff in pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on code in PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#discussion_r917274864


##########
common/src/main/java/org/apache/sedona/Functions.java:
##########
@@ -0,0 +1,28 @@
+package org.apache.sedona;
+
+import org.locationtech.jts.geom.Coordinate;
+import org.locationtech.jts.geom.Geometry;
+
+public class Functions {
+    public static double ST_Distance(Geometry left, Geometry right) {
+        return left.distance(right);
+    }
+
+    public static double ST_YMin(Geometry geometry) {

Review Comment:
   I'm fine with that. Also curious about thoughts on package structure/naming



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] jiayuasu commented on pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#issuecomment-1188246695

   > > > 
   > > 
   > > 
   > > Doing it piece by piece is probably a good idea but it would be nice to have a clear end goal.
   > > Spatial partitioning and the kryoserializer are not intertwined with other apis and should be pretty easy to move. I don't think it's strictly nessesary to move format readers. Some ST-constructors depend on format readers. They could be refactored to call JTS and geojson libraries directly. Then the format readers could use the functions in common where it makes sense.
   > 
   > How about limiting this PR to the "Functions" defined in Flink? And then make separate issues for other things Flink already has (predicates, constructors), and then other things can be done as they are used/added to Flink
   
   I agree. Let's start with the functions in this PR. And then create other PRs for other functions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] jiayuasu merged pull request #647: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
jiayuasu merged PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] jiayuasu commented on pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#issuecomment-1188247158

   > > > 
   > > 
   > > 
   > > Doing it piece by piece is probably a good idea but it would be nice to have a clear end goal.
   > > Spatial partitioning and the kryoserializer are not intertwined with other apis and should be pretty easy to move. I don't think it's strictly nessesary to move format readers. Some ST-constructors depend on format readers. They could be refactored to call JTS and geojson libraries directly. Then the format readers could use the functions in common where it makes sense.
   > 
   > How about limiting this PR to the "Functions" defined in Flink? And then make separate issues for other things Flink already has (predicates, constructors), and then other things can be done as they are used/added to Flink
   
   I agree. Let's start with the functions in this PR. And then create other PRs for other functions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Kimahriman commented on pull request #647: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#issuecomment-1216642700

   > This broke the build for me because of the wrong/not matching version in:
   > 
   > https://github.com/apache/incubator-sedona/blob/f2e61b85dc95235bffd5aa49a8db35ad0735f1a7/common/pom.xml#L25
   > 
   > 
   > It should be `<version>1.3.0-incubating-SNAPSHOT</version>`.
   
   Yep this never got updated after the 1.2.1 release got cut, can you update it in your PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Elephantusparvus commented on pull request #647: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Elephantusparvus commented on PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#issuecomment-1216710143

   > > This broke the build for me because of the wrong/not matching version in:
   > > https://github.com/apache/incubator-sedona/blob/f2e61b85dc95235bffd5aa49a8db35ad0735f1a7/common/pom.xml#L25
   > > 
   > > It should be `<version>1.3.0-incubating-SNAPSHOT</version>`.
   > 
   > Yep this never got updated after the 1.2.1 release got cut, can you update it in your PR?
   
   Did so, see https://github.com/apache/incubator-sedona/pull/662/commits/231f98b5016ec7fc0f541593343c5202dace43f3


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Kimahriman commented on pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#issuecomment-1175562463

   Started playing around with this and wanted to get some thoughts. Figured out some Scala magic to reduce the boilerplate needed for each Spark function. Only converted a few to get initial thoughts. Maybe can agree on an initial set to start with and others could slowly convert existing ones, and any new ones get added in the common module.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Kimahriman commented on pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#issuecomment-1179555282

   > @Kimahriman This sounds like a good idea to me! But there are way more functions need to move to this Sedona-common, such as spatial partitioning, serializer, format reader... I believe this will take lots of efforts
   
   Yeah I just wanted to get the conversation started, figure out how to maybe break this up so it's not a massive effort for all the things in one PR, but maybe is done piece by piece (and at least start any new things in the common setup).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Kimahriman commented on a diff in pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on code in PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#discussion_r923549826


##########
common/src/main/java/org/apache/sedona/Functions.java:
##########
@@ -0,0 +1,28 @@
+package org.apache.sedona;
+
+import org.locationtech.jts.geom.Coordinate;
+import org.locationtech.jts.geom.Geometry;
+
+public class Functions {
+    public static double ST_Distance(Geometry left, Geometry right) {

Review Comment:
   Good point, it's really a matter of safety vs performance if the engine has a better way it can handle null checks. For example it would be pretty trivial to make a Spark codegen wrapper for the functions that only checks for nulls if the input is actually nullable. Really just depends if you want to be able to squeeze every last drop of performance out or not



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Elephantusparvus commented on pull request #647: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Elephantusparvus commented on PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#issuecomment-1216570079

   This broke build for me because of the wrong/not matching version in:
   https://github.com/apache/incubator-sedona/blob/f2e61b85dc95235bffd5aa49a8db35ad0735f1a7/common/pom.xml#L25
   It should be  
   `        <version>1.3.0-incubating-SNAPSHOT</version>
   `
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] umartin commented on a diff in pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
umartin commented on code in PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#discussion_r923290517


##########
common/src/main/java/org/apache/sedona/Functions.java:
##########
@@ -0,0 +1,28 @@
+package org.apache.sedona;

Review Comment:
   The package name should probably be org.apache.sedona.common.
   
   It's safer to let each maven module have their own package namespace. For instance OSGi runtimes have a strict requirement on that.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] umartin commented on a diff in pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
umartin commented on code in PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#discussion_r923308073


##########
common/src/main/java/org/apache/sedona/Functions.java:
##########
@@ -0,0 +1,28 @@
+package org.apache.sedona;
+
+import org.locationtech.jts.geom.Coordinate;
+import org.locationtech.jts.geom.Geometry;
+
+public class Functions {
+    public static double ST_Distance(Geometry left, Geometry right) {

Review Comment:
   Should functions in common do null checks or should that be done in the Spark and Flink bindings? I don't have a strong opinion but clarifying the contract for functions in common would be helpful for contributors.
   
   Making functions in common null safe would be safer but each platform (currently Spark and Flink) has its own way of doing input validation so the null check might have to be repeated there. In Spark that is done by overriding checkInputDataTypes in expressions. It's currently not done in Sedona but would be a nice to add in the future. It would make error messages a lot more user friendly.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Kimahriman commented on a diff in pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on code in PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#discussion_r914300019


##########
python-adapter/pom.xml:
##########
@@ -47,6 +47,11 @@
             <artifactId>jts2geojson</artifactId>
             <version>${jts2geojson.version}</version>
         </dependency>
+        <dependency>
+            <groupId>org.apache.sedona</groupId>
+            <artifactId>sedona-common</artifactId>
+            <version>${project.version}</version>
+        </dependency>

Review Comment:
   Figured out I needed to add this here, but I've never figured out where the fat jar behavior comes from



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] jiayuasu commented on pull request #647: WIP: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#issuecomment-1179497837

   @netanel246 @Imbruced @yitao-li Any opinions?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Kimahriman commented on pull request #647: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#issuecomment-1190469440

   Ok I think I got all the Flink Functions. Rewriting the geohash logic in java was a bit of a pain, I wish everything could just be in Scala :sweat_smile: I didn't try to change any tests, was just gonna let the flink/spark-sql tests keep checking the logic for now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] jiayuasu commented on pull request #647: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#issuecomment-1193026589

   @Kimahriman Do you think we should release sedona-common as a separate module or it is automatically packaged in other modules?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] jiayuasu commented on pull request #647: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
jiayuasu commented on PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#issuecomment-1193519962

   @Kimahriman Yes, I think it will be good to make Sedona-common as a compile scope dependency to whatever package needs it. So the user no need to include it manually.
   
   As in this PR,
   
   sedona-core pom: `needs sedona-common`, as a compile scope dependency
   sedona-sql pom: `no update needed` because it needs sedona-core (provided scope) for spatial join and kyro serializer
   sedona-viz pom: `no update needed` because it needs sedona-core ans sedona-sql (provided scope) for spatial join and kyro serializer
   sedona-python-adapter pom: `no update needed` because it needs sedona-core and sedona-sql (compile scope)
   sedona-flink: `needs sedona-common` as a compile scope dependency. It currently also needs sedona-core and sedona-sql (provided scope). Note that: sedona-core and sedona-sql dependencies will eventually be dropped as we move all common functions to sedona-common in the next few PRs).
   
   Can you update the PR? Then I will merge this PR.
   
   Eventually I plan to release this PR to Sedona 1.3.0, not the next release 1.2.1. I think this PR is a rather big change to the project structure and is better not put in a maintenance release like 1.2.1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-sedona] Kimahriman commented on pull request #647: [SEDONA-132] Move some functions to a common module

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on PR #647:
URL: https://github.com/apache/incubator-sedona/pull/647#issuecomment-1193123479

   Yeah that's what I ended up doing all the time, I just include the python-adapter and then it automatically includes all the dependencies I need, which is a little odd. In fact, python-adapter has things bundled inside of it _and_ has compile scope dependencies on all the things, so you end up double including all of the classes.
   
   From a Spark perspective (which is all I do, not Flink, and via Python, so not a Java/Scala/Maven expert by any means), I think the two main approaches to including dependencies are via `--packages` (or `spark.jars.packages`), which resolves and automatically includes compile scope dependencies, or creating a fat Jar if you're spark-submitting java/scala code. Either way lends itself to compile scope dependencies correctly, versus users having to know what transitive dependencies are required for them to manually include (and what version they should use). So if everything was just compile scope, and you just wanted to do Sedona SQL things, you could just do `--packages org.apache.sedona:sedona-sql-3.0_2.12:1.2.0` and have everything you need without worrying about anything else.
   
   Would you be open to switching to that type of setup? Then the "common" module would just be another compile scope module that gets automatically included for whatever needs it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@sedona.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org