You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sedona.apache.org by Grégory Dugernier <gd...@aloalto.com> on 2021/02/10 10:34:01 UTC

[Bug][Python] Missing Java Class?

Hello,

I've been trying to run Sedona for Python on Databricks for 2 days and I
think I've stumbled upon a bug.

*Configuration*:

   - Spark 3.0.1
   - Scala 2.12
   - Python 3.7

*Librairies*:

   - apache-sedona (from PyPi)
   - org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating
   (from Maven)

*What I'm trying to do:*

I'm trying to load a series of Shapefiles files into a dataframe for
geospatial analysis. See code snippet below, based of your example notebook
<https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaCore.ipynb>


> from sedona.core.formatMapper.shapefileParser import ShapefileReader
> from sedona.register import SedonaRegistrator
> from sedona.utils.adapter import Adapter
>
> SedonaRegistrator.registerAll(spark)
> shape_rdd = ShapefileReader.readToGeometryRDD(spark.sparkContext,
> file_name)
> df = Adapter.toDf(shape_rdd, spark)
>

*Bug*:

The ShapefileReader.readToGeometryRDD() currently throws the following
error:

> Py4JJavaError: An error occurred while calling
> z:org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD.
> : java.lang.NoClassDefFoundError: org/opengis/referencing/FactoryException
> at
> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
> at
> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498) at
> py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) at
> py4j.Gateway.invoke(Gateway.java:295) at
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at
> py4j.commands.CallCommand.execute(CallCommand.java:79) at
> py4j.GatewayConnection.run(GatewayConnection.java:251) at
> java.lang.Thread.run(Thread.java:748) Caused by:
> java.lang.ClassNotFoundException: org.opengis.referencing.FactoryException
> at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:419) at
> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
> : java.lang.NoClassDefFoundError: org/opengis/referencing/FactoryException
> at
> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
> at
> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
> at py4j.Gateway.invoke(Gateway.java:295)
> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
> at py4j.commands.CallCommand.execute(CallCommand.java:79)
> at py4j.GatewayConnection.run(GatewayConnection.java:251)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassNotFoundException:
> org.opengis.referencing.FactoryException
> at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
> at
> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:352)


Adding the org.apache.sedona:sedona-core-3.0_2.12:1.0.0-incubating library
from Maven doesn't solve the error. Adding the org.datasyslab:geospark:1.3.1
library from Maven solves the error, but it creates conflicts with the
underlying org.locationtech.jts dependencies. This makes me think there is
a missing OpenGIS dependency in the sedona-python-adapter.

Regards,
G. Dugernier

-- 



Grégory Dugernier
Software Engineer

gd@aloalto.com <fp...@aloalto.com>
+32 (0)484 11 26 09

www.aloalto.com
+32 (0)2 736 10 17

-- 




DISCLAIMER : The content of this e-mail
message does not constitute a 
commitment of S.A. ALOALTO N.V. or its
subsidiaries/affiliates. This e-mail 
and any attachments thereto may contain
information which is confidential 
and/or protected by intellectual property
rights and are intended for the 
intended recipient only. Any use of the
information contained herein 
(including, but not limited to, total or partial
reproduction, 
communication or distribution in any form) by persons other than
the 
designated recipient(s) is prohibited. If an addressing or transmission
error has misdirected this e-mail, please notify the author, either by
telephone or by e-mail and delete the material from any computer. 


Re: [Bug][Python] Missing Java Class?

Posted by Netanel Malka <ma...@apache.org>.
Hi Gregory,
Can you please try to install the jars on the Databricks Cluster?

For example:
On clusters -> choose your cluster -> libraries -> install new:
1.Coordinates:  org.geotools:gt-main:24.0
2.repo: https://repo.osgeo.org/repository/release/

I successfully did it.
 
Please let me know if it solves your problem

On 2021/02/10 13:16:50, Grégory Dugernier <gd...@aloalto.com> wrote: 
> Thank you for the quick reply!
> 
> It seems my particular situation is a bit more complex than that, since I'm
> running the notebook on a Databricks cluster, and the default spark config
> doesn't seem to allow for more jar repositories (GeoTools isn't on Maven
> Central), nor does creating a new SparkSession appears to work. I've tried
> to download the jars and add them manually to the cluster but it doesn't
> seem to work either. But at least I know where the issue's at!
> 
> Thanks again for your help,
> Regards
> 
> On Wed, 10 Feb 2021 at 12:22, Jia Yu <ji...@apache.org> wrote:
> 
> > Hi Gregory,
> >
> > Thanks for letting us know. This is not a bug. We cannot include GeoTools
> > jars due to license issues. But indeed we forgot to update the docs and
> > jupyter notebook examples. I just updated them. Please read them here:
> >
> >
> > https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaSQL.ipynb
> >
> > (Make sure you disable the browser cache or open it in an incognito
> > window)  http://sedona.apache.org/download/overview/#install-sedona-python
> >
> > In short, you need to add the following coordinates in the notebook:
> >
> > spark = SparkSession. \ builder. \ appName('appName'). \ config(
> > "spark.serializer", KryoSerializer.getName). \ config(
> > "spark.kryo.registrator", SedonaKryoRegistrator.getName). \ config(
> > "spark.jars.repositories", 'https://repo.osgeo.org/repository/release,' '
> > https://download.java.net/maven/2'). \ config('spark.jars.packages',
> > 'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating,'
> > 'org.geotools:gt-main:24.0,' 'org.geotools:gt-referencing:24.0,'
> > 'org.geotools:gt-epsg-hsql:24.0'). \ getOrCreate()
> >
> > On Wed, Feb 10, 2021 at 2:35 AM Grégory Dugernier <gd...@aloalto.com> wrote:
> >
> >> Hello,
> >>
> >> I've been trying to run Sedona for Python on Databricks for 2 days and I
> >> think I've stumbled upon a bug.
> >>
> >> *Configuration*:
> >>
> >>    - Spark 3.0.1
> >>    - Scala 2.12
> >>    - Python 3.7
> >>
> >> *Librairies*:
> >>
> >>    - apache-sedona (from PyPi)
> >>    - org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating
> >>    (from Maven)
> >>
> >> *What I'm trying to do:*
> >>
> >> I'm trying to load a series of Shapefiles files into a dataframe for
> >> geospatial analysis. See code snippet below, based of your example
> >> notebook
> >> <
> >> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaCore.ipynb
> >> >
> >>
> >>
> >> > from sedona.core.formatMapper.shapefileParser import ShapefileReader
> >> > from sedona.register import SedonaRegistrator
> >> > from sedona.utils.adapter import Adapter
> >> >
> >> > SedonaRegistrator.registerAll(spark)
> >> > shape_rdd = ShapefileReader.readToGeometryRDD(spark.sparkContext,
> >> > file_name)
> >> > df = Adapter.toDf(shape_rdd, spark)
> >> >
> >>
> >> *Bug*:
> >>
> >> The ShapefileReader.readToGeometryRDD() currently throws the following
> >> error:
> >>
> >> > Py4JJavaError: An error occurred while calling
> >> >
> >> z:org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD.
> >> > : java.lang.NoClassDefFoundError:
> >> org/opengis/referencing/FactoryException
> >> > at
> >> >
> >> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
> >> > at
> >> >
> >> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
> >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> >> >
> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> >> > at
> >> >
> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >> > at java.lang.reflect.Method.invoke(Method.java:498) at
> >> > py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
> >> > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) at
> >> > py4j.Gateway.invoke(Gateway.java:295) at
> >> > py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at
> >> > py4j.commands.CallCommand.execute(CallCommand.java:79) at
> >> > py4j.GatewayConnection.run(GatewayConnection.java:251) at
> >> > java.lang.Thread.run(Thread.java:748) Caused by:
> >> > java.lang.ClassNotFoundException:
> >> org.opengis.referencing.FactoryException
> >> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at
> >> > java.lang.ClassLoader.loadClass(ClassLoader.java:419) at
> >> >
> >> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
> >> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
> >> > : java.lang.NoClassDefFoundError:
> >> org/opengis/referencing/FactoryException
> >> > at
> >> >
> >> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
> >> > at
> >> >
> >> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
> >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >> > at
> >> >
> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> >> > at
> >> >
> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >> > at java.lang.reflect.Method.invoke(Method.java:498)
> >> > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
> >> > at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
> >> > at py4j.Gateway.invoke(Gateway.java:295)
> >> > at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
> >> > at py4j.commands.CallCommand.execute(CallCommand.java:79)
> >> > at py4j.GatewayConnection.run(GatewayConnection.java:251)
> >> > at java.lang.Thread.run(Thread.java:748)
> >> > Caused by: java.lang.ClassNotFoundException:
> >> > org.opengis.referencing.FactoryException
> >> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
> >> > at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
> >> > at
> >> >
> >> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
> >> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
> >>
> >>
> >> Adding the org.apache.sedona:sedona-core-3.0_2.12:1.0.0-incubating library
> >> from Maven doesn't solve the error. Adding the
> >> org.datasyslab:geospark:1.3.1
> >> library from Maven solves the error, but it creates conflicts with the
> >> underlying org.locationtech.jts dependencies. This makes me think there is
> >> a missing OpenGIS dependency in the sedona-python-adapter.
> >>
> >> Regards,
> >> G. Dugernier
> >>
> >> --
> >>
> >>
> >>
> >> Grégory Dugernier
> >> Software Engineer
> >>
> >> gd@aloalto.com <fp...@aloalto.com>
> >> +32 (0)484 11 26 09
> >>
> >> www.aloalto.com
> >> +32 (0)2 736 10 17
> >>
> >> --
> >>
> >>
> >>
> >>
> >> DISCLAIMER : The content of this e-mail
> >> message does not constitute a
> >> commitment of S.A. ALOALTO N.V. or its
> >> subsidiaries/affiliates. This e-mail
> >> and any attachments thereto may contain
> >> information which is confidential
> >> and/or protected by intellectual property
> >> rights and are intended for the
> >> intended recipient only. Any use of the
> >> information contained herein
> >> (including, but not limited to, total or partial
> >> reproduction,
> >> communication or distribution in any form) by persons other than
> >> the
> >> designated recipient(s) is prohibited. If an addressing or transmission
> >> error has misdirected this e-mail, please notify the author, either by
> >> telephone or by e-mail and delete the material from any computer.
> >>
> >>
> 
> -- 
> 
> 
> 
> Grégory Dugernier
> Software Engineer
> 
> gd@aloalto.com <fp...@aloalto.com>
> +32 (0)484 11 26 09
> 
> www.aloalto.com
> +32 (0)2 736 10 17
> 
> -- 
> 
> 
> 
> 
> DISCLAIMER : The content of this e-mail
> message does not constitute a 
> commitment of S.A. ALOALTO N.V. or its
> subsidiaries/affiliates. This e-mail 
> and any attachments thereto may contain
> information which is confidential 
> and/or protected by intellectual property
> rights and are intended for the 
> intended recipient only. Any use of the
> information contained herein 
> (including, but not limited to, total or partial
> reproduction, 
> communication or distribution in any form) by persons other than
> the 
> designated recipient(s) is prohibited. If an addressing or transmission
> error has misdirected this e-mail, please notify the author, either by
> telephone or by e-mail and delete the material from any computer. 
> 
> 

Re: [Bug][Python] Missing Java Class?

Posted by Grégory Dugernier <gd...@aloalto.com>.
Honestly, not the worst thing I had to compile on Windows. Aside from the
Javadoc step and that weird issue with the dynamic versioning, it went
pretty smoothly; it just took a while because every try was taking a fair
bit of time.

I'll keep an eye on further announcements to see what comes next!

Thanks for your time

On Thu, 11 Feb 2021 at 11:06, Jia Yu <ji...@apache.org> wrote:

> Thanks for letting us know. Yes, our source code is not supposed to be
> compiled on Windows. I didn't expect so much trouble to get this jar. We
> will figure a better way to solve this issue soon.
>
> On Thu, Feb 11, 2021 at 1:46 AM Grégory Dugernier <gd...@aloalto.com> wrote:
>
>> In fact, you should let us know about your situation early on. In fact,
>>> you can download the GeoTools jars manually and copy to SPARK_HOME/jars/
>>> folder... You don't have to compile the code. Download links are given in
>>> the comments:
>>> http://sedona.apache.org/download/GeoSpark-All-Modules-Maven-Central-Coordinates/#geotools-240
>>
>>
>> I did copy the Geotools jars and added them to my cluster library, but
>> python-adapter didn't seem to find them in the FileStore. Placing the jars
>> inside SPARK_HOME on the cluster means trying to first determine where the
>> environment variable points to inside the DBFS architecture, then most
>> likely add them through CLI commands. This represented several short terms
>> obstacles, but also raised many issues down the line, because we are
>> deploying our clusters through Terraform and not all developers will have
>> the elevated permissions to perform CLI commands. A single, compiled jar
>> with all the dependencies within can easily be deployed at cluster creation
>> with a databricks_dbfs_file
>> <https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/resources/dbfs_file>
>> resource and using the library.jar property of databricks_cluster
>> <https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/resources/cluster#library-configuration-block>.
>> The jar ended up to be a bit of a headache to produce, but it keeps things
>> high level and easier to maintain.
>>
>> That is, of course, unless I'm missing the obvious and there was an easy
>> way to add GeoTools jars on the Databricks cluster and let
>> sedona-python-adapter find them, which isn't entirely excluded.
>>
>> On Thu, 11 Feb 2021 at 10:03, Jia Yu <ji...@apache.org> wrote:
>>
>>> Thanks, Gregory. I think this behavior is not expected. We will look
>>> into this.
>>>
>>> In fact, you should let us know about your situation early on. In fact,
>>> you can download the GeoTools jars manually and copy to SPARK_HOME/jars/
>>> folder... You don't have to compile the code. Download links are given in
>>> the comments:
>>> http://sedona.apache.org/download/GeoSpark-All-Modules-Maven-Central-Coordinates/#geotools-240
>>>
>>> We should make our doc more clear.
>>>
>>>
>>> On Thu, Feb 11, 2021 at 12:44 AM Grégory Dugernier <gd...@aloalto.com>
>>> wrote:
>>>
>>>> Hi Jia,
>>>>
>>>> After much sweat and tears, I went the long road and compiled the code
>>>> locally. I'm working on Windows so I had to change a few things in the
>>>> POM.xml:
>>>>
>>>>    - When trying to compile just the python-adapter lib, Maven didn't
>>>>    like the dynamic versioning of sedona-core and sedona-sql, so I had to
>>>>    hardcode the current version.
>>>>    - For some reason, Maven couldn't find spark-version-converter from
>>>>    within the python-adapter directory, so I just decided to compile the full
>>>>    library. It might be possible to just compile the adapter, I just decided
>>>>    pushing in this direction further seemed like it would take longer.
>>>>    - When trying to compile the full library, the attach-javadoc goal
>>>>    just keep erroring-out, even with the latest version of
>>>>    maven-javadoc-plugin, so I just removed it entirely.
>>>>
>>>> By the end, I got the jar, uploaded it in Databricks and it works like
>>>> a charm so far.
>>>>
>>>> I did however meet another issue, it seems that when using *ShapefileReader.readToGeometryRDD(spark.sparkContext,
>>>> file_url) *to read multiple Shapefiles files at once, then use the
>>>> Adapter, same-named columns aren't combined in the resulting DataFrame (see
>>>> example below). It might be normal RDD behavior -I have little experience
>>>> using them instead of DataFrames-, and I already found a workaround by
>>>> creating multiple dfs and using union(), but I prefer to let you know in
>>>> case it isn't the expected behavior.
>>>> [image: image.png]
>>>>
>>>> Regards,
>>>> Grégory
>>>>
>>>> On Thu, 11 Feb 2021 at 07:58, Jia Yu <ji...@apache.org> wrote:
>>>>
>>>>> Hi Gregory,
>>>>>
>>>>> Please let us know if you get your issue fixed. I know many of our
>>>>> users are also using Databricks cluster. We are also interested in the
>>>>> solution.
>>>>>
>>>>> Thanks,
>>>>> Jia
>>>>>
>>>>> On Wed, Feb 10, 2021 at 5:17 AM Grégory Dugernier <gd...@aloalto.com>
>>>>> wrote:
>>>>>
>>>>>> Thank you for the quick reply!
>>>>>>
>>>>>> It seems my particular situation is a bit more complex than that,
>>>>>> since I'm running the notebook on a Databricks cluster, and the default
>>>>>> spark config doesn't seem to allow for more jar repositories (GeoTools
>>>>>> isn't on Maven Central), nor does creating a new SparkSession appears to
>>>>>> work. I've tried to download the jars and add them manually to the cluster
>>>>>> but it doesn't seem to work either. But at least I know where the issue's
>>>>>> at!
>>>>>>
>>>>>> Thanks again for your help,
>>>>>> Regards
>>>>>>
>>>>>> On Wed, 10 Feb 2021 at 12:22, Jia Yu <ji...@apache.org> wrote:
>>>>>>
>>>>>>> Hi Gregory,
>>>>>>>
>>>>>>> Thanks for letting us know. This is not a bug. We cannot include
>>>>>>> GeoTools jars due to license issues. But indeed we forgot to update the
>>>>>>> docs and jupyter notebook examples. I just updated them. Please read them
>>>>>>> here:
>>>>>>>
>>>>>>>
>>>>>>> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaSQL.ipynb
>>>>>>>
>>>>>>> (Make sure you disable the browser cache or open it in an incognito
>>>>>>> window)
>>>>>>> http://sedona.apache.org/download/overview/#install-sedona-python
>>>>>>>
>>>>>>> In short, you need to add the following coordinates in the notebook:
>>>>>>>
>>>>>>> spark = SparkSession. \ builder. \ appName('appName'). \ config(
>>>>>>> "spark.serializer", KryoSerializer.getName). \ config(
>>>>>>> "spark.kryo.registrator", SedonaKryoRegistrator.getName). \ config(
>>>>>>> "spark.jars.repositories", '
>>>>>>> https://repo.osgeo.org/repository/release,' '
>>>>>>> https://download.java.net/maven/2'). \ config('spark.jars.packages',
>>>>>>> 'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating,'
>>>>>>> 'org.geotools:gt-main:24.0,' 'org.geotools:gt-referencing:24.0,'
>>>>>>> 'org.geotools:gt-epsg-hsql:24.0'). \ getOrCreate()
>>>>>>>
>>>>>>> On Wed, Feb 10, 2021 at 2:35 AM Grégory Dugernier <gd...@aloalto.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> I've been trying to run Sedona for Python on Databricks for 2 days
>>>>>>>> and I
>>>>>>>> think I've stumbled upon a bug.
>>>>>>>>
>>>>>>>> *Configuration*:
>>>>>>>>
>>>>>>>>    - Spark 3.0.1
>>>>>>>>    - Scala 2.12
>>>>>>>>    - Python 3.7
>>>>>>>>
>>>>>>>> *Librairies*:
>>>>>>>>
>>>>>>>>    - apache-sedona (from PyPi)
>>>>>>>>    -
>>>>>>>> org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating
>>>>>>>>    (from Maven)
>>>>>>>>
>>>>>>>> *What I'm trying to do:*
>>>>>>>>
>>>>>>>> I'm trying to load a series of Shapefiles files into a dataframe for
>>>>>>>> geospatial analysis. See code snippet below, based of your example
>>>>>>>> notebook
>>>>>>>> <
>>>>>>>> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaCore.ipynb
>>>>>>>> >
>>>>>>>>
>>>>>>>>
>>>>>>>> > from sedona.core.formatMapper.shapefileParser import
>>>>>>>> ShapefileReader
>>>>>>>> > from sedona.register import SedonaRegistrator
>>>>>>>> > from sedona.utils.adapter import Adapter
>>>>>>>> >
>>>>>>>> > SedonaRegistrator.registerAll(spark)
>>>>>>>> > shape_rdd = ShapefileReader.readToGeometryRDD(spark.sparkContext,
>>>>>>>> > file_name)
>>>>>>>> > df = Adapter.toDf(shape_rdd, spark)
>>>>>>>> >
>>>>>>>>
>>>>>>>> *Bug*:
>>>>>>>>
>>>>>>>> The ShapefileReader.readToGeometryRDD() currently throws the
>>>>>>>> following
>>>>>>>> error:
>>>>>>>>
>>>>>>>> > Py4JJavaError: An error occurred while calling
>>>>>>>> >
>>>>>>>> z:org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD.
>>>>>>>> > : java.lang.NoClassDefFoundError:
>>>>>>>> org/opengis/referencing/FactoryException
>>>>>>>> > at
>>>>>>>> >
>>>>>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>>>>>>>> > at
>>>>>>>> >
>>>>>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>>>>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>>>>>>>> >
>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>>>>> > at
>>>>>>>> >
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>> > at java.lang.reflect.Method.invoke(Method.java:498) at
>>>>>>>> > py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
>>>>>>>> >
>>>>>>>> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) at
>>>>>>>> > py4j.Gateway.invoke(Gateway.java:295) at
>>>>>>>> >
>>>>>>>> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at
>>>>>>>> > py4j.commands.CallCommand.execute(CallCommand.java:79) at
>>>>>>>> > py4j.GatewayConnection.run(GatewayConnection.java:251) at
>>>>>>>> > java.lang.Thread.run(Thread.java:748) Caused by:
>>>>>>>> > java.lang.ClassNotFoundException:
>>>>>>>> org.opengis.referencing.FactoryException
>>>>>>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at
>>>>>>>> > java.lang.ClassLoader.loadClass(ClassLoader.java:419) at
>>>>>>>> >
>>>>>>>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>>>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>>>>>>> > : java.lang.NoClassDefFoundError:
>>>>>>>> org/opengis/referencing/FactoryException
>>>>>>>> > at
>>>>>>>> >
>>>>>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>>>>>>>> > at
>>>>>>>> >
>>>>>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>>>>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>> > at
>>>>>>>> >
>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>>>>> > at
>>>>>>>> >
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>> > at java.lang.reflect.Method.invoke(Method.java:498)
>>>>>>>> > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
>>>>>>>> > at
>>>>>>>> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
>>>>>>>> > at py4j.Gateway.invoke(Gateway.java:295)
>>>>>>>> > at
>>>>>>>> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
>>>>>>>> > at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>>>>>>> > at py4j.GatewayConnection.run(GatewayConnection.java:251)
>>>>>>>> > at java.lang.Thread.run(Thread.java:748)
>>>>>>>> > Caused by: java.lang.ClassNotFoundException:
>>>>>>>> > org.opengis.referencing.FactoryException
>>>>>>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>>>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
>>>>>>>> > at
>>>>>>>> >
>>>>>>>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>>>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>>>>>>>
>>>>>>>>
>>>>>>>> Adding the org.apache.sedona:sedona-core-3.0_2.12:1.0.0-incubating
>>>>>>>> library
>>>>>>>> from Maven doesn't solve the error. Adding the
>>>>>>>> org.datasyslab:geospark:1.3.1
>>>>>>>> library from Maven solves the error, but it creates conflicts with
>>>>>>>> the
>>>>>>>> underlying org.locationtech.jts dependencies. This makes me think
>>>>>>>> there is
>>>>>>>> a missing OpenGIS dependency in the sedona-python-adapter.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> G. Dugernier
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Grégory Dugernier
>>>>>>>> Software Engineer
>>>>>>>>
>>>>>>>> gd@aloalto.com <fp...@aloalto.com>
>>>>>>>> +32 (0)484 11 26 09
>>>>>>>>
>>>>>>>> www.aloalto.com
>>>>>>>> +32 (0)2 736 10 17
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> DISCLAIMER : The content of this e-mail
>>>>>>>> message does not constitute a
>>>>>>>> commitment of S.A. ALOALTO N.V. or its
>>>>>>>> subsidiaries/affiliates. This e-mail
>>>>>>>> and any attachments thereto may contain
>>>>>>>> information which is confidential
>>>>>>>> and/or protected by intellectual property
>>>>>>>> rights and are intended for the
>>>>>>>> intended recipient only. Any use of the
>>>>>>>> information contained herein
>>>>>>>> (including, but not limited to, total or partial
>>>>>>>> reproduction,
>>>>>>>> communication or distribution in any form) by persons other than
>>>>>>>> the
>>>>>>>> designated recipient(s) is prohibited. If an addressing or
>>>>>>>> transmission
>>>>>>>> error has misdirected this e-mail, please notify the author, either
>>>>>>>> by
>>>>>>>> telephone or by e-mail and delete the material from any computer.
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>>
>>>>>>
>>>>>> Grégory Dugernier
>>>>>> Software Engineer
>>>>>>
>>>>>> gd@aloalto.com <fp...@aloalto.com>
>>>>>> +32 (0)484 11 26 09
>>>>>>
>>>>>> www.aloalto.com
>>>>>> +32 (0)2 736 10 17
>>>>>>
>>>>>> DISCLAIMER : The content of this e-mail message does not constitute a
>>>>>> commitment of S.A. ALOALTO N.V. or its subsidiaries/affiliates. This e-mail
>>>>>> and any attachments thereto may contain information which is confidential
>>>>>> and/or protected by intellectual property rights and are intended for the
>>>>>> intended recipient only. Any use of the information contained herein
>>>>>> (including, but not limited to, total or partial reproduction,
>>>>>> communication or distribution in any form) by persons other than the
>>>>>> designated recipient(s) is prohibited. If an addressing or transmission
>>>>>> error has misdirected this e-mail, please notify the author, either by
>>>>>> telephone or by e-mail and delete the material from any computer.
>>>>>>
>>>>>
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>> Grégory Dugernier
>>>> Software Engineer
>>>>
>>>> gd@aloalto.com <fp...@aloalto.com>
>>>> +32 (0)484 11 26 09
>>>>
>>>> www.aloalto.com
>>>> +32 (0)2 736 10 17
>>>>
>>>> DISCLAIMER : The content of this e-mail message does not constitute a
>>>> commitment of S.A. ALOALTO N.V. or its subsidiaries/affiliates. This e-mail
>>>> and any attachments thereto may contain information which is confidential
>>>> and/or protected by intellectual property rights and are intended for the
>>>> intended recipient only. Any use of the information contained herein
>>>> (including, but not limited to, total or partial reproduction,
>>>> communication or distribution in any form) by persons other than the
>>>> designated recipient(s) is prohibited. If an addressing or transmission
>>>> error has misdirected this e-mail, please notify the author, either by
>>>> telephone or by e-mail and delete the material from any computer.
>>>>
>>>
>>
>> --
>>
>>
>>
>> Grégory Dugernier
>> Software Engineer
>>
>> gd@aloalto.com <fp...@aloalto.com>
>> +32 (0)484 11 26 09
>>
>> www.aloalto.com
>> +32 (0)2 736 10 17
>>
>> DISCLAIMER : The content of this e-mail message does not constitute a
>> commitment of S.A. ALOALTO N.V. or its subsidiaries/affiliates. This e-mail
>> and any attachments thereto may contain information which is confidential
>> and/or protected by intellectual property rights and are intended for the
>> intended recipient only. Any use of the information contained herein
>> (including, but not limited to, total or partial reproduction,
>> communication or distribution in any form) by persons other than the
>> designated recipient(s) is prohibited. If an addressing or transmission
>> error has misdirected this e-mail, please notify the author, either by
>> telephone or by e-mail and delete the material from any computer.
>>
>

-- 



Grégory Dugernier
Software Engineer

gd@aloalto.com <fp...@aloalto.com>
+32 (0)484 11 26 09

www.aloalto.com
+32 (0)2 736 10 17

-- 




DISCLAIMER : The content of this e-mail
message does not constitute a 
commitment of S.A. ALOALTO N.V. or its
subsidiaries/affiliates. This e-mail 
and any attachments thereto may contain
information which is confidential 
and/or protected by intellectual property
rights and are intended for the 
intended recipient only. Any use of the
information contained herein 
(including, but not limited to, total or partial
reproduction, 
communication or distribution in any form) by persons other than
the 
designated recipient(s) is prohibited. If an addressing or transmission
error has misdirected this e-mail, please notify the author, either by
telephone or by e-mail and delete the material from any computer. 


Re: [Bug][Python] Missing Java Class?

Posted by Jia Yu <ji...@apache.org>.
Thanks for letting us know. Yes, our source code is not supposed to be
compiled on Windows. I didn't expect so much trouble to get this jar. We
will figure a better way to solve this issue soon.

On Thu, Feb 11, 2021 at 1:46 AM Grégory Dugernier <gd...@aloalto.com> wrote:

> In fact, you should let us know about your situation early on. In fact,
>> you can download the GeoTools jars manually and copy to SPARK_HOME/jars/
>> folder... You don't have to compile the code. Download links are given in
>> the comments:
>> http://sedona.apache.org/download/GeoSpark-All-Modules-Maven-Central-Coordinates/#geotools-240
>
>
> I did copy the Geotools jars and added them to my cluster library, but
> python-adapter didn't seem to find them in the FileStore. Placing the jars
> inside SPARK_HOME on the cluster means trying to first determine where the
> environment variable points to inside the DBFS architecture, then most
> likely add them through CLI commands. This represented several short terms
> obstacles, but also raised many issues down the line, because we are
> deploying our clusters through Terraform and not all developers will have
> the elevated permissions to perform CLI commands. A single, compiled jar
> with all the dependencies within can easily be deployed at cluster creation
> with a databricks_dbfs_file
> <https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/resources/dbfs_file>
> resource and using the library.jar property of databricks_cluster
> <https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/resources/cluster#library-configuration-block>.
> The jar ended up to be a bit of a headache to produce, but it keeps things
> high level and easier to maintain.
>
> That is, of course, unless I'm missing the obvious and there was an easy
> way to add GeoTools jars on the Databricks cluster and let
> sedona-python-adapter find them, which isn't entirely excluded.
>
> On Thu, 11 Feb 2021 at 10:03, Jia Yu <ji...@apache.org> wrote:
>
>> Thanks, Gregory. I think this behavior is not expected. We will look into
>> this.
>>
>> In fact, you should let us know about your situation early on. In fact,
>> you can download the GeoTools jars manually and copy to SPARK_HOME/jars/
>> folder... You don't have to compile the code. Download links are given in
>> the comments:
>> http://sedona.apache.org/download/GeoSpark-All-Modules-Maven-Central-Coordinates/#geotools-240
>>
>> We should make our doc more clear.
>>
>>
>> On Thu, Feb 11, 2021 at 12:44 AM Grégory Dugernier <gd...@aloalto.com>
>> wrote:
>>
>>> Hi Jia,
>>>
>>> After much sweat and tears, I went the long road and compiled the code
>>> locally. I'm working on Windows so I had to change a few things in the
>>> POM.xml:
>>>
>>>    - When trying to compile just the python-adapter lib, Maven didn't
>>>    like the dynamic versioning of sedona-core and sedona-sql, so I had to
>>>    hardcode the current version.
>>>    - For some reason, Maven couldn't find spark-version-converter from
>>>    within the python-adapter directory, so I just decided to compile the full
>>>    library. It might be possible to just compile the adapter, I just decided
>>>    pushing in this direction further seemed like it would take longer.
>>>    - When trying to compile the full library, the attach-javadoc goal
>>>    just keep erroring-out, even with the latest version of
>>>    maven-javadoc-plugin, so I just removed it entirely.
>>>
>>> By the end, I got the jar, uploaded it in Databricks and it works like a
>>> charm so far.
>>>
>>> I did however meet another issue, it seems that when using *ShapefileReader.readToGeometryRDD(spark.sparkContext,
>>> file_url) *to read multiple Shapefiles files at once, then use the
>>> Adapter, same-named columns aren't combined in the resulting DataFrame (see
>>> example below). It might be normal RDD behavior -I have little experience
>>> using them instead of DataFrames-, and I already found a workaround by
>>> creating multiple dfs and using union(), but I prefer to let you know in
>>> case it isn't the expected behavior.
>>> [image: image.png]
>>>
>>> Regards,
>>> Grégory
>>>
>>> On Thu, 11 Feb 2021 at 07:58, Jia Yu <ji...@apache.org> wrote:
>>>
>>>> Hi Gregory,
>>>>
>>>> Please let us know if you get your issue fixed. I know many of our
>>>> users are also using Databricks cluster. We are also interested in the
>>>> solution.
>>>>
>>>> Thanks,
>>>> Jia
>>>>
>>>> On Wed, Feb 10, 2021 at 5:17 AM Grégory Dugernier <gd...@aloalto.com>
>>>> wrote:
>>>>
>>>>> Thank you for the quick reply!
>>>>>
>>>>> It seems my particular situation is a bit more complex than that,
>>>>> since I'm running the notebook on a Databricks cluster, and the default
>>>>> spark config doesn't seem to allow for more jar repositories (GeoTools
>>>>> isn't on Maven Central), nor does creating a new SparkSession appears to
>>>>> work. I've tried to download the jars and add them manually to the cluster
>>>>> but it doesn't seem to work either. But at least I know where the issue's
>>>>> at!
>>>>>
>>>>> Thanks again for your help,
>>>>> Regards
>>>>>
>>>>> On Wed, 10 Feb 2021 at 12:22, Jia Yu <ji...@apache.org> wrote:
>>>>>
>>>>>> Hi Gregory,
>>>>>>
>>>>>> Thanks for letting us know. This is not a bug. We cannot include
>>>>>> GeoTools jars due to license issues. But indeed we forgot to update the
>>>>>> docs and jupyter notebook examples. I just updated them. Please read them
>>>>>> here:
>>>>>>
>>>>>>
>>>>>> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaSQL.ipynb
>>>>>>
>>>>>> (Make sure you disable the browser cache or open it in an incognito
>>>>>> window)
>>>>>> http://sedona.apache.org/download/overview/#install-sedona-python
>>>>>>
>>>>>> In short, you need to add the following coordinates in the notebook:
>>>>>>
>>>>>> spark = SparkSession. \ builder. \ appName('appName'). \ config(
>>>>>> "spark.serializer", KryoSerializer.getName). \ config(
>>>>>> "spark.kryo.registrator", SedonaKryoRegistrator.getName). \ config(
>>>>>> "spark.jars.repositories", '
>>>>>> https://repo.osgeo.org/repository/release,' '
>>>>>> https://download.java.net/maven/2'). \ config('spark.jars.packages',
>>>>>> 'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating,'
>>>>>> 'org.geotools:gt-main:24.0,' 'org.geotools:gt-referencing:24.0,'
>>>>>> 'org.geotools:gt-epsg-hsql:24.0'). \ getOrCreate()
>>>>>>
>>>>>> On Wed, Feb 10, 2021 at 2:35 AM Grégory Dugernier <gd...@aloalto.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> I've been trying to run Sedona for Python on Databricks for 2 days
>>>>>>> and I
>>>>>>> think I've stumbled upon a bug.
>>>>>>>
>>>>>>> *Configuration*:
>>>>>>>
>>>>>>>    - Spark 3.0.1
>>>>>>>    - Scala 2.12
>>>>>>>    - Python 3.7
>>>>>>>
>>>>>>> *Librairies*:
>>>>>>>
>>>>>>>    - apache-sedona (from PyPi)
>>>>>>>    -
>>>>>>> org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating
>>>>>>>    (from Maven)
>>>>>>>
>>>>>>> *What I'm trying to do:*
>>>>>>>
>>>>>>> I'm trying to load a series of Shapefiles files into a dataframe for
>>>>>>> geospatial analysis. See code snippet below, based of your example
>>>>>>> notebook
>>>>>>> <
>>>>>>> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaCore.ipynb
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>> > from sedona.core.formatMapper.shapefileParser import
>>>>>>> ShapefileReader
>>>>>>> > from sedona.register import SedonaRegistrator
>>>>>>> > from sedona.utils.adapter import Adapter
>>>>>>> >
>>>>>>> > SedonaRegistrator.registerAll(spark)
>>>>>>> > shape_rdd = ShapefileReader.readToGeometryRDD(spark.sparkContext,
>>>>>>> > file_name)
>>>>>>> > df = Adapter.toDf(shape_rdd, spark)
>>>>>>> >
>>>>>>>
>>>>>>> *Bug*:
>>>>>>>
>>>>>>> The ShapefileReader.readToGeometryRDD() currently throws the
>>>>>>> following
>>>>>>> error:
>>>>>>>
>>>>>>> > Py4JJavaError: An error occurred while calling
>>>>>>> >
>>>>>>> z:org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD.
>>>>>>> > : java.lang.NoClassDefFoundError:
>>>>>>> org/opengis/referencing/FactoryException
>>>>>>> > at
>>>>>>> >
>>>>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>>>>>>> > at
>>>>>>> >
>>>>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>>>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>>>>>>> >
>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>>>> > at
>>>>>>> >
>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>> > at java.lang.reflect.Method.invoke(Method.java:498) at
>>>>>>> > py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
>>>>>>> > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
>>>>>>> at
>>>>>>> > py4j.Gateway.invoke(Gateway.java:295) at
>>>>>>> >
>>>>>>> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at
>>>>>>> > py4j.commands.CallCommand.execute(CallCommand.java:79) at
>>>>>>> > py4j.GatewayConnection.run(GatewayConnection.java:251) at
>>>>>>> > java.lang.Thread.run(Thread.java:748) Caused by:
>>>>>>> > java.lang.ClassNotFoundException:
>>>>>>> org.opengis.referencing.FactoryException
>>>>>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at
>>>>>>> > java.lang.ClassLoader.loadClass(ClassLoader.java:419) at
>>>>>>> >
>>>>>>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>>>>>> > : java.lang.NoClassDefFoundError:
>>>>>>> org/opengis/referencing/FactoryException
>>>>>>> > at
>>>>>>> >
>>>>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>>>>>>> > at
>>>>>>> >
>>>>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>>>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>> > at
>>>>>>> >
>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>>>> > at
>>>>>>> >
>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>> > at java.lang.reflect.Method.invoke(Method.java:498)
>>>>>>> > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
>>>>>>> > at
>>>>>>> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
>>>>>>> > at py4j.Gateway.invoke(Gateway.java:295)
>>>>>>> > at
>>>>>>> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
>>>>>>> > at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>>>>>> > at py4j.GatewayConnection.run(GatewayConnection.java:251)
>>>>>>> > at java.lang.Thread.run(Thread.java:748)
>>>>>>> > Caused by: java.lang.ClassNotFoundException:
>>>>>>> > org.opengis.referencing.FactoryException
>>>>>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
>>>>>>> > at
>>>>>>> >
>>>>>>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>>>>>>
>>>>>>>
>>>>>>> Adding the org.apache.sedona:sedona-core-3.0_2.12:1.0.0-incubating
>>>>>>> library
>>>>>>> from Maven doesn't solve the error. Adding the
>>>>>>> org.datasyslab:geospark:1.3.1
>>>>>>> library from Maven solves the error, but it creates conflicts with
>>>>>>> the
>>>>>>> underlying org.locationtech.jts dependencies. This makes me think
>>>>>>> there is
>>>>>>> a missing OpenGIS dependency in the sedona-python-adapter.
>>>>>>>
>>>>>>> Regards,
>>>>>>> G. Dugernier
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Grégory Dugernier
>>>>>>> Software Engineer
>>>>>>>
>>>>>>> gd@aloalto.com <fp...@aloalto.com>
>>>>>>> +32 (0)484 11 26 09
>>>>>>>
>>>>>>> www.aloalto.com
>>>>>>> +32 (0)2 736 10 17
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> DISCLAIMER : The content of this e-mail
>>>>>>> message does not constitute a
>>>>>>> commitment of S.A. ALOALTO N.V. or its
>>>>>>> subsidiaries/affiliates. This e-mail
>>>>>>> and any attachments thereto may contain
>>>>>>> information which is confidential
>>>>>>> and/or protected by intellectual property
>>>>>>> rights and are intended for the
>>>>>>> intended recipient only. Any use of the
>>>>>>> information contained herein
>>>>>>> (including, but not limited to, total or partial
>>>>>>> reproduction,
>>>>>>> communication or distribution in any form) by persons other than
>>>>>>> the
>>>>>>> designated recipient(s) is prohibited. If an addressing or
>>>>>>> transmission
>>>>>>> error has misdirected this e-mail, please notify the author, either
>>>>>>> by
>>>>>>> telephone or by e-mail and delete the material from any computer.
>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>>
>>>>> Grégory Dugernier
>>>>> Software Engineer
>>>>>
>>>>> gd@aloalto.com <fp...@aloalto.com>
>>>>> +32 (0)484 11 26 09
>>>>>
>>>>> www.aloalto.com
>>>>> +32 (0)2 736 10 17
>>>>>
>>>>> DISCLAIMER : The content of this e-mail message does not constitute a
>>>>> commitment of S.A. ALOALTO N.V. or its subsidiaries/affiliates. This e-mail
>>>>> and any attachments thereto may contain information which is confidential
>>>>> and/or protected by intellectual property rights and are intended for the
>>>>> intended recipient only. Any use of the information contained herein
>>>>> (including, but not limited to, total or partial reproduction,
>>>>> communication or distribution in any form) by persons other than the
>>>>> designated recipient(s) is prohibited. If an addressing or transmission
>>>>> error has misdirected this e-mail, please notify the author, either by
>>>>> telephone or by e-mail and delete the material from any computer.
>>>>>
>>>>
>>>
>>> --
>>>
>>>
>>>
>>> Grégory Dugernier
>>> Software Engineer
>>>
>>> gd@aloalto.com <fp...@aloalto.com>
>>> +32 (0)484 11 26 09
>>>
>>> www.aloalto.com
>>> +32 (0)2 736 10 17
>>>
>>> DISCLAIMER : The content of this e-mail message does not constitute a
>>> commitment of S.A. ALOALTO N.V. or its subsidiaries/affiliates. This e-mail
>>> and any attachments thereto may contain information which is confidential
>>> and/or protected by intellectual property rights and are intended for the
>>> intended recipient only. Any use of the information contained herein
>>> (including, but not limited to, total or partial reproduction,
>>> communication or distribution in any form) by persons other than the
>>> designated recipient(s) is prohibited. If an addressing or transmission
>>> error has misdirected this e-mail, please notify the author, either by
>>> telephone or by e-mail and delete the material from any computer.
>>>
>>
>
> --
>
>
>
> Grégory Dugernier
> Software Engineer
>
> gd@aloalto.com <fp...@aloalto.com>
> +32 (0)484 11 26 09
>
> www.aloalto.com
> +32 (0)2 736 10 17
>
> DISCLAIMER : The content of this e-mail message does not constitute a
> commitment of S.A. ALOALTO N.V. or its subsidiaries/affiliates. This e-mail
> and any attachments thereto may contain information which is confidential
> and/or protected by intellectual property rights and are intended for the
> intended recipient only. Any use of the information contained herein
> (including, but not limited to, total or partial reproduction,
> communication or distribution in any form) by persons other than the
> designated recipient(s) is prohibited. If an addressing or transmission
> error has misdirected this e-mail, please notify the author, either by
> telephone or by e-mail and delete the material from any computer.
>

Re: [Bug][Python] Missing Java Class?

Posted by Grégory Dugernier <gd...@aloalto.com>.
>
> In fact, you should let us know about your situation early on. In fact,
> you can download the GeoTools jars manually and copy to SPARK_HOME/jars/
> folder... You don't have to compile the code. Download links are given in
> the comments:
> http://sedona.apache.org/download/GeoSpark-All-Modules-Maven-Central-Coordinates/#geotools-240


I did copy the Geotools jars and added them to my cluster library, but
python-adapter didn't seem to find them in the FileStore. Placing the jars
inside SPARK_HOME on the cluster means trying to first determine where the
environment variable points to inside the DBFS architecture, then most
likely add them through CLI commands. This represented several short terms
obstacles, but also raised many issues down the line, because we are
deploying our clusters through Terraform and not all developers will have
the elevated permissions to perform CLI commands. A single, compiled jar
with all the dependencies within can easily be deployed at cluster creation
with a databricks_dbfs_file
<https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/resources/dbfs_file>
resource and using the library.jar property of databricks_cluster
<https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/resources/cluster#library-configuration-block>.
The jar ended up to be a bit of a headache to produce, but it keeps things
high level and easier to maintain.

That is, of course, unless I'm missing the obvious and there was an easy
way to add GeoTools jars on the Databricks cluster and let
sedona-python-adapter find them, which isn't entirely excluded.

On Thu, 11 Feb 2021 at 10:03, Jia Yu <ji...@apache.org> wrote:

> Thanks, Gregory. I think this behavior is not expected. We will look into
> this.
>
> In fact, you should let us know about your situation early on. In fact,
> you can download the GeoTools jars manually and copy to SPARK_HOME/jars/
> folder... You don't have to compile the code. Download links are given in
> the comments:
> http://sedona.apache.org/download/GeoSpark-All-Modules-Maven-Central-Coordinates/#geotools-240
>
> We should make our doc more clear.
>
>
> On Thu, Feb 11, 2021 at 12:44 AM Grégory Dugernier <gd...@aloalto.com> wrote:
>
>> Hi Jia,
>>
>> After much sweat and tears, I went the long road and compiled the code
>> locally. I'm working on Windows so I had to change a few things in the
>> POM.xml:
>>
>>    - When trying to compile just the python-adapter lib, Maven didn't
>>    like the dynamic versioning of sedona-core and sedona-sql, so I had to
>>    hardcode the current version.
>>    - For some reason, Maven couldn't find spark-version-converter from
>>    within the python-adapter directory, so I just decided to compile the full
>>    library. It might be possible to just compile the adapter, I just decided
>>    pushing in this direction further seemed like it would take longer.
>>    - When trying to compile the full library, the attach-javadoc goal
>>    just keep erroring-out, even with the latest version of
>>    maven-javadoc-plugin, so I just removed it entirely.
>>
>> By the end, I got the jar, uploaded it in Databricks and it works like a
>> charm so far.
>>
>> I did however meet another issue, it seems that when using *ShapefileReader.readToGeometryRDD(spark.sparkContext,
>> file_url) *to read multiple Shapefiles files at once, then use the
>> Adapter, same-named columns aren't combined in the resulting DataFrame (see
>> example below). It might be normal RDD behavior -I have little experience
>> using them instead of DataFrames-, and I already found a workaround by
>> creating multiple dfs and using union(), but I prefer to let you know in
>> case it isn't the expected behavior.
>> [image: image.png]
>>
>> Regards,
>> Grégory
>>
>> On Thu, 11 Feb 2021 at 07:58, Jia Yu <ji...@apache.org> wrote:
>>
>>> Hi Gregory,
>>>
>>> Please let us know if you get your issue fixed. I know many of our users
>>> are also using Databricks cluster. We are also interested in the solution.
>>>
>>> Thanks,
>>> Jia
>>>
>>> On Wed, Feb 10, 2021 at 5:17 AM Grégory Dugernier <gd...@aloalto.com>
>>> wrote:
>>>
>>>> Thank you for the quick reply!
>>>>
>>>> It seems my particular situation is a bit more complex than that, since
>>>> I'm running the notebook on a Databricks cluster, and the default spark
>>>> config doesn't seem to allow for more jar repositories (GeoTools isn't on
>>>> Maven Central), nor does creating a new SparkSession appears to work. I've
>>>> tried to download the jars and add them manually to the cluster but it
>>>> doesn't seem to work either. But at least I know where the issue's at!
>>>>
>>>> Thanks again for your help,
>>>> Regards
>>>>
>>>> On Wed, 10 Feb 2021 at 12:22, Jia Yu <ji...@apache.org> wrote:
>>>>
>>>>> Hi Gregory,
>>>>>
>>>>> Thanks for letting us know. This is not a bug. We cannot include
>>>>> GeoTools jars due to license issues. But indeed we forgot to update the
>>>>> docs and jupyter notebook examples. I just updated them. Please read them
>>>>> here:
>>>>>
>>>>>
>>>>> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaSQL.ipynb
>>>>>
>>>>> (Make sure you disable the browser cache or open it in an incognito
>>>>> window)
>>>>> http://sedona.apache.org/download/overview/#install-sedona-python
>>>>>
>>>>> In short, you need to add the following coordinates in the notebook:
>>>>>
>>>>> spark = SparkSession. \ builder. \ appName('appName'). \ config(
>>>>> "spark.serializer", KryoSerializer.getName). \ config(
>>>>> "spark.kryo.registrator", SedonaKryoRegistrator.getName). \ config(
>>>>> "spark.jars.repositories", 'https://repo.osgeo.org/repository/release,
>>>>> ' 'https://download.java.net/maven/2'). \ config('spark.jars.packages'
>>>>> , 'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating,'
>>>>> 'org.geotools:gt-main:24.0,' 'org.geotools:gt-referencing:24.0,'
>>>>> 'org.geotools:gt-epsg-hsql:24.0'). \ getOrCreate()
>>>>>
>>>>> On Wed, Feb 10, 2021 at 2:35 AM Grégory Dugernier <gd...@aloalto.com>
>>>>> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I've been trying to run Sedona for Python on Databricks for 2 days
>>>>>> and I
>>>>>> think I've stumbled upon a bug.
>>>>>>
>>>>>> *Configuration*:
>>>>>>
>>>>>>    - Spark 3.0.1
>>>>>>    - Scala 2.12
>>>>>>    - Python 3.7
>>>>>>
>>>>>> *Librairies*:
>>>>>>
>>>>>>    - apache-sedona (from PyPi)
>>>>>>    - org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating
>>>>>>    (from Maven)
>>>>>>
>>>>>> *What I'm trying to do:*
>>>>>>
>>>>>> I'm trying to load a series of Shapefiles files into a dataframe for
>>>>>> geospatial analysis. See code snippet below, based of your example
>>>>>> notebook
>>>>>> <
>>>>>> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaCore.ipynb
>>>>>> >
>>>>>>
>>>>>>
>>>>>> > from sedona.core.formatMapper.shapefileParser import ShapefileReader
>>>>>> > from sedona.register import SedonaRegistrator
>>>>>> > from sedona.utils.adapter import Adapter
>>>>>> >
>>>>>> > SedonaRegistrator.registerAll(spark)
>>>>>> > shape_rdd = ShapefileReader.readToGeometryRDD(spark.sparkContext,
>>>>>> > file_name)
>>>>>> > df = Adapter.toDf(shape_rdd, spark)
>>>>>> >
>>>>>>
>>>>>> *Bug*:
>>>>>>
>>>>>> The ShapefileReader.readToGeometryRDD() currently throws the following
>>>>>> error:
>>>>>>
>>>>>> > Py4JJavaError: An error occurred while calling
>>>>>> >
>>>>>> z:org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD.
>>>>>> > : java.lang.NoClassDefFoundError:
>>>>>> org/opengis/referencing/FactoryException
>>>>>> > at
>>>>>> >
>>>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>>>>>> > at
>>>>>> >
>>>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>>>>>> >
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>>> > at
>>>>>> >
>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>> > at java.lang.reflect.Method.invoke(Method.java:498) at
>>>>>> > py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
>>>>>> > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
>>>>>> at
>>>>>> > py4j.Gateway.invoke(Gateway.java:295) at
>>>>>> >
>>>>>> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at
>>>>>> > py4j.commands.CallCommand.execute(CallCommand.java:79) at
>>>>>> > py4j.GatewayConnection.run(GatewayConnection.java:251) at
>>>>>> > java.lang.Thread.run(Thread.java:748) Caused by:
>>>>>> > java.lang.ClassNotFoundException:
>>>>>> org.opengis.referencing.FactoryException
>>>>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at
>>>>>> > java.lang.ClassLoader.loadClass(ClassLoader.java:419) at
>>>>>> >
>>>>>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>>>>> > : java.lang.NoClassDefFoundError:
>>>>>> org/opengis/referencing/FactoryException
>>>>>> > at
>>>>>> >
>>>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>>>>>> > at
>>>>>> >
>>>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>> > at
>>>>>> >
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>>> > at
>>>>>> >
>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>> > at java.lang.reflect.Method.invoke(Method.java:498)
>>>>>> > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
>>>>>> > at
>>>>>> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
>>>>>> > at py4j.Gateway.invoke(Gateway.java:295)
>>>>>> > at
>>>>>> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
>>>>>> > at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>>>>> > at py4j.GatewayConnection.run(GatewayConnection.java:251)
>>>>>> > at java.lang.Thread.run(Thread.java:748)
>>>>>> > Caused by: java.lang.ClassNotFoundException:
>>>>>> > org.opengis.referencing.FactoryException
>>>>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
>>>>>> > at
>>>>>> >
>>>>>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>>>>>
>>>>>>
>>>>>> Adding the org.apache.sedona:sedona-core-3.0_2.12:1.0.0-incubating
>>>>>> library
>>>>>> from Maven doesn't solve the error. Adding the
>>>>>> org.datasyslab:geospark:1.3.1
>>>>>> library from Maven solves the error, but it creates conflicts with the
>>>>>> underlying org.locationtech.jts dependencies. This makes me think
>>>>>> there is
>>>>>> a missing OpenGIS dependency in the sedona-python-adapter.
>>>>>>
>>>>>> Regards,
>>>>>> G. Dugernier
>>>>>>
>>>>>> --
>>>>>>
>>>>>>
>>>>>>
>>>>>> Grégory Dugernier
>>>>>> Software Engineer
>>>>>>
>>>>>> gd@aloalto.com <fp...@aloalto.com>
>>>>>> +32 (0)484 11 26 09
>>>>>>
>>>>>> www.aloalto.com
>>>>>> +32 (0)2 736 10 17
>>>>>>
>>>>>> --
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> DISCLAIMER : The content of this e-mail
>>>>>> message does not constitute a
>>>>>> commitment of S.A. ALOALTO N.V. or its
>>>>>> subsidiaries/affiliates. This e-mail
>>>>>> and any attachments thereto may contain
>>>>>> information which is confidential
>>>>>> and/or protected by intellectual property
>>>>>> rights and are intended for the
>>>>>> intended recipient only. Any use of the
>>>>>> information contained herein
>>>>>> (including, but not limited to, total or partial
>>>>>> reproduction,
>>>>>> communication or distribution in any form) by persons other than
>>>>>> the
>>>>>> designated recipient(s) is prohibited. If an addressing or
>>>>>> transmission
>>>>>> error has misdirected this e-mail, please notify the author, either by
>>>>>> telephone or by e-mail and delete the material from any computer.
>>>>>>
>>>>>>
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>> Grégory Dugernier
>>>> Software Engineer
>>>>
>>>> gd@aloalto.com <fp...@aloalto.com>
>>>> +32 (0)484 11 26 09
>>>>
>>>> www.aloalto.com
>>>> +32 (0)2 736 10 17
>>>>
>>>> DISCLAIMER : The content of this e-mail message does not constitute a
>>>> commitment of S.A. ALOALTO N.V. or its subsidiaries/affiliates. This e-mail
>>>> and any attachments thereto may contain information which is confidential
>>>> and/or protected by intellectual property rights and are intended for the
>>>> intended recipient only. Any use of the information contained herein
>>>> (including, but not limited to, total or partial reproduction,
>>>> communication or distribution in any form) by persons other than the
>>>> designated recipient(s) is prohibited. If an addressing or transmission
>>>> error has misdirected this e-mail, please notify the author, either by
>>>> telephone or by e-mail and delete the material from any computer.
>>>>
>>>
>>
>> --
>>
>>
>>
>> Grégory Dugernier
>> Software Engineer
>>
>> gd@aloalto.com <fp...@aloalto.com>
>> +32 (0)484 11 26 09
>>
>> www.aloalto.com
>> +32 (0)2 736 10 17
>>
>> DISCLAIMER : The content of this e-mail message does not constitute a
>> commitment of S.A. ALOALTO N.V. or its subsidiaries/affiliates. This e-mail
>> and any attachments thereto may contain information which is confidential
>> and/or protected by intellectual property rights and are intended for the
>> intended recipient only. Any use of the information contained herein
>> (including, but not limited to, total or partial reproduction,
>> communication or distribution in any form) by persons other than the
>> designated recipient(s) is prohibited. If an addressing or transmission
>> error has misdirected this e-mail, please notify the author, either by
>> telephone or by e-mail and delete the material from any computer.
>>
>

-- 



Grégory Dugernier
Software Engineer

gd@aloalto.com <fp...@aloalto.com>
+32 (0)484 11 26 09

www.aloalto.com
+32 (0)2 736 10 17

-- 




DISCLAIMER : The content of this e-mail
message does not constitute a 
commitment of S.A. ALOALTO N.V. or its
subsidiaries/affiliates. This e-mail 
and any attachments thereto may contain
information which is confidential 
and/or protected by intellectual property
rights and are intended for the 
intended recipient only. Any use of the
information contained herein 
(including, but not limited to, total or partial
reproduction, 
communication or distribution in any form) by persons other than
the 
designated recipient(s) is prohibited. If an addressing or transmission
error has misdirected this e-mail, please notify the author, either by
telephone or by e-mail and delete the material from any computer. 


Re: [Bug][Python] Missing Java Class?

Posted by Jia Yu <ji...@apache.org>.
Thanks, Gregory. I think this behavior is not expected. We will look into
this.

In fact, you should let us know about your situation early on. In fact, you
can download the GeoTools jars manually and copy to SPARK_HOME/jars/
folder... You don't have to compile the code. Download links are given in
the comments:
http://sedona.apache.org/download/GeoSpark-All-Modules-Maven-Central-Coordinates/#geotools-240

We should make our doc more clear.


On Thu, Feb 11, 2021 at 12:44 AM Grégory Dugernier <gd...@aloalto.com> wrote:

> Hi Jia,
>
> After much sweat and tears, I went the long road and compiled the code
> locally. I'm working on Windows so I had to change a few things in the
> POM.xml:
>
>    - When trying to compile just the python-adapter lib, Maven didn't
>    like the dynamic versioning of sedona-core and sedona-sql, so I had to
>    hardcode the current version.
>    - For some reason, Maven couldn't find spark-version-converter from
>    within the python-adapter directory, so I just decided to compile the full
>    library. It might be possible to just compile the adapter, I just decided
>    pushing in this direction further seemed like it would take longer.
>    - When trying to compile the full library, the attach-javadoc goal
>    just keep erroring-out, even with the latest version of
>    maven-javadoc-plugin, so I just removed it entirely.
>
> By the end, I got the jar, uploaded it in Databricks and it works like a
> charm so far.
>
> I did however meet another issue, it seems that when using *ShapefileReader.readToGeometryRDD(spark.sparkContext,
> file_url) *to read multiple Shapefiles files at once, then use the
> Adapter, same-named columns aren't combined in the resulting DataFrame (see
> example below). It might be normal RDD behavior -I have little experience
> using them instead of DataFrames-, and I already found a workaround by
> creating multiple dfs and using union(), but I prefer to let you know in
> case it isn't the expected behavior.
> [image: image.png]
>
> Regards,
> Grégory
>
> On Thu, 11 Feb 2021 at 07:58, Jia Yu <ji...@apache.org> wrote:
>
>> Hi Gregory,
>>
>> Please let us know if you get your issue fixed. I know many of our users
>> are also using Databricks cluster. We are also interested in the solution.
>>
>> Thanks,
>> Jia
>>
>> On Wed, Feb 10, 2021 at 5:17 AM Grégory Dugernier <gd...@aloalto.com> wrote:
>>
>>> Thank you for the quick reply!
>>>
>>> It seems my particular situation is a bit more complex than that, since
>>> I'm running the notebook on a Databricks cluster, and the default spark
>>> config doesn't seem to allow for more jar repositories (GeoTools isn't on
>>> Maven Central), nor does creating a new SparkSession appears to work. I've
>>> tried to download the jars and add them manually to the cluster but it
>>> doesn't seem to work either. But at least I know where the issue's at!
>>>
>>> Thanks again for your help,
>>> Regards
>>>
>>> On Wed, 10 Feb 2021 at 12:22, Jia Yu <ji...@apache.org> wrote:
>>>
>>>> Hi Gregory,
>>>>
>>>> Thanks for letting us know. This is not a bug. We cannot include
>>>> GeoTools jars due to license issues. But indeed we forgot to update the
>>>> docs and jupyter notebook examples. I just updated them. Please read them
>>>> here:
>>>>
>>>>
>>>> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaSQL.ipynb
>>>>
>>>> (Make sure you disable the browser cache or open it in an incognito
>>>> window)
>>>> http://sedona.apache.org/download/overview/#install-sedona-python
>>>>
>>>> In short, you need to add the following coordinates in the notebook:
>>>>
>>>> spark = SparkSession. \ builder. \ appName('appName'). \ config(
>>>> "spark.serializer", KryoSerializer.getName). \ config(
>>>> "spark.kryo.registrator", SedonaKryoRegistrator.getName). \ config(
>>>> "spark.jars.repositories", 'https://repo.osgeo.org/repository/release,'
>>>> 'https://download.java.net/maven/2'). \ config('spark.jars.packages',
>>>> 'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating,'
>>>> 'org.geotools:gt-main:24.0,' 'org.geotools:gt-referencing:24.0,'
>>>> 'org.geotools:gt-epsg-hsql:24.0'). \ getOrCreate()
>>>>
>>>> On Wed, Feb 10, 2021 at 2:35 AM Grégory Dugernier <gd...@aloalto.com>
>>>> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I've been trying to run Sedona for Python on Databricks for 2 days and
>>>>> I
>>>>> think I've stumbled upon a bug.
>>>>>
>>>>> *Configuration*:
>>>>>
>>>>>    - Spark 3.0.1
>>>>>    - Scala 2.12
>>>>>    - Python 3.7
>>>>>
>>>>> *Librairies*:
>>>>>
>>>>>    - apache-sedona (from PyPi)
>>>>>    - org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating
>>>>>    (from Maven)
>>>>>
>>>>> *What I'm trying to do:*
>>>>>
>>>>> I'm trying to load a series of Shapefiles files into a dataframe for
>>>>> geospatial analysis. See code snippet below, based of your example
>>>>> notebook
>>>>> <
>>>>> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaCore.ipynb
>>>>> >
>>>>>
>>>>>
>>>>> > from sedona.core.formatMapper.shapefileParser import ShapefileReader
>>>>> > from sedona.register import SedonaRegistrator
>>>>> > from sedona.utils.adapter import Adapter
>>>>> >
>>>>> > SedonaRegistrator.registerAll(spark)
>>>>> > shape_rdd = ShapefileReader.readToGeometryRDD(spark.sparkContext,
>>>>> > file_name)
>>>>> > df = Adapter.toDf(shape_rdd, spark)
>>>>> >
>>>>>
>>>>> *Bug*:
>>>>>
>>>>> The ShapefileReader.readToGeometryRDD() currently throws the following
>>>>> error:
>>>>>
>>>>> > Py4JJavaError: An error occurred while calling
>>>>> >
>>>>> z:org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD.
>>>>> > : java.lang.NoClassDefFoundError:
>>>>> org/opengis/referencing/FactoryException
>>>>> > at
>>>>> >
>>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>>>>> > at
>>>>> >
>>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>>>>> >
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>> > at
>>>>> >
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>> > at java.lang.reflect.Method.invoke(Method.java:498) at
>>>>> > py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
>>>>> > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) at
>>>>> > py4j.Gateway.invoke(Gateway.java:295) at
>>>>> > py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
>>>>> at
>>>>> > py4j.commands.CallCommand.execute(CallCommand.java:79) at
>>>>> > py4j.GatewayConnection.run(GatewayConnection.java:251) at
>>>>> > java.lang.Thread.run(Thread.java:748) Caused by:
>>>>> > java.lang.ClassNotFoundException:
>>>>> org.opengis.referencing.FactoryException
>>>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at
>>>>> > java.lang.ClassLoader.loadClass(ClassLoader.java:419) at
>>>>> >
>>>>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>>>> > : java.lang.NoClassDefFoundError:
>>>>> org/opengis/referencing/FactoryException
>>>>> > at
>>>>> >
>>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>>>>> > at
>>>>> >
>>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> > at
>>>>> >
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>> > at
>>>>> >
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>> > at java.lang.reflect.Method.invoke(Method.java:498)
>>>>> > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
>>>>> > at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
>>>>> > at py4j.Gateway.invoke(Gateway.java:295)
>>>>> > at
>>>>> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
>>>>> > at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>>>> > at py4j.GatewayConnection.run(GatewayConnection.java:251)
>>>>> > at java.lang.Thread.run(Thread.java:748)
>>>>> > Caused by: java.lang.ClassNotFoundException:
>>>>> > org.opengis.referencing.FactoryException
>>>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
>>>>> > at
>>>>> >
>>>>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>>>>
>>>>>
>>>>> Adding the org.apache.sedona:sedona-core-3.0_2.12:1.0.0-incubating
>>>>> library
>>>>> from Maven doesn't solve the error. Adding the
>>>>> org.datasyslab:geospark:1.3.1
>>>>> library from Maven solves the error, but it creates conflicts with the
>>>>> underlying org.locationtech.jts dependencies. This makes me think
>>>>> there is
>>>>> a missing OpenGIS dependency in the sedona-python-adapter.
>>>>>
>>>>> Regards,
>>>>> G. Dugernier
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>>
>>>>> Grégory Dugernier
>>>>> Software Engineer
>>>>>
>>>>> gd@aloalto.com <fp...@aloalto.com>
>>>>> +32 (0)484 11 26 09
>>>>>
>>>>> www.aloalto.com
>>>>> +32 (0)2 736 10 17
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> DISCLAIMER : The content of this e-mail
>>>>> message does not constitute a
>>>>> commitment of S.A. ALOALTO N.V. or its
>>>>> subsidiaries/affiliates. This e-mail
>>>>> and any attachments thereto may contain
>>>>> information which is confidential
>>>>> and/or protected by intellectual property
>>>>> rights and are intended for the
>>>>> intended recipient only. Any use of the
>>>>> information contained herein
>>>>> (including, but not limited to, total or partial
>>>>> reproduction,
>>>>> communication or distribution in any form) by persons other than
>>>>> the
>>>>> designated recipient(s) is prohibited. If an addressing or transmission
>>>>> error has misdirected this e-mail, please notify the author, either by
>>>>> telephone or by e-mail and delete the material from any computer.
>>>>>
>>>>>
>>>
>>> --
>>>
>>>
>>>
>>> Grégory Dugernier
>>> Software Engineer
>>>
>>> gd@aloalto.com <fp...@aloalto.com>
>>> +32 (0)484 11 26 09
>>>
>>> www.aloalto.com
>>> +32 (0)2 736 10 17
>>>
>>> DISCLAIMER : The content of this e-mail message does not constitute a
>>> commitment of S.A. ALOALTO N.V. or its subsidiaries/affiliates. This e-mail
>>> and any attachments thereto may contain information which is confidential
>>> and/or protected by intellectual property rights and are intended for the
>>> intended recipient only. Any use of the information contained herein
>>> (including, but not limited to, total or partial reproduction,
>>> communication or distribution in any form) by persons other than the
>>> designated recipient(s) is prohibited. If an addressing or transmission
>>> error has misdirected this e-mail, please notify the author, either by
>>> telephone or by e-mail and delete the material from any computer.
>>>
>>
>
> --
>
>
>
> Grégory Dugernier
> Software Engineer
>
> gd@aloalto.com <fp...@aloalto.com>
> +32 (0)484 11 26 09
>
> www.aloalto.com
> +32 (0)2 736 10 17
>
> DISCLAIMER : The content of this e-mail message does not constitute a
> commitment of S.A. ALOALTO N.V. or its subsidiaries/affiliates. This e-mail
> and any attachments thereto may contain information which is confidential
> and/or protected by intellectual property rights and are intended for the
> intended recipient only. Any use of the information contained herein
> (including, but not limited to, total or partial reproduction,
> communication or distribution in any form) by persons other than the
> designated recipient(s) is prohibited. If an addressing or transmission
> error has misdirected this e-mail, please notify the author, either by
> telephone or by e-mail and delete the material from any computer.
>

Re: [Bug][Python] Missing Java Class?

Posted by Grégory Dugernier <gd...@aloalto.com>.
Hi Jia,

After much sweat and tears, I went the long road and compiled the code
locally. I'm working on Windows so I had to change a few things in the
POM.xml:

   - When trying to compile just the python-adapter lib, Maven didn't like
   the dynamic versioning of sedona-core and sedona-sql, so I had to hardcode
   the current version.
   - For some reason, Maven couldn't find spark-version-converter from
   within the python-adapter directory, so I just decided to compile the full
   library. It might be possible to just compile the adapter, I just decided
   pushing in this direction further seemed like it would take longer.
   - When trying to compile the full library, the attach-javadoc goal just
   keep erroring-out, even with the latest version of maven-javadoc-plugin, so
   I just removed it entirely.

By the end, I got the jar, uploaded it in Databricks and it works like a
charm so far.

I did however meet another issue, it seems that when using
*ShapefileReader.readToGeometryRDD(spark.sparkContext,
file_url) *to read multiple Shapefiles files at once, then use the Adapter,
same-named columns aren't combined in the resulting DataFrame (see example
below). It might be normal RDD behavior -I have little experience using
them instead of DataFrames-, and I already found a workaround by creating
multiple dfs and using union(), but I prefer to let you know in case it
isn't the expected behavior.
[image: image.png]

Regards,
Grégory

On Thu, 11 Feb 2021 at 07:58, Jia Yu <ji...@apache.org> wrote:

> Hi Gregory,
>
> Please let us know if you get your issue fixed. I know many of our users
> are also using Databricks cluster. We are also interested in the solution.
>
> Thanks,
> Jia
>
> On Wed, Feb 10, 2021 at 5:17 AM Grégory Dugernier <gd...@aloalto.com> wrote:
>
>> Thank you for the quick reply!
>>
>> It seems my particular situation is a bit more complex than that, since
>> I'm running the notebook on a Databricks cluster, and the default spark
>> config doesn't seem to allow for more jar repositories (GeoTools isn't on
>> Maven Central), nor does creating a new SparkSession appears to work. I've
>> tried to download the jars and add them manually to the cluster but it
>> doesn't seem to work either. But at least I know where the issue's at!
>>
>> Thanks again for your help,
>> Regards
>>
>> On Wed, 10 Feb 2021 at 12:22, Jia Yu <ji...@apache.org> wrote:
>>
>>> Hi Gregory,
>>>
>>> Thanks for letting us know. This is not a bug. We cannot include
>>> GeoTools jars due to license issues. But indeed we forgot to update the
>>> docs and jupyter notebook examples. I just updated them. Please read them
>>> here:
>>>
>>>
>>> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaSQL.ipynb
>>>
>>> (Make sure you disable the browser cache or open it in an incognito
>>> window)
>>> http://sedona.apache.org/download/overview/#install-sedona-python
>>>
>>> In short, you need to add the following coordinates in the notebook:
>>>
>>> spark = SparkSession. \ builder. \ appName('appName'). \ config(
>>> "spark.serializer", KryoSerializer.getName). \ config(
>>> "spark.kryo.registrator", SedonaKryoRegistrator.getName). \ config(
>>> "spark.jars.repositories", 'https://repo.osgeo.org/repository/release,'
>>> 'https://download.java.net/maven/2'). \ config('spark.jars.packages',
>>> 'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating,'
>>> 'org.geotools:gt-main:24.0,' 'org.geotools:gt-referencing:24.0,'
>>> 'org.geotools:gt-epsg-hsql:24.0'). \ getOrCreate()
>>>
>>> On Wed, Feb 10, 2021 at 2:35 AM Grégory Dugernier <gd...@aloalto.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> I've been trying to run Sedona for Python on Databricks for 2 days and I
>>>> think I've stumbled upon a bug.
>>>>
>>>> *Configuration*:
>>>>
>>>>    - Spark 3.0.1
>>>>    - Scala 2.12
>>>>    - Python 3.7
>>>>
>>>> *Librairies*:
>>>>
>>>>    - apache-sedona (from PyPi)
>>>>    - org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating
>>>>    (from Maven)
>>>>
>>>> *What I'm trying to do:*
>>>>
>>>> I'm trying to load a series of Shapefiles files into a dataframe for
>>>> geospatial analysis. See code snippet below, based of your example
>>>> notebook
>>>> <
>>>> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaCore.ipynb
>>>> >
>>>>
>>>>
>>>> > from sedona.core.formatMapper.shapefileParser import ShapefileReader
>>>> > from sedona.register import SedonaRegistrator
>>>> > from sedona.utils.adapter import Adapter
>>>> >
>>>> > SedonaRegistrator.registerAll(spark)
>>>> > shape_rdd = ShapefileReader.readToGeometryRDD(spark.sparkContext,
>>>> > file_name)
>>>> > df = Adapter.toDf(shape_rdd, spark)
>>>> >
>>>>
>>>> *Bug*:
>>>>
>>>> The ShapefileReader.readToGeometryRDD() currently throws the following
>>>> error:
>>>>
>>>> > Py4JJavaError: An error occurred while calling
>>>> >
>>>> z:org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD.
>>>> > : java.lang.NoClassDefFoundError:
>>>> org/opengis/referencing/FactoryException
>>>> > at
>>>> >
>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>>>> > at
>>>> >
>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>>>> >
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>> > at
>>>> >
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> > at java.lang.reflect.Method.invoke(Method.java:498) at
>>>> > py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
>>>> > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) at
>>>> > py4j.Gateway.invoke(Gateway.java:295) at
>>>> > py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
>>>> at
>>>> > py4j.commands.CallCommand.execute(CallCommand.java:79) at
>>>> > py4j.GatewayConnection.run(GatewayConnection.java:251) at
>>>> > java.lang.Thread.run(Thread.java:748) Caused by:
>>>> > java.lang.ClassNotFoundException:
>>>> org.opengis.referencing.FactoryException
>>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at
>>>> > java.lang.ClassLoader.loadClass(ClassLoader.java:419) at
>>>> >
>>>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>>> > : java.lang.NoClassDefFoundError:
>>>> org/opengis/referencing/FactoryException
>>>> > at
>>>> >
>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>>>> > at
>>>> >
>>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> > at
>>>> >
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>> > at
>>>> >
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> > at java.lang.reflect.Method.invoke(Method.java:498)
>>>> > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
>>>> > at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
>>>> > at py4j.Gateway.invoke(Gateway.java:295)
>>>> > at
>>>> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
>>>> > at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>>> > at py4j.GatewayConnection.run(GatewayConnection.java:251)
>>>> > at java.lang.Thread.run(Thread.java:748)
>>>> > Caused by: java.lang.ClassNotFoundException:
>>>> > org.opengis.referencing.FactoryException
>>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
>>>> > at
>>>> >
>>>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>>>
>>>>
>>>> Adding the org.apache.sedona:sedona-core-3.0_2.12:1.0.0-incubating
>>>> library
>>>> from Maven doesn't solve the error. Adding the
>>>> org.datasyslab:geospark:1.3.1
>>>> library from Maven solves the error, but it creates conflicts with the
>>>> underlying org.locationtech.jts dependencies. This makes me think there
>>>> is
>>>> a missing OpenGIS dependency in the sedona-python-adapter.
>>>>
>>>> Regards,
>>>> G. Dugernier
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>> Grégory Dugernier
>>>> Software Engineer
>>>>
>>>> gd@aloalto.com <fp...@aloalto.com>
>>>> +32 (0)484 11 26 09
>>>>
>>>> www.aloalto.com
>>>> +32 (0)2 736 10 17
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>>
>>>> DISCLAIMER : The content of this e-mail
>>>> message does not constitute a
>>>> commitment of S.A. ALOALTO N.V. or its
>>>> subsidiaries/affiliates. This e-mail
>>>> and any attachments thereto may contain
>>>> information which is confidential
>>>> and/or protected by intellectual property
>>>> rights and are intended for the
>>>> intended recipient only. Any use of the
>>>> information contained herein
>>>> (including, but not limited to, total or partial
>>>> reproduction,
>>>> communication or distribution in any form) by persons other than
>>>> the
>>>> designated recipient(s) is prohibited. If an addressing or transmission
>>>> error has misdirected this e-mail, please notify the author, either by
>>>> telephone or by e-mail and delete the material from any computer.
>>>>
>>>>
>>
>> --
>>
>>
>>
>> Grégory Dugernier
>> Software Engineer
>>
>> gd@aloalto.com <fp...@aloalto.com>
>> +32 (0)484 11 26 09
>>
>> www.aloalto.com
>> +32 (0)2 736 10 17
>>
>> DISCLAIMER : The content of this e-mail message does not constitute a
>> commitment of S.A. ALOALTO N.V. or its subsidiaries/affiliates. This e-mail
>> and any attachments thereto may contain information which is confidential
>> and/or protected by intellectual property rights and are intended for the
>> intended recipient only. Any use of the information contained herein
>> (including, but not limited to, total or partial reproduction,
>> communication or distribution in any form) by persons other than the
>> designated recipient(s) is prohibited. If an addressing or transmission
>> error has misdirected this e-mail, please notify the author, either by
>> telephone or by e-mail and delete the material from any computer.
>>
>

-- 



Grégory Dugernier
Software Engineer

gd@aloalto.com <fp...@aloalto.com>
+32 (0)484 11 26 09

www.aloalto.com
+32 (0)2 736 10 17

-- 




DISCLAIMER : The content of this e-mail
message does not constitute a 
commitment of S.A. ALOALTO N.V. or its
subsidiaries/affiliates. This e-mail 
and any attachments thereto may contain
information which is confidential 
and/or protected by intellectual property
rights and are intended for the 
intended recipient only. Any use of the
information contained herein 
(including, but not limited to, total or partial
reproduction, 
communication or distribution in any form) by persons other than
the 
designated recipient(s) is prohibited. If an addressing or transmission
error has misdirected this e-mail, please notify the author, either by
telephone or by e-mail and delete the material from any computer. 


Re: [Bug][Python] Missing Java Class?

Posted by Jia Yu <ji...@apache.org>.
Hi Gregory,

Please let us know if you get your issue fixed. I know many of our users
are also using Databricks cluster. We are also interested in the solution.

Thanks,
Jia

On Wed, Feb 10, 2021 at 5:17 AM Grégory Dugernier <gd...@aloalto.com> wrote:

> Thank you for the quick reply!
>
> It seems my particular situation is a bit more complex than that, since
> I'm running the notebook on a Databricks cluster, and the default spark
> config doesn't seem to allow for more jar repositories (GeoTools isn't on
> Maven Central), nor does creating a new SparkSession appears to work. I've
> tried to download the jars and add them manually to the cluster but it
> doesn't seem to work either. But at least I know where the issue's at!
>
> Thanks again for your help,
> Regards
>
> On Wed, 10 Feb 2021 at 12:22, Jia Yu <ji...@apache.org> wrote:
>
>> Hi Gregory,
>>
>> Thanks for letting us know. This is not a bug. We cannot include GeoTools
>> jars due to license issues. But indeed we forgot to update the docs and
>> jupyter notebook examples. I just updated them. Please read them here:
>>
>>
>> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaSQL.ipynb
>>
>> (Make sure you disable the browser cache or open it in an incognito
>> window)
>> http://sedona.apache.org/download/overview/#install-sedona-python
>>
>> In short, you need to add the following coordinates in the notebook:
>>
>> spark = SparkSession. \ builder. \ appName('appName'). \ config(
>> "spark.serializer", KryoSerializer.getName). \ config(
>> "spark.kryo.registrator", SedonaKryoRegistrator.getName). \ config(
>> "spark.jars.repositories", 'https://repo.osgeo.org/repository/release,' '
>> https://download.java.net/maven/2'). \ config('spark.jars.packages',
>> 'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating,'
>> 'org.geotools:gt-main:24.0,' 'org.geotools:gt-referencing:24.0,'
>> 'org.geotools:gt-epsg-hsql:24.0'). \ getOrCreate()
>>
>> On Wed, Feb 10, 2021 at 2:35 AM Grégory Dugernier <gd...@aloalto.com> wrote:
>>
>>> Hello,
>>>
>>> I've been trying to run Sedona for Python on Databricks for 2 days and I
>>> think I've stumbled upon a bug.
>>>
>>> *Configuration*:
>>>
>>>    - Spark 3.0.1
>>>    - Scala 2.12
>>>    - Python 3.7
>>>
>>> *Librairies*:
>>>
>>>    - apache-sedona (from PyPi)
>>>    - org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating
>>>    (from Maven)
>>>
>>> *What I'm trying to do:*
>>>
>>> I'm trying to load a series of Shapefiles files into a dataframe for
>>> geospatial analysis. See code snippet below, based of your example
>>> notebook
>>> <
>>> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaCore.ipynb
>>> >
>>>
>>>
>>> > from sedona.core.formatMapper.shapefileParser import ShapefileReader
>>> > from sedona.register import SedonaRegistrator
>>> > from sedona.utils.adapter import Adapter
>>> >
>>> > SedonaRegistrator.registerAll(spark)
>>> > shape_rdd = ShapefileReader.readToGeometryRDD(spark.sparkContext,
>>> > file_name)
>>> > df = Adapter.toDf(shape_rdd, spark)
>>> >
>>>
>>> *Bug*:
>>>
>>> The ShapefileReader.readToGeometryRDD() currently throws the following
>>> error:
>>>
>>> > Py4JJavaError: An error occurred while calling
>>> >
>>> z:org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD.
>>> > : java.lang.NoClassDefFoundError:
>>> org/opengis/referencing/FactoryException
>>> > at
>>> >
>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>>> > at
>>> >
>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>>> >
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>> > at
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> > at java.lang.reflect.Method.invoke(Method.java:498) at
>>> > py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
>>> > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) at
>>> > py4j.Gateway.invoke(Gateway.java:295) at
>>> > py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at
>>> > py4j.commands.CallCommand.execute(CallCommand.java:79) at
>>> > py4j.GatewayConnection.run(GatewayConnection.java:251) at
>>> > java.lang.Thread.run(Thread.java:748) Caused by:
>>> > java.lang.ClassNotFoundException:
>>> org.opengis.referencing.FactoryException
>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at
>>> > java.lang.ClassLoader.loadClass(ClassLoader.java:419) at
>>> >
>>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>> > : java.lang.NoClassDefFoundError:
>>> org/opengis/referencing/FactoryException
>>> > at
>>> >
>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>>> > at
>>> >
>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> > at
>>> >
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>> > at
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> > at java.lang.reflect.Method.invoke(Method.java:498)
>>> > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
>>> > at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
>>> > at py4j.Gateway.invoke(Gateway.java:295)
>>> > at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
>>> > at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>> > at py4j.GatewayConnection.run(GatewayConnection.java:251)
>>> > at java.lang.Thread.run(Thread.java:748)
>>> > Caused by: java.lang.ClassNotFoundException:
>>> > org.opengis.referencing.FactoryException
>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
>>> > at
>>> >
>>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>>
>>>
>>> Adding the org.apache.sedona:sedona-core-3.0_2.12:1.0.0-incubating
>>> library
>>> from Maven doesn't solve the error. Adding the
>>> org.datasyslab:geospark:1.3.1
>>> library from Maven solves the error, but it creates conflicts with the
>>> underlying org.locationtech.jts dependencies. This makes me think there
>>> is
>>> a missing OpenGIS dependency in the sedona-python-adapter.
>>>
>>> Regards,
>>> G. Dugernier
>>>
>>> --
>>>
>>>
>>>
>>> Grégory Dugernier
>>> Software Engineer
>>>
>>> gd@aloalto.com <fp...@aloalto.com>
>>> +32 (0)484 11 26 09
>>>
>>> www.aloalto.com
>>> +32 (0)2 736 10 17
>>>
>>> --
>>>
>>>
>>>
>>>
>>> DISCLAIMER : The content of this e-mail
>>> message does not constitute a
>>> commitment of S.A. ALOALTO N.V. or its
>>> subsidiaries/affiliates. This e-mail
>>> and any attachments thereto may contain
>>> information which is confidential
>>> and/or protected by intellectual property
>>> rights and are intended for the
>>> intended recipient only. Any use of the
>>> information contained herein
>>> (including, but not limited to, total or partial
>>> reproduction,
>>> communication or distribution in any form) by persons other than
>>> the
>>> designated recipient(s) is prohibited. If an addressing or transmission
>>> error has misdirected this e-mail, please notify the author, either by
>>> telephone or by e-mail and delete the material from any computer.
>>>
>>>
>
> --
>
>
>
> Grégory Dugernier
> Software Engineer
>
> gd@aloalto.com <fp...@aloalto.com>
> +32 (0)484 11 26 09
>
> www.aloalto.com
> +32 (0)2 736 10 17
>
> DISCLAIMER : The content of this e-mail message does not constitute a
> commitment of S.A. ALOALTO N.V. or its subsidiaries/affiliates. This e-mail
> and any attachments thereto may contain information which is confidential
> and/or protected by intellectual property rights and are intended for the
> intended recipient only. Any use of the information contained herein
> (including, but not limited to, total or partial reproduction,
> communication or distribution in any form) by persons other than the
> designated recipient(s) is prohibited. If an addressing or transmission
> error has misdirected this e-mail, please notify the author, either by
> telephone or by e-mail and delete the material from any computer.
>

Re: [Bug][Python] Missing Java Class?

Posted by Grégory Dugernier <gd...@aloalto.com>.
Thank you for the quick reply!

It seems my particular situation is a bit more complex than that, since I'm
running the notebook on a Databricks cluster, and the default spark config
doesn't seem to allow for more jar repositories (GeoTools isn't on Maven
Central), nor does creating a new SparkSession appears to work. I've tried
to download the jars and add them manually to the cluster but it doesn't
seem to work either. But at least I know where the issue's at!

Thanks again for your help,
Regards

On Wed, 10 Feb 2021 at 12:22, Jia Yu <ji...@apache.org> wrote:

> Hi Gregory,
>
> Thanks for letting us know. This is not a bug. We cannot include GeoTools
> jars due to license issues. But indeed we forgot to update the docs and
> jupyter notebook examples. I just updated them. Please read them here:
>
>
> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaSQL.ipynb
>
> (Make sure you disable the browser cache or open it in an incognito
> window)  http://sedona.apache.org/download/overview/#install-sedona-python
>
> In short, you need to add the following coordinates in the notebook:
>
> spark = SparkSession. \ builder. \ appName('appName'). \ config(
> "spark.serializer", KryoSerializer.getName). \ config(
> "spark.kryo.registrator", SedonaKryoRegistrator.getName). \ config(
> "spark.jars.repositories", 'https://repo.osgeo.org/repository/release,' '
> https://download.java.net/maven/2'). \ config('spark.jars.packages',
> 'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating,'
> 'org.geotools:gt-main:24.0,' 'org.geotools:gt-referencing:24.0,'
> 'org.geotools:gt-epsg-hsql:24.0'). \ getOrCreate()
>
> On Wed, Feb 10, 2021 at 2:35 AM Grégory Dugernier <gd...@aloalto.com> wrote:
>
>> Hello,
>>
>> I've been trying to run Sedona for Python on Databricks for 2 days and I
>> think I've stumbled upon a bug.
>>
>> *Configuration*:
>>
>>    - Spark 3.0.1
>>    - Scala 2.12
>>    - Python 3.7
>>
>> *Librairies*:
>>
>>    - apache-sedona (from PyPi)
>>    - org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating
>>    (from Maven)
>>
>> *What I'm trying to do:*
>>
>> I'm trying to load a series of Shapefiles files into a dataframe for
>> geospatial analysis. See code snippet below, based of your example
>> notebook
>> <
>> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaCore.ipynb
>> >
>>
>>
>> > from sedona.core.formatMapper.shapefileParser import ShapefileReader
>> > from sedona.register import SedonaRegistrator
>> > from sedona.utils.adapter import Adapter
>> >
>> > SedonaRegistrator.registerAll(spark)
>> > shape_rdd = ShapefileReader.readToGeometryRDD(spark.sparkContext,
>> > file_name)
>> > df = Adapter.toDf(shape_rdd, spark)
>> >
>>
>> *Bug*:
>>
>> The ShapefileReader.readToGeometryRDD() currently throws the following
>> error:
>>
>> > Py4JJavaError: An error occurred while calling
>> >
>> z:org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD.
>> > : java.lang.NoClassDefFoundError:
>> org/opengis/referencing/FactoryException
>> > at
>> >
>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>> > at
>> >
>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>> >
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> > at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > at java.lang.reflect.Method.invoke(Method.java:498) at
>> > py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
>> > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) at
>> > py4j.Gateway.invoke(Gateway.java:295) at
>> > py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at
>> > py4j.commands.CallCommand.execute(CallCommand.java:79) at
>> > py4j.GatewayConnection.run(GatewayConnection.java:251) at
>> > java.lang.Thread.run(Thread.java:748) Caused by:
>> > java.lang.ClassNotFoundException:
>> org.opengis.referencing.FactoryException
>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at
>> > java.lang.ClassLoader.loadClass(ClassLoader.java:419) at
>> >
>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>> > : java.lang.NoClassDefFoundError:
>> org/opengis/referencing/FactoryException
>> > at
>> >
>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>> > at
>> >
>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> > at
>> >
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> > at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > at java.lang.reflect.Method.invoke(Method.java:498)
>> > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
>> > at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
>> > at py4j.Gateway.invoke(Gateway.java:295)
>> > at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
>> > at py4j.commands.CallCommand.execute(CallCommand.java:79)
>> > at py4j.GatewayConnection.run(GatewayConnection.java:251)
>> > at java.lang.Thread.run(Thread.java:748)
>> > Caused by: java.lang.ClassNotFoundException:
>> > org.opengis.referencing.FactoryException
>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
>> > at
>> >
>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>
>>
>> Adding the org.apache.sedona:sedona-core-3.0_2.12:1.0.0-incubating library
>> from Maven doesn't solve the error. Adding the
>> org.datasyslab:geospark:1.3.1
>> library from Maven solves the error, but it creates conflicts with the
>> underlying org.locationtech.jts dependencies. This makes me think there is
>> a missing OpenGIS dependency in the sedona-python-adapter.
>>
>> Regards,
>> G. Dugernier
>>
>> --
>>
>>
>>
>> Grégory Dugernier
>> Software Engineer
>>
>> gd@aloalto.com <fp...@aloalto.com>
>> +32 (0)484 11 26 09
>>
>> www.aloalto.com
>> +32 (0)2 736 10 17
>>
>> --
>>
>>
>>
>>
>> DISCLAIMER : The content of this e-mail
>> message does not constitute a
>> commitment of S.A. ALOALTO N.V. or its
>> subsidiaries/affiliates. This e-mail
>> and any attachments thereto may contain
>> information which is confidential
>> and/or protected by intellectual property
>> rights and are intended for the
>> intended recipient only. Any use of the
>> information contained herein
>> (including, but not limited to, total or partial
>> reproduction,
>> communication or distribution in any form) by persons other than
>> the
>> designated recipient(s) is prohibited. If an addressing or transmission
>> error has misdirected this e-mail, please notify the author, either by
>> telephone or by e-mail and delete the material from any computer.
>>
>>

-- 



Grégory Dugernier
Software Engineer

gd@aloalto.com <fp...@aloalto.com>
+32 (0)484 11 26 09

www.aloalto.com
+32 (0)2 736 10 17

-- 




DISCLAIMER : The content of this e-mail
message does not constitute a 
commitment of S.A. ALOALTO N.V. or its
subsidiaries/affiliates. This e-mail 
and any attachments thereto may contain
information which is confidential 
and/or protected by intellectual property
rights and are intended for the 
intended recipient only. Any use of the
information contained herein 
(including, but not limited to, total or partial
reproduction, 
communication or distribution in any form) by persons other than
the 
designated recipient(s) is prohibited. If an addressing or transmission
error has misdirected this e-mail, please notify the author, either by
telephone or by e-mail and delete the material from any computer. 


Re: [Bug][Python] Missing Java Class?

Posted by Jia Yu <ji...@apache.org>.
Hi Gregory,

Thanks for letting us know. This is not a bug. We cannot include GeoTools
jars due to license issues. But indeed we forgot to update the docs and
jupyter notebook examples. I just updated them. Please read them here:

https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaSQL.ipynb

(Make sure you disable the browser cache or open it in an incognito
window)  http://sedona.apache.org/download/overview/#install-sedona-python

In short, you need to add the following coordinates in the notebook:

spark = SparkSession. \ builder. \ appName('appName'). \ config(
"spark.serializer", KryoSerializer.getName). \ config(
"spark.kryo.registrator", SedonaKryoRegistrator.getName). \ config(
"spark.jars.repositories", 'https://repo.osgeo.org/repository/release,' '
https://download.java.net/maven/2'). \ config('spark.jars.packages',
'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating,'
'org.geotools:gt-main:24.0,' 'org.geotools:gt-referencing:24.0,'
'org.geotools:gt-epsg-hsql:24.0'). \ getOrCreate()

On Wed, Feb 10, 2021 at 2:35 AM Grégory Dugernier <gd...@aloalto.com> wrote:

> Hello,
>
> I've been trying to run Sedona for Python on Databricks for 2 days and I
> think I've stumbled upon a bug.
>
> *Configuration*:
>
>    - Spark 3.0.1
>    - Scala 2.12
>    - Python 3.7
>
> *Librairies*:
>
>    - apache-sedona (from PyPi)
>    - org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating
>    (from Maven)
>
> *What I'm trying to do:*
>
> I'm trying to load a series of Shapefiles files into a dataframe for
> geospatial analysis. See code snippet below, based of your example notebook
> <
> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaCore.ipynb
> >
>
>
> > from sedona.core.formatMapper.shapefileParser import ShapefileReader
> > from sedona.register import SedonaRegistrator
> > from sedona.utils.adapter import Adapter
> >
> > SedonaRegistrator.registerAll(spark)
> > shape_rdd = ShapefileReader.readToGeometryRDD(spark.sparkContext,
> > file_name)
> > df = Adapter.toDf(shape_rdd, spark)
> >
>
> *Bug*:
>
> The ShapefileReader.readToGeometryRDD() currently throws the following
> error:
>
> > Py4JJavaError: An error occurred while calling
> >
> z:org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD.
> > : java.lang.NoClassDefFoundError:
> org/opengis/referencing/FactoryException
> > at
> >
> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
> > at
> >
> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:498) at
> > py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
> > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) at
> > py4j.Gateway.invoke(Gateway.java:295) at
> > py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at
> > py4j.commands.CallCommand.execute(CallCommand.java:79) at
> > py4j.GatewayConnection.run(GatewayConnection.java:251) at
> > java.lang.Thread.run(Thread.java:748) Caused by:
> > java.lang.ClassNotFoundException:
> org.opengis.referencing.FactoryException
> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at
> > java.lang.ClassLoader.loadClass(ClassLoader.java:419) at
> >
> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
> > : java.lang.NoClassDefFoundError:
> org/opengis/referencing/FactoryException
> > at
> >
> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
> > at
> >
> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:498)
> > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
> > at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
> > at py4j.Gateway.invoke(Gateway.java:295)
> > at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
> > at py4j.commands.CallCommand.execute(CallCommand.java:79)
> > at py4j.GatewayConnection.run(GatewayConnection.java:251)
> > at java.lang.Thread.run(Thread.java:748)
> > Caused by: java.lang.ClassNotFoundException:
> > org.opengis.referencing.FactoryException
> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
> > at
> >
> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>
>
> Adding the org.apache.sedona:sedona-core-3.0_2.12:1.0.0-incubating library
> from Maven doesn't solve the error. Adding the
> org.datasyslab:geospark:1.3.1
> library from Maven solves the error, but it creates conflicts with the
> underlying org.locationtech.jts dependencies. This makes me think there is
> a missing OpenGIS dependency in the sedona-python-adapter.
>
> Regards,
> G. Dugernier
>
> --
>
>
>
> Grégory Dugernier
> Software Engineer
>
> gd@aloalto.com <fp...@aloalto.com>
> +32 (0)484 11 26 09
>
> www.aloalto.com
> +32 (0)2 736 10 17
>
> --
>
>
>
>
> DISCLAIMER : The content of this e-mail
> message does not constitute a
> commitment of S.A. ALOALTO N.V. or its
> subsidiaries/affiliates. This e-mail
> and any attachments thereto may contain
> information which is confidential
> and/or protected by intellectual property
> rights and are intended for the
> intended recipient only. Any use of the
> information contained herein
> (including, but not limited to, total or partial
> reproduction,
> communication or distribution in any form) by persons other than
> the
> designated recipient(s) is prohibited. If an addressing or transmission
> error has misdirected this e-mail, please notify the author, either by
> telephone or by e-mail and delete the material from any computer.
>
>