You are viewing a plain text version of this content. The canonical link for it is here.
Posted to geospatial@apache.org by Martin Desruisseaux <ma...@geomatys.com> on 2023/11/08 22:04:32 UTC

GeoTIFF reader/writer performance comparison

Hello all

Apache SIS 1.4 contains a Cloud Optimized GeoTIFF reader, and a GeoTIFF 
writer is currently available in 1.5-SNAPSHOT (not yet fully completed). 
Those reader/writer are in pure Java (ignoring native code provided by 
the JDK itself). There is a widespread belief that a Java code would 
surely be slower than C/C++ code. To verify, we conducted a superficial 
benchmark. It is very superficial in that we tested only one image with 
one compression method, no sub-region, no subsampling, no 
parallelization, no reprojection, etc. But the results nevertheless 
question the above-cited belief. In that benchmark, Apache SIS and GDAL 
performances were equivalent. The numbers actually report Apache SIS as 
very slightly faster than GDAL, but we cannot conclude much because of 
all above-cited limitations, because there is a possibility of biais in 
time measurements (we tried to compensate it), because the differences 
are close to margin errors, and because Apache SIS spent 95% of its time 
in the native code of the `java.util.zip` package. Assuming that GDAL 
also uses the `zlib` library (we did not tested with `libdeflate`), 95% 
of this benchmark is actually measuring the same code. However for the 
remaining 5%, it suggests that the interfacing between Java and the 
native code in `java.nio` and `java.util.zip` packages can be as 
efficient as the interfacing between a C/C++ application and native 
libraries.

More details are there: 
https://www.geomatys.com/2023/10/05/geotiff-reader-writer-performance-comparison/

     Martin



Re: GeoTIFF reader/writer performance comparison

Posted by Martin Desruisseaux <ma...@geomatys.com>.
I forgot to said: the snapshots are deployed on Apache snapshots 
repository. So it is possible to start experimenting the GeoTIFF writer 
if Sedona wishes. It would be an opportunity to fix reported issues 
before a SIS 1.5 release. For using those snapshots in a Maven project, 
the following should work: 
https://github.com/Geomatys/sis-benchmarks/blob/main/pom.xml

     Martin


Le 2023-11-09 à 11 h 05, Martin Desruisseaux a écrit :

> Le 2023-11-09 à 08 h 50, Jia Yu a écrit :
>> One follow-up question, when is the target release date of SIS 1.5 
>> which includes the COG writer? Sedona really needs a pure Java COG 
>> writer without GDAL.
> It is up to this community to decide. Contrarily to the 1.4 release 
> which was a big restructuring work, we are not in the middle of a work 
> blocking a release right now. But since the writer is new, I doubt 
> that there is no bug, so I think that we need at least one month for 
> experimenting with more images. After that point, we can try to align 
> with the needs of Sedona or Baremap. So if the Sedona community thinks 
> that they may start trying the writer in two months (for example), we 
> can plan a 1.5 release around that time. If issues are discovered that 
> need quick fixes, we can try to do minor releases.



Re: GeoTIFF reader/writer performance comparison

Posted by Jia Yu <ji...@apache.org>.
CC Sedona dev in the loop

On Mon, Nov 20, 2023 at 17:37 Martin Desruisseaux <
martin.desruisseaux@geomatys.com> wrote:

> Hello Jia
>
> No worry for the delay. I'm often late myself.
>
>
> Le 2023-11-20 à 05 h 56, Jia Yu a écrit :
>
> > the Sedona community will start to try the COG writer in late January
> > or early Feb 2024. By that time, we will drop the support of Java 8 to
> > be aligned with Apache SIS.
> >
> Thanks for the information. We can target a SIS 1.5 release in January
> then. In the mean time, snapshots will continue to be pushed from time
> to time on the Apache snapshot repositories. Please let me know if you
> face issues.
>
>
> > In addition, do you have any plan to implement a native Java reader
> > and writer of PROJJSON string? (…snip…) We recently ran into a problem
> > in Sedona GeoParquet reader and writer: no existing pure Java library
> > that can read and write a CRS in PROJJSON format. However, PROJJSON
> > somehow is the CRS representation required by GeoParquet spec:
> >
> This is under discussion. PROJJSON is currently a format which is
> specific to the PROJ project, not yet a standard. However, it has been
> brought to the Open Geospatial Consortium (OGC) CRS Standard Working
> Group (SWG) for standardization. I'm a member of this group and attended
> to the initial discussion. The group is open to standardizing that
> format, but may want to do some changes if they consider that some
> aspects of PROJJSON are incorrect. On the other hand, PROJ is reluctant
> to allow changes because PROJJSON is already used. In an attempt to find
> an agreement, the CRS SWG has updated its charter as below:
>
> /---- Quote from CRS SWG charter ----/
>
>     The new SWG work item is to define and support an encoding of the
>     self-contained and compact description of Coordinate Reference
>     Systems and Coordinate Operations through a simple JSON schema. The
>     draft standard delivered will be consistent with OGC Abstract
>     Specification Topic 2 and ISO 19111.  This offers the following
>     advantages:
>
>       * Provides a JSON encoding of coordinate reference systems
>         descriptions for use where JSON may be considered a more natural
>         data schema than Well Known Text.
>       * Ensures that a JSON schema is available that will be kept in
>         alignment with OGC Abstract Specification Topic 2 and ISO 19111.
>       * Recognizing the PROJ6 origins of the PROJJSON schema
>         contribution being used as the starting point of this work,
>         effort will be made to try to avoid compatibility issues and to
>         clearly document any issues identified or object names that
>         deviate from the original OGC Abstract Specification Topic 2 and
>         ISO 19111.  If an unresolvable conflict is identified, OGC
>         Abstract Specification Topic 2 and ISO 19111 will guide the
>         resolution and such an issue will be clearly documented.
>
> /---- End of quote ----/
>
> The CRS SWG has not yet started its work on a CRS JSON. In the last 10
> years, progresses in CRS SWG were done a lot thanks to Roger Lott, who
> has done a tremendous amount of work in editing ISO 19111 (a.k.a. OGC
> Topic 2) and 19162 (a.k.a. WKT 2). For now Roger is busy with GGXF,
> which is in final stage for becoming a standard. Maybe after that, we
> will see progress in CRS JSON.
>
> Meanwhile, GeoParquet is submitted for becoming an OGC standard, but is
> not yet a standard. An OGC GeoParquet working group has been recently
> created for that purpose. I joined that group as well, but did not yet
> attended a meeting. I assume that the CRS representation in GeoParquet
> will be a topic of some meetings. We do not know yet what will be the
> group decision: will they keep PROJJSON unchanged? Wait for a CRS JSON
> to be standardized before to standardize GeoParquet? Allow an
> alternative encoding such as ISO 19162?
>
> On Apache SIS side, some JSON parsing exists in the "incubator" group of
> modules. The "incubator" group contains modules that are not yet part of
> Apache SIS releases, because not considered mature yet. One of them is a
> "Coverage JSON" module. The experience gained there may be used for a
> "CRS JSON" support later. Another incubator module of potential interest
> is a Shapefile reader and writer.
>
> About the time frame: we applied to a GeoParquet funding made available
> by OGC, with a proposal to do the development in Apache SIS.
>
>
> https://www.ogc.org/requests/funding-available-for-creating-geoparquet-developer-resources/
>
> If our submission is accepted, it would vastly speedup the development,
> with at least a temporary solution for "CRS JSON" due in the time frame
> of that project. If not accepted, we will probably try to do it anyway,
> but it would take longer (we don't know how long).
>
>      Martin
>
>

Re: GeoTIFF reader/writer performance comparison

Posted by Jia Yu <ji...@apache.org>.
CC Sedona dev in the loop

On Mon, Nov 20, 2023 at 17:37 Martin Desruisseaux <
martin.desruisseaux@geomatys.com> wrote:

> Hello Jia
>
> No worry for the delay. I'm often late myself.
>
>
> Le 2023-11-20 à 05 h 56, Jia Yu a écrit :
>
> > the Sedona community will start to try the COG writer in late January
> > or early Feb 2024. By that time, we will drop the support of Java 8 to
> > be aligned with Apache SIS.
> >
> Thanks for the information. We can target a SIS 1.5 release in January
> then. In the mean time, snapshots will continue to be pushed from time
> to time on the Apache snapshot repositories. Please let me know if you
> face issues.
>
>
> > In addition, do you have any plan to implement a native Java reader
> > and writer of PROJJSON string? (…snip…) We recently ran into a problem
> > in Sedona GeoParquet reader and writer: no existing pure Java library
> > that can read and write a CRS in PROJJSON format. However, PROJJSON
> > somehow is the CRS representation required by GeoParquet spec:
> >
> This is under discussion. PROJJSON is currently a format which is
> specific to the PROJ project, not yet a standard. However, it has been
> brought to the Open Geospatial Consortium (OGC) CRS Standard Working
> Group (SWG) for standardization. I'm a member of this group and attended
> to the initial discussion. The group is open to standardizing that
> format, but may want to do some changes if they consider that some
> aspects of PROJJSON are incorrect. On the other hand, PROJ is reluctant
> to allow changes because PROJJSON is already used. In an attempt to find
> an agreement, the CRS SWG has updated its charter as below:
>
> /---- Quote from CRS SWG charter ----/
>
>     The new SWG work item is to define and support an encoding of the
>     self-contained and compact description of Coordinate Reference
>     Systems and Coordinate Operations through a simple JSON schema. The
>     draft standard delivered will be consistent with OGC Abstract
>     Specification Topic 2 and ISO 19111.  This offers the following
>     advantages:
>
>       * Provides a JSON encoding of coordinate reference systems
>         descriptions for use where JSON may be considered a more natural
>         data schema than Well Known Text.
>       * Ensures that a JSON schema is available that will be kept in
>         alignment with OGC Abstract Specification Topic 2 and ISO 19111.
>       * Recognizing the PROJ6 origins of the PROJJSON schema
>         contribution being used as the starting point of this work,
>         effort will be made to try to avoid compatibility issues and to
>         clearly document any issues identified or object names that
>         deviate from the original OGC Abstract Specification Topic 2 and
>         ISO 19111.  If an unresolvable conflict is identified, OGC
>         Abstract Specification Topic 2 and ISO 19111 will guide the
>         resolution and such an issue will be clearly documented.
>
> /---- End of quote ----/
>
> The CRS SWG has not yet started its work on a CRS JSON. In the last 10
> years, progresses in CRS SWG were done a lot thanks to Roger Lott, who
> has done a tremendous amount of work in editing ISO 19111 (a.k.a. OGC
> Topic 2) and 19162 (a.k.a. WKT 2). For now Roger is busy with GGXF,
> which is in final stage for becoming a standard. Maybe after that, we
> will see progress in CRS JSON.
>
> Meanwhile, GeoParquet is submitted for becoming an OGC standard, but is
> not yet a standard. An OGC GeoParquet working group has been recently
> created for that purpose. I joined that group as well, but did not yet
> attended a meeting. I assume that the CRS representation in GeoParquet
> will be a topic of some meetings. We do not know yet what will be the
> group decision: will they keep PROJJSON unchanged? Wait for a CRS JSON
> to be standardized before to standardize GeoParquet? Allow an
> alternative encoding such as ISO 19162?
>
> On Apache SIS side, some JSON parsing exists in the "incubator" group of
> modules. The "incubator" group contains modules that are not yet part of
> Apache SIS releases, because not considered mature yet. One of them is a
> "Coverage JSON" module. The experience gained there may be used for a
> "CRS JSON" support later. Another incubator module of potential interest
> is a Shapefile reader and writer.
>
> About the time frame: we applied to a GeoParquet funding made available
> by OGC, with a proposal to do the development in Apache SIS.
>
>
> https://www.ogc.org/requests/funding-available-for-creating-geoparquet-developer-resources/
>
> If our submission is accepted, it would vastly speedup the development,
> with at least a temporary solution for "CRS JSON" due in the time frame
> of that project. If not accepted, we will probably try to do it anyway,
> but it would take longer (we don't know how long).
>
>      Martin
>
>

Re: GeoTIFF reader/writer performance comparison

Posted by Jia Yu <ji...@apache.org>.
CC Sedona dev in the loop

On Mon, Nov 20, 2023 at 17:37 Martin Desruisseaux <
martin.desruisseaux@geomatys.com> wrote:

> Hello Jia
>
> No worry for the delay. I'm often late myself.
>
>
> Le 2023-11-20 à 05 h 56, Jia Yu a écrit :
>
> > the Sedona community will start to try the COG writer in late January
> > or early Feb 2024. By that time, we will drop the support of Java 8 to
> > be aligned with Apache SIS.
> >
> Thanks for the information. We can target a SIS 1.5 release in January
> then. In the mean time, snapshots will continue to be pushed from time
> to time on the Apache snapshot repositories. Please let me know if you
> face issues.
>
>
> > In addition, do you have any plan to implement a native Java reader
> > and writer of PROJJSON string? (…snip…) We recently ran into a problem
> > in Sedona GeoParquet reader and writer: no existing pure Java library
> > that can read and write a CRS in PROJJSON format. However, PROJJSON
> > somehow is the CRS representation required by GeoParquet spec:
> >
> This is under discussion. PROJJSON is currently a format which is
> specific to the PROJ project, not yet a standard. However, it has been
> brought to the Open Geospatial Consortium (OGC) CRS Standard Working
> Group (SWG) for standardization. I'm a member of this group and attended
> to the initial discussion. The group is open to standardizing that
> format, but may want to do some changes if they consider that some
> aspects of PROJJSON are incorrect. On the other hand, PROJ is reluctant
> to allow changes because PROJJSON is already used. In an attempt to find
> an agreement, the CRS SWG has updated its charter as below:
>
> /---- Quote from CRS SWG charter ----/
>
>     The new SWG work item is to define and support an encoding of the
>     self-contained and compact description of Coordinate Reference
>     Systems and Coordinate Operations through a simple JSON schema. The
>     draft standard delivered will be consistent with OGC Abstract
>     Specification Topic 2 and ISO 19111.  This offers the following
>     advantages:
>
>       * Provides a JSON encoding of coordinate reference systems
>         descriptions for use where JSON may be considered a more natural
>         data schema than Well Known Text.
>       * Ensures that a JSON schema is available that will be kept in
>         alignment with OGC Abstract Specification Topic 2 and ISO 19111.
>       * Recognizing the PROJ6 origins of the PROJJSON schema
>         contribution being used as the starting point of this work,
>         effort will be made to try to avoid compatibility issues and to
>         clearly document any issues identified or object names that
>         deviate from the original OGC Abstract Specification Topic 2 and
>         ISO 19111.  If an unresolvable conflict is identified, OGC
>         Abstract Specification Topic 2 and ISO 19111 will guide the
>         resolution and such an issue will be clearly documented.
>
> /---- End of quote ----/
>
> The CRS SWG has not yet started its work on a CRS JSON. In the last 10
> years, progresses in CRS SWG were done a lot thanks to Roger Lott, who
> has done a tremendous amount of work in editing ISO 19111 (a.k.a. OGC
> Topic 2) and 19162 (a.k.a. WKT 2). For now Roger is busy with GGXF,
> which is in final stage for becoming a standard. Maybe after that, we
> will see progress in CRS JSON.
>
> Meanwhile, GeoParquet is submitted for becoming an OGC standard, but is
> not yet a standard. An OGC GeoParquet working group has been recently
> created for that purpose. I joined that group as well, but did not yet
> attended a meeting. I assume that the CRS representation in GeoParquet
> will be a topic of some meetings. We do not know yet what will be the
> group decision: will they keep PROJJSON unchanged? Wait for a CRS JSON
> to be standardized before to standardize GeoParquet? Allow an
> alternative encoding such as ISO 19162?
>
> On Apache SIS side, some JSON parsing exists in the "incubator" group of
> modules. The "incubator" group contains modules that are not yet part of
> Apache SIS releases, because not considered mature yet. One of them is a
> "Coverage JSON" module. The experience gained there may be used for a
> "CRS JSON" support later. Another incubator module of potential interest
> is a Shapefile reader and writer.
>
> About the time frame: we applied to a GeoParquet funding made available
> by OGC, with a proposal to do the development in Apache SIS.
>
>
> https://www.ogc.org/requests/funding-available-for-creating-geoparquet-developer-resources/
>
> If our submission is accepted, it would vastly speedup the development,
> with at least a temporary solution for "CRS JSON" due in the time frame
> of that project. If not accepted, we will probably try to do it anyway,
> but it would take longer (we don't know how long).
>
>      Martin
>
>

Re: GeoTIFF reader/writer performance comparison

Posted by Martin Desruisseaux <ma...@geomatys.com>.
Hello Jia

No worry for the delay. I'm often late myself.


Le 2023-11-20 à 05 h 56, Jia Yu a écrit :

> the Sedona community will start to try the COG writer in late January 
> or early Feb 2024. By that time, we will drop the support of Java 8 to 
> be aligned with Apache SIS.
>
Thanks for the information. We can target a SIS 1.5 release in January 
then. In the mean time, snapshots will continue to be pushed from time 
to time on the Apache snapshot repositories. Please let me know if you 
face issues.


> In addition, do you have any plan to implement a native Java reader 
> and writer of PROJJSON string? (…snip…) We recently ran into a problem 
> in Sedona GeoParquet reader and writer: no existing pure Java library 
> that can read and write a CRS in PROJJSON format. However, PROJJSON 
> somehow is the CRS representation required by GeoParquet spec:
>
This is under discussion. PROJJSON is currently a format which is 
specific to the PROJ project, not yet a standard. However, it has been 
brought to the Open Geospatial Consortium (OGC) CRS Standard Working 
Group (SWG) for standardization. I'm a member of this group and attended 
to the initial discussion. The group is open to standardizing that 
format, but may want to do some changes if they consider that some 
aspects of PROJJSON are incorrect. On the other hand, PROJ is reluctant 
to allow changes because PROJJSON is already used. In an attempt to find 
an agreement, the CRS SWG has updated its charter as below:

/---- Quote from CRS SWG charter ----/

    The new SWG work item is to define and support an encoding of the
    self-contained and compact description of Coordinate Reference
    Systems and Coordinate Operations through a simple JSON schema. The
    draft standard delivered will be consistent with OGC Abstract
    Specification Topic 2 and ISO 19111.  This offers the following
    advantages:

      * Provides a JSON encoding of coordinate reference systems
        descriptions for use where JSON may be considered a more natural
        data schema than Well Known Text.
      * Ensures that a JSON schema is available that will be kept in
        alignment with OGC Abstract Specification Topic 2 and ISO 19111.
      * Recognizing the PROJ6 origins of the PROJJSON schema
        contribution being used as the starting point of this work,
        effort will be made to try to avoid compatibility issues and to
        clearly document any issues identified or object names that
        deviate from the original OGC Abstract Specification Topic 2 and
        ISO 19111.  If an unresolvable conflict is identified, OGC
        Abstract Specification Topic 2 and ISO 19111 will guide the
        resolution and such an issue will be clearly documented.

/---- End of quote ----/

The CRS SWG has not yet started its work on a CRS JSON. In the last 10 
years, progresses in CRS SWG were done a lot thanks to Roger Lott, who 
has done a tremendous amount of work in editing ISO 19111 (a.k.a. OGC 
Topic 2) and 19162 (a.k.a. WKT 2). For now Roger is busy with GGXF, 
which is in final stage for becoming a standard. Maybe after that, we 
will see progress in CRS JSON.

Meanwhile, GeoParquet is submitted for becoming an OGC standard, but is 
not yet a standard. An OGC GeoParquet working group has been recently 
created for that purpose. I joined that group as well, but did not yet 
attended a meeting. I assume that the CRS representation in GeoParquet 
will be a topic of some meetings. We do not know yet what will be the 
group decision: will they keep PROJJSON unchanged? Wait for a CRS JSON 
to be standardized before to standardize GeoParquet? Allow an 
alternative encoding such as ISO 19162?

On Apache SIS side, some JSON parsing exists in the "incubator" group of 
modules. The "incubator" group contains modules that are not yet part of 
Apache SIS releases, because not considered mature yet. One of them is a 
"Coverage JSON" module. The experience gained there may be used for a 
"CRS JSON" support later. Another incubator module of potential interest 
is a Shapefile reader and writer.

About the time frame: we applied to a GeoParquet funding made available 
by OGC, with a proposal to do the development in Apache SIS.

    https://www.ogc.org/requests/funding-available-for-creating-geoparquet-developer-resources/

If our submission is accepted, it would vastly speedup the development, 
with at least a temporary solution for "CRS JSON" due in the time frame 
of that project. If not accepted, we will probably try to do it anyway, 
but it would take longer (we don't know how long).

     Martin


Re: GeoTIFF reader/writer performance comparison

Posted by Martin Desruisseaux <ma...@geomatys.com>.
Hello Jia

No worry for the delay. I'm often late myself.


Le 2023-11-20 à 05 h 56, Jia Yu a écrit :

> the Sedona community will start to try the COG writer in late January 
> or early Feb 2024. By that time, we will drop the support of Java 8 to 
> be aligned with Apache SIS.
>
Thanks for the information. We can target a SIS 1.5 release in January 
then. In the mean time, snapshots will continue to be pushed from time 
to time on the Apache snapshot repositories. Please let me know if you 
face issues.


> In addition, do you have any plan to implement a native Java reader 
> and writer of PROJJSON string? (…snip…) We recently ran into a problem 
> in Sedona GeoParquet reader and writer: no existing pure Java library 
> that can read and write a CRS in PROJJSON format. However, PROJJSON 
> somehow is the CRS representation required by GeoParquet spec:
>
This is under discussion. PROJJSON is currently a format which is 
specific to the PROJ project, not yet a standard. However, it has been 
brought to the Open Geospatial Consortium (OGC) CRS Standard Working 
Group (SWG) for standardization. I'm a member of this group and attended 
to the initial discussion. The group is open to standardizing that 
format, but may want to do some changes if they consider that some 
aspects of PROJJSON are incorrect. On the other hand, PROJ is reluctant 
to allow changes because PROJJSON is already used. In an attempt to find 
an agreement, the CRS SWG has updated its charter as below:

/---- Quote from CRS SWG charter ----/

    The new SWG work item is to define and support an encoding of the
    self-contained and compact description of Coordinate Reference
    Systems and Coordinate Operations through a simple JSON schema. The
    draft standard delivered will be consistent with OGC Abstract
    Specification Topic 2 and ISO 19111.  This offers the following
    advantages:

      * Provides a JSON encoding of coordinate reference systems
        descriptions for use where JSON may be considered a more natural
        data schema than Well Known Text.
      * Ensures that a JSON schema is available that will be kept in
        alignment with OGC Abstract Specification Topic 2 and ISO 19111.
      * Recognizing the PROJ6 origins of the PROJJSON schema
        contribution being used as the starting point of this work,
        effort will be made to try to avoid compatibility issues and to
        clearly document any issues identified or object names that
        deviate from the original OGC Abstract Specification Topic 2 and
        ISO 19111.  If an unresolvable conflict is identified, OGC
        Abstract Specification Topic 2 and ISO 19111 will guide the
        resolution and such an issue will be clearly documented.

/---- End of quote ----/

The CRS SWG has not yet started its work on a CRS JSON. In the last 10 
years, progresses in CRS SWG were done a lot thanks to Roger Lott, who 
has done a tremendous amount of work in editing ISO 19111 (a.k.a. OGC 
Topic 2) and 19162 (a.k.a. WKT 2). For now Roger is busy with GGXF, 
which is in final stage for becoming a standard. Maybe after that, we 
will see progress in CRS JSON.

Meanwhile, GeoParquet is submitted for becoming an OGC standard, but is 
not yet a standard. An OGC GeoParquet working group has been recently 
created for that purpose. I joined that group as well, but did not yet 
attended a meeting. I assume that the CRS representation in GeoParquet 
will be a topic of some meetings. We do not know yet what will be the 
group decision: will they keep PROJJSON unchanged? Wait for a CRS JSON 
to be standardized before to standardize GeoParquet? Allow an 
alternative encoding such as ISO 19162?

On Apache SIS side, some JSON parsing exists in the "incubator" group of 
modules. The "incubator" group contains modules that are not yet part of 
Apache SIS releases, because not considered mature yet. One of them is a 
"Coverage JSON" module. The experience gained there may be used for a 
"CRS JSON" support later. Another incubator module of potential interest 
is a Shapefile reader and writer.

About the time frame: we applied to a GeoParquet funding made available 
by OGC, with a proposal to do the development in Apache SIS.

    https://www.ogc.org/requests/funding-available-for-creating-geoparquet-developer-resources/

If our submission is accepted, it would vastly speedup the development, 
with at least a temporary solution for "CRS JSON" due in the time frame 
of that project. If not accepted, we will probably try to do it anyway, 
but it would take longer (we don't know how long).

     Martin


Re: GeoTIFF reader/writer performance comparison

Posted by Jia Yu <ji...@apache.org>.
Hi Martin,

Sorry for the late reply. I totally forgot this thread. Yes, the
Sedona community will start to try the COG writer in late January or
early Feb 2024. By that time, we will drop the support of Java 8 to be
aligned with Apache SIS.

In addition, do you have any plan to implement a native Java reader
and writer of PROJJSON string? I noticed that you are the main
maintainer of PROJ-JNI: https://github.com/OSGeo/PROJ-JNI
We recently ran into a problem in Sedona GeoParquet reader and writer:
no existing pure Java library that can read and write a CRS in
PROJJSON format. However, PROJJSON somehow is the CRS representation
required by GeoParquet spec:
https://github.com/opengeospatial/geoparquet/blob/main/format-specs/geoparquet.md


Thanks,
Jia

On Thu, Nov 9, 2023 at 2:05 AM Martin Desruisseaux
<ma...@geomatys.com> wrote:
>
> Le 2023-11-09 à 08 h 50, Jia Yu a écrit :
>
> > One follow-up question, when is the target release date of SIS 1.5
> > which includes the COG writer? Sedona really needs a pure Java COG
> > writer without GDAL.
> >
> It is up to this community to decide. Contrarily to the 1.4 release
> which was a big restructuring work, we are not in the middle of a work
> blocking a release right now. But since the writer is new, I doubt that
> there is no bug, so I think that we need at least one month for
> experimenting with more images. After that point, we can try to align
> with the needs of Sedona or Baremap. So if the Sedona community thinks
> that they may start trying the writer in two months (for example), we
> can plan a 1.5 release around that time. If issues are discovered that
> need quick fixes, we can try to do minor releases.
>
>      Martin
>
>

Re: GeoTIFF reader/writer performance comparison

Posted by Jia Yu <ji...@apache.org>.
Hi Martin,

Sorry for the late reply. I totally forgot this thread. Yes, the
Sedona community will start to try the COG writer in late January or
early Feb 2024. By that time, we will drop the support of Java 8 to be
aligned with Apache SIS.

In addition, do you have any plan to implement a native Java reader
and writer of PROJJSON string? I noticed that you are the main
maintainer of PROJ-JNI: https://github.com/OSGeo/PROJ-JNI
We recently ran into a problem in Sedona GeoParquet reader and writer:
no existing pure Java library that can read and write a CRS in
PROJJSON format. However, PROJJSON somehow is the CRS representation
required by GeoParquet spec:
https://github.com/opengeospatial/geoparquet/blob/main/format-specs/geoparquet.md


Thanks,
Jia

On Thu, Nov 9, 2023 at 2:05 AM Martin Desruisseaux
<ma...@geomatys.com> wrote:
>
> Le 2023-11-09 à 08 h 50, Jia Yu a écrit :
>
> > One follow-up question, when is the target release date of SIS 1.5
> > which includes the COG writer? Sedona really needs a pure Java COG
> > writer without GDAL.
> >
> It is up to this community to decide. Contrarily to the 1.4 release
> which was a big restructuring work, we are not in the middle of a work
> blocking a release right now. But since the writer is new, I doubt that
> there is no bug, so I think that we need at least one month for
> experimenting with more images. After that point, we can try to align
> with the needs of Sedona or Baremap. So if the Sedona community thinks
> that they may start trying the writer in two months (for example), we
> can plan a 1.5 release around that time. If issues are discovered that
> need quick fixes, we can try to do minor releases.
>
>      Martin
>
>

Re: GeoTIFF reader/writer performance comparison

Posted by Martin Desruisseaux <ma...@geomatys.com>.
Le 2023-11-09 à 08 h 50, Jia Yu a écrit :

> One follow-up question, when is the target release date of SIS 1.5 
> which includes the COG writer? Sedona really needs a pure Java COG 
> writer without GDAL.
>
It is up to this community to decide. Contrarily to the 1.4 release 
which was a big restructuring work, we are not in the middle of a work 
blocking a release right now. But since the writer is new, I doubt that 
there is no bug, so I think that we need at least one month for 
experimenting with more images. After that point, we can try to align 
with the needs of Sedona or Baremap. So if the Sedona community thinks 
that they may start trying the writer in two months (for example), we 
can plan a 1.5 release around that time. If issues are discovered that 
need quick fixes, we can try to do minor releases.

     Martin



Re: GeoTIFF reader/writer performance comparison

Posted by Martin Desruisseaux <ma...@geomatys.com>.
Le 2023-11-09 à 08 h 50, Jia Yu a écrit :

> One follow-up question, when is the target release date of SIS 1.5 
> which includes the COG writer? Sedona really needs a pure Java COG 
> writer without GDAL.
>
It is up to this community to decide. Contrarily to the 1.4 release 
which was a big restructuring work, we are not in the middle of a work 
blocking a release right now. But since the writer is new, I doubt that 
there is no bug, so I think that we need at least one month for 
experimenting with more images. After that point, we can try to align 
with the needs of Sedona or Baremap. So if the Sedona community thinks 
that they may start trying the writer in two months (for example), we 
can plan a 1.5 release around that time. If issues are discovered that 
need quick fixes, we can try to do minor releases.

     Martin



Re: GeoTIFF reader/writer performance comparison

Posted by Jia Yu <ji...@apache.org>.
Martin,

One follow-up question, when is the target release date of SIS 1.5
which includes the COG writer? Sedona really needs a pure Java COG
writer without GDAL.

Thanks,
Jia


On Wed, Nov 8, 2023 at 11:38 PM Jia Yu <ji...@apache.org> wrote:
>
> Hi Martin,
>
> This is great work! Looking forward to the SIS 1.5 release!
>
> Thanks,
> Jia
>
> On Wed, Nov 8, 2023 at 11:28 PM Bertil Chapuis <bc...@gmail.com> wrote:
> >
> > Really interesting benchmark, thank you.
> >
> > > On 8 Nov 2023, at 23:04, Martin Desruisseaux <ma...@geomatys.com> wrote:
> > >
> > > Hello all
> > >
> > > Apache SIS 1.4 contains a Cloud Optimized GeoTIFF reader, and a GeoTIFF writer is currently available in 1.5-SNAPSHOT (not yet fully completed). Those reader/writer are in pure Java (ignoring native code provided by the JDK itself). There is a widespread belief that a Java code would surely be slower than C/C++ code. To verify, we conducted a superficial benchmark. It is very superficial in that we tested only one image with one compression method, no sub-region, no subsampling, no parallelization, no reprojection, etc. But the results nevertheless question the above-cited belief. In that benchmark, Apache SIS and GDAL performances were equivalent. The numbers actually report Apache SIS as very slightly faster than GDAL, but we cannot conclude much because of all above-cited limitations, because there is a possibility of biais in time measurements (we tried to compensate it), because the differences are close to margin errors, and because Apache SIS spent 95% of its time in the native code of the `java.util.zip` package. Assuming that GDAL also uses the `zlib` library (we did not tested with `libdeflate`), 95% of this benchmark is actually measuring the same code. However for the remaining 5%, it suggests that the interfacing between Java and the native code in `java.nio` and `java.util.zip` packages can be as efficient as the interfacing between a C/C++ application and native libraries.
> > >
> > > More details are there: https://www.geomatys.com/2023/10/05/geotiff-reader-writer-performance-comparison/
> > >
> > >     Martin
> > >
> > >
> >

Re: GeoTIFF reader/writer performance comparison

Posted by Jia Yu <ji...@apache.org>.
Martin,

One follow-up question, when is the target release date of SIS 1.5
which includes the COG writer? Sedona really needs a pure Java COG
writer without GDAL.

Thanks,
Jia


On Wed, Nov 8, 2023 at 11:38 PM Jia Yu <ji...@apache.org> wrote:
>
> Hi Martin,
>
> This is great work! Looking forward to the SIS 1.5 release!
>
> Thanks,
> Jia
>
> On Wed, Nov 8, 2023 at 11:28 PM Bertil Chapuis <bc...@gmail.com> wrote:
> >
> > Really interesting benchmark, thank you.
> >
> > > On 8 Nov 2023, at 23:04, Martin Desruisseaux <ma...@geomatys.com> wrote:
> > >
> > > Hello all
> > >
> > > Apache SIS 1.4 contains a Cloud Optimized GeoTIFF reader, and a GeoTIFF writer is currently available in 1.5-SNAPSHOT (not yet fully completed). Those reader/writer are in pure Java (ignoring native code provided by the JDK itself). There is a widespread belief that a Java code would surely be slower than C/C++ code. To verify, we conducted a superficial benchmark. It is very superficial in that we tested only one image with one compression method, no sub-region, no subsampling, no parallelization, no reprojection, etc. But the results nevertheless question the above-cited belief. In that benchmark, Apache SIS and GDAL performances were equivalent. The numbers actually report Apache SIS as very slightly faster than GDAL, but we cannot conclude much because of all above-cited limitations, because there is a possibility of biais in time measurements (we tried to compensate it), because the differences are close to margin errors, and because Apache SIS spent 95% of its time in the native code of the `java.util.zip` package. Assuming that GDAL also uses the `zlib` library (we did not tested with `libdeflate`), 95% of this benchmark is actually measuring the same code. However for the remaining 5%, it suggests that the interfacing between Java and the native code in `java.nio` and `java.util.zip` packages can be as efficient as the interfacing between a C/C++ application and native libraries.
> > >
> > > More details are there: https://www.geomatys.com/2023/10/05/geotiff-reader-writer-performance-comparison/
> > >
> > >     Martin
> > >
> > >
> >

Re: GeoTIFF reader/writer performance comparison

Posted by Jia Yu <ji...@apache.org>.
Hi Martin,

This is great work! Looking forward to the SIS 1.5 release!

Thanks,
Jia

On Wed, Nov 8, 2023 at 11:28 PM Bertil Chapuis <bc...@gmail.com> wrote:
>
> Really interesting benchmark, thank you.
>
> > On 8 Nov 2023, at 23:04, Martin Desruisseaux <ma...@geomatys.com> wrote:
> >
> > Hello all
> >
> > Apache SIS 1.4 contains a Cloud Optimized GeoTIFF reader, and a GeoTIFF writer is currently available in 1.5-SNAPSHOT (not yet fully completed). Those reader/writer are in pure Java (ignoring native code provided by the JDK itself). There is a widespread belief that a Java code would surely be slower than C/C++ code. To verify, we conducted a superficial benchmark. It is very superficial in that we tested only one image with one compression method, no sub-region, no subsampling, no parallelization, no reprojection, etc. But the results nevertheless question the above-cited belief. In that benchmark, Apache SIS and GDAL performances were equivalent. The numbers actually report Apache SIS as very slightly faster than GDAL, but we cannot conclude much because of all above-cited limitations, because there is a possibility of biais in time measurements (we tried to compensate it), because the differences are close to margin errors, and because Apache SIS spent 95% of its time in the native code of the `java.util.zip` package. Assuming that GDAL also uses the `zlib` library (we did not tested with `libdeflate`), 95% of this benchmark is actually measuring the same code. However for the remaining 5%, it suggests that the interfacing between Java and the native code in `java.nio` and `java.util.zip` packages can be as efficient as the interfacing between a C/C++ application and native libraries.
> >
> > More details are there: https://www.geomatys.com/2023/10/05/geotiff-reader-writer-performance-comparison/
> >
> >     Martin
> >
> >
>

Re: GeoTIFF reader/writer performance comparison

Posted by Jia Yu <ji...@apache.org>.
Hi Martin,

This is great work! Looking forward to the SIS 1.5 release!

Thanks,
Jia

On Wed, Nov 8, 2023 at 11:28 PM Bertil Chapuis <bc...@gmail.com> wrote:
>
> Really interesting benchmark, thank you.
>
> > On 8 Nov 2023, at 23:04, Martin Desruisseaux <ma...@geomatys.com> wrote:
> >
> > Hello all
> >
> > Apache SIS 1.4 contains a Cloud Optimized GeoTIFF reader, and a GeoTIFF writer is currently available in 1.5-SNAPSHOT (not yet fully completed). Those reader/writer are in pure Java (ignoring native code provided by the JDK itself). There is a widespread belief that a Java code would surely be slower than C/C++ code. To verify, we conducted a superficial benchmark. It is very superficial in that we tested only one image with one compression method, no sub-region, no subsampling, no parallelization, no reprojection, etc. But the results nevertheless question the above-cited belief. In that benchmark, Apache SIS and GDAL performances were equivalent. The numbers actually report Apache SIS as very slightly faster than GDAL, but we cannot conclude much because of all above-cited limitations, because there is a possibility of biais in time measurements (we tried to compensate it), because the differences are close to margin errors, and because Apache SIS spent 95% of its time in the native code of the `java.util.zip` package. Assuming that GDAL also uses the `zlib` library (we did not tested with `libdeflate`), 95% of this benchmark is actually measuring the same code. However for the remaining 5%, it suggests that the interfacing between Java and the native code in `java.nio` and `java.util.zip` packages can be as efficient as the interfacing between a C/C++ application and native libraries.
> >
> > More details are there: https://www.geomatys.com/2023/10/05/geotiff-reader-writer-performance-comparison/
> >
> >     Martin
> >
> >
>

Re: GeoTIFF reader/writer performance comparison

Posted by Bertil Chapuis <bc...@gmail.com>.
Really interesting benchmark, thank you.

> On 8 Nov 2023, at 23:04, Martin Desruisseaux <ma...@geomatys.com> wrote:
> 
> Hello all
> 
> Apache SIS 1.4 contains a Cloud Optimized GeoTIFF reader, and a GeoTIFF writer is currently available in 1.5-SNAPSHOT (not yet fully completed). Those reader/writer are in pure Java (ignoring native code provided by the JDK itself). There is a widespread belief that a Java code would surely be slower than C/C++ code. To verify, we conducted a superficial benchmark. It is very superficial in that we tested only one image with one compression method, no sub-region, no subsampling, no parallelization, no reprojection, etc. But the results nevertheless question the above-cited belief. In that benchmark, Apache SIS and GDAL performances were equivalent. The numbers actually report Apache SIS as very slightly faster than GDAL, but we cannot conclude much because of all above-cited limitations, because there is a possibility of biais in time measurements (we tried to compensate it), because the differences are close to margin errors, and because Apache SIS spent 95% of its time in the native code of the `java.util.zip` package. Assuming that GDAL also uses the `zlib` library (we did not tested with `libdeflate`), 95% of this benchmark is actually measuring the same code. However for the remaining 5%, it suggests that the interfacing between Java and the native code in `java.nio` and `java.util.zip` packages can be as efficient as the interfacing between a C/C++ application and native libraries.
> 
> More details are there: https://www.geomatys.com/2023/10/05/geotiff-reader-writer-performance-comparison/
> 
>     Martin
> 
> 


Re: GeoTIFF reader/writer performance comparison

Posted by Bertil Chapuis <bc...@gmail.com>.
Really interesting benchmark, thank you.

> On 8 Nov 2023, at 23:04, Martin Desruisseaux <ma...@geomatys.com> wrote:
> 
> Hello all
> 
> Apache SIS 1.4 contains a Cloud Optimized GeoTIFF reader, and a GeoTIFF writer is currently available in 1.5-SNAPSHOT (not yet fully completed). Those reader/writer are in pure Java (ignoring native code provided by the JDK itself). There is a widespread belief that a Java code would surely be slower than C/C++ code. To verify, we conducted a superficial benchmark. It is very superficial in that we tested only one image with one compression method, no sub-region, no subsampling, no parallelization, no reprojection, etc. But the results nevertheless question the above-cited belief. In that benchmark, Apache SIS and GDAL performances were equivalent. The numbers actually report Apache SIS as very slightly faster than GDAL, but we cannot conclude much because of all above-cited limitations, because there is a possibility of biais in time measurements (we tried to compensate it), because the differences are close to margin errors, and because Apache SIS spent 95% of its time in the native code of the `java.util.zip` package. Assuming that GDAL also uses the `zlib` library (we did not tested with `libdeflate`), 95% of this benchmark is actually measuring the same code. However for the remaining 5%, it suggests that the interfacing between Java and the native code in `java.nio` and `java.util.zip` packages can be as efficient as the interfacing between a C/C++ application and native libraries.
> 
> More details are there: https://www.geomatys.com/2023/10/05/geotiff-reader-writer-performance-comparison/
> 
>     Martin
> 
>