You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@asterixdb.apache.org by Riyafa Abdul Hameed <ri...@apache.org> on 2017/06/09 09:24:29 UTC

Support GeoJSON in AsterixDB

Hi all,

As the first step in resolving ASTERIXDB-1371 we plan to add support for
GeoJSON[1]. Hence initially a datatype known as 'Geometry' would be
implemented. Since we plan to use Esri-geometry-api[2] to achieve this, the
internal representation of the Geometry objects need to be in WKB(Well
Known Binary) format.
GeoJSON is also defined in JSON. Currently as I understand the JSON objects
and arrays are represented by the Record and OrderedList datatypes
respectively. But they are not internally parsed into WKB format. Thus it
wouldn't be possible to reuse them unless we write our own implementation
to convert these types to WKB format.

Hence my question is if it is a good idea to change the javcc grammar
(grammar.jj) to parse Geometry types directly which could then be
internally represented in the WKB format.

[1] https://tools.ietf.org/html/rfc7946
[2] https://github.com/Esri/geometry-api-java

Thank you.
Yours sincerely,
Riyafa

Re: Support GeoJSON in AsterixDB

Posted by Ahmed Eldawy <el...@ucr.edu>.
Hi,

As you said, the initial plan is to provide a parse_geojson function which
takes as input a JSON object and returns a geometry object, which you added
recently. The way it will work is like this:

   1. The input is a valid JSON file where one of the attributes is a
   GeoJSON object.
   2. Since GeoJSON is represented in a standard JSON format, AsterxiDB
   will parse it and store it as a complex object.
   3. You will provide a new parse_geojson which takes as input an object
   and returns a geometry.
   4. At this stage, AsterixDB will not automatically detect that an object
   as GeoJSON. The user will explicitly provide this information by calling
   the parse_geojson on the correct attribute.
   5. If the user calls parse_geojson on an object that is not in the
   standard GeoJSON format, it should throw an exception.

The advantage of this *initial* approach is to keep the current JSON parser
intact. This will allow us to provide the geometry support without worrying
about the JSON parser.
Once this part is stable, we should add the feature of automatically
detecting a GeoJSON object by mapping it to a geometry attribute in a
datatype. I think this is more convenient and can be also more efficient as
it can deal directly with the raw JSON input.



On Sun, Jun 18, 2017 at 7:12 AM, Riyafa Abdul Hameed <ri...@apache.org>
wrote:

> Hi,
>
> The plan is to implement a function as *parse_geojson()* which would take
> as input json objects. My concern is when the input (ie. json objects) is
> parsed within AsterixDB it would be as *Records*. Then *Records* might
> have to converted to WKB or to OGCGeometry type of Esri api. Is it the way
> to follow in this case?
>
> Sorry about the ignorance.
>
> Thank you.
> Yours sincerely,
> Riyafa
>
> On 14 June 2017 at 02:19, Mike Carey <dt...@gmail.com> wrote:
>
>> Agreed.  I also think we'll want to avoid an architecture where extending
>> the system with a new JSON-based standard might require modifying its core
>> components - the function approach seems more modular and cleaner.
>>
>>
>>
>> On 6/13/17 1:14 PM, Till Westmann wrote:
>>
>>> Hi,
>>>
>>> I’m sorry for the late comment on this. I think that we should not
>>> directly
>>> parse GeoJSON into WKB initially. As GeoJSON is valid JSON we’d need to
>>> a) determine if it’s GeoJSON every time we parse some JSON and
>>> b) do this independent of the users intention (maybe it’s just some JSON
>>>     that gets returned and never stored or processed).
>>> So I think that we should have a user action (e.g. a constructor
>>> function)
>>> that clearly documents the users intention.
>>>
>>> Cheers,
>>> Till
>>>
>>> On 9 Jun 2017, at 2:24, Riyafa Abdul Hameed wrote:
>>>
>>> Hi all,
>>>>
>>>> As the first step in resolving ASTERIXDB-1371 we plan to add support for
>>>> GeoJSON[1]. Hence initially a datatype known as 'Geometry' would be
>>>> implemented. Since we plan to use Esri-geometry-api[2] to achieve this,
>>>> the
>>>> internal representation of the Geometry objects need to be in WKB(Well
>>>> Known Binary) format.
>>>> GeoJSON is also defined in JSON. Currently as I understand the JSON
>>>> objects
>>>> and arrays are represented by the Record and OrderedList datatypes
>>>> respectively. But they are not internally parsed into WKB format. Thus
>>>> it
>>>> wouldn't be possible to reuse them unless we write our own
>>>> implementation
>>>> to convert these types to WKB format.
>>>>
>>>> Hence my question is if it is a good idea to change the javcc grammar
>>>> (grammar.jj) to parse Geometry types directly which could then be
>>>> internally represented in the WKB format.
>>>>
>>>> [1] https://tools.ietf.org/html/rfc7946
>>>> [2] https://github.com/Esri/geometry-api-java
>>>>
>>>> Thank you.
>>>> Yours sincerely,
>>>> Riyafa
>>>>
>>>
>>
>


-- 
Best regards
Ahmed Eldawy

Re: Support GeoJSON in AsterixDB

Posted by Riyafa Abdul Hameed <ri...@apache.org>.
Hi,

The plan is to implement a function as *parse_geojson()* which would take
as input json objects. My concern is when the input (ie. json objects) is
parsed within AsterixDB it would be as *Records*. Then *Records* might have
to converted to WKB or to OGCGeometry type of Esri api. Is it the way to
follow in this case?

Sorry about the ignorance.

Thank you.
Yours sincerely,
Riyafa

On 14 June 2017 at 02:19, Mike Carey <dt...@gmail.com> wrote:

> Agreed.  I also think we'll want to avoid an architecture where extending
> the system with a new JSON-based standard might require modifying its core
> components - the function approach seems more modular and cleaner.
>
>
>
> On 6/13/17 1:14 PM, Till Westmann wrote:
>
>> Hi,
>>
>> I’m sorry for the late comment on this. I think that we should not
>> directly
>> parse GeoJSON into WKB initially. As GeoJSON is valid JSON we’d need to
>> a) determine if it’s GeoJSON every time we parse some JSON and
>> b) do this independent of the users intention (maybe it’s just some JSON
>>     that gets returned and never stored or processed).
>> So I think that we should have a user action (e.g. a constructor function)
>> that clearly documents the users intention.
>>
>> Cheers,
>> Till
>>
>> On 9 Jun 2017, at 2:24, Riyafa Abdul Hameed wrote:
>>
>> Hi all,
>>>
>>> As the first step in resolving ASTERIXDB-1371 we plan to add support for
>>> GeoJSON[1]. Hence initially a datatype known as 'Geometry' would be
>>> implemented. Since we plan to use Esri-geometry-api[2] to achieve this,
>>> the
>>> internal representation of the Geometry objects need to be in WKB(Well
>>> Known Binary) format.
>>> GeoJSON is also defined in JSON. Currently as I understand the JSON
>>> objects
>>> and arrays are represented by the Record and OrderedList datatypes
>>> respectively. But they are not internally parsed into WKB format. Thus it
>>> wouldn't be possible to reuse them unless we write our own implementation
>>> to convert these types to WKB format.
>>>
>>> Hence my question is if it is a good idea to change the javcc grammar
>>> (grammar.jj) to parse Geometry types directly which could then be
>>> internally represented in the WKB format.
>>>
>>> [1] https://tools.ietf.org/html/rfc7946
>>> [2] https://github.com/Esri/geometry-api-java
>>>
>>> Thank you.
>>> Yours sincerely,
>>> Riyafa
>>>
>>
>

Re: Support GeoJSON in AsterixDB

Posted by Mike Carey <dt...@gmail.com>.
Agreed.  I also think we'll want to avoid an architecture where 
extending the system with a new JSON-based standard might require 
modifying its core components - the function approach seems more modular 
and cleaner.


On 6/13/17 1:14 PM, Till Westmann wrote:
> Hi,
>
> I’m sorry for the late comment on this. I think that we should not directly
> parse GeoJSON into WKB initially. As GeoJSON is valid JSON we’d need to
> a) determine if it’s GeoJSON every time we parse some JSON and
> b) do this independent of the users intention (maybe it’s just some JSON
>     that gets returned and never stored or processed).
> So I think that we should have a user action (e.g. a constructor function)
> that clearly documents the users intention.
>
> Cheers,
> Till
>
> On 9 Jun 2017, at 2:24, Riyafa Abdul Hameed wrote:
>
>> Hi all,
>>
>> As the first step in resolving ASTERIXDB-1371 we plan to add support for
>> GeoJSON[1]. Hence initially a datatype known as 'Geometry' would be
>> implemented. Since we plan to use Esri-geometry-api[2] to achieve this, the
>> internal representation of the Geometry objects need to be in WKB(Well
>> Known Binary) format.
>> GeoJSON is also defined in JSON. Currently as I understand the JSON objects
>> and arrays are represented by the Record and OrderedList datatypes
>> respectively. But they are not internally parsed into WKB format. Thus it
>> wouldn't be possible to reuse them unless we write our own implementation
>> to convert these types to WKB format.
>>
>> Hence my question is if it is a good idea to change the javcc grammar
>> (grammar.jj) to parse Geometry types directly which could then be
>> internally represented in the WKB format.
>>
>> [1] https://tools.ietf.org/html/rfc7946
>> [2] https://github.com/Esri/geometry-api-java
>>
>> Thank you.
>> Yours sincerely,
>> Riyafa


Re: Support GeoJSON in AsterixDB

Posted by Till Westmann <ti...@apache.org>.
Hi,

I’m sorry for the late comment on this. I think that we should not directly
parse GeoJSON into WKB initially. As GeoJSON is valid JSON we’d need to
a) determine if it’s GeoJSON every time we parse some JSON and
b) do this independent of the users intention (maybe it’s just some JSON
   that gets returned and never stored or processed).
So I think that we should have a user action (e.g. a constructor function)
that clearly documents the users intention.

Cheers,
Till

On 9 Jun 2017, at 2:24, Riyafa Abdul Hameed wrote:

> Hi all,
>
> As the first step in resolving ASTERIXDB-1371 we plan to add support for
> GeoJSON[1]. Hence initially a datatype known as 'Geometry' would be
> implemented. Since we plan to use Esri-geometry-api[2] to achieve this, the
> internal representation of the Geometry objects need to be in WKB(Well
> Known Binary) format.
> GeoJSON is also defined in JSON. Currently as I understand the JSON objects
> and arrays are represented by the Record and OrderedList datatypes
> respectively. But they are not internally parsed into WKB format. Thus it
> wouldn't be possible to reuse them unless we write our own implementation
> to convert these types to WKB format.
>
> Hence my question is if it is a good idea to change the javcc grammar
> (grammar.jj) to parse Geometry types directly which could then be
> internally represented in the WKB format.
>
> [1] https://tools.ietf.org/html/rfc7946
> [2] https://github.com/Esri/geometry-api-java
>
> Thank you.
> Yours sincerely,
> Riyafa