You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Jairo Henao <ja...@gmail.com> on 2020/02/17 20:32:03 UTC

Split JSON using an expression to define the PATH

Hi all,

Is there any way to be able to "split" a huge JSON, using as JsonPATH the
value of an attribute?

SplitJson processor does not support this, it must be "hard-coded"


-- 
Jairo Henao

Re: Split JSON using an expression to define the PATH

Posted by Jairo Henao <ja...@gmail.com>.
Thanks Matt,

What you say helps me understand even because EvaluateJsonPath doesn't
support expressions either.

I will try what you mention, I appreciate your ideas.

On Tue, Feb 18, 2020 at 4:04 PM Matt Burgess <ma...@apache.org> wrote:

> Jairo,
>
> IIRC the reason we don't support Expression Language (EL) for the
> JSONPath expression is because the two DSLs use the same characters in
> different syntax, such as $. To support both, I believe the user would
> have to escape the JSONPath $ characters so the NiFi Expression
> Language lexer doesn't think it's a NiFi expression. We could try what
> some other languages do which is to catch those kinds of errors and
> proceed under the assumption that the character must be part of a
> JSONPath rather than an EL expression, but in an error condition how
> would it know that it was a JSONPath error vs a NiFi EL error? I
> believe the decision was made to keep things simple for the user and
> so EL is not currently supported for that field. Happy to continue the
> discussion though, and I welcome all opinions.
>
> In the meantime you may find (as Pierre suggested) that ForkRecord or
> SplitRecord would work, as you can use a RecordPath expression rather
> than a JSONPath expression. If the incoming JSON has a schema
> (inferred or explicit) and is a top-level array of JSON objects (or
> one JSON after another), then SplitRecord doesn't need a JSONPath as
> the JsonTreeReader will take care of reading in the individual
> records. If you need a JSONPath because you need to split on an array
> "further down" in the input, you can try ForkRecord with a "Mode" set
> to "Extract" and likely "Include Parent Fields" set to "false".
> However as Simon mentioned, any transformation over the entire JSON
> (whether it be to fork a nested array or JoltTransformJson) will
> likely have to read the entire file into memory.
>
> Regards,
> Matt
>
>
> On Tue, Feb 18, 2020 at 3:32 PM Jairo Henao <ja...@gmail.com>
> wrote:
> >
> > Hi,
> > Sorry, I should have given a little more details.
> >
> > My requirement is that I have a PG (Processor Group) that I defined as a
> template, the PG contains several processors and one of them is a
> SplitJSON, I want to receive the JsonPath to be applied in a flowfile
> attribute.
> >
> > Apparently then, I must transform my JSON before entering the PG to
> always find the same JsonPath.
> >
> > Do you consider it useful to make the change request in the processor so
> that in the next versions it has that functionality?
> >
> > On Tue, Feb 18, 2020 at 2:02 PM Pierre Villard <
> pierre.villard.fr@gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> We would need a bit more details about what you try to achieve (an
> example maybe?) but the record processors (SplitRecord, ForkRecord, etc)
> might be useful.
> >>
> >> Thanks,
> >> Pierre
> >>
> >> Le lun. 17 févr. 2020 à 23:55, Simon Bence <si...@gmail.com>
> a écrit :
> >>>
> >>> Hi,
> >>>
> >>> Without knowing the actual use case: what if you would run the JSON
> through JoltTransformJSON and convert it into a format would work well with
> the SplitJson? It might help. (On the side note: as you mention “huge”
> JSON, it might be a resource consuming operation)
> >>>
> >>> Regards,
> >>> Bence
> >>>
> >>> On 2020. Feb 17., at 21:32, Jairo Henao <ja...@gmail.com>
> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> Is there any way to be able to "split" a huge JSON, using as JsonPATH
> the value of an attribute?
> >>>
> >>> SplitJson processor does not support this, it must be "hard-coded"
> >>>
> >>>
> >>> --
> >>> Jairo Henao
> >>>
> >>>
> >
> >
> > --
> > Jairo Henao
> >
>


-- 
Saludos

Jairo Henao

*Chat Skype: jairo.henao.05*

Re: Split JSON using an expression to define the PATH

Posted by Matt Burgess <ma...@apache.org>.
Jairo,

IIRC the reason we don't support Expression Language (EL) for the
JSONPath expression is because the two DSLs use the same characters in
different syntax, such as $. To support both, I believe the user would
have to escape the JSONPath $ characters so the NiFi Expression
Language lexer doesn't think it's a NiFi expression. We could try what
some other languages do which is to catch those kinds of errors and
proceed under the assumption that the character must be part of a
JSONPath rather than an EL expression, but in an error condition how
would it know that it was a JSONPath error vs a NiFi EL error? I
believe the decision was made to keep things simple for the user and
so EL is not currently supported for that field. Happy to continue the
discussion though, and I welcome all opinions.

In the meantime you may find (as Pierre suggested) that ForkRecord or
SplitRecord would work, as you can use a RecordPath expression rather
than a JSONPath expression. If the incoming JSON has a schema
(inferred or explicit) and is a top-level array of JSON objects (or
one JSON after another), then SplitRecord doesn't need a JSONPath as
the JsonTreeReader will take care of reading in the individual
records. If you need a JSONPath because you need to split on an array
"further down" in the input, you can try ForkRecord with a "Mode" set
to "Extract" and likely "Include Parent Fields" set to "false".
However as Simon mentioned, any transformation over the entire JSON
(whether it be to fork a nested array or JoltTransformJson) will
likely have to read the entire file into memory.

Regards,
Matt


On Tue, Feb 18, 2020 at 3:32 PM Jairo Henao <ja...@gmail.com> wrote:
>
> Hi,
> Sorry, I should have given a little more details.
>
> My requirement is that I have a PG (Processor Group) that I defined as a template, the PG contains several processors and one of them is a SplitJSON, I want to receive the JsonPath to be applied in a flowfile attribute.
>
> Apparently then, I must transform my JSON before entering the PG to always find the same JsonPath.
>
> Do you consider it useful to make the change request in the processor so that in the next versions it has that functionality?
>
> On Tue, Feb 18, 2020 at 2:02 PM Pierre Villard <pi...@gmail.com> wrote:
>>
>> Hi,
>>
>> We would need a bit more details about what you try to achieve (an example maybe?) but the record processors (SplitRecord, ForkRecord, etc) might be useful.
>>
>> Thanks,
>> Pierre
>>
>> Le lun. 17 févr. 2020 à 23:55, Simon Bence <si...@gmail.com> a écrit :
>>>
>>> Hi,
>>>
>>> Without knowing the actual use case: what if you would run the JSON through JoltTransformJSON and convert it into a format would work well with the SplitJson? It might help. (On the side note: as you mention “huge” JSON, it might be a resource consuming operation)
>>>
>>> Regards,
>>> Bence
>>>
>>> On 2020. Feb 17., at 21:32, Jairo Henao <ja...@gmail.com> wrote:
>>>
>>> Hi all,
>>>
>>> Is there any way to be able to "split" a huge JSON, using as JsonPATH the value of an attribute?
>>>
>>> SplitJson processor does not support this, it must be "hard-coded"
>>>
>>>
>>> --
>>> Jairo Henao
>>>
>>>
>
>
> --
> Jairo Henao
>

Re: Split JSON using an expression to define the PATH

Posted by Jairo Henao <ja...@gmail.com>.
Hi,
Sorry, I should have given a little more details.

My requirement is that I have a PG (Processor Group) that I defined as a
template, the PG contains several processors and one of them is a
SplitJSON, I want to receive the JsonPath to be applied in a flowfile
attribute.

Apparently then, I must transform my JSON before entering the PG to always
find the same JsonPath.

Do you consider it useful to make the change request in the processor so
that in the next versions it has that functionality?

On Tue, Feb 18, 2020 at 2:02 PM Pierre Villard <pi...@gmail.com>
wrote:

> Hi,
>
> We would need a bit more details about what you try to achieve (an example
> maybe?) but the record processors (SplitRecord, ForkRecord, etc) might be
> useful.
>
> Thanks,
> Pierre
>
> Le lun. 17 févr. 2020 à 23:55, Simon Bence <si...@gmail.com> a
> écrit :
>
>> Hi,
>>
>> Without knowing the actual use case: what if you would run the JSON
>> through JoltTransformJSON and convert it into a format would work well with
>> the SplitJson? It might help. (On the side note: as you mention “huge”
>> JSON, it might be a resource consuming operation)
>>
>> Regards,
>> Bence
>>
>> On 2020. Feb 17., at 21:32, Jairo Henao <ja...@gmail.com>
>> wrote:
>>
>> Hi all,
>>
>> Is there any way to be able to "split" a huge JSON, using as JsonPATH the
>> value of an attribute?
>>
>> SplitJson processor does not support this, it must be "hard-coded"
>>
>>
>> --
>> Jairo Henao
>>
>>
>>

-- 
Jairo Henao

Re: Split JSON using an expression to define the PATH

Posted by Pierre Villard <pi...@gmail.com>.
Hi,

We would need a bit more details about what you try to achieve (an example
maybe?) but the record processors (SplitRecord, ForkRecord, etc) might be
useful.

Thanks,
Pierre

Le lun. 17 févr. 2020 à 23:55, Simon Bence <si...@gmail.com> a
écrit :

> Hi,
>
> Without knowing the actual use case: what if you would run the JSON
> through JoltTransformJSON and convert it into a format would work well with
> the SplitJson? It might help. (On the side note: as you mention “huge”
> JSON, it might be a resource consuming operation)
>
> Regards,
> Bence
>
> On 2020. Feb 17., at 21:32, Jairo Henao <ja...@gmail.com> wrote:
>
> Hi all,
>
> Is there any way to be able to "split" a huge JSON, using as JsonPATH the
> value of an attribute?
>
> SplitJson processor does not support this, it must be "hard-coded"
>
>
> --
> Jairo Henao
>
>
>

Re: Split JSON using an expression to define the PATH

Posted by Simon Bence <si...@gmail.com>.
Hi,

Without knowing the actual use case: what if you would run the JSON through JoltTransformJSON and convert it into a format would work well with the SplitJson? It might help. (On the side note: as you mention “huge” JSON, it might be a resource consuming operation)

Regards,
Bence

> On 2020. Feb 17., at 21:32, Jairo Henao <ja...@gmail.com> wrote:
> 
> Hi all,
> 
> Is there any way to be able to "split" a huge JSON, using as JsonPATH the value of an attribute?
> 
> SplitJson processor does not support this, it must be "hard-coded"
> 
> 
> -- 
> Jairo Henao
>