You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@daffodil.apache.org by Attila Horvath <at...@gmail.com> on 2022/03/08 12:20:55 UTC
help processing missing optional data
ALCON
I'm having difficulty parsing an input ASCII file containing optional
records as follows:
*Required line 1... fixed length <CR><LF>Required line 2... fixed length
<CR><LF> Optional line 3... fixed length <CR><LF> Optional comment... 1 of
n <CR><LF> Optional comment... 2 of n <CR><LF> Optional comment... n of n
<CR><LF> Required line 4... fixed length <CR><LF> :::*
I've tried various <xs:choice ... /> and lookahead approaches
unsuccessfully and admittedly getting frustrated. <8(
For example, I'm able to parse 'Optional line 3' *when it exists* BUT I
can't construct DFDL script to process the file successfully when line 3 is
not present.
Can someone pls point me to an example/link illustrating best way to
process this if/when optional lines are NOT present.
Thx in advance
Attila
Re: help processing missing optional data
Posted by Steve Lawrence <sl...@apache.org>.
There's a few different possible approaches that depend on your data and
how you know if a line is one of your required lines or an optional line.
In your data, how do you know that an optional line is missing or not?
If you could post some an example of actual data, we could probably give
a more specific approach.
That said, a general common approach is something like this:
<xs:element name="RequiredThing" ... />
<xs:element name="OptionalThing" minOccurs="0" ...>
<xs:annotation>
<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:discriminator test="{ ... }"/>
</xs:appinfo>
</xs:annotation>
</xs:element>
This approach sets minOccurs of the optional thing to zero so that it
may or may not exist. We also add a discriminator to test if this is
actually the optional thing or not. With this approach, Daffodil will
speculatively try to parse the optional thing, and if it fails to parse
or the discriminator evaluates to false and determines this isn't
actually the optional thing, Daffodil will backtrack and try to parse
the next thing. The actual test depends on how you determine if a line
is an optional line or not.
In your snippet you just posted, length kind pattern is slightly
unintuitive. A non-match just means it's a successful parse of zero
length. Usually users want a non-match to be an error, so you would need
to do something like this to get that behavior:
<xs:element name="test" type="xs:string" minOccurs="0"
dfdl:lengthKind="pattern"
dfdl:lengthPattern="...">
<xs:annotation>
<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:discriminator test="{ . != '' }"/>
</xs:appinfo>
</xs:annotation>
</xs:element>
Like above, the test field is optional because minOccurs=0. An we have a
discriminator to determine if we actually parsed the test field or if it
was something else. In this case, the test is if the value of the test
element was the empty string (i.e., the pattern didn't match anything).
If it is the empty string, Daffodil will know the test element doesn't
actually exist, it will backtrack, and try parsing the same data with
whatever is next in the schema.
- Steve
On 3/8/22 7:37 AM, Attila Horvath wrote:
> PS:...
>
> Case in point:...
>
> <xs:element name="test" type="xs:string" dfdl:lengthKind="pattern"
> dfdl:lengthPattern=".*?(?=(THIS IS A CONTINUATION MESSAGE, PART))"/>
> <xs:choice>
> <xs:choice dfdl:choiceDispatchKey="{./test}">
>
> When 'Optional line 3' is NOT there, I get an error on 2nd '<xs:choice.../>'
> because it resolves to an empty string.
>
> <8(
>
>
>
> On Tue, Mar 8, 2022 at 7:20 AM Attila Horvath <attila.j.horvath@gmail.com
> <ma...@gmail.com>> wrote:
>
>
> ALCON
>
> I'm having difficulty parsing an input ASCII file containing optional
> records as follows:
> *Required line 1... fixed length <CR><LF>
> Required line 2... fixed length *<CR><LF>*
> Optional line 3... fixed length *<CR><LF>*
> Optional comment... 1 of n *<CR><LF>*
> Optional comment... 2 of n *<CR><LF>*
> Optional comment... n of n *<CR><LF>*
> Required line 4... fixed length *<CR><LF>*
> :::*
>
> I've tried various <xs:choice ... /> and lookahead approaches unsuccessfully
> and admittedly getting frustrated. <8(
>
> For example, I'm able to parse 'Optional line 3' _when it exists_ BUT I
> can't construct DFDL script to process the file successfully when line 3 is
> not present.
>
> Can someone pls point me to an example/link illustrating best way to process
> this if/when optional lines are NOT present.
>
> Thx in advance
> Attila
>
Re: help processing missing optional data
Posted by Attila Horvath <at...@gmail.com>.
PS:...
Case in point:...
<xs:element name="test" type="xs:string" dfdl:lengthKind="pattern"
dfdl:lengthPattern=".*?(?=(THIS IS A CONTINUATION MESSAGE, PART))"/>
<xs:choice>
<xs:choice dfdl:choiceDispatchKey="{./test}">
When 'Optional line 3' is NOT there, I get an error on 2nd '<xs:choice.../>'
because it resolves to an empty string.
<8(
On Tue, Mar 8, 2022 at 7:20 AM Attila Horvath <at...@gmail.com>
wrote:
>
> ALCON
>
> I'm having difficulty parsing an input ASCII file containing optional
> records as follows:
>
>
>
>
>
>
>
> *Required line 1... fixed length <CR><LF>Required line 2... fixed length
> <CR><LF> Optional line 3... fixed length <CR><LF> Optional comment... 1 of
> n <CR><LF> Optional comment... 2 of n <CR><LF> Optional comment... n of n
> <CR><LF> Required line 4... fixed length <CR><LF> :::*
>
> I've tried various <xs:choice ... /> and lookahead approaches
> unsuccessfully and admittedly getting frustrated. <8(
>
> For example, I'm able to parse 'Optional line 3' *when it exists* BUT I
> can't construct DFDL script to process the file successfully when line 3 is
> not present.
>
> Can someone pls point me to an example/link illustrating best way to
> process this if/when optional lines are NOT present.
>
> Thx in advance
> Attila
>
>
Re: help processing missing optional data
Posted by Attila Horvath <at...@gmail.com>.
PS:...
Case in point:...
<xs:element name="test" type="xs:string" dfdl:lengthKind="pattern"
dfdl:lengthPattern=".*?(?=(THIS IS A CONTINUATION MESSAGE, PART))"/>
<xs:choice>
<xs:choice dfdl:choiceDispatchKey="{./test}">
When 'Optional line 3' is NOT there, I get an error on 2nd '<xs:choice.../>'
because it resolves to an empty string.
<8(
On Tue, Mar 8, 2022 at 7:20 AM Attila Horvath <at...@gmail.com>
wrote:
>
> ALCON
>
> I'm having difficulty parsing an input ASCII file containing optional
> records as follows:
>
>
>
>
>
>
>
> *Required line 1... fixed length <CR><LF>Required line 2... fixed length
> <CR><LF> Optional line 3... fixed length <CR><LF> Optional comment... 1 of
> n <CR><LF> Optional comment... 2 of n <CR><LF> Optional comment... n of n
> <CR><LF> Required line 4... fixed length <CR><LF> :::*
>
> I've tried various <xs:choice ... /> and lookahead approaches
> unsuccessfully and admittedly getting frustrated. <8(
>
> For example, I'm able to parse 'Optional line 3' *when it exists* BUT I
> can't construct DFDL script to process the file successfully when line 3 is
> not present.
>
> Can someone pls point me to an example/link illustrating best way to
> process this if/when optional lines are NOT present.
>
> Thx in advance
> Attila
>
>