You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@daffodil.apache.org by John Dziurlaj <jo...@turnout.rocks> on 2022/10/17 19:52:59 UTC
Variable tokens in DFDL
I am attempting to create a DFDL schema for a HL7 derived language. HL7 has a feature that all the delimiters in a message are mapped using the first few bytes of the input itself. For example: MSH|^~\& specifies the various delimiters, in order of field separator | component separator ^, repetition separator ~, escape character \, subcomponent separator &.
Can an DFDL schema be produced that does not hardcode these tokens?
John Dziurłaj
Re: Variable tokens in DFDL
Posted by Steve Lawrence <sl...@apache.org>.
Yep. The key is to use DFDL variables.
You first need to define the variable using dfdl:defineVariable in the
schema annotation, providing the name, type, and optional default value,
e.g.:
<schema ...>
<annotation>
<appinfo source="http://www.ogf.org/dfdl/">
...
<dfdl:defineVariable name="FieldSep" type="xs:string"
defaultValue="|"/>
...
</appinfo>
</annotation>
...
Then on the element that parses that delimiter you use dfdl:setVariable
to set it based on the parsed value, e.g.:
<element name="FieldSeparator" type="xs:string"
dfdl:lengthKind="explicit" dfdl:length="1">
<annotation>
<appinfo source="http://www.ogf.org/dfdl/">
<dfdl:setVariable ref="FieldSep" value="{.}" />
</appinfo>
</annotation>
</element>
And finally, specific an expression that evaluates to that variable
placed on the appropriate sequence, e.g.
<sequence dfdl:separator="{ $FieldSep }">
...
</sequence>
For a real world example, the EDIFACT schema does something very
similar. Here is the Daffodil development branch for that schema:
https://github.com/DFDLSchemas/EDIFACT/tree/daffodil-dev
Everything I mentioned is in the src/main/resource/EDIFACT-Common/
files. Though it is a bit more complicated since it defines new formats
that use the variable and then the sequence uses dfdl:ref to refer to
those formats (instead of directly setting the separator to the variable
like above), and it has a slightly more complicated setVariable
expression to conditionally set the variables. But the core idea is the
same.
On 10/17/22 3:52 PM, John Dziurlaj wrote:
> I am attempting to create a DFDL schema for a HL7 derived language. HL7
> has a feature that all the delimiters in a message are mapped using the
> first few bytes of the input itself. For example: |*MSH|^~\&*| specifies
> the various delimiters, in order of field separator ||| component
> separator |^|, repetition separator |~|, escape character |\|,
> subcomponent separator |&|.
>
> Can an DFDL schema be produced that does not hardcode these tokens?
>
> John Dziurłaj
>