You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@daffodil.apache.org by Steve Lawrence <sl...@apache.org> on 2020/07/07 12:15:09 UTC

Re: Here's how XSLT programs can use DFDL/Daffodil

Great! Thanks Mike!

The only downside to this that I see is that it requires Saxon-EE to use
reflexive extension functions, which takes a little bit of effort to get.

I'm wondering if it would be worth updating this to use Saxon's newer
integrated extension functions [1] rather than the reflexive extension
functions? It looks like it requires a bit more code to setup the
extension function and register with Saxon, but it is supported in
Saxon-HE which is much easier to get via maven. And I'd like to
integrate this into a Daffodil regression suite, which would be easier
if all dependencies were in maven.

I can take a look at making this update if this seems reasonable.

- Steve

[1]
http://www.saxonica.com/documentation/#!extensibility/integratedfunctions


On 7/6/20 7:04 PM, Beckerle, Mike wrote:
> I captured Roger's example from this email and put in on github here with a test.
> 
> https://github.com/OpenDFDL/examples/tree/master/xslt-csv
> 
> This isn't part of Daffodil, but it's an interesting example combining XSLT with 
> Daffodil.
> 
> --------------------------------------------------------------------------------
> *From:* Roger L Costello <co...@mitre.org>
> *Sent:* Tuesday, June 2, 2020 12:10 PM
> *To:* users@daffodil.apache.org <us...@daffodil.apache.org>
> *Subject:* Here's how XSLT programs can use DFDL/Daffodil
> 
> Hi Folks,
> 
> Below is an XSLT program that processes XML-formatted CSV. The XSLT program 
> calls an external Java program, passing it the name of a DFDL file 
> (csv.dfdl.xsd) and the name of a CSV file (csv.txt). The Java program calls 
> Daffodil which produces XML-formatted CSV. The Java program returns the XML as a 
> string. The XSLT program uses the parse-xml() function to convert the string to 
> XML.
> 
> <xsl:stylesheetxmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> xmlns:xs="http://www.w3.org/2001/XMLSchema"
> xmlns:dfdl="java:runDaffodil"
> version="3.0">
> 
> <xsl:variablename="dfdl"select="'csv.dfdl.xsd'"/>
> <xsl:variablename="input"select="'csv.txt'"/>
> 
> <xsl:templatematch="/">
> <!-- Convert the CSV text file to XML using DFDL -->
> <xsl:variablename="csv-string"select="/dfdl:dfdlParse/(*$dfdl*,*$input*)"as="xs:string"/>
> <xsl:variablename="csv-xml"select="/parse-xml/(*$csv-string*)"as="document-node()"/>
> <!-- Process the XML-formatted CSV here -->
> <xsl:variablename="numRecords"select="/count/(*$csv-xml*//*record*)"as="xs:integer"/>
> <!-- Output the number of records in the CSV file -->
> <xsl:sequenceselect="*$numRecords*"/>
> <!-- Output the model (field 3) and year (field 1) of each Chevy auto -->
> <xsl:for-eachselect="*$csv-xml*//*record*[*field*[2] eq'Chevy']">
> <xsl:sequenceselect="(*field*[3]//data/(),' ',*field*[1]//data/())"/>
> </xsl:for-each>
> </xsl:template>
> </xsl:stylesheet>
> 
> The XSLT program uses XSLT version 3.0. You can use any version except 1.0 (the 
> parse-xml() function is not present in XSLT 1.0). I used Saxon as the XSLT 
> processor.
> 
> The Java program calls Daffodil which parses the CSV file using the DFDL schema. 
> Here is the Java program:
> 
> *import*java.io.IOException;
> *import*java.net.URISyntaxException;
> *import*java.net.URL;
> 
> *import*org.jdom2.Document;
> *import*org.jdom2.output.XMLOutputter;
> 
> *import*org.apache.daffodil.japi.Compiler;
> *import*org.apache.daffodil.japi.Daffodil;
> *import*org.apache.daffodil.japi.DataProcessor;
> *import*org.apache.daffodil.japi.Diagnostic;
> *import*org.apache.daffodil.japi.ParseResult;
> *import*org.apache.daffodil.japi.ProcessorFactory;
> *import*org.apache.daffodil.japi.infoset.JDOMInfosetOutputter;
> *import*org.apache.daffodil.japi.io.InputSourceDataInputStream;
> 
> *public**class*runDaffodil {
> 
> *public**static*String dfdlParse(String dfdl,String 
> input)*throws*IOException,URISyntaxException {
> 
> URL dfdlURL =runDaffodil.*class*.getResource(dfdl);
> URL inputURL =runDaffodil.*class*.getResource(input);
> 
> //
> // First, compile the DFDL Schema
> //
> Compiler c =Daffodil.compiler();
> ProcessorFactory pf =c.compileSource(dfdlURL.toURI());
> DataProcessor dp =pf.onPath("/");
> 
> //
> // Parse - parse data to XML
> //
> java.io.InputStream is =inputURL.openStream();
> InputSourceDataInputStream dis =*new*InputSourceDataInputStream(is);
> 
> //
> // Setup JDOM outputter
> //
> JDOMInfosetOutputter outputter =*new*JDOMInfosetOutputter();
> 
> //
> // Do the parse
> //
> ParseResult res =dp.parse(dis,outputter);
> 
> //
> // Return the XML as a string
> //
> Document doc =outputter.getResult();
> *return**new*XMLOutputter().outputString(doc);
> }
> }
> 


Re: Here's how XSLT programs can use DFDL/Daffodil

Posted by Steve Lawrence <sl...@apache.org>.
Note that the repo isn't closed. It's a public github repo at:

https://github.com/OpenDFDL/examples

But access is restricted and not controlled by Daffodil PPMC or anything
ASF related, so it is private in that sense.

The main thing we use OpenDFDL is as place to store examples of how to
use the Daffodil API, but don't really belong in the Daffodil codebase.
I sort of view this as what another person/project might do if they
wanted to use the Daffodil API and publish it. This one just happens to
be controlled by PPMC members of Daffodil. Using Daffodil doesn't
require any of this stuff at all.

Though, I'm not against merging some stuff form OpenDFDL into a repo. Do
you know how other projects handle this kind of thing? Does it make
sense to have a "daffodil-examples" repository, or just directly include
it in our code base?

The other thing we've been using OpenDFDL is for things that have
incompatible licenses.

For example, we have a tool that uses the IBM DFDL implementation to run
tests in the Daffodil resository, as a way to test cross compatibility.
The IBM tool requires agreeing to IBM terms and downloading jars from
IBM's website. The terms are pretty restrictive and are certainly
category X, so doesn't belong in Daffodil.

Another example is this XSLT example. That depends on Saxon-EE, which is
also category X.

I guess we could include these in the an ASF repo, and just add
instructions on how to get the dependencies and build everything. But it
feels safer to just not include that stuff at all. None of these are
needed to use Daffodil so feels like a separate thing to me.

On 7/15/20 12:39 PM, Dave Fisher wrote:
> Hi -
> 
> I’m somewhat concerned about this private closed repository being used for daffodil examples. It really would be more helpful if it was more publicly accessible and perhaps part of the project’s repositories.
> 
> Please explain.
> 
> Regards,
> Dave
> 
>> On Jul 7, 2020, at 6:41 AM, Steve Lawrence <sl...@apache.org> wrote:
>>
>> All makes sense. I've created a handful a bugs in the OpenDFDL repo to
>> track these issues.
>>
>> On 7/7/20 8:39 AM, Beckerle, Mike wrote:
>>> I suggest adding an Issue/ticket to the OpenDFDL github for this.
>>>
>>> There are a few enhancements to this thing that would be helpful.
>>>
>>> As is, it only requires Saxon-PE, which is affordable for most usages.
>>>
>>> Another feature would be a cache of the compiled DFDL schema so that the penalty 
>>> to compile it isn't being paid every time you run the XSLT.  (Something many 
>>> usages of Daffodil need - probably should be a feature supplied in a Daffodil 
>>> library so that it's not being reinvented.) Daffodil compilation times are 
>>> acceptable now for long-running processes, but not for quick  start/stop things 
>>> as XSLT commonly is.
>>>
>>> We should look into whether Java's built in XSLT could be used to do this also.
>>>
>>> Roger suggests you just need XSLT 2.0, and maybe Java's built in XSLT is 2.0 ? I 
>>> know it is not XSLT 3.0. But I wasn't able to quickly determine if it is stuck 
>>> at version 1.0 or has been updated to 2.0.
>>>
>>> I am unclear if or how Java's XSLT supports extension functions however. But I 
>>> suspect it does. The trick is the parse-xml() function or other way to construct 
>>> the XML again that the XSLT operates on.
>>>
>>>
>>>
>>>
>>> --------------------------------------------------------------------------------
>>> *From:* Steve Lawrence <sl...@apache.org>
>>> *Sent:* Tuesday, July 7, 2020 8:15 AM
>>> *To:* users@daffodil.apache.org <us...@daffodil.apache.org>
>>> *Subject:* Re: Here's how XSLT programs can use DFDL/Daffodil
>>> Great! Thanks Mike!
>>>
>>> The only downside to this that I see is that it requires Saxon-EE to use
>>> reflexive extension functions, which takes a little bit of effort to get.
>>>
>>> I'm wondering if it would be worth updating this to use Saxon's newer
>>> integrated extension functions [1] rather than the reflexive extension
>>> functions? It looks like it requires a bit more code to setup the
>>> extension function and register with Saxon, but it is supported in
>>> Saxon-HE which is much easier to get via maven. And I'd like to
>>> integrate this into a Daffodil regression suite, which would be easier
>>> if all dependencies were in maven.
>>>
>>> I can take a look at making this update if this seems reasonable.
>>>
>>> - Steve
>>>
>>> [1]
>>> http://www.saxonica.com/documentation/#!extensibility/integratedfunctions
>>>
>>>
>>> On 7/6/20 7:04 PM, Beckerle, Mike wrote:
>>>> I captured Roger's example from this email and put in on github here with a test.
>>>>
>>>> https://github.com/OpenDFDL/examples/tree/master/xslt-csv
>>>>
>>>> This isn't part of Daffodil, but it's an interesting example combining XSLT with
>>>> Daffodil.
>>>>
>>>> --------------------------------------------------------------------------------
>>>> *From:* Roger L Costello <co...@mitre.org>
>>>> *Sent:* Tuesday, June 2, 2020 12:10 PM
>>>> *To:* users@daffodil.apache.org <us...@daffodil.apache.org>
>>>> *Subject:* Here's how XSLT programs can use DFDL/Daffodil
>>>>
>>>> Hi Folks,
>>>>
>>>> Below is an XSLT program that processes XML-formatted CSV. The XSLT program 
>>>> calls an external Java program, passing it the name of a DFDL file 
>>>> (csv.dfdl.xsd) and the name of a CSV file (csv.txt). The Java program calls 
>>>> Daffodil which produces XML-formatted CSV. The Java program returns the XML as a
>>>> string. The XSLT program uses the parse-xml() function to convert the string to
>>>> XML.
>>>>
>>>> <xsl:stylesheetxmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>>>> xmlns:xs="http://www.w3.org/2001/XMLSchema"
>>>> xmlns:dfdl="java:runDaffodil"
>>>> version="3.0">
>>>>
>>>> <xsl:variablename="dfdl"select="'csv.dfdl.xsd'"/>
>>>> <xsl:variablename="input"select="'csv.txt'"/>
>>>>
>>>> <xsl:templatematch="/">
>>>> <!-- Convert the CSV text file to XML using DFDL -->
>>>> <xsl:variablename="csv-string"select="/dfdl:dfdlParse/(*$dfdl*,*$input*)"as="xs:string"/>
>>>> <xsl:variablename="csv-xml"select="/parse-xml/(*$csv-string*)"as="document-node()"/>
>>>> <!-- Process the XML-formatted CSV here -->
>>>> <xsl:variablename="numRecords"select="/count/(*$csv-xml*//*record*)"as="xs:integer"/>
>>>> <!-- Output the number of records in the CSV file -->
>>>> <xsl:sequenceselect="*$numRecords*"/>
>>>> <!-- Output the model (field 3) and year (field 1) of each Chevy auto -->
>>>> <xsl:for-eachselect="*$csv-xml*//*record*[*field*[2] eq'Chevy']">
>>>> <xsl:sequenceselect="(*field*[3]//data/(),' ',*field*[1]//data/())"/>
>>>> </xsl:for-each>
>>>> </xsl:template>
>>>> </xsl:stylesheet>
>>>>
>>>> The XSLT program uses XSLT version 3.0. You can use any version except 1.0 (the
>>>> parse-xml() function is not present in XSLT 1.0). I used Saxon as the XSLT 
>>>> processor.
>>>>
>>>> The Java program calls Daffodil which parses the CSV file using the DFDL schema.
>>>> Here is the Java program:
>>>>
>>>> *import*java.io.IOException;
>>>> *import*java.net.URISyntaxException;
>>>> *import*java.net.URL;
>>>>
>>>> *import*org.jdom2.Document;
>>>> *import*org.jdom2.output.XMLOutputter;
>>>>
>>>> *import*org.apache.daffodil.japi.Compiler;
>>>> *import*org.apache.daffodil.japi.Daffodil;
>>>> *import*org.apache.daffodil.japi.DataProcessor;
>>>> *import*org.apache.daffodil.japi.Diagnostic;
>>>> *import*org.apache.daffodil.japi.ParseResult;
>>>> *import*org.apache.daffodil.japi.ProcessorFactory;
>>>> *import*org.apache.daffodil.japi.infoset.JDOMInfosetOutputter;
>>>> *import*org.apache.daffodil.japi.io.InputSourceDataInputStream;
>>>>
>>>> *public**class*runDaffodil {
>>>>
>>>> *public**static*String dfdlParse(String dfdl,String 
>>>> input)*throws*IOException,URISyntaxException {
>>>>
>>>> URL dfdlURL =runDaffodil.*class*.getResource(dfdl);
>>>> URL inputURL =runDaffodil.*class*.getResource(input);
>>>>
>>>> //
>>>> // First, compile the DFDL Schema
>>>> //
>>>> Compiler c =Daffodil.compiler();
>>>> ProcessorFactory pf =c.compileSource(dfdlURL.toURI());
>>>> DataProcessor dp =pf.onPath("/");
>>>>
>>>> //
>>>> // Parse - parse data to XML
>>>> //
>>>> java.io.InputStream is =inputURL.openStream();
>>>> InputSourceDataInputStream dis =*new*InputSourceDataInputStream(is);
>>>>
>>>> //
>>>> // Setup JDOM outputter
>>>> //
>>>> JDOMInfosetOutputter outputter =*new*JDOMInfosetOutputter();
>>>>
>>>> //
>>>> // Do the parse
>>>> //
>>>> ParseResult res =dp.parse(dis,outputter);
>>>>
>>>> //
>>>> // Return the XML as a string
>>>> //
>>>> Document doc =outputter.getResult();
>>>> *return**new*XMLOutputter().outputString(doc);
>>>> }
>>>> }
>>>>
>>>
>>
> 


Re: Here's how XSLT programs can use DFDL/Daffodil

Posted by "Beckerle, Mike" <mb...@tresys.com>.
I'm certainly open to suggestions on how to do this better.

OpenDFDL is not private for access. It's just privately administered by me. It's public to access, and the materials are licensed ASF 2.0.

OpenDFDL is about DFDL generally, not about Daffodil per se. Hence, the site could be tools/examples related to not only Daffodil, but other DFDL implementations (there are 2 now - IBMDFDL, and The ESA DFDL4Space implementation).

Some of the things on OpenDFDL are not possible to have in an ASF project repo, because they link with and depend upon proprietary software - ex: we have a cross-test rig that allows us to run Daffodil TDML tests (that are part of daffodil), against a different implementation of DFDL, to test portability. This rig requires the IBM DFDL implementation which is closed source. It does not incorporate nor embed it, but it does require it in order to run, and won't compile without the IBM DFDL jar files present.

Other things are little ad-hoc examples. Like hello-world things that exercise Daffodil APIs, and the more recent XSLT integration (which uses Saxon - and requires the paid variety thereof).






________________________________
From: Dave Fisher <wa...@apache.org>
Sent: Wednesday, July 15, 2020 12:39 PM
To: users@daffodil.apache.org <us...@daffodil.apache.org>
Subject: Re: Here's how XSLT programs can use DFDL/Daffodil

Hi -

I’m somewhat concerned about this private closed repository being used for daffodil examples. It really would be more helpful if it was more publicly accessible and perhaps part of the project’s repositories.

Please explain.

Regards,
Dave

> On Jul 7, 2020, at 6:41 AM, Steve Lawrence <sl...@apache.org> wrote:
>
> All makes sense. I've created a handful a bugs in the OpenDFDL repo to
> track these issues.
>
> On 7/7/20 8:39 AM, Beckerle, Mike wrote:
>> I suggest adding an Issue/ticket to the OpenDFDL github for this.
>>
>> There are a few enhancements to this thing that would be helpful.
>>
>> As is, it only requires Saxon-PE, which is affordable for most usages.
>>
>> Another feature would be a cache of the compiled DFDL schema so that the penalty
>> to compile it isn't being paid every time you run the XSLT.  (Something many
>> usages of Daffodil need - probably should be a feature supplied in a Daffodil
>> library so that it's not being reinvented.) Daffodil compilation times are
>> acceptable now for long-running processes, but not for quick  start/stop things
>> as XSLT commonly is.
>>
>> We should look into whether Java's built in XSLT could be used to do this also.
>>
>> Roger suggests you just need XSLT 2.0, and maybe Java's built in XSLT is 2.0 ? I
>> know it is not XSLT 3.0. But I wasn't able to quickly determine if it is stuck
>> at version 1.0 or has been updated to 2.0.
>>
>> I am unclear if or how Java's XSLT supports extension functions however. But I
>> suspect it does. The trick is the parse-xml() function or other way to construct
>> the XML again that the XSLT operates on.
>>
>>
>>
>>
>> --------------------------------------------------------------------------------
>> *From:* Steve Lawrence <sl...@apache.org>
>> *Sent:* Tuesday, July 7, 2020 8:15 AM
>> *To:* users@daffodil.apache.org <us...@daffodil.apache.org>
>> *Subject:* Re: Here's how XSLT programs can use DFDL/Daffodil
>> Great! Thanks Mike!
>>
>> The only downside to this that I see is that it requires Saxon-EE to use
>> reflexive extension functions, which takes a little bit of effort to get.
>>
>> I'm wondering if it would be worth updating this to use Saxon's newer
>> integrated extension functions [1] rather than the reflexive extension
>> functions? It looks like it requires a bit more code to setup the
>> extension function and register with Saxon, but it is supported in
>> Saxon-HE which is much easier to get via maven. And I'd like to
>> integrate this into a Daffodil regression suite, which would be easier
>> if all dependencies were in maven.
>>
>> I can take a look at making this update if this seems reasonable.
>>
>> - Steve
>>
>> [1]
>> http://www.saxonica.com/documentation/#!extensibility/integratedfunctions
>>
>>
>> On 7/6/20 7:04 PM, Beckerle, Mike wrote:
>>> I captured Roger's example from this email and put in on github here with a test.
>>>
>>> https://github.com/OpenDFDL/examples/tree/master/xslt-csv
>>>
>>> This isn't part of Daffodil, but it's an interesting example combining XSLT with
>>> Daffodil.
>>>
>>> --------------------------------------------------------------------------------
>>> *From:* Roger L Costello <co...@mitre.org>
>>> *Sent:* Tuesday, June 2, 2020 12:10 PM
>>> *To:* users@daffodil.apache.org <us...@daffodil.apache.org>
>>> *Subject:* Here's how XSLT programs can use DFDL/Daffodil
>>>
>>> Hi Folks,
>>>
>>> Below is an XSLT program that processes XML-formatted CSV. The XSLT program
>>> calls an external Java program, passing it the name of a DFDL file
>>> (csv.dfdl.xsd) and the name of a CSV file (csv.txt). The Java program calls
>>> Daffodil which produces XML-formatted CSV. The Java program returns the XML as a
>>> string. The XSLT program uses the parse-xml() function to convert the string to
>>> XML.
>>>
>>> <xsl:stylesheetxmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>>> xmlns:xs="http://www.w3.org/2001/XMLSchema"
>>> xmlns:dfdl="java:runDaffodil"
>>> version="3.0">
>>>
>>> <xsl:variablename="dfdl"select="'csv.dfdl.xsd'"/>
>>> <xsl:variablename="input"select="'csv.txt'"/>
>>>
>>> <xsl:templatematch="/">
>>> <!-- Convert the CSV text file to XML using DFDL -->
>>> <xsl:variablename="csv-string"select="/dfdl:dfdlParse/(*$dfdl*,*$input*)"as="xs:string"/>
>>> <xsl:variablename="csv-xml"select="/parse-xml/(*$csv-string*)"as="document-node()"/>
>>> <!-- Process the XML-formatted CSV here -->
>>> <xsl:variablename="numRecords"select="/count/(*$csv-xml*//*record*)"as="xs:integer"/>
>>> <!-- Output the number of records in the CSV file -->
>>> <xsl:sequenceselect="*$numRecords*"/>
>>> <!-- Output the model (field 3) and year (field 1) of each Chevy auto -->
>>> <xsl:for-eachselect="*$csv-xml*//*record*[*field*[2] eq'Chevy']">
>>> <xsl:sequenceselect="(*field*[3]//data/(),' ',*field*[1]//data/())"/>
>>> </xsl:for-each>
>>> </xsl:template>
>>> </xsl:stylesheet>
>>>
>>> The XSLT program uses XSLT version 3.0. You can use any version except 1.0 (the
>>> parse-xml() function is not present in XSLT 1.0). I used Saxon as the XSLT
>>> processor.
>>>
>>> The Java program calls Daffodil which parses the CSV file using the DFDL schema.
>>> Here is the Java program:
>>>
>>> *import*java.io.IOException;
>>> *import*java.net.URISyntaxException;
>>> *import*java.net.URL;
>>>
>>> *import*org.jdom2.Document;
>>> *import*org.jdom2.output.XMLOutputter;
>>>
>>> *import*org.apache.daffodil.japi.Compiler;
>>> *import*org.apache.daffodil.japi.Daffodil;
>>> *import*org.apache.daffodil.japi.DataProcessor;
>>> *import*org.apache.daffodil.japi.Diagnostic;
>>> *import*org.apache.daffodil.japi.ParseResult;
>>> *import*org.apache.daffodil.japi.ProcessorFactory;
>>> *import*org.apache.daffodil.japi.infoset.JDOMInfosetOutputter;
>>> *import*org.apache.daffodil.japi.io.InputSourceDataInputStream;
>>>
>>> *public**class*runDaffodil {
>>>
>>> *public**static*String dfdlParse(String dfdl,String
>>> input)*throws*IOException,URISyntaxException {
>>>
>>> URL dfdlURL =runDaffodil.*class*.getResource(dfdl);
>>> URL inputURL =runDaffodil.*class*.getResource(input);
>>>
>>> //
>>> // First, compile the DFDL Schema
>>> //
>>> Compiler c =Daffodil.compiler();
>>> ProcessorFactory pf =c.compileSource(dfdlURL.toURI());
>>> DataProcessor dp =pf.onPath("/");
>>>
>>> //
>>> // Parse - parse data to XML
>>> //
>>> java.io.InputStream is =inputURL.openStream();
>>> InputSourceDataInputStream dis =*new*InputSourceDataInputStream(is);
>>>
>>> //
>>> // Setup JDOM outputter
>>> //
>>> JDOMInfosetOutputter outputter =*new*JDOMInfosetOutputter();
>>>
>>> //
>>> // Do the parse
>>> //
>>> ParseResult res =dp.parse(dis,outputter);
>>>
>>> //
>>> // Return the XML as a string
>>> //
>>> Document doc =outputter.getResult();
>>> *return**new*XMLOutputter().outputString(doc);
>>> }
>>> }
>>>
>>
>


Re: Here's how XSLT programs can use DFDL/Daffodil

Posted by Dave Fisher <wa...@apache.org>.
Hi -

I’m somewhat concerned about this private closed repository being used for daffodil examples. It really would be more helpful if it was more publicly accessible and perhaps part of the project’s repositories.

Please explain.

Regards,
Dave

> On Jul 7, 2020, at 6:41 AM, Steve Lawrence <sl...@apache.org> wrote:
> 
> All makes sense. I've created a handful a bugs in the OpenDFDL repo to
> track these issues.
> 
> On 7/7/20 8:39 AM, Beckerle, Mike wrote:
>> I suggest adding an Issue/ticket to the OpenDFDL github for this.
>> 
>> There are a few enhancements to this thing that would be helpful.
>> 
>> As is, it only requires Saxon-PE, which is affordable for most usages.
>> 
>> Another feature would be a cache of the compiled DFDL schema so that the penalty 
>> to compile it isn't being paid every time you run the XSLT.  (Something many 
>> usages of Daffodil need - probably should be a feature supplied in a Daffodil 
>> library so that it's not being reinvented.) Daffodil compilation times are 
>> acceptable now for long-running processes, but not for quick  start/stop things 
>> as XSLT commonly is.
>> 
>> We should look into whether Java's built in XSLT could be used to do this also.
>> 
>> Roger suggests you just need XSLT 2.0, and maybe Java's built in XSLT is 2.0 ? I 
>> know it is not XSLT 3.0. But I wasn't able to quickly determine if it is stuck 
>> at version 1.0 or has been updated to 2.0.
>> 
>> I am unclear if or how Java's XSLT supports extension functions however. But I 
>> suspect it does. The trick is the parse-xml() function or other way to construct 
>> the XML again that the XSLT operates on.
>> 
>> 
>> 
>> 
>> --------------------------------------------------------------------------------
>> *From:* Steve Lawrence <sl...@apache.org>
>> *Sent:* Tuesday, July 7, 2020 8:15 AM
>> *To:* users@daffodil.apache.org <us...@daffodil.apache.org>
>> *Subject:* Re: Here's how XSLT programs can use DFDL/Daffodil
>> Great! Thanks Mike!
>> 
>> The only downside to this that I see is that it requires Saxon-EE to use
>> reflexive extension functions, which takes a little bit of effort to get.
>> 
>> I'm wondering if it would be worth updating this to use Saxon's newer
>> integrated extension functions [1] rather than the reflexive extension
>> functions? It looks like it requires a bit more code to setup the
>> extension function and register with Saxon, but it is supported in
>> Saxon-HE which is much easier to get via maven. And I'd like to
>> integrate this into a Daffodil regression suite, which would be easier
>> if all dependencies were in maven.
>> 
>> I can take a look at making this update if this seems reasonable.
>> 
>> - Steve
>> 
>> [1]
>> http://www.saxonica.com/documentation/#!extensibility/integratedfunctions
>> 
>> 
>> On 7/6/20 7:04 PM, Beckerle, Mike wrote:
>>> I captured Roger's example from this email and put in on github here with a test.
>>> 
>>> https://github.com/OpenDFDL/examples/tree/master/xslt-csv
>>> 
>>> This isn't part of Daffodil, but it's an interesting example combining XSLT with
>>> Daffodil.
>>> 
>>> --------------------------------------------------------------------------------
>>> *From:* Roger L Costello <co...@mitre.org>
>>> *Sent:* Tuesday, June 2, 2020 12:10 PM
>>> *To:* users@daffodil.apache.org <us...@daffodil.apache.org>
>>> *Subject:* Here's how XSLT programs can use DFDL/Daffodil
>>> 
>>> Hi Folks,
>>> 
>>> Below is an XSLT program that processes XML-formatted CSV. The XSLT program 
>>> calls an external Java program, passing it the name of a DFDL file 
>>> (csv.dfdl.xsd) and the name of a CSV file (csv.txt). The Java program calls 
>>> Daffodil which produces XML-formatted CSV. The Java program returns the XML as a
>>> string. The XSLT program uses the parse-xml() function to convert the string to
>>> XML.
>>> 
>>> <xsl:stylesheetxmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>>> xmlns:xs="http://www.w3.org/2001/XMLSchema"
>>> xmlns:dfdl="java:runDaffodil"
>>> version="3.0">
>>> 
>>> <xsl:variablename="dfdl"select="'csv.dfdl.xsd'"/>
>>> <xsl:variablename="input"select="'csv.txt'"/>
>>> 
>>> <xsl:templatematch="/">
>>> <!-- Convert the CSV text file to XML using DFDL -->
>>> <xsl:variablename="csv-string"select="/dfdl:dfdlParse/(*$dfdl*,*$input*)"as="xs:string"/>
>>> <xsl:variablename="csv-xml"select="/parse-xml/(*$csv-string*)"as="document-node()"/>
>>> <!-- Process the XML-formatted CSV here -->
>>> <xsl:variablename="numRecords"select="/count/(*$csv-xml*//*record*)"as="xs:integer"/>
>>> <!-- Output the number of records in the CSV file -->
>>> <xsl:sequenceselect="*$numRecords*"/>
>>> <!-- Output the model (field 3) and year (field 1) of each Chevy auto -->
>>> <xsl:for-eachselect="*$csv-xml*//*record*[*field*[2] eq'Chevy']">
>>> <xsl:sequenceselect="(*field*[3]//data/(),' ',*field*[1]//data/())"/>
>>> </xsl:for-each>
>>> </xsl:template>
>>> </xsl:stylesheet>
>>> 
>>> The XSLT program uses XSLT version 3.0. You can use any version except 1.0 (the
>>> parse-xml() function is not present in XSLT 1.0). I used Saxon as the XSLT 
>>> processor.
>>> 
>>> The Java program calls Daffodil which parses the CSV file using the DFDL schema.
>>> Here is the Java program:
>>> 
>>> *import*java.io.IOException;
>>> *import*java.net.URISyntaxException;
>>> *import*java.net.URL;
>>> 
>>> *import*org.jdom2.Document;
>>> *import*org.jdom2.output.XMLOutputter;
>>> 
>>> *import*org.apache.daffodil.japi.Compiler;
>>> *import*org.apache.daffodil.japi.Daffodil;
>>> *import*org.apache.daffodil.japi.DataProcessor;
>>> *import*org.apache.daffodil.japi.Diagnostic;
>>> *import*org.apache.daffodil.japi.ParseResult;
>>> *import*org.apache.daffodil.japi.ProcessorFactory;
>>> *import*org.apache.daffodil.japi.infoset.JDOMInfosetOutputter;
>>> *import*org.apache.daffodil.japi.io.InputSourceDataInputStream;
>>> 
>>> *public**class*runDaffodil {
>>> 
>>> *public**static*String dfdlParse(String dfdl,String 
>>> input)*throws*IOException,URISyntaxException {
>>> 
>>> URL dfdlURL =runDaffodil.*class*.getResource(dfdl);
>>> URL inputURL =runDaffodil.*class*.getResource(input);
>>> 
>>> //
>>> // First, compile the DFDL Schema
>>> //
>>> Compiler c =Daffodil.compiler();
>>> ProcessorFactory pf =c.compileSource(dfdlURL.toURI());
>>> DataProcessor dp =pf.onPath("/");
>>> 
>>> //
>>> // Parse - parse data to XML
>>> //
>>> java.io.InputStream is =inputURL.openStream();
>>> InputSourceDataInputStream dis =*new*InputSourceDataInputStream(is);
>>> 
>>> //
>>> // Setup JDOM outputter
>>> //
>>> JDOMInfosetOutputter outputter =*new*JDOMInfosetOutputter();
>>> 
>>> //
>>> // Do the parse
>>> //
>>> ParseResult res =dp.parse(dis,outputter);
>>> 
>>> //
>>> // Return the XML as a string
>>> //
>>> Document doc =outputter.getResult();
>>> *return**new*XMLOutputter().outputString(doc);
>>> }
>>> }
>>> 
>> 
> 


Re: Here's how XSLT programs can use DFDL/Daffodil

Posted by Steve Lawrence <sl...@apache.org>.
All makes sense. I've created a handful a bugs in the OpenDFDL repo to
track these issues.

On 7/7/20 8:39 AM, Beckerle, Mike wrote:
> I suggest adding an Issue/ticket to the OpenDFDL github for this.
> 
> There are a few enhancements to this thing that would be helpful.
> 
> As is, it only requires Saxon-PE, which is affordable for most usages.
> 
> Another feature would be a cache of the compiled DFDL schema so that the penalty 
> to compile it isn't being paid every time you run the XSLT.  (Something many 
> usages of Daffodil need - probably should be a feature supplied in a Daffodil 
> library so that it's not being reinvented.) Daffodil compilation times are 
> acceptable now for long-running processes, but not for quick  start/stop things 
> as XSLT commonly is.
> 
> We should look into whether Java's built in XSLT could be used to do this also.
> 
> Roger suggests you just need XSLT 2.0, and maybe Java's built in XSLT is 2.0 ? I 
> know it is not XSLT 3.0. But I wasn't able to quickly determine if it is stuck 
> at version 1.0 or has been updated to 2.0.
> 
> I am unclear if or how Java's XSLT supports extension functions however. But I 
> suspect it does. The trick is the parse-xml() function or other way to construct 
> the XML again that the XSLT operates on.
> 
> 
> 
> 
> --------------------------------------------------------------------------------
> *From:* Steve Lawrence <sl...@apache.org>
> *Sent:* Tuesday, July 7, 2020 8:15 AM
> *To:* users@daffodil.apache.org <us...@daffodil.apache.org>
> *Subject:* Re: Here's how XSLT programs can use DFDL/Daffodil
> Great! Thanks Mike!
> 
> The only downside to this that I see is that it requires Saxon-EE to use
> reflexive extension functions, which takes a little bit of effort to get.
> 
> I'm wondering if it would be worth updating this to use Saxon's newer
> integrated extension functions [1] rather than the reflexive extension
> functions? It looks like it requires a bit more code to setup the
> extension function and register with Saxon, but it is supported in
> Saxon-HE which is much easier to get via maven. And I'd like to
> integrate this into a Daffodil regression suite, which would be easier
> if all dependencies were in maven.
> 
> I can take a look at making this update if this seems reasonable.
> 
> - Steve
> 
> [1]
> http://www.saxonica.com/documentation/#!extensibility/integratedfunctions
> 
> 
> On 7/6/20 7:04 PM, Beckerle, Mike wrote:
>> I captured Roger's example from this email and put in on github here with a test.
>> 
>> https://github.com/OpenDFDL/examples/tree/master/xslt-csv
>> 
>> This isn't part of Daffodil, but it's an interesting example combining XSLT with
>> Daffodil.
>> 
>> --------------------------------------------------------------------------------
>> *From:* Roger L Costello <co...@mitre.org>
>> *Sent:* Tuesday, June 2, 2020 12:10 PM
>> *To:* users@daffodil.apache.org <us...@daffodil.apache.org>
>> *Subject:* Here's how XSLT programs can use DFDL/Daffodil
>> 
>> Hi Folks,
>> 
>> Below is an XSLT program that processes XML-formatted CSV. The XSLT program 
>> calls an external Java program, passing it the name of a DFDL file 
>> (csv.dfdl.xsd) and the name of a CSV file (csv.txt). The Java program calls 
>> Daffodil which produces XML-formatted CSV. The Java program returns the XML as a
>> string. The XSLT program uses the parse-xml() function to convert the string to
>> XML.
>> 
>> <xsl:stylesheetxmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>> xmlns:xs="http://www.w3.org/2001/XMLSchema"
>> xmlns:dfdl="java:runDaffodil"
>> version="3.0">
>> 
>> <xsl:variablename="dfdl"select="'csv.dfdl.xsd'"/>
>> <xsl:variablename="input"select="'csv.txt'"/>
>> 
>> <xsl:templatematch="/">
>> <!-- Convert the CSV text file to XML using DFDL -->
>> <xsl:variablename="csv-string"select="/dfdl:dfdlParse/(*$dfdl*,*$input*)"as="xs:string"/>
>> <xsl:variablename="csv-xml"select="/parse-xml/(*$csv-string*)"as="document-node()"/>
>> <!-- Process the XML-formatted CSV here -->
>> <xsl:variablename="numRecords"select="/count/(*$csv-xml*//*record*)"as="xs:integer"/>
>> <!-- Output the number of records in the CSV file -->
>> <xsl:sequenceselect="*$numRecords*"/>
>> <!-- Output the model (field 3) and year (field 1) of each Chevy auto -->
>> <xsl:for-eachselect="*$csv-xml*//*record*[*field*[2] eq'Chevy']">
>> <xsl:sequenceselect="(*field*[3]//data/(),' ',*field*[1]//data/())"/>
>> </xsl:for-each>
>> </xsl:template>
>> </xsl:stylesheet>
>> 
>> The XSLT program uses XSLT version 3.0. You can use any version except 1.0 (the
>> parse-xml() function is not present in XSLT 1.0). I used Saxon as the XSLT 
>> processor.
>> 
>> The Java program calls Daffodil which parses the CSV file using the DFDL schema.
>> Here is the Java program:
>> 
>> *import*java.io.IOException;
>> *import*java.net.URISyntaxException;
>> *import*java.net.URL;
>> 
>> *import*org.jdom2.Document;
>> *import*org.jdom2.output.XMLOutputter;
>> 
>> *import*org.apache.daffodil.japi.Compiler;
>> *import*org.apache.daffodil.japi.Daffodil;
>> *import*org.apache.daffodil.japi.DataProcessor;
>> *import*org.apache.daffodil.japi.Diagnostic;
>> *import*org.apache.daffodil.japi.ParseResult;
>> *import*org.apache.daffodil.japi.ProcessorFactory;
>> *import*org.apache.daffodil.japi.infoset.JDOMInfosetOutputter;
>> *import*org.apache.daffodil.japi.io.InputSourceDataInputStream;
>> 
>> *public**class*runDaffodil {
>> 
>> *public**static*String dfdlParse(String dfdl,String 
>> input)*throws*IOException,URISyntaxException {
>> 
>> URL dfdlURL =runDaffodil.*class*.getResource(dfdl);
>> URL inputURL =runDaffodil.*class*.getResource(input);
>> 
>> //
>> // First, compile the DFDL Schema
>> //
>> Compiler c =Daffodil.compiler();
>> ProcessorFactory pf =c.compileSource(dfdlURL.toURI());
>> DataProcessor dp =pf.onPath("/");
>> 
>> //
>> // Parse - parse data to XML
>> //
>> java.io.InputStream is =inputURL.openStream();
>> InputSourceDataInputStream dis =*new*InputSourceDataInputStream(is);
>> 
>> //
>> // Setup JDOM outputter
>> //
>> JDOMInfosetOutputter outputter =*new*JDOMInfosetOutputter();
>> 
>> //
>> // Do the parse
>> //
>> ParseResult res =dp.parse(dis,outputter);
>> 
>> //
>> // Return the XML as a string
>> //
>> Document doc =outputter.getResult();
>> *return**new*XMLOutputter().outputString(doc);
>> }
>> }
>> 
> 


Re: Here's how XSLT programs can use DFDL/Daffodil

Posted by "Beckerle, Mike" <mb...@tresys.com>.
I suggest adding an Issue/ticket to the OpenDFDL github for this.

There are a few enhancements to this thing that would be helpful.

As is, it only requires Saxon-PE, which is affordable for most usages.

Another feature would be a cache of the compiled DFDL schema so that the penalty to compile it isn't being paid every time you run the XSLT.  (Something many usages of Daffodil need - probably should be a feature supplied in a Daffodil library so that it's not being reinvented.) Daffodil compilation times are acceptable now for long-running processes, but not for quick  start/stop things as XSLT commonly is.

We should look into whether Java's built in XSLT could be used to do this also.

Roger suggests you just need XSLT 2.0, and maybe Java's built in XSLT is 2.0 ? I know it is not XSLT 3.0. But I wasn't able to quickly determine if it is stuck at version 1.0 or has been updated to 2.0.

I am unclear if or how Java's XSLT supports extension functions however. But I suspect it does. The trick is the parse-xml() function or other way to construct the XML again that the XSLT operates on.




________________________________
From: Steve Lawrence <sl...@apache.org>
Sent: Tuesday, July 7, 2020 8:15 AM
To: users@daffodil.apache.org <us...@daffodil.apache.org>
Subject: Re: Here's how XSLT programs can use DFDL/Daffodil

Great! Thanks Mike!

The only downside to this that I see is that it requires Saxon-EE to use
reflexive extension functions, which takes a little bit of effort to get.

I'm wondering if it would be worth updating this to use Saxon's newer
integrated extension functions [1] rather than the reflexive extension
functions? It looks like it requires a bit more code to setup the
extension function and register with Saxon, but it is supported in
Saxon-HE which is much easier to get via maven. And I'd like to
integrate this into a Daffodil regression suite, which would be easier
if all dependencies were in maven.

I can take a look at making this update if this seems reasonable.

- Steve

[1]
http://www.saxonica.com/documentation/#!extensibility/integratedfunctions


On 7/6/20 7:04 PM, Beckerle, Mike wrote:
> I captured Roger's example from this email and put in on github here with a test.
>
> https://github.com/OpenDFDL/examples/tree/master/xslt-csv
>
> This isn't part of Daffodil, but it's an interesting example combining XSLT with
> Daffodil.
>
> --------------------------------------------------------------------------------
> *From:* Roger L Costello <co...@mitre.org>
> *Sent:* Tuesday, June 2, 2020 12:10 PM
> *To:* users@daffodil.apache.org <us...@daffodil.apache.org>
> *Subject:* Here's how XSLT programs can use DFDL/Daffodil
>
> Hi Folks,
>
> Below is an XSLT program that processes XML-formatted CSV. The XSLT program
> calls an external Java program, passing it the name of a DFDL file
> (csv.dfdl.xsd) and the name of a CSV file (csv.txt). The Java program calls
> Daffodil which produces XML-formatted CSV. The Java program returns the XML as a
> string. The XSLT program uses the parse-xml() function to convert the string to
> XML.
>
> <xsl:stylesheetxmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> xmlns:xs="http://www.w3.org/2001/XMLSchema"
> xmlns:dfdl="java:runDaffodil"
> version="3.0">
>
> <xsl:variablename="dfdl"select="'csv.dfdl.xsd'"/>
> <xsl:variablename="input"select="'csv.txt'"/>
>
> <xsl:templatematch="/">
> <!-- Convert the CSV text file to XML using DFDL -->
> <xsl:variablename="csv-string"select="/dfdl:dfdlParse/(*$dfdl*,*$input*)"as="xs:string"/>
> <xsl:variablename="csv-xml"select="/parse-xml/(*$csv-string*)"as="document-node()"/>
> <!-- Process the XML-formatted CSV here -->
> <xsl:variablename="numRecords"select="/count/(*$csv-xml*//*record*)"as="xs:integer"/>
> <!-- Output the number of records in the CSV file -->
> <xsl:sequenceselect="*$numRecords*"/>
> <!-- Output the model (field 3) and year (field 1) of each Chevy auto -->
> <xsl:for-eachselect="*$csv-xml*//*record*[*field*[2] eq'Chevy']">
> <xsl:sequenceselect="(*field*[3]//data/(),' ',*field*[1]//data/())"/>
> </xsl:for-each>
> </xsl:template>
> </xsl:stylesheet>
>
> The XSLT program uses XSLT version 3.0. You can use any version except 1.0 (the
> parse-xml() function is not present in XSLT 1.0). I used Saxon as the XSLT
> processor.
>
> The Java program calls Daffodil which parses the CSV file using the DFDL schema.
> Here is the Java program:
>
> *import*java.io.IOException;
> *import*java.net.URISyntaxException;
> *import*java.net.URL;
>
> *import*org.jdom2.Document;
> *import*org.jdom2.output.XMLOutputter;
>
> *import*org.apache.daffodil.japi.Compiler;
> *import*org.apache.daffodil.japi.Daffodil;
> *import*org.apache.daffodil.japi.DataProcessor;
> *import*org.apache.daffodil.japi.Diagnostic;
> *import*org.apache.daffodil.japi.ParseResult;
> *import*org.apache.daffodil.japi.ProcessorFactory;
> *import*org.apache.daffodil.japi.infoset.JDOMInfosetOutputter;
> *import*org.apache.daffodil.japi.io.InputSourceDataInputStream;
>
> *public**class*runDaffodil {
>
> *public**static*String dfdlParse(String dfdl,String
> input)*throws*IOException,URISyntaxException {
>
> URL dfdlURL =runDaffodil.*class*.getResource(dfdl);
> URL inputURL =runDaffodil.*class*.getResource(input);
>
> //
> // First, compile the DFDL Schema
> //
> Compiler c =Daffodil.compiler();
> ProcessorFactory pf =c.compileSource(dfdlURL.toURI());
> DataProcessor dp =pf.onPath("/");
>
> //
> // Parse - parse data to XML
> //
> java.io.InputStream is =inputURL.openStream();
> InputSourceDataInputStream dis =*new*InputSourceDataInputStream(is);
>
> //
> // Setup JDOM outputter
> //
> JDOMInfosetOutputter outputter =*new*JDOMInfosetOutputter();
>
> //
> // Do the parse
> //
> ParseResult res =dp.parse(dis,outputter);
>
> //
> // Return the XML as a string
> //
> Document doc =outputter.getResult();
> *return**new*XMLOutputter().outputString(doc);
> }
> }
>