You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@daffodil.apache.org by ji...@apache.org on 2023/02/23 20:38:25 UTC

[daffodil-site] branch main updated: Rename daffodil-runtime2

This is an automated email from the ASF dual-hosted git repository.

jinterrante pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/daffodil-site.git


The following commit(s) were added to refs/heads/main by this push:
     new 4c1bfda  Rename daffodil-runtime2
4c1bfda is described below

commit 4c1bfdaafb8f0a17942383886dadb1e7743dfa34
Author: John Interrante <in...@research.ge.com>
AuthorDate: Tue Feb 21 15:21:56 2023 -0800

    Rename daffodil-runtime2
    
    Update places mentioning runtime2 as part of renaming
    daffodil-runtime2 to daffodil-codegen-c.
    
    3.1.0.md, 3.2.0.md, 3.2.1.md, 3.3.0.md, 3.4.0.md: Say just
    "code generator" in some places. Update link to daffodilc-todos.
    
    cli.md: Rename runtime2 to codegen-c.
    
    daffodilc-todos.adoc: Rename from runtime2-todos.adoc.  Rename
    runtime2 to codegen-c.  Shorten arrays section since we have now
    implemented arrays with finite bounds and have only unbounded arrays
    left to do.
    
    DAFFODIL-2796
---
 site/_releases/3.1.0.md                            |  13 +--
 site/_releases/3.2.0.md                            |   6 +-
 site/_releases/3.2.1.md                            |   2 +-
 site/_releases/3.3.0.md                            |   2 +-
 site/_releases/3.4.0.md                            |   2 +-
 site/cli.md                                        |   2 +-
 .../{runtime2-todos.adoc => daffodilc-todos.adoc}  | 103 +++++++--------------
 7 files changed, 46 insertions(+), 84 deletions(-)

diff --git a/site/_releases/3.1.0.md b/site/_releases/3.1.0.md
index 31b78e0..b1ed50c 100644
--- a/site/_releases/3.1.0.md
+++ b/site/_releases/3.1.0.md
@@ -21,12 +21,13 @@ binary-dist:
 scala-version: 2.12
 ---
 
-#### Code Generation Backend (runtime2)
+#### Code Generation Backend
 
-A new code generation back-end, called `runtime2`, is added supporting generating
-C code with limited parse and unparse capabilities. See the [Command Line
-Interface](/cli) page for commands and options to generate code. Visit
-[Runtime2 ToDos](/dev/design-notes/runtime2-todos/) for a list of limitations
+A new code generation backend, called `daffodil-runtime2`, is added
+supporting generating C code with limited parse and unparse
+capabilities. See the [Command Line Interface](/cli) page for commands
+and options to generate code. Visit [DaffodilC
+ToDos](/dev/design-notes/daffodilc-todos/) for a list of limitations
 and planned future work.
 
 * {% jira 2202 %} Code Gen Framework
@@ -170,7 +171,7 @@ The following dependencies have been added or updated
 * Jansi 2.3.2 <small>(update)</small>
 * Scallop 4.0.2 <small>(update)</small>
 
-**Code Generator (runtime2)**
+**Code Generator**
 
 * OS-Lib 0.7.6 <small>(new)</small>
 
diff --git a/site/_releases/3.2.0.md b/site/_releases/3.2.0.md
index 3ed3065..3b46b70 100644
--- a/site/_releases/3.2.0.md
+++ b/site/_releases/3.2.0.md
@@ -71,9 +71,9 @@ Java 17 is now a supported JVM. Note that Use of Java 17 requires use of Scala l
 * {% jira 2579 %} java compile compatibility issue with Java 16/17
 * {% jira 2558 %} Update GitHub Actions Workflow to use Java 17
 
-#### Code Generation Backend (runtime2)
+#### Code Generation Backend
 
-The C code generation back-end, called `runtime2` has had these enhancements.
+The C code generation backend called `daffodil-runtime2` has had these enhancements.
 
 * {% jira 2566 %} Extend C code generator to support hexBinary elements
 * {% jira 2578 %} Raise minimum C level to ISO C11 with GNU extensions
@@ -179,7 +179,7 @@ The following dependencies have been added or updated
 * Log4j core 2.14.1 <small>(new)</small>
 * Scallop 4.1.0 <small>(update)</small>
 
-**Code Generator (runtime2)**
+**Code Generator**
 
 * OS-Lib 0.7.8 <small>(update)</small>
 
diff --git a/site/_releases/3.2.1.md b/site/_releases/3.2.1.md
index 327ee59..3138392 100644
--- a/site/_releases/3.2.1.md
+++ b/site/_releases/3.2.1.md
@@ -67,7 +67,7 @@ The following dependencies have been added or updated
 * Log4j api 2.17.0 <small>(update)</small>
 * JDOM2 2.0.6.1 <small>(update)</small>
 
-**Code Generator (runtime2)**
+**Code Generator**
 
 * OS-Lib 0.8.0 <small>(update)</small>
 
diff --git a/site/_releases/3.3.0.md b/site/_releases/3.3.0.md
index da24ff7..3c605f6 100644
--- a/site/_releases/3.3.0.md
+++ b/site/_releases/3.3.0.md
@@ -78,7 +78,7 @@ The following dependencies have been added or updated:
 * Typesafe Config 1.4.2 <small>(update)</small>
 * Xerces 2.12.2 <small>(update)</small>
 
-**Code Generator (runtime2)**
+**Code Generator**
 
 * OS-Lib 0.8.1 <small>(update)</small>
 
diff --git a/site/_releases/3.4.0.md b/site/_releases/3.4.0.md
index 2b6495b..1a037ac 100644
--- a/site/_releases/3.4.0.md
+++ b/site/_releases/3.4.0.md
@@ -63,7 +63,7 @@ original data.
 
 * {% jira 2708 %} XML String feature in XML Text Infoset Inputter/Outputter
 
-#### C Code Generator (Runtime2) Updates
+#### C Code Generator Updates
 
 The C code generator backend now supports reading and writing N-bit booleans
 and integers, where N is an explicit length from 1 to 64 bits. Additional
diff --git a/site/cli.md b/site/cli.md
index 99e1df3..aef3669 100644
--- a/site/cli.md
+++ b/site/cli.md
@@ -55,7 +55,7 @@ Setting environment variables may be necessary to allow imports, includes, and r
 
 ``CC``
 
-   : If you need to specify which C compiler should be called when running TDML files with the ``test`` subcommand using Daffodil's runtime2 backend, you can set the ``CC`` environment variable. If that environment variable is not specified, then Daffodil will call the first C compiler driver command it finds within the ``PATH`` environment variable from the following list: "zig cc", "cc", "clang", "gcc" (in that order). The reason for "zig cc" coming first is because [zig cc](https://an [...]
+   : If you need to specify which C compiler should be called when running TDML files with the ``test`` subcommand using Daffodil's codegen-c backend, you can set the ``CC`` environment variable. If that environment variable is not specified, then Daffodil will call the first C compiler driver command it finds within the ``PATH`` environment variable from the following list: "zig cc", "cc", "clang", "gcc" (in that order). The reason for "zig cc" coming first is because [zig cc](https://a [...]
 
 ### Global Options
 
diff --git a/site/dev/design-notes/runtime2-todos.adoc b/site/dev/design-notes/daffodilc-todos.adoc
similarity index 76%
rename from site/dev/design-notes/runtime2-todos.adoc
rename to site/dev/design-notes/daffodilc-todos.adoc
index 1e52aff..660a143 100644
--- a/site/dev/design-notes/runtime2-todos.adoc
+++ b/site/dev/design-notes/daffodilc-todos.adoc
@@ -1,5 +1,5 @@
 :page-layout: page
-:keywords: dfdl-to-c backend code-generator runtime2
+:keywords: dfdl-to-c code generator backend
 // ///////////////////////////////////////////////////////////////////////////
 //
 // This file is written in https://asciidoctor.org/docs/what-is-asciidoc/[AsciiDoc]
@@ -11,12 +11,12 @@
 //
 // //////////////////////////////////////////////////////////////////////////
 
-== Runtime2 ToDos
+== C Code Generator ToDos
 
 === Overview
 
-We have built an initial DFDL-to-C backend
-and code generator for Apache Daffodil.
+We have built an initial DFDL-to-C code
+generator and backend for Apache Daffodil.
 Currently the C code generator can support
 binary boolean, integer, and real numbers,
 arrays of simple and complex elements,
@@ -97,7 +97,7 @@ doesn't integrate well with any of these systems
 unless we generate a child element with a generated name
 (which makes paths awkward, etc.).
 Hence, it seems better to say that
-the runtime2 DFDL subset doesn't allow anonymous choices
+codegen-c's DFDL subset doesn't allow anonymous choices
 and DFDL schema authors should write their schema like this:
 
 [source,xml]
@@ -150,70 +150,31 @@ every use of _choice very carefully.
 
 === Arrays
 
-Currently we create an ERD for an array with the array's name
-and the scalar type of its first element,
-but the ERD has no numChildren and the rest of its fields are NULL.
-Then in the parent element's ERD, we expand and inline the array
-into the parent element's offsets and childrenERDs
-with incrementing offsets for each array element
-and the same pointer to the same array ERD for each array element.
-We also expand and inline the array
-into the parent element's parseSelf and unparseSelf functions
-with as many parse and unparse calls as there are array elements.
-
-We need to change this approach to handle arrays
-having undetermined lengths at compile time.
-One possible approach might be to define an ERD for an array
-like an ERD for a complex element with one child.
-The typeCode might become ARRAY or remain COMPLEX,
-the numChildren would be 1,
-the offsets would be the offset of the first array element
-(allowing room to skip over an actual number of elements
-stored in the C struct to the offset of the actual array,
-or to point to memory allocated from the heap),
-the childrenERDs would be the ERD of the first array element,
-the parseSelf would be a function to parse all array members,
-and the unparseSelf would be a function to unparse all array members.
-These functions would know how to find the number of elements
-depending on dfdl:occursCountKind when parsing
-(fixed, implicit, parsed, expression, or stopValue)
-and depending on a count stored in the C struct when unparsing.
-These functions also would know how to loop as many times
-as needed to parse or unparse each array element using the
-first array element's ERD in childrenERDs every time.
-
-Note that we don't have to store a count
-of the actual number of array elements in the C struct
-for a dfdl:occursCountKind of fixed, expression, or stopValue.
-Fixed means the count is a known constant at compile time.
-Expression means the count is already stored in
-another C struct field which we just have to find
-via the expression when parsing and unparsing.
-StopValue means we only need to look inside the array
-for a stopValue when parsing and unparsing.
-However, we do need to store an actual count in the C struct
-for a dfdl:occursCountKind of implicit or parsed
-because we will have no other possible way
-to find the actual count when unparsing.
-Our C code also should allow the count to be zero
-without the code blowing up.
-
-If we want the C code to validate the array's count
-against the array's minOccurs and maxOccurs,
-we can inline the array's minOccurs and maxOccurs
-into the array's parseSelf and unparseSelf functions.
-However, we should allow the normal case to be no validation,
-since Daffodil must not enforce min/maxOccurs
-if the user wants to parse and unparse well-formed but invalid data
-for forensic analysis.
-However, we still can let min/maxOccurs influence the generated C code.
+We inline an array's minOccurs and maxOccurs
+into the array's parseSelf and unparseSelf functions
+and throw an error if the count isn't within these bounds.
+The reason why we perform these checks is because
+we inline arrays' maximum size into C structs
+and we put C structs into static memory to avoid using
+heap memory which may not be available on all platforms.
+
+In the normal case, Daffodil's Scala backend parses and unparses
+unbounded or well-formed arrays without enforcing min/maxOccurs
+for forensic analysis and easier debugging.
+If we want the C backend to parse and unparse unbounded arrays
+and well-formed arrays, we still can inline min/maxOccurs into
+the generated C code without enforcing their bounds.
 If maxOccurs is unbounded or the largest possible array size
-(maxOccurs - minOccurs) is larger than a heuristic or tunable,
-we should allocate storage for the array from the heap
-instead of declaring storage for the array inline in the C struct.
-The normal case should be to inline the array into the C struct
-with the array's maximum size since bare metal C and VHDL
-will not be able to allocate memory from a heap dynamically.
+(maxOccurs - minOccurs) is larger than a tunable,
+we would allocate storage for the array from the heap
+instead of inlining the array's storage into the C struct.
+If the array is small enough to inline into the C struct
+but the array needs more space than its inlined space,
+we can switch that array to heap memory at runtime as long as
+we track its heap/inline status in another C struct field.
+We still should keep inlining finite bounded arrays into C structs
+since some embedded systems will not be able to allocate memory
+from a heap dynamically.
 
 === Making infosets more efficient
 
@@ -297,7 +258,7 @@ like an offline log processor does.
 
 === Recovering after errors
 
-As we continue to build out runtime2,
+As we continue to build out codegen-c,
 we may need to distinguish more types of errors
 and allow backtracking and retrying.
 Right now we handle only parse/unparse and
@@ -344,7 +305,7 @@ but we would need a good use case requiring recovering after errors.
 === Validate "fixed" values in runtime1 too
 
 If we change runtime1 to validate "fixed" values
-like runtime2 does, then we can resolve 
+like codegen-c does, then we can resolve 
 https://issues.apache.org/jira/browse/DAFFODIL-117[DAFFODIL-117].
 
 === No match between choice dispatch key and choice branch keys
@@ -367,4 +328,4 @@ matching the unparsed choice branch,
 the writer must write an explicit dfdl:outputValueCalc
 expression to replace the choiceDispatchKey
 even though supporting dfdl:outputValueCalc
-in runtime2 is likely a distant goal.
+in codegen-c is likely a distant goal.