You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@daffodil.apache.org by "John Wass (Jira)" <ji...@apache.org> on 2021/01/21 20:52:00 UTC

[jira] [Closed] (DAFFODIL-1685) Full validation should create and initialize the validator before parsing/unparsing begins

     [ https://issues.apache.org/jira/browse/DAFFODIL-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Wass closed DAFFODIL-1685.
-------------------------------
    Fix Version/s: 3.1.0
       Resolution: Fixed

Resolved in [PR#431|https://github.com/apache/incubator-daffodil/pull/431] which restructured the Xerces validation to fit the new Validator API pattern

> Full validation should create and initialize the validator before parsing/unparsing begins
> ------------------------------------------------------------------------------------------
>
>                 Key: DAFFODIL-1685
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-1685
>             Project: Daffodil
>          Issue Type: Improvement
>          Components: Back End, Performance
>    Affects Versions: 2.0.0
>            Reporter: Mike Beckerle
>            Assignee: John Wass
>            Priority: Major
>              Labels: verify
>             Fix For: 3.1.0
>
>
> In many applications, validation will be turned on.
> In 2.0.0-rc2, it was observed that parse time increases with the volume of non-DFDL comments/annotations in the schema. 
> This was with validation on. The explanation for this is that validation, which calls xerces currently, is constructing the validator and this cost is viewed as part of the cost of parsing, or perhaps even constructing the validator for every parse call.
> Now we're switching to woodstox for XML parsing. This is a validating parser also, so we could try using it to speed up validation. 
> Nevertheless we should make sure as much is hoisted out of parse time as possible. 
> Certainly we should try creating the validator object once; on the latest 2.0.0 currently there is code that does this once per thread (it does not assume the validator, when initialized, is thread safe - perhaps we can determine this and create only one, not one per thread.)
> Within the same thread the same validator will be used, but across threads it will not. 
> It is initialized on first use, which probably shows up as part of parse time - involves reading the entire extended schema, resolving all file references, etc. Lots of cost here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)