You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@maven.apache.org by "Henri Sivonen (JIRA)" <ji...@codehaus.org> on 2008/08/25 18:00:27 UTC

[jira] Created: (MAVENUPLOAD-2186) Validator.nu HTML parser 1.1.0 to Maven repo

Validator.nu HTML parser 1.1.0 to Maven repo
--------------------------------------------

Key: MAVENUPLOAD-2186
URL: http://jira.codehaus.org/browse/MAVENUPLOAD-2186
Project: Maven Upload Requests
Issue Type: Improvement
Reporter: Henri Sivonen
Attachments: htmlparser-1.1.0-bundle.jar

For reference, the previous version was MAVENUPLOAD-2006.

Changes:
* Made the SAX, DOM and XOM parser entry point constructors default to altering the infoset instead of throwing when the input needs coercing to be an XML 1.0 4th ed. plus Namespaces infoset.
* Isolated Java IO dependent code from the parser core. The parser core now compiles on Google Web Toolkit.
* Refactored the tokenizer to use a switch branch per state instead of method per state.
* Made various performance tweaks to the tokenizer.
* Implemented support for MathML and SVG foreign content. (Note that the SVG part is based on spec text that has been commented out from the spec at the request of the SVG WG.)
* Made the parser suspendable after any input character.
* Made it possible for custom TreeBuilder subclasses to request parser suspension. (Applications wishing to implement document.write() should provide their own TreeBuilder subclass and a document.write()-aware replacement of the Driver class. Look in the gwt-src/ directory for sample code.)
* Made changes to the parser core to make it more suitable for mechanical translation into other object-oriented programming languages that have C-like control structures but not necessarily a garbage collector (with focus on targeting C++). This work is not complete.
* Made the HTML serializer do the right thing when input represents a conforming XHTML+SVG+MathML tree. (Results may be bad for non-conforming input trees.)
* Developed sample programs for converting between HTML5 and XHTML5 when the input is known to be conforming.
* Provided an XML serializer so that the sample code no longer depends on the Xalan serializer.
* Improved API documentation.
* Fixed bugs in the tokenizer, tree builder and the input stream character encoding decoder.
* Made coercion to an XML infoset work according to the HTML5 spec.
* Added ID uniqueness checking.
* Various other fixes.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Closed: (MAVENUPLOAD-2186) Validator.nu HTML parser 1.1.0 to Maven repo

Posted by "Carlos Sanchez (JIRA)" <ji...@codehaus.org>.

     [ http://jira.codehaus.org/browse/MAVENUPLOAD-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Carlos Sanchez closed MAVENUPLOAD-2186.
---------------------------------------

      Assignee: Carlos Sanchez
    Resolution: Fixed

> Validator.nu HTML parser 1.1.0 to Maven repo
> --------------------------------------------
>
>                 Key: MAVENUPLOAD-2186
>                 URL: http://jira.codehaus.org/browse/MAVENUPLOAD-2186
>             Project: Maven Upload Requests
>          Issue Type: Improvement
>            Reporter: Henri Sivonen
>            Assignee: Carlos Sanchez
>         Attachments: htmlparser-1.1.0-bundle.jar
>
>
> For reference, the previous version was MAVENUPLOAD-2006.
> Changes:
>     * Made the SAX, DOM and XOM parser entry point constructors default to altering the infoset instead of throwing when the input needs coercing to be an XML 1.0 4th ed. plus Namespaces infoset.
>     * Isolated Java IO dependent code from the parser core. The parser core now compiles on Google Web Toolkit.
>     * Refactored the tokenizer to use a switch branch per state instead of method per state.
>     * Made various performance tweaks to the tokenizer.
>     * Implemented support for MathML and SVG foreign content. (Note that the SVG part is based on spec text that has been commented out from the spec at the request of the SVG WG.)
>     * Made the parser suspendable after any input character.
>     * Made it possible for custom TreeBuilder subclasses to request parser suspension. (Applications wishing to implement document.write() should provide their own TreeBuilder subclass and a document.write()-aware replacement of the Driver class. Look in the gwt-src/ directory for sample code.)
>     * Made changes to the parser core to make it more suitable for mechanical translation into other object-oriented programming languages that have C-like control structures but not necessarily a garbage collector (with focus on targeting C++). This work is not complete.
>     * Made the HTML serializer do the right thing when input represents a conforming XHTML+SVG+MathML tree. (Results may be bad for non-conforming input trees.)
>     * Developed sample programs for converting between HTML5 and XHTML5 when the input is known to be conforming.
>     * Provided an XML serializer so that the sample code no longer depends on the Xalan serializer.
>     * Improved API documentation.
>     * Fixed bugs in the tokenizer, tree builder and the input stream character encoding decoder.
>     * Made coercion to an XML infoset work according to the HTML5 spec.
>     * Added ID uniqueness checking.
>     * Various other fixes.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira