You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Andy Clark <an...@apache.org> on 2002/04/14 13:55:49 UTC
[Announce] NekoHTML 0.4 Parser for Xerces2 Available
== About ==
NekoHTML is a simple HTML scanner and tag balancer that enables
application programmers to parse HTML documents and access the
information using standard XML interfaces. The parser can scan
HTML files and "fix up" many common mistakes that human (and
computer) authors make in writing HTML documents. NekoHTML adds
missing parent elements; automatically closes elements with
optional end tags; and can handle mismatched inline element tags.
NekoHTML is written using the Xerces Native Interface (XNI) that
is the foundation of the Xerces2 implementation. This enables you
to use the NekoHTML parser with existing XNI tools without
modification or rewriting code.
The NekoHTML parser is available under an Apache-style licence.
== This Release ==
Changes from the last release include:
* Added properties to control case of element and attribute
names;
* changed behavior of parser so that only known HTML elements
have their names modified according to the properties -- all
unknown tags are left as-is;
* added property to set default encoding;
* added feature to augment infoset to report "synthesized" events;
* added feature to be able to report errors and localized the
error messages;
* implemented the locator so that location information can be
reported; and
* fixed element information so that more elements are properly
scanned as "special".
In addition, new documentation was written to demonstrate how
to take advantage of the new features and properties.
== Other ==
I thought I was going to just put in a few new features people
were asking for and then release it. But then I wanted to add
another feature and another and... Well, you get the idea. :)
So this release has very little change in terms of behavior
but includes a bunch of useful new features.
Have fun!
--
Andy Clark * andyc@apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org