You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Miguel Montes <mi...@gmail.com> on 2006/08/23 02:17:43 UTC

[classlib][html]

Hi:
We are working on the html parser, and need to have working DTD. The current
implementation of DTD.read(), based on serialization, has some problems, and
I think we should have a well defined binary format. I suggest the following
ASN.1 format, and if there is consensus on it, we could contribute the code
to read and write it.
I would like to hear the opinion of Stepan and anyone who has worked with
ASN.1 before.

BDTD ::= SEQUENCE {
       Name UTF8String,
       Entity SET OF HTMLEntity,
       Element SET OF HTMLElement
}

HTMLEntity ::= SEQUENCE {
       Name UTF8String,
       Value INTEGER,
       General BOOLEAN DEFAULT FALSE,
       Parameter BOOLEAN DEFAULT FALSE,
       Data UTF8String
}

HTMLElement ::= SEQUENCE {
       Index INTEGER,
       Name UTF8String,
       Type INTEGER,
       OStart BOOLEAN,
       OEnd BOOLEAN,
       Exclusions SET OF INTEGER,
       Inclusions SET OF INTEGER,
       Attributes SET OF HTMLElementAttributes OPTIONAL,
       ContentModel HTMLContentModel,
}

HTMLContentModel ::= SEQUENCE OF SEQUENCE {
       Type INTEGER,
       Index INTEGER
}

HTMLElementAttributes ::= SEQUENCE {
       Name UTF8String,
       Type INTEGER,
       Modifier INTEGER,
       DefaultValue UTF8String OPTIONAL,
       PossibleValues SET OF UTF8String OPTIONAL
}
-- 
Miguel Montes