You are viewing a plain text version of this content. The canonical link for it is here.

Posted to j-dev@xerces.apache.org by "Jacob Kjome (JIRA)" <xe...@xml.apache.org> on 2006/09/30 16:58:21 UTC

[jira] Created: (XERCESJ-1200) change HTMLDocumentImpl.getElementById(String) to first attempt lookup from "identifiers" map, otherwise fall back to old behavior

change HTMLDocumentImpl.getElementById(String) to first attempt lookup from "identifiers" map, otherwise fall back to old behavior
----------------------------------------------------------------------------------------------------------------------------------

Key: XERCESJ-1200
URL: http://issues.apache.org/jira/browse/XERCESJ-1200
Project: Xerces2-J
Issue Type: Improvement
Components: DOM (HTML)
Affects Versions: 2.8.1
Reporter: Jacob Kjome

As discussed on the mailing list [1], I'd like HTMLDocumentImpl.getElementById(String) to make an attempt at an optimized lookup of the element by calling the superclass's getElementById(String), which looks up from the "identifiers" hashmap. If found, it is returned. If not, it falls back to pre-existing behavior, which walks the DOM looking for an element with an "id" attribute of the value provided.

There's a difference between what I proposed on the mailing list and here. On the mailing list, I proposed adding Id's to the "identifiers" map as they were found via the fallback method. The idea was that the next time the Id was looked up, it would be pulled from the "identifiers" map and, therefore, be optimized. The main problem pointed out with this was the inconsistencies between the ID'ness of the "id" attribute at document load time and after the first call to getElementById(String), as well as the same issue after a call to normalizeDocument().

This proposal is more limited in scope. If the Id exists in the "identifiers" map, it is returned, otherwise it just falls back to pre-existing behavior. It is up to the HTML document parser to properly set up the ID'ness of the "id" attribute so Id's are either registered or not from the get-go. Note that this can easily be accomplished using Andy Clark's NekoHTML "filters" capability [2].

I have not addressed the issue of calls to normalizeDocument() here. That seems to be a more complicated issue, so I'd rather not tackle that here. However, I don't think it is much of a problem. Why would someone want to revalidate an HTML Document? There's no DTD, so there's nothing to validate. In fact, I currently use normalizeDocument() without loss of ID'ness of "id" attributes. I have the following properties set...

config.setParameter("namespaces", Boolean.FALSE);
config.setParameter("well-formed", Boolean.FALSE);

In any case, my proposal makes no proactive modification of ID'ness of "id" attributes. The parser either sets them up for optimzation or it doesn't and there's always fallback to pre-existing behavior.

Patch coming up....

[1] - http://marc.theaimsgroup.com/?t=115890536600001&r=1&w=4
[2] - http://people.apache.org/~andyc/neko/doc/html/filters.html

Jake

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org

[jira] Assigned: (XERCESJ-1200) change HTMLDocumentImpl.getElementById(String) to first attempt lookup from "identifiers" map, otherwise fall back to old behavior

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.

     [ http://issues.apache.org/jira/browse/XERCESJ-1200?page=all ]

Michael Glavassevich reassigned XERCESJ-1200:
---------------------------------------------

    Assignee: Michael Glavassevich

> change HTMLDocumentImpl.getElementById(String) to first attempt lookup from "identifiers" map, otherwise fall back to old behavior
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1200
>                 URL: http://issues.apache.org/jira/browse/XERCESJ-1200
>             Project: Xerces2-J
>          Issue Type: Improvement
>          Components: DOM (HTML)
>    Affects Versions: 2.8.1
>            Reporter: Jacob Kjome
>         Assigned To: Michael Glavassevich
>         Attachments: HTMLDocumentImpl.java.patch
>
>
> As discussed on the mailing list [1], I'd like HTMLDocumentImpl.getElementById(String) to make an attempt at an optimized lookup of the element by calling the superclass's getElementById(String), which looks up from the "identifiers" hashmap.  If found, it is returned.  If not, it falls back to pre-existing behavior, which walks the DOM looking for an element with an "id" attribute of the value provided.
> There's a difference between what I proposed on the mailing list and here.  On the mailing list, I proposed adding Id's to the "identifiers" map as they were found via the fallback method.  The idea was that the next time the Id was looked up, it would be pulled from the "identifiers" map and, therefore, be optimized.  The main problem pointed out with this was the inconsistencies between the ID'ness of the "id" attribute at document load time and after the first call to getElementById(String), as well as the same issue after a call to normalizeDocument().
> This proposal is more limited in scope.  If the Id exists in the "identifiers" map, it is returned, otherwise it just falls back to pre-existing behavior.  It is up to the HTML document parser to properly set up the ID'ness of the "id" attribute so Id's are either registered or not from the get-go.  Note that this can easily be accomplished using Andy Clark's NekoHTML "filters" capability [2].
> I have not addressed the issue of calls to normalizeDocument() here.  That seems to be a more complicated issue, so I'd rather not tackle that here.  However, I don't think it is much of a problem.  Why would someone want to revalidate an HTML Document?  There's no DTD, so there's nothing to validate.  In fact, I currently use normalizeDocument() without loss of ID'ness of "id" attributes.  I have the following properties set...
>         config.setParameter("namespaces", Boolean.FALSE);
>         config.setParameter("well-formed", Boolean.FALSE);
> In any case, my proposal makes no proactive modification of ID'ness of "id" attributes.  The parser either sets them up for optimzation or it doesn't and there's always fallback to pre-existing behavior.  
> Patch coming up....
> [1] - http://marc.theaimsgroup.com/?t=115890536600001&r=1&w=4
> [2] - http://people.apache.org/~andyc/neko/doc/html/filters.html
> Jake

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org

[jira] Updated: (XERCESJ-1200) change HTMLDocumentImpl.getElementById(String) to first attempt lookup from "identifiers" map, otherwise fall back to old behavior

Posted by "Jacob Kjome (JIRA)" <xe...@xml.apache.org>.

     [ http://issues.apache.org/jira/browse/XERCESJ-1200?page=all ]

Jacob Kjome updated XERCESJ-1200:
---------------------------------

    Attachment: HTMLDocumentImpl.java.patch

Provides optimized lookup of element using document.getElementById(String) in the case that the parser set up the ID'ness of "id" attributes.  If not, just falls back to pre-existing behavior.

> change HTMLDocumentImpl.getElementById(String) to first attempt lookup from "identifiers" map, otherwise fall back to old behavior
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1200
>                 URL: http://issues.apache.org/jira/browse/XERCESJ-1200
>             Project: Xerces2-J
>          Issue Type: Improvement
>          Components: DOM (HTML)
>    Affects Versions: 2.8.1
>            Reporter: Jacob Kjome
>         Attachments: HTMLDocumentImpl.java.patch
>
>
> As discussed on the mailing list [1], I'd like HTMLDocumentImpl.getElementById(String) to make an attempt at an optimized lookup of the element by calling the superclass's getElementById(String), which looks up from the "identifiers" hashmap.  If found, it is returned.  If not, it falls back to pre-existing behavior, which walks the DOM looking for an element with an "id" attribute of the value provided.
> There's a difference between what I proposed on the mailing list and here.  On the mailing list, I proposed adding Id's to the "identifiers" map as they were found via the fallback method.  The idea was that the next time the Id was looked up, it would be pulled from the "identifiers" map and, therefore, be optimized.  The main problem pointed out with this was the inconsistencies between the ID'ness of the "id" attribute at document load time and after the first call to getElementById(String), as well as the same issue after a call to normalizeDocument().
> This proposal is more limited in scope.  If the Id exists in the "identifiers" map, it is returned, otherwise it just falls back to pre-existing behavior.  It is up to the HTML document parser to properly set up the ID'ness of the "id" attribute so Id's are either registered or not from the get-go.  Note that this can easily be accomplished using Andy Clark's NekoHTML "filters" capability [2].
> I have not addressed the issue of calls to normalizeDocument() here.  That seems to be a more complicated issue, so I'd rather not tackle that here.  However, I don't think it is much of a problem.  Why would someone want to revalidate an HTML Document?  There's no DTD, so there's nothing to validate.  In fact, I currently use normalizeDocument() without loss of ID'ness of "id" attributes.  I have the following properties set...
>         config.setParameter("namespaces", Boolean.FALSE);
>         config.setParameter("well-formed", Boolean.FALSE);
> In any case, my proposal makes no proactive modification of ID'ness of "id" attributes.  The parser either sets them up for optimzation or it doesn't and there's always fallback to pre-existing behavior.  
> Patch coming up....
> [1] - http://marc.theaimsgroup.com/?t=115890536600001&r=1&w=4
> [2] - http://people.apache.org/~andyc/neko/doc/html/filters.html
> Jake

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org

[jira] Resolved: (XERCESJ-1200) change HTMLDocumentImpl.getElementById(String) to first attempt lookup from "identifiers" map, otherwise fall back to old behavior

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.

     [ http://issues.apache.org/jira/browse/XERCESJ-1200?page=all ]

Michael Glavassevich resolved XERCESJ-1200.
-------------------------------------------

    Resolution: Fixed

Thanks Jake.  I've committed your patch to SVN. Even if the HTML Document doesn't have a DocumentType node an application could still validate against a DTD in memory by setting their DOMConfiguration up like:

config.setParameter("validate", Boolean.TRUE);
config.setParameter("schema-type", javax.xml.XMLConstants.XML_DTD_NS_URI);
config.setParameter("schema-location", "file:///C:/my.dtd");

It's probably very unlikely that anyone is doing this with an HTML DOM.  Perhaps something to revisit in the future if this becomes a real problem for users.

> change HTMLDocumentImpl.getElementById(String) to first attempt lookup from "identifiers" map, otherwise fall back to old behavior
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1200
>                 URL: http://issues.apache.org/jira/browse/XERCESJ-1200
>             Project: Xerces2-J
>          Issue Type: Improvement
>          Components: DOM (HTML)
>    Affects Versions: 2.8.1
>            Reporter: Jacob Kjome
>         Assigned To: Michael Glavassevich
>         Attachments: HTMLDocumentImpl.java.patch
>
>
> As discussed on the mailing list [1], I'd like HTMLDocumentImpl.getElementById(String) to make an attempt at an optimized lookup of the element by calling the superclass's getElementById(String), which looks up from the "identifiers" hashmap.  If found, it is returned.  If not, it falls back to pre-existing behavior, which walks the DOM looking for an element with an "id" attribute of the value provided.
> There's a difference between what I proposed on the mailing list and here.  On the mailing list, I proposed adding Id's to the "identifiers" map as they were found via the fallback method.  The idea was that the next time the Id was looked up, it would be pulled from the "identifiers" map and, therefore, be optimized.  The main problem pointed out with this was the inconsistencies between the ID'ness of the "id" attribute at document load time and after the first call to getElementById(String), as well as the same issue after a call to normalizeDocument().
> This proposal is more limited in scope.  If the Id exists in the "identifiers" map, it is returned, otherwise it just falls back to pre-existing behavior.  It is up to the HTML document parser to properly set up the ID'ness of the "id" attribute so Id's are either registered or not from the get-go.  Note that this can easily be accomplished using Andy Clark's NekoHTML "filters" capability [2].
> I have not addressed the issue of calls to normalizeDocument() here.  That seems to be a more complicated issue, so I'd rather not tackle that here.  However, I don't think it is much of a problem.  Why would someone want to revalidate an HTML Document?  There's no DTD, so there's nothing to validate.  In fact, I currently use normalizeDocument() without loss of ID'ness of "id" attributes.  I have the following properties set...
>         config.setParameter("namespaces", Boolean.FALSE);
>         config.setParameter("well-formed", Boolean.FALSE);
> In any case, my proposal makes no proactive modification of ID'ness of "id" attributes.  The parser either sets them up for optimzation or it doesn't and there's always fallback to pre-existing behavior.  
> Patch coming up....
> [1] - http://marc.theaimsgroup.com/?t=115890536600001&r=1&w=4
> [2] - http://people.apache.org/~andyc/neko/doc/html/filters.html
> Jake

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org

[jira] Updated: (XERCESJ-1200) change HTMLDocumentImpl.getElementById(String) to first attempt lookup from "identifiers" map, otherwise fall back to old behavior

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.

     [ http://issues.apache.org/jira/browse/XERCESJ-1200?page=all ]

Michael Glavassevich updated XERCESJ-1200:
------------------------------------------

    Fix Version/s: 2.9.0

> change HTMLDocumentImpl.getElementById(String) to first attempt lookup from "identifiers" map, otherwise fall back to old behavior
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1200
>                 URL: http://issues.apache.org/jira/browse/XERCESJ-1200
>             Project: Xerces2-J
>          Issue Type: Improvement
>          Components: DOM (HTML)
>    Affects Versions: 2.8.1
>            Reporter: Jacob Kjome
>         Assigned To: Michael Glavassevich
>             Fix For: 2.9.0
>
>         Attachments: HTMLDocumentImpl.java.patch
>
>
> As discussed on the mailing list [1], I'd like HTMLDocumentImpl.getElementById(String) to make an attempt at an optimized lookup of the element by calling the superclass's getElementById(String), which looks up from the "identifiers" hashmap.  If found, it is returned.  If not, it falls back to pre-existing behavior, which walks the DOM looking for an element with an "id" attribute of the value provided.
> There's a difference between what I proposed on the mailing list and here.  On the mailing list, I proposed adding Id's to the "identifiers" map as they were found via the fallback method.  The idea was that the next time the Id was looked up, it would be pulled from the "identifiers" map and, therefore, be optimized.  The main problem pointed out with this was the inconsistencies between the ID'ness of the "id" attribute at document load time and after the first call to getElementById(String), as well as the same issue after a call to normalizeDocument().
> This proposal is more limited in scope.  If the Id exists in the "identifiers" map, it is returned, otherwise it just falls back to pre-existing behavior.  It is up to the HTML document parser to properly set up the ID'ness of the "id" attribute so Id's are either registered or not from the get-go.  Note that this can easily be accomplished using Andy Clark's NekoHTML "filters" capability [2].
> I have not addressed the issue of calls to normalizeDocument() here.  That seems to be a more complicated issue, so I'd rather not tackle that here.  However, I don't think it is much of a problem.  Why would someone want to revalidate an HTML Document?  There's no DTD, so there's nothing to validate.  In fact, I currently use normalizeDocument() without loss of ID'ness of "id" attributes.  I have the following properties set...
>         config.setParameter("namespaces", Boolean.FALSE);
>         config.setParameter("well-formed", Boolean.FALSE);
> In any case, my proposal makes no proactive modification of ID'ness of "id" attributes.  The parser either sets them up for optimzation or it doesn't and there's always fallback to pre-existing behavior.  
> Patch coming up....
> [1] - http://marc.theaimsgroup.com/?t=115890536600001&r=1&w=4
> [2] - http://people.apache.org/~andyc/neko/doc/html/filters.html
> Jake

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org