You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Daniel Bonniot de Ruisselet (Created) (JIRA)" <ji...@apache.org> on 2011/12/20 10:01:33 UTC
[jira] [Created] (TIKA-820) Locator is unset for HTML parser
Locator is unset for HTML parser
--------------------------------
Key: TIKA-820
URL: https://issues.apache.org/jira/browse/TIKA-820
Project: Tika
Issue Type: Bug
Components: general, parser
Reporter: Daniel Bonniot de Ruisselet
Attachments: text-locator.patch
The HtmlParser does not call setDocumentLocator(Locator locator) on the user's content handler.
Patch and unit test attached.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (TIKA-820) Locator is unset for HTML parser
Posted by "Chris A. Mattmann (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris A. Mattmann updated TIKA-820:
-----------------------------------
Fix Version/s: (was: 1.1)
1.2
- push out to 1.2
> Locator is unset for HTML parser
> --------------------------------
>
> Key: TIKA-820
> URL: https://issues.apache.org/jira/browse/TIKA-820
> Project: Tika
> Issue Type: Bug
> Components: general, parser
> Affects Versions: 1.0
> Reporter: Daniel Bonniot de Ruisselet
> Labels: patch
> Fix For: 1.2
>
> Attachments: text-locator.patch
>
>
> The HtmlParser does not call setDocumentLocator(Locator locator) on the user's content handler.
> Patch and unit test attached.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TIKA-820) Locator is unset for HTML parser
Posted by "Daniel Bonniot de Ruisselet (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173046#comment-13173046 ]
Daniel Bonniot de Ruisselet commented on TIKA-820:
--------------------------------------------------
Note that the exact value of the line/column locations seems not perfect, but that's a separate issue.
> Locator is unset for HTML parser
> --------------------------------
>
> Key: TIKA-820
> URL: https://issues.apache.org/jira/browse/TIKA-820
> Project: Tika
> Issue Type: Bug
> Components: general, parser
> Reporter: Daniel Bonniot de Ruisselet
> Attachments: text-locator.patch
>
>
> The HtmlParser does not call setDocumentLocator(Locator locator) on the user's content handler.
> Patch and unit test attached.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (TIKA-820) Locator is unset for HTML parser
Posted by "Daniel Bonniot de Ruisselet (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Bonniot de Ruisselet updated TIKA-820:
---------------------------------------------
Fix Version/s: 1.1
Affects Version/s: 1.0
> Locator is unset for HTML parser
> --------------------------------
>
> Key: TIKA-820
> URL: https://issues.apache.org/jira/browse/TIKA-820
> Project: Tika
> Issue Type: Bug
> Components: general, parser
> Affects Versions: 1.0
> Reporter: Daniel Bonniot de Ruisselet
> Labels: patch
> Fix For: 1.1
>
> Attachments: text-locator.patch
>
>
> The HtmlParser does not call setDocumentLocator(Locator locator) on the user's content handler.
> Patch and unit test attached.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TIKA-820) Locator is unset for HTML parser
Posted by "Daniel Bonniot de Ruisselet (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460573#comment-13460573 ]
Daniel Bonniot de Ruisselet commented on TIKA-820:
--------------------------------------------------
Hi Ken - Thanks for looking at the patch. I have no idea if this is the only missing delegating call, it just seemed wrong to me not to do it in TextContentHandler.
> Locator is unset for HTML parser
> --------------------------------
>
> Key: TIKA-820
> URL: https://issues.apache.org/jira/browse/TIKA-820
> Project: Tika
> Issue Type: Bug
> Components: general, parser
> Affects Versions: 1.0
> Reporter: Daniel Bonniot de Ruisselet
> Assignee: Ken Krugler
> Labels: patch
> Fix For: 1.3
>
> Attachments: text-locator.patch
>
>
> The HtmlParser does not call setDocumentLocator(Locator locator) on the user's content handler.
> Patch and unit test attached.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (TIKA-820) Locator is unset for HTML parser
Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris A. Mattmann updated TIKA-820:
-----------------------------------
Fix Version/s: (was: 1.2)
1.3
- push to 1.3
> Locator is unset for HTML parser
> --------------------------------
>
> Key: TIKA-820
> URL: https://issues.apache.org/jira/browse/TIKA-820
> Project: Tika
> Issue Type: Bug
> Components: general, parser
> Affects Versions: 1.0
> Reporter: Daniel Bonniot de Ruisselet
> Labels: patch
> Fix For: 1.3
>
> Attachments: text-locator.patch
>
>
> The HtmlParser does not call setDocumentLocator(Locator locator) on the user's content handler.
> Patch and unit test attached.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (TIKA-820) Locator is unset for HTML parser
Posted by "Daniel Bonniot de Ruisselet (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Bonniot de Ruisselet updated TIKA-820:
---------------------------------------------
Attachment: text-locator.patch
Fix+test patch.
> Locator is unset for HTML parser
> --------------------------------
>
> Key: TIKA-820
> URL: https://issues.apache.org/jira/browse/TIKA-820
> Project: Tika
> Issue Type: Bug
> Components: general, parser
> Reporter: Daniel Bonniot de Ruisselet
> Attachments: text-locator.patch
>
>
> The HtmlParser does not call setDocumentLocator(Locator locator) on the user's content handler.
> Patch and unit test attached.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TIKA-820) Locator is unset for HTML parser
Posted by "Ken Krugler (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13432246#comment-13432246 ]
Ken Krugler commented on TIKA-820:
----------------------------------
Hi Daniel - I took a quick look at your patch, and had a question. It looks like the change was for TextContentHandler to call setDocumentLocator on its delegate; is this the only case in Tika where a ContentHandler wasn't delegating the method call properly? Thanks!
> Locator is unset for HTML parser
> --------------------------------
>
> Key: TIKA-820
> URL: https://issues.apache.org/jira/browse/TIKA-820
> Project: Tika
> Issue Type: Bug
> Components: general, parser
> Affects Versions: 1.0
> Reporter: Daniel Bonniot de Ruisselet
> Labels: patch
> Fix For: 1.3
>
> Attachments: text-locator.patch
>
>
> The HtmlParser does not call setDocumentLocator(Locator locator) on the user's content handler.
> Patch and unit test attached.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (TIKA-820) Locator is unset for HTML parser
Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris A. Mattmann updated TIKA-820:
-----------------------------------
- push to 1.3
> Locator is unset for HTML parser
> --------------------------------
>
> Key: TIKA-820
> URL: https://issues.apache.org/jira/browse/TIKA-820
> Project: Tika
> Issue Type: Bug
> Components: general, parser
> Affects Versions: 1.0
> Reporter: Daniel Bonniot de Ruisselet
> Labels: patch
> Fix For: 1.3
>
> Attachments: text-locator.patch
>
>
> The HtmlParser does not call setDocumentLocator(Locator locator) on the user's content handler.
> Patch and unit test attached.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (TIKA-820) Locator is unset for HTML parser
Posted by "Ken Krugler (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ken Krugler reassigned TIKA-820:
--------------------------------
Assignee: Ken Krugler
> Locator is unset for HTML parser
> --------------------------------
>
> Key: TIKA-820
> URL: https://issues.apache.org/jira/browse/TIKA-820
> Project: Tika
> Issue Type: Bug
> Components: general, parser
> Affects Versions: 1.0
> Reporter: Daniel Bonniot de Ruisselet
> Assignee: Ken Krugler
> Labels: patch
> Fix For: 1.3
>
> Attachments: text-locator.patch
>
>
> The HtmlParser does not call setDocumentLocator(Locator locator) on the user's content handler.
> Patch and unit test attached.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira