You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Markus Jelsma (Created) (JIRA)" <ji...@apache.org> on 2011/12/21 14:49:30 UTC

[jira] [Created] (TIKA-825) Extract rel attr with LinkContentHandler

Extract rel attr with LinkContentHandler
----------------------------------------

                 Key: TIKA-825
                 URL: https://issues.apache.org/jira/browse/TIKA-825
             Project: Tika
          Issue Type: Improvement
          Components: parser
            Reporter: Markus Jelsma
            Priority: Minor


For Nutch we need to extract URL's but need the rel attribute to check for the nofollow value. I've patched the code to return this information in the Link object. It's been tested and i can read the rel in Nutch now.

Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (TIKA-825) Extract rel attr with LinkContentHandler

Posted by "Markus Jelsma (Closed) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Markus Jelsma closed TIKA-825.
------------------------------

    Resolution: Duplicate

For some reason it's added this issue twice. Closing.
                
> Extract rel attr with LinkContentHandler
> ----------------------------------------
>
>                 Key: TIKA-825
>                 URL: https://issues.apache.org/jira/browse/TIKA-825
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>            Reporter: Markus Jelsma
>            Priority: Minor
>
> For Nutch we need to extract URL's but need the rel attribute to check for the nofollow value. I've patched the code to return this information in the Link object. It's been tested and i can read the rel in Nutch now.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira