You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (Created) (JIRA)" <ji...@apache.org> on 2011/10/18 12:56:10 UTC

[jira] [Created] (TIKA-755) Add getDetector() method to TikaConfig

Add getDetector() method to TikaConfig
--------------------------------------

                 Key: TIKA-755
                 URL: https://issues.apache.org/jira/browse/TIKA-755
             Project: Tika
          Issue Type: Improvement
          Components: config
    Affects Versions: 0.10
            Reporter: Nick Burch
            Assignee: Nick Burch
             Fix For: 1.0


As discussed on the mailing list, we should add a getDetector() method to TikaConfig. This would return a DefaultDetector that was created with the same classloader as the DefaultParser was

As part of this, we should update the Tika class to get the DefaultDetector from the TikaConfig, rather than creating internally. We should also switch the Tika class to not create its own AutoDetectParser, but instead use the DefaultParser from TikaConfig

Discussion is:
http://mail-archives.apache.org/mod_mbox/tika-dev/201110.mbox/%3Calpine.DEB.2.00.1110171330160.7762@urchin.earth.li%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (TIKA-755) Add getDetector() method to TikaConfig

Posted by "Jukka Zitting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129971#comment-13129971 ] 

Jukka Zitting commented on TIKA-755:
------------------------------------

Hmm, I looked at the interaction between Tika and TikaConfig, and actually I now think it's better if we leave the AutoDetectParser instantiation there.

This way the TikaConfig class is responsible for producing composite parsers and detectors based on explicit configuration or the default classloading mechanism, and the Tika facade (or the AutoDetectParser class directly) remains responsible for adding autodetection and other extra features on top of the basic configuration.

So I guess we can resolve this as fixed.
                
> Add getDetector() method to TikaConfig
> --------------------------------------
>
>                 Key: TIKA-755
>                 URL: https://issues.apache.org/jira/browse/TIKA-755
>             Project: Tika
>          Issue Type: Improvement
>          Components: config
>    Affects Versions: 0.10
>            Reporter: Nick Burch
>            Assignee: Nick Burch
>             Fix For: 1.0
>
>
> As discussed on the mailing list, we should add a getDetector() method to TikaConfig. This would return a DefaultDetector that was created with the same classloader as the DefaultParser was
> As part of this, we should update the Tika class to get the DefaultDetector from the TikaConfig, rather than creating internally. We should also switch the Tika class to not create its own AutoDetectParser, but instead use the DefaultParser from TikaConfig
> Discussion is:
> http://mail-archives.apache.org/mod_mbox/tika-dev/201110.mbox/%3Calpine.DEB.2.00.1110171330160.7762@urchin.earth.li%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (TIKA-755) Add getDetector() method to TikaConfig

Posted by "Nick Burch (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129744#comment-13129744 ] 

Nick Burch commented on TIKA-755:
---------------------------------

In r1185658 I've updated TikaConfig to create a DefaultDetector based on the available MimeTypes and/or ClassLoader. I've also changed the Tika class and AutoDetectParser to both get the Detector from TikaConfig, rather than creating their own DefaultDetector internally.

However, various things broke when I switched the Tika class from using AutoDetectParser to using the Parser from TikaConfig, so I haven't made that change.
                
> Add getDetector() method to TikaConfig
> --------------------------------------
>
>                 Key: TIKA-755
>                 URL: https://issues.apache.org/jira/browse/TIKA-755
>             Project: Tika
>          Issue Type: Improvement
>          Components: config
>    Affects Versions: 0.10
>            Reporter: Nick Burch
>            Assignee: Nick Burch
>             Fix For: 1.0
>
>
> As discussed on the mailing list, we should add a getDetector() method to TikaConfig. This would return a DefaultDetector that was created with the same classloader as the DefaultParser was
> As part of this, we should update the Tika class to get the DefaultDetector from the TikaConfig, rather than creating internally. We should also switch the Tika class to not create its own AutoDetectParser, but instead use the DefaultParser from TikaConfig
> Discussion is:
> http://mail-archives.apache.org/mod_mbox/tika-dev/201110.mbox/%3Calpine.DEB.2.00.1110171330160.7762@urchin.earth.li%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (TIKA-755) Add getDetector() method to TikaConfig

Posted by "Nick Burch (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nick Burch resolved TIKA-755.
-----------------------------

    Resolution: Fixed
    
> Add getDetector() method to TikaConfig
> --------------------------------------
>
>                 Key: TIKA-755
>                 URL: https://issues.apache.org/jira/browse/TIKA-755
>             Project: Tika
>          Issue Type: Improvement
>          Components: config
>    Affects Versions: 0.10
>            Reporter: Nick Burch
>            Assignee: Nick Burch
>             Fix For: 1.0
>
>
> As discussed on the mailing list, we should add a getDetector() method to TikaConfig. This would return a DefaultDetector that was created with the same classloader as the DefaultParser was
> As part of this, we should update the Tika class to get the DefaultDetector from the TikaConfig, rather than creating internally. We should also switch the Tika class to not create its own AutoDetectParser, but instead use the DefaultParser from TikaConfig
> Discussion is:
> http://mail-archives.apache.org/mod_mbox/tika-dev/201110.mbox/%3Calpine.DEB.2.00.1110171330160.7762@urchin.earth.li%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira