You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2007/09/21 08:42:50 UTC

[jira] Updated: (TIKA-23) Decouple Parser from ParserConfig

     [ https://issues.apache.org/jira/browse/TIKA-23?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting updated TIKA-23:
------------------------------

    Attachment: TIKA-23.patch

The attached patch removes the LiusConfig and ParserConfig dependencies from the Parser class. These dependencies are replaced by direct namespace (String) and contents (List<Content>) properties injected by ParserFactory from the ParserConfig object used to instantiate a new Parser.

Note that the content list is still shared with the ParserConfig instance, so two separate Parser instances can still interfere with each other. I didn't want to change this yet to keep the scope of this patch small, but this is high on my list of things to change in near future.

This change should have no functional effects and all the test cases pass after applying it.

> Decouple Parser from ParserConfig
> ---------------------------------
>
>                 Key: TIKA-23
>                 URL: https://issues.apache.org/jira/browse/TIKA-23
>             Project: Tika
>          Issue Type: Improvement
>          Components: general
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>            Priority: Minor
>         Attachments: TIKA-23.patch
>
>
> Instead of starting from scratch with the new Parser interface design I proposed on the mailing list, I'd like to work from the current codebase, iteratively refactoring it.
> The first problem I see with the current Parser design is the tight coupling with the ParserConfig (and even LiusConfig) classes. Config objects are used both by ParserFactory when  creating the parser instances and by the parser objects when parsing content. In fact the parser classes even use the ParserConfig instances as containers of parsed content.
> This coupling makes it quite difficult to apply any structural changes to the parser classes, so as a first step I'd like to propose a change that breaks this coupling.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.