You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Alexandre Rafalovitch (JIRA)" <ji...@apache.org> on 2016/10/05 01:23:20 UTC

[jira] [Created] (SOLR-9601) DIH: Radicially simplify Tika example to only show relevant configuration

Alexandre Rafalovitch created SOLR-9601:
-------------------------------------------

             Summary: DIH: Radicially simplify Tika example to only show relevant configuration
                 Key: SOLR-9601
                 URL: https://issues.apache.org/jira/browse/SOLR-9601
             Project: Solr
          Issue Type: Improvement
      Security Level: Public (Default Security Level. Issues are Public)
          Components: contrib - DataImportHandler, contrib - Solr Cell (Tika extraction)
    Affects Versions: 6.x, master (7.0)
            Reporter: Alexandre Rafalovitch
            Assignee: Alexandre Rafalovitch


Solr DIH examples are legacy examples to show how DIH work. However, they include full configurations that may obscure teaching points. This is no longer needed as we have 3 full-blown examples in the configsets. 

Specifically for Tika, the field types definitions were at some point simplified to have less support files in the configuration directory. This, however, means that we now have field definitions that have same names as other examples, but different definitions. 

Importantly, Tika does not use most (any?) of those modified definitions. They are there just for completeness. Similarly, the solrconfig.xml includes extract handler even though we are demonstrating a different path of using Tika. Somebody grepping through config files may get confused about what configuration aspects contributes to what experience.

I am planning to significantly simplify configuration and schema of Tika example to **only** show DIH Tika extraction path. It will end-up a very short and focused example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org