You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "James Dyer (Updated) (JIRA)" <ji...@apache.org> on 2011/12/13 20:54:31 UTC

[jira] [Updated] (SOLR-2549) DIH LineEntityProcessor support for delimited & fixed-width files

     [ https://issues.apache.org/jira/browse/SOLR-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

James Dyer updated SOLR-2549:
-----------------------------

    Attachment: SOLR-2549.patch

A long time ago someone on the users' list asked for better support for delimited files.  This version supports most of the same features as the CSVRequestHandler, using the same csv parser and most of the same parameter names.  

The reason for using DIH instead for CSVRequestHandler would be cases where the flat file needs to be joined to other entities, if the data needs to be cached, and/or if transformers need to be applied.

This patch also retains the same support for fixed-width files.

The unit tests have been enhanced to test these new possibilities.
                
> DIH LineEntityProcessor support for delimited & fixed-width files
> -----------------------------------------------------------------
>
>                 Key: SOLR-2549
>                 URL: https://issues.apache.org/jira/browse/SOLR-2549
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 4.0
>            Reporter: James Dyer
>            Priority: Minor
>         Attachments: SOLR-2549.patch, SOLR-2549.patch, SOLR-2549.patch
>
>
> Provides support for Fixed Width and Delimited Files without needing to write a Transformer. 
> The following xml properties are supported with this version of LineEntityProcessor:
> For fixed width files:
>  - colDef[#]
> For Delimited files:
>  - fieldDelimiterRegex
>  - firstLineHasFieldnames
>  - delimitedFieldNames
>  - delimitedFieldTypes
> These properties are described in the api documentation.  See patch.
> When combined with the cache improvements from SOLR-2382 this allows you to join a flat file entity with other entities (sql, etc).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org