You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Noble Paul നോബിള്‍ नोब्ळ् <no...@corp.aol.com> on 2009/06/18 19:20:35 UTC

Re: PlainTextEntitiyProcessor not putting any text into a field in index

can you just log it and see what is contained in the plainText field.
(using LogTransformer)

On Thu, Jun 18, 2009 at 8:54 PM, Jay Hill<ja...@gmail.com> wrote:
> I'm having some trouble getting the PlainTextEntityProcessor to populate a
> field in an index. I'm using the TemplateTransformer to fill 2 fields, and
> have a timestamp field in schema.xml, and these fields make it into the
> index. Only the plaintText data is missing. Here is my configuration:
>
> <dataConfig>
>    <dataSource type="FileDataSource" encoding="UTF-8" />
>    <document>
>        <entity
>       name="f"
>       processor="FileListEntityProcessor"
>       baseDir="/Users/jayhill/test/dir"
>       fileName=".*txt"
>       recursive="true"
>       rootEntity="true"
>       >
>
>        <entity
>           name="pt"
>           processor="PlainTextEntityProcessor"
>           url="${f.fileAbsolutePath}"
>           transformer="RegexTransformer,TemplateTransformer"
>           >
>          <field column="plainText" name="text"/>
>          <field column="datasource" template="textfiles" />
>        </entity>
>
>        </entity>
>    </document>
> </dataConfig>
>
> I've tried adding "plainText" as a field in schema.xml, but that didn't work
> either.
>
> When I look at what the PlainTextEntityProcessor class is doing I see that
> it has correctly parsed the file and has the text in a StringWriter:
>    row.put(PLAIN_TEXT, sw.toString());
> I just don't know how to get that text into a field in the index
>
> Any pointers appreciated.
>
> -Jay
>



-- 
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com