You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Noble Paul നോബിള് नोब्ळ् <no...@corp.aol.com> on 2009/06/18 19:20:35 UTC
Re: PlainTextEntitiyProcessor not putting any text into a field in
index
can you just log it and see what is contained in the plainText field.
(using LogTransformer)
On Thu, Jun 18, 2009 at 8:54 PM, Jay Hill<ja...@gmail.com> wrote:
> I'm having some trouble getting the PlainTextEntityProcessor to populate a
> field in an index. I'm using the TemplateTransformer to fill 2 fields, and
> have a timestamp field in schema.xml, and these fields make it into the
> index. Only the plaintText data is missing. Here is my configuration:
>
> <dataConfig>
> <dataSource type="FileDataSource" encoding="UTF-8" />
> <document>
> <entity
> name="f"
> processor="FileListEntityProcessor"
> baseDir="/Users/jayhill/test/dir"
> fileName=".*txt"
> recursive="true"
> rootEntity="true"
> >
>
> <entity
> name="pt"
> processor="PlainTextEntityProcessor"
> url="${f.fileAbsolutePath}"
> transformer="RegexTransformer,TemplateTransformer"
> >
> <field column="plainText" name="text"/>
> <field column="datasource" template="textfiles" />
> </entity>
>
> </entity>
> </document>
> </dataConfig>
>
> I've tried adding "plainText" as a field in schema.xml, but that didn't work
> either.
>
> When I look at what the PlainTextEntityProcessor class is doing I see that
> it has correctly parsed the file and has the text in a StringWriter:
> row.put(PLAIN_TEXT, sw.toString());
> I just don't know how to get that text into a field in the index
>
> Any pointers appreciated.
>
> -Jay
>
--
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com