You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Scott Derrick <sc...@tnstaafl.net> on 2021/10/29 17:22:08 UTC

XPathEntityProcessor not indexing all instances of nodes defined in config?

I have the following in my config file

        <entity name="meta"
           dataSource="myfilereader"
           processor="XPathEntityProcessor"
           url="${jcurrent.fileAbsolutePath}"
           stream="false"
           forEach="/TEI/teiHeader/fileDesc"
           xsl="xslt/meta.xsl"
           >
           <field column="note" xpath="/TEI/teiHeader//note" flatten="true" />
           <field column="annotator" xpath="/TEI/teiHeader//annotator" />
           <field column="scribe" xpath="/TEI/teiHeader//scribe" />
           <field column="recipient" xpath="/TEI/teiHeader//recipient" />
        </entity>


In the XML file is  (snippet):

<notesStmt>
      <note type="manuscript_description">Handwritten by Mary Baker Eddy.</note>
      <note type="editorial">This document is a draft of Eddy's poem, "<title level="a">Woman's Rights</title>."  See page 21 of Eddy's <title level="m">Poems</title> to read the published version.</note>
</notesStmt>

There are "note" nodes at the xpath="/TEI/teiHeader//note"

but only the last is stored and searchable?  This is happening to all instances where there are multiple nodes in the xpath.  Only the last node is stored?

Is there something wrong with my config file?

thanks,

Scott

Re: XPathEntityProcessor not indexing all instances of nodes defined in config?

Posted by Scott Derrick <sc...@tnstaafl.net>.
I figured it out!

I needed to make the note multivalued in the schema like so

   <field name="note" type="text_general" multiValued="true" indexed="true" stored="true"/>

now solr stores the different node values in a array like so

|"note":["Handwritten by Mary Baker Eddy.", "This document is a draft of Eddy's poem, \" \n Woman's Rights \n .\" See page 21 of Eddy's \n Poems \n to read the published version.", "A10004 is a draft of Eddy's poem \" Woman's Rights ,\" published on 
page 21 of Poems .", "Jesus Christ"],|

Scott

On 10/29/21 10:22 AM, Scott Derrick wrote:
>
> I have the following in my config file
>
>        <entity name="meta"
>           dataSource="myfilereader"
>           processor="XPathEntityProcessor"
>           url="${jcurrent.fileAbsolutePath}"
>           stream="false"
>           forEach="/TEI/teiHeader/fileDesc"
>           xsl="xslt/meta.xsl"
>           >
>           <field column="note" xpath="/TEI/teiHeader//note" flatten="true" />
>           <field column="annotator" xpath="/TEI/teiHeader//annotator" />
>           <field column="scribe" xpath="/TEI/teiHeader//scribe" />
>           <field column="recipient" xpath="/TEI/teiHeader//recipient" />
>        </entity>
>
>
> In the XML file is  (snippet):
>
> <notesStmt>
>      <note type="manuscript_description">Handwritten by Mary Baker Eddy.</note>
>      <note type="editorial">This document is a draft of Eddy's poem, "<title level="a">Woman's Rights</title>."  See page 21 of Eddy's <title level="m">Poems</title> to read the published version.</note>
> </notesStmt>
>
> There are "note" nodes at the xpath="/TEI/teiHeader//note"
>
> but only the last is stored and searchable?  This is happening to all instances where there are multiple nodes in the xpath.  Only the last node is stored?
>
> Is there something wrong with my config file?
>
> thanks,
>
> Scott