You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Matthias Fischer <Ma...@doubleslash.de> on 2015/10/16 13:40:04 UTC

Nested entities not imported / do not show up in search?

Hello everybody,

I am trying to import from an Oracle DB 11g2 via DIH using SOLR 5.3.1. 
In my relational DB there are company addresses (table tb_firmen_adressen) and branches (table tb_branchen). They have an n:m relationship using the join table tb_firmen_branchen. 
Now I would like to find companies by their name and in each company result I would like to see the associated branches.
However I only get the companies without the nested entries. As a newbie I'd highly appreciate some help as there are no errors or warnings in the log file and I could not find any helpful hints in the documentation or elsewhere in the internet concerning my problem. 

Here is my data config:

    <dataConfig>
        <dataSource name="jdbc" driver="oracle.jdbc.driver.OracleDriver" url="jdbc:oracle:thin:@//xxxxx.xxxxxxx:1521/pde11" user="myuser" password="mysecret"/>
    <document>
        <entity name="firma" pk="fa.EBI_NR" query="
            SELECT fa.EBI_NR, fa.NAMENSZEILE_1, fa.NAMENSZEILE_2, fa.NAMENSZEILE_3
            FROM tb_firmen_adressen fa 
            WHERE rownum &lt; 10000
        ">
                                            
            <field name="firma_ebi_nr"        	 	column="EBI_NR" />
            <field name="firma_namenszeile_1" 	column="NAMENSZEILE_1" />
            <field name="firma_namenszeile_2" 	column="NAMENSZEILE_2" />
            <field name="firma_namenszeile_3" 	column="NAMENSZEILE_3" />
                    
            <entity name="firma_branche" child="true" query="
                SELECT b.EBC_CODE AS EBC_CODE
                FROM 
                    tb_firmen_branchen fb 
                        JOIN tb_branchen b ON fb.EBC_CODE = b.EBC_CODE 
                WHERE fb.EBI_NR='${firma.firma_ebi_nr}'
            ">
                <field name="branche_ebc_code" column="EBC_CODE" />
                <!-- I would like to add more fields later here once I get it to work -->
            </entity>
      
        </entity>
    </document>
    </dataConfig>


And here are the relevant lines from my schema file:

    <uniqueKey>firma_ebi_nr</uniqueKey>

     <field name="firma_ebi_nr" 		type="long" 		required="true" 	indexed="true" 	stored="true"/>
     <field name="firma_namenszeile_1" 	type="text_general" 				indexed="true" 	stored="true"/>
     <field name="firma_namenszeile_2" 	type="text_general" 				indexed="true" 	stored="true"/>
     <field name="firma_namenszeile_3" 	type="text_general" 				indexed="true" 	stored="true"/>
     <field name="branche_ebc_code" 		type="long" 					indexed="true" 	stored="true"/>
     


After restarting solr and calling http://localhost:8983/solr/jcg/dataimport?command=full-import I get "Indexing completed. Added/Updated: 9999 documents. Deleted 0 documents."
So basically it seams to work, but my search results look like this:

{
  "responseHeader":{
    "status":0,
    "QTime":71,
    "params":{
      "q":"Der Bunte",
      "defType":"edismax",
      "indent":"true",
      "qf":"firma_namenszeile_1",
      "wt":"json"}},
  "response":{"numFound":85,"start":0,"docs":[
      {
        "firma_ebi_nr":123123123,
        "firma_namenszeile_1":"Der Bunte Laden",
        "_version_":1515185579421073408},
      {
     ...
}

Why are there no company branches inside the company records? What's wrong with my configuration? Any help is appreciated!

Kind regards
Matthias Fischer


Re: Nested entities not imported / do not show up in search?

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Hello Mattias,
I confirm that uniqKey a little bit odd now. I hope to improve it in
future. For now, you need to approach one of discussed workarounds.

On Fri, Oct 16, 2015 at 8:11 AM, Matthias Fischer <
Matthias.Fischer@doubleslash.de> wrote:

> Thank you, Andrea, for answering so quickly.
>
> However I got further errors. I also had to change
> "<uniqueKey>firma_ebi_nr</uniqueKey>" to "<uniqueKey>id</uniqueKey>". But
> it still does not work properly. It seems that an id is auto generated for
> the company documents but not for the nested ones (the business branches).
> Any ideas how to fix this?
>
> 2015-10-16 12:49:29.650 WARN  (Thread-17) [   x:jcg] o.a.s.h.d.SolrWriter
> Error creating document :
> SolrInputDocument(
>     fields: [firma_ebi_nr=317709682, firma_namenszeile_1=Example Company,
> id=3c7f7421-9d51-4056-a2a0-eebab87a546a, _version_=1515192078460518400,
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a],
>     children: [
>            SolrInputDocument(fields: [branche_ebc_code=7,
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
>            SolrInputDocument(fields: [branche_ebc_code=47000,
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
>            SolrInputDocument(fields: [branche_ebc_code=47700,
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
>            SolrInputDocument(fields: [branche_ebc_code=47790,
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
>            SolrInputDocument(fields: [branche_ebc_code=47791,
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a])])
> org.apache.solr.common.SolrException: [doc=null] missing required field: id
>         at
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:198)
>         at
> org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:191)
>         at
> org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:166)
>         at
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:259)
>         at
> org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:413)
>         at
> org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1316)
>         at
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:235)
>         at
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:163)
>         at
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
>         at
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
>         at
> org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104)
>         at
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>         at
> org.apache.solr.update.processor.AbstractDefaultValueUpdateProcessorFactory$DefaultValueUpdateProcessor.processAdd(AbstractDefaultValueUpdateProcessorFactory.java:94)
>         at
> org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:71)
>         at
> org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:259)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:524)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
>         at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
>         at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
>         at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
>
> Kind regards,
> Matthias
>
> -----Ursprüngliche Nachricht-----
> Von: Andrea Gazzarini [mailto:a.gazzarini@gmail.com]
> Gesendet: Freitag, 16. Oktober 2015 13:59
> An: solr-user@lucene.apache.org
> Betreff: Re: Nested entities not imported / do not show up in search?
>
> Hi Matthias,
> you should use <entity-name>.<column-name> in your expressions. So for
> example, here
>
> WHERE fb.EBI_NR='${firma.firma_ebi_nr}'
>
> should be
>
> WHERE fb.EBI_NR='${firma.EBI_NR}'
>
> Best,
> Andrea
>
> 2015-10-16 13:40 GMT+02:00 Matthias Fischer <
> Matthias.Fischer@doubleslash.de
> >:
>
> > Hello everybody,
> >
> > I am trying to import from an Oracle DB 11g2 via DIH using SOLR 5.3.1.
> > In my relational DB there are company addresses (table
> > tb_firmen_adressen) and branches (table tb_branchen). They have an n:m
> > relationship using the join table tb_firmen_branchen.
> > Now I would like to find companies by their name and in each company
> > result I would like to see the associated branches.
> > However I only get the companies without the nested entries. As a
> > newbie I'd highly appreciate some help as there are no errors or
> > warnings in the log file and I could not find any helpful hints in the
> > documentation or elsewhere in the internet concerning my problem.
> >
> > Here is my data config:
> >
> >     <dataConfig>
> >         <dataSource name="jdbc" driver="oracle.jdbc.driver.OracleDriver"
> > url="jdbc:oracle:thin:@//xxxxx.xxxxxxx:1521/pde11" user="myuser"
> > password="mysecret"/>
> >     <document>
> >         <entity name="firma" pk="fa.EBI_NR" query="
> >             SELECT fa.EBI_NR, fa.NAMENSZEILE_1, fa.NAMENSZEILE_2,
> > fa.NAMENSZEILE_3
> >             FROM tb_firmen_adressen fa
> >             WHERE rownum &lt; 10000
> >         ">
> >
> >             <field name="firma_ebi_nr"                  column="EBI_NR"
> />
> >             <field name="firma_namenszeile_1"   column="NAMENSZEILE_1" />
> >             <field name="firma_namenszeile_2"   column="NAMENSZEILE_2" />
> >             <field name="firma_namenszeile_3"   column="NAMENSZEILE_3" />
> >
> >             <entity name="firma_branche" child="true" query="
> >                 SELECT b.EBC_CODE AS EBC_CODE
> >                 FROM
> >                     tb_firmen_branchen fb
> >                         JOIN tb_branchen b ON fb.EBC_CODE = b.EBC_CODE
> >                 WHERE fb.EBI_NR='${firma.firma_ebi_nr}'
> >             ">
> >                 <field name="branche_ebc_code" column="EBC_CODE" />
> >                 <!-- I would like to add more fields later here once I
> > get it to work -->
> >             </entity>
> >
> >         </entity>
> >     </document>
> >     </dataConfig>
> >
> >
> > And here are the relevant lines from my schema file:
> >
> >     <uniqueKey>firma_ebi_nr</uniqueKey>
> >
> >      <field name="firma_ebi_nr"                 type="long"
> >  required="true"         indexed="true"  stored="true"/>
> >      <field name="firma_namenszeile_1"  type="text_general"
> >              indexed="true"  stored="true"/>
> >      <field name="firma_namenszeile_2"  type="text_general"
> >              indexed="true"  stored="true"/>
> >      <field name="firma_namenszeile_3"  type="text_general"
> >              indexed="true"  stored="true"/>
> >      <field name="branche_ebc_code"             type="long"
> >                      indexed="true"  stored="true"/>
> >
> >
> >
> > After restarting solr and calling
> > http://localhost:8983/solr/jcg/dataimport?command=full-import I get
> > "Indexing completed. Added/Updated: 9999 documents. Deleted 0 documents."
> > So basically it seams to work, but my search results look like this:
> >
> > {
> >   "responseHeader":{
> >     "status":0,
> >     "QTime":71,
> >     "params":{
> >       "q":"Der Bunte",
> >       "defType":"edismax",
> >       "indent":"true",
> >       "qf":"firma_namenszeile_1",
> >       "wt":"json"}},
> >   "response":{"numFound":85,"start":0,"docs":[
> >       {
> >         "firma_ebi_nr":123123123,
> >         "firma_namenszeile_1":"Der Bunte Laden",
> >         "_version_":1515185579421073408},
> >       {
> >      ...
> > }
> >
> > Why are there no company branches inside the company records? What's
> > wrong with my configuration? Any help is appreciated!
> >
> > Kind regards
> > Matthias Fischer
> >
> >
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<mk...@griddynamics.com>

Re: Nested entities not imported / do not show up in search?

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
On Mon, Oct 19, 2015 at 2:48 AM, Matthias Fischer <
Matthias.Fischer@doubleslash.de> wrote:

> Ok, thanks for your advice so far. I can import companies with their
> nested entities (business branches) now. But I wonder whether there is a
> way to query for company name patterns and get the business branches nested
> inside the respective companies.


pls check [child] at
https://cwiki.apache.org/confluence/display/solr/Transforming+Result+Documents


> Using the following query I only get the companies without their nested
> entities:
>
> http://localhost:8983/solr/jcg/select?q=firma_namenszeile_1%3AMu*&wt=xml&indent=true
>
> I can use the firma_ebi_nr (the company id) and get the associated
> branches by issueing the following query:
>
> http://localhost:8983/solr/jcg/select?q={!child%20of=%22firma_ebi_nr:123123%22}firma_ebi_nr:123123
> This results in a flat list of associated business branches. However I
> would like to search a company by name and in the result I would like to
> see all associated business branches nested inside the respective company.
> Is this possible or do I need to issue the second query above for each
> company search result in order to get the nested entities?
>
> Example of what I would like to achieve:
>
>
> http://localhost:8983/solr/jcg/select?q=firma_namenszeile_1%3AMust*&wt=xml&indent=true
>
> <response>
>     <lst name="responseHeader">
>         <int name="status">0</int>
>         <int name="QTime">1</int>
>         <lst name="params">
>             <str name="q">firma_namenszeile_1:Must*</str>
>             <str name="indent">true</str>
>             <str name="wt">xml</str>
>         </lst>
>     </lst>
>     <result name="response" numFound="2" start="0">
>     <doc>
>         <long name="firma_ebi_nr">123123</long>
>         <str name="firma_namenszeile_1">Musterfirma</str>
>         <str name="id">ac8d5627-b17a-bbbb-8926-8d5a80680ee4</str>
>         <long name="_version_">1515205299087081472</long>
>
>         <!-- nested branches -->
>         <doc>
>             <long name="branche_ebc_code">6</long>
>         </doc>
>         <doc>
>             <long name="branche_ebc_code">43000</long>
>         </doc>
>         <doc>
>             <long name="branche_ebc_code">43900</long>
>         </doc>
>
>      </doc>
>      ....
> </response>
>
>
> Is this possible? Or maybe there is a better way than nested enties? An
> alternative I could think of is to join companies and branches in the JDBC
> import. But this would result in duplicate companies in the search result
> (one for each associated branch). My goal is to have a suggest field where
> the user can type a company name pattern and gets a list of matching
> companies including the associated branches. Any suggestions?
>
> Kind regards,
> Matthias
>
> -----Ursprüngliche Nachricht-----
> Von: Andrea Gazzarini [mailto:a.gazzarini@gmail.com]
> Gesendet: Freitag, 16. Oktober 2015 17:24
> An: solr-user@lucene.apache.org
> Betreff: Re: Nested entities not imported / do not show up in search?
>
> Hi Matthias,
> I guess the company.id field is not unique so you need a "compound"
> uniqueKey on Solr, which is not strctly possible. As consequence of that
> (company) UUID is probably created before the index phase by an
> UpdateRequestProcessor [1] so you should check your solrconfig.xml and, if
> I'm right, check if the same strategy could be used for the nested entities.
>
> Andrea
>
> [1]
>
> http://lucene.apache.org/solr/5_2_1/solr-core/org/apache/solr/update/processor/UUIDUpdateProcessorFactory.html
>
> 2015-10-16 17:11 GMT+02:00 Matthias Fischer <
> Matthias.Fischer@doubleslash.de
> >:
>
> > Thank you, Andrea, for answering so quickly.
> >
> > However I got further errors. I also had to change
> > "<uniqueKey>firma_ebi_nr</uniqueKey>" to "<uniqueKey>id</uniqueKey>".
> > But it still does not work properly. It seems that an id is auto
> > generated for the company documents but not for the nested ones (the
> business branches).
> > Any ideas how to fix this?
> >
> > 2015-10-16 12:49:29.650 WARN  (Thread-17) [   x:jcg] o.a.s.h.d.SolrWriter
> > Error creating document :
> > SolrInputDocument(
> >     fields: [firma_ebi_nr=317709682, firma_namenszeile_1=Example
> > Company, id=3c7f7421-9d51-4056-a2a0-eebab87a546a,
> > _version_=1515192078460518400,
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a],
> >     children: [
> >            SolrInputDocument(fields: [branche_ebc_code=7,
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
> >            SolrInputDocument(fields: [branche_ebc_code=47000,
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
> >            SolrInputDocument(fields: [branche_ebc_code=47700,
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
> >            SolrInputDocument(fields: [branche_ebc_code=47790,
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
> >            SolrInputDocument(fields: [branche_ebc_code=47791,
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a])])
> > org.apache.solr.common.SolrException: [doc=null] missing required field:
> id
> >         at
> >
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:198)
> >         at
> > org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:191)
> >         at
> > org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:166)
> >         at
> >
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:259)
> >         at
> >
> org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:413)
> >         at
> >
> org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1316)
> >         at
> >
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:235)
> >         at
> >
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:163)
> >         at
> >
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
> >         at
> >
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
> >         at
> >
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
> >         at
> >
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
> >         at
> >
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
> >         at
> >
> org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104)
> >         at
> >
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
> >         at
> >
> org.apache.solr.update.processor.AbstractDefaultValueUpdateProcessorFactory$DefaultValueUpdateProcessor.processAdd(AbstractDefaultValueUpdateProcessorFactory.java:94)
> >         at
> > org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:71)
> >         at
> >
> org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:259)
> >         at
> >
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:524)
> >         at
> >
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
> >         at
> >
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
> >         at
> >
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
> >         at
> >
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
> >         at
> >
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
> >         at
> > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.jav
> > a:461)
> >
> > Kind regards,
> > Matthias
> >
> > -----Ursprüngliche Nachricht-----
> > Von: Andrea Gazzarini [mailto:a.gazzarini@gmail.com]
> > Gesendet: Freitag, 16. Oktober 2015 13:59
> > An: solr-user@lucene.apache.org
> > Betreff: Re: Nested entities not imported / do not show up in search?
> >
> > Hi Matthias,
> > you should use <entity-name>.<column-name> in your expressions. So for
> > example, here
> >
> > WHERE fb.EBI_NR='${firma.firma_ebi_nr}'
> >
> > should be
> >
> > WHERE fb.EBI_NR='${firma.EBI_NR}'
> >
> > Best,
> > Andrea
> >
> > 2015-10-16 13:40 GMT+02:00 Matthias Fischer <
> > Matthias.Fischer@doubleslash.de
> > >:
> >
> > > Hello everybody,
> > >
> > > I am trying to import from an Oracle DB 11g2 via DIH using SOLR 5.3.1.
> > > In my relational DB there are company addresses (table
> > > tb_firmen_adressen) and branches (table tb_branchen). They have an
> > > n:m relationship using the join table tb_firmen_branchen.
> > > Now I would like to find companies by their name and in each company
> > > result I would like to see the associated branches.
> > > However I only get the companies without the nested entries. As a
> > > newbie I'd highly appreciate some help as there are no errors or
> > > warnings in the log file and I could not find any helpful hints in
> > > the documentation or elsewhere in the internet concerning my problem.
> > >
> > > Here is my data config:
> > >
> > >     <dataConfig>
> > >         <dataSource name="jdbc"
> driver="oracle.jdbc.driver.OracleDriver"
> > > url="jdbc:oracle:thin:@//xxxxx.xxxxxxx:1521/pde11" user="myuser"
> > > password="mysecret"/>
> > >     <document>
> > >         <entity name="firma" pk="fa.EBI_NR" query="
> > >             SELECT fa.EBI_NR, fa.NAMENSZEILE_1, fa.NAMENSZEILE_2,
> > > fa.NAMENSZEILE_3
> > >             FROM tb_firmen_adressen fa
> > >             WHERE rownum &lt; 10000
> > >         ">
> > >
> > >             <field name="firma_ebi_nr"                  column="EBI_NR"
> > />
> > >             <field name="firma_namenszeile_1"   column="NAMENSZEILE_1"
> />
> > >             <field name="firma_namenszeile_2"   column="NAMENSZEILE_2"
> />
> > >             <field name="firma_namenszeile_3"   column="NAMENSZEILE_3"
> />
> > >
> > >             <entity name="firma_branche" child="true" query="
> > >                 SELECT b.EBC_CODE AS EBC_CODE
> > >                 FROM
> > >                     tb_firmen_branchen fb
> > >                         JOIN tb_branchen b ON fb.EBC_CODE = b.EBC_CODE
> > >                 WHERE fb.EBI_NR='${firma.firma_ebi_nr}'
> > >             ">
> > >                 <field name="branche_ebc_code" column="EBC_CODE" />
> > >                 <!-- I would like to add more fields later here once
> > > I get it to work -->
> > >             </entity>
> > >
> > >         </entity>
> > >     </document>
> > >     </dataConfig>
> > >
> > >
> > > And here are the relevant lines from my schema file:
> > >
> > >     <uniqueKey>firma_ebi_nr</uniqueKey>
> > >
> > >      <field name="firma_ebi_nr"                 type="long"
> > >  required="true"         indexed="true"  stored="true"/>
> > >      <field name="firma_namenszeile_1"  type="text_general"
> > >              indexed="true"  stored="true"/>
> > >      <field name="firma_namenszeile_2"  type="text_general"
> > >              indexed="true"  stored="true"/>
> > >      <field name="firma_namenszeile_3"  type="text_general"
> > >              indexed="true"  stored="true"/>
> > >      <field name="branche_ebc_code"             type="long"
> > >                      indexed="true"  stored="true"/>
> > >
> > >
> > >
> > > After restarting solr and calling
> > > http://localhost:8983/solr/jcg/dataimport?command=full-import I get
> > > "Indexing completed. Added/Updated: 9999 documents. Deleted 0
> documents."
> > > So basically it seams to work, but my search results look like this:
> > >
> > > {
> > >   "responseHeader":{
> > >     "status":0,
> > >     "QTime":71,
> > >     "params":{
> > >       "q":"Der Bunte",
> > >       "defType":"edismax",
> > >       "indent":"true",
> > >       "qf":"firma_namenszeile_1",
> > >       "wt":"json"}},
> > >   "response":{"numFound":85,"start":0,"docs":[
> > >       {
> > >         "firma_ebi_nr":123123123,
> > >         "firma_namenszeile_1":"Der Bunte Laden",
> > >         "_version_":1515185579421073408},
> > >       {
> > >      ...
> > > }
> > >
> > > Why are there no company branches inside the company records? What's
> > > wrong with my configuration? Any help is appreciated!
> > >
> > > Kind regards
> > > Matthias Fischer
> > >
> > >
> >
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<mk...@griddynamics.com>

AW: AW: Nested entities not imported / do not show up in search?

Posted by Matthias Fischer <Ma...@doubleslash.de>.
Thanks, Andrea, your answer does make sense! Obviously as a SOLR newbie I am still thinking too much in terms of traditional databases ;-)

Kind regards
Matthias

-----Ursprüngliche Nachricht-----
Von: Andrea Gazzarini [mailto:a.gazzarini@gmail.com] 
Gesendet: Montag, 19. Oktober 2015 12:05
An: solr-user@lucene.apache.org
Betreff: Re: AW: Nested entities not imported / do not show up in search?

Most probably my answer makes no sense because I don't know the overall context, but why don't you import flat branches and companies with a "type"
attribute ("company" or "branch") and a "owner" field that will be valorized only for braches with the company id ? Then you could autocomplete on the company name (fq=type:"company"). Once selected a company it would be just a matter of another query with two fq:
type:"branch", owner: <selected company id>

Andrea
On 19 Oct 2015 11:48, "Matthias Fischer" <Ma...@doubleslash.de>
wrote:

> Ok, thanks for your advice so far. I can import companies with their 
> nested entities (business branches) now. But I wonder whether there is 
> a way to query for company name patterns and get the business branches 
> nested inside the respective companies. Using the following query I 
> only get the companies without their nested entities:
>
> http://localhost:8983/solr/jcg/select?q=firma_namenszeile_1%3AMu*&wt=x
> ml&indent=true
>
> I can use the firma_ebi_nr (the company id) and get the associated 
> branches by issueing the following query:
>
> http://localhost:8983/solr/jcg/select?q={!child%20of=%22firma_ebi_nr:1
> 23123%22}firma_ebi_nr:123123 This results in a flat list of associated 
> business branches. However I would like to search a company by name 
> and in the result I would like to see all associated business branches 
> nested inside the respective company.
> Is this possible or do I need to issue the second query above for each 
> company search result in order to get the nested entities?
>
> Example of what I would like to achieve:
>
>
> http://localhost:8983/solr/jcg/select?q=firma_namenszeile_1%3AMust*&wt
> =xml&indent=true
>
> <response>
>     <lst name="responseHeader">
>         <int name="status">0</int>
>         <int name="QTime">1</int>
>         <lst name="params">
>             <str name="q">firma_namenszeile_1:Must*</str>
>             <str name="indent">true</str>
>             <str name="wt">xml</str>
>         </lst>
>     </lst>
>     <result name="response" numFound="2" start="0">
>     <doc>
>         <long name="firma_ebi_nr">123123</long>
>         <str name="firma_namenszeile_1">Musterfirma</str>
>         <str name="id">ac8d5627-b17a-bbbb-8926-8d5a80680ee4</str>
>         <long name="_version_">1515205299087081472</long>
>
>         <!-- nested branches -->
>         <doc>
>             <long name="branche_ebc_code">6</long>
>         </doc>
>         <doc>
>             <long name="branche_ebc_code">43000</long>
>         </doc>
>         <doc>
>             <long name="branche_ebc_code">43900</long>
>         </doc>
>
>      </doc>
>      ....
> </response>
>
>
> Is this possible? Or maybe there is a better way than nested enties? 
> An alternative I could think of is to join companies and branches in 
> the JDBC import. But this would result in duplicate companies in the 
> search result (one for each associated branch). My goal is to have a 
> suggest field where the user can type a company name pattern and gets 
> a list of matching companies including the associated branches. Any suggestions?
>
> Kind regards,
> Matthias
>
> -----Ursprüngliche Nachricht-----
> Von: Andrea Gazzarini [mailto:a.gazzarini@gmail.com]
> Gesendet: Freitag, 16. Oktober 2015 17:24
> An: solr-user@lucene.apache.org
> Betreff: Re: Nested entities not imported / do not show up in search?
>
> Hi Matthias,
> I guess the company.id field is not unique so you need a "compound"
> uniqueKey on Solr, which is not strctly possible. As consequence of 
> that
> (company) UUID is probably created before the index phase by an 
> UpdateRequestProcessor [1] so you should check your solrconfig.xml 
> and, if I'm right, check if the same strategy could be used for the nested entities.
>
> Andrea
>
> [1]
>
> http://lucene.apache.org/solr/5_2_1/solr-core/org/apache/solr/update/p
> rocessor/UUIDUpdateProcessorFactory.html
>
> 2015-10-16 17:11 GMT+02:00 Matthias Fischer < 
> Matthias.Fischer@doubleslash.de
> >:
>
> > Thank you, Andrea, for answering so quickly.
> >
> > However I got further errors. I also had to change 
> > "<uniqueKey>firma_ebi_nr</uniqueKey>" to "<uniqueKey>id</uniqueKey>".
> > But it still does not work properly. It seems that an id is auto 
> > generated for the company documents but not for the nested ones (the
> business branches).
> > Any ideas how to fix this?
> >
> > 2015-10-16 12:49:29.650 WARN  (Thread-17) [   x:jcg] o.a.s.h.d.SolrWriter
> > Error creating document :
> > SolrInputDocument(
> >     fields: [firma_ebi_nr=317709682, firma_namenszeile_1=Example 
> > Company, id=3c7f7421-9d51-4056-a2a0-eebab87a546a,
> > _version_=1515192078460518400,
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a],
> >     children: [
> >            SolrInputDocument(fields: [branche_ebc_code=7, 
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
> >            SolrInputDocument(fields: [branche_ebc_code=47000, 
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
> >            SolrInputDocument(fields: [branche_ebc_code=47700, 
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
> >            SolrInputDocument(fields: [branche_ebc_code=47790, 
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
> >            SolrInputDocument(fields: [branche_ebc_code=47791,
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a])])
> > org.apache.solr.common.SolrException: [doc=null] missing required field:
> id
> >         at
> >
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java
> :198)
> >         at
> > org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:191)
> >         at
> > org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:166)
> >         at
> >
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(Docum
> entsWriterPerThread.java:259)
> >         at
> >
> org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWrite
> r.java:413)
> >         at
> >
> org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1
> 316)
> >         at
> >
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandle
> r2.java:235)
> >         at
> >
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler
> 2.java:163)
> >         at
> >
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpda
> teProcessorFactory.java:69)
> >         at
> >
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(Upd
> ateRequestProcessor.java:51)
> >         at
> >
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd
> (DistributedUpdateProcessor.java:955)
> >         at
> >
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd
> (DistributedUpdateProcessor.java:1110)
> >         at
> >
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd
> (DistributedUpdateProcessor.java:706)
> >         at
> >
> org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpda
> teProcessorFactory.java:104)
> >         at
> >
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(Upd
> ateRequestProcessor.java:51)
> >         at
> >
> org.apache.solr.update.processor.AbstractDefaultValueUpdateProcessorFa
> ctory$DefaultValueUpdateProcessor.processAdd(AbstractDefaultValueUpdat
> eProcessorFactory.java:94)
> >         at
> > org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:71)
> >         at
> >
> org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImpo
> rtHandler.java:259)
> >         at
> >
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder
> .java:524)
> >         at
> >
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder
> .java:414)
> >         at
> >
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.ja
> va:329)
> >         at
> >
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:
> 232)
> >         at
> >
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImpor
> ter.java:416)
> >         at
> >
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.ja
> va:480)
> >         at
> > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.j
> > av
> > a:461)
> >
> > Kind regards,
> > Matthias
> >
> > -----Ursprüngliche Nachricht-----
> > Von: Andrea Gazzarini [mailto:a.gazzarini@gmail.com]
> > Gesendet: Freitag, 16. Oktober 2015 13:59
> > An: solr-user@lucene.apache.org
> > Betreff: Re: Nested entities not imported / do not show up in search?
> >
> > Hi Matthias,
> > you should use <entity-name>.<column-name> in your expressions. So 
> > for example, here
> >
> > WHERE fb.EBI_NR='${firma.firma_ebi_nr}'
> >
> > should be
> >
> > WHERE fb.EBI_NR='${firma.EBI_NR}'
> >
> > Best,
> > Andrea
> >
> > 2015-10-16 13:40 GMT+02:00 Matthias Fischer < 
> > Matthias.Fischer@doubleslash.de
> > >:
> >
> > > Hello everybody,
> > >
> > > I am trying to import from an Oracle DB 11g2 via DIH using SOLR 5.3.1.
> > > In my relational DB there are company addresses (table
> > > tb_firmen_adressen) and branches (table tb_branchen). They have an 
> > > n:m relationship using the join table tb_firmen_branchen.
> > > Now I would like to find companies by their name and in each 
> > > company result I would like to see the associated branches.
> > > However I only get the companies without the nested entries. As a 
> > > newbie I'd highly appreciate some help as there are no errors or 
> > > warnings in the log file and I could not find any helpful hints in 
> > > the documentation or elsewhere in the internet concerning my problem.
> > >
> > > Here is my data config:
> > >
> > >     <dataConfig>
> > >         <dataSource name="jdbc"
> driver="oracle.jdbc.driver.OracleDriver"
> > > url="jdbc:oracle:thin:@//xxxxx.xxxxxxx:1521/pde11" user="myuser"
> > > password="mysecret"/>
> > >     <document>
> > >         <entity name="firma" pk="fa.EBI_NR" query="
> > >             SELECT fa.EBI_NR, fa.NAMENSZEILE_1, fa.NAMENSZEILE_2,
> > > fa.NAMENSZEILE_3
> > >             FROM tb_firmen_adressen fa
> > >             WHERE rownum &lt; 10000
> > >         ">
> > >
> > >             <field name="firma_ebi_nr"                  column="EBI_NR"
> > />
> > >             <field name="firma_namenszeile_1"   column="NAMENSZEILE_1"
> />
> > >             <field name="firma_namenszeile_2"   column="NAMENSZEILE_2"
> />
> > >             <field name="firma_namenszeile_3"   column="NAMENSZEILE_3"
> />
> > >
> > >             <entity name="firma_branche" child="true" query="
> > >                 SELECT b.EBC_CODE AS EBC_CODE
> > >                 FROM
> > >                     tb_firmen_branchen fb
> > >                         JOIN tb_branchen b ON fb.EBC_CODE = b.EBC_CODE
> > >                 WHERE fb.EBI_NR='${firma.firma_ebi_nr}'
> > >             ">
> > >                 <field name="branche_ebc_code" column="EBC_CODE" />
> > >                 <!-- I would like to add more fields later here 
> > > once I get it to work -->
> > >             </entity>
> > >
> > >         </entity>
> > >     </document>
> > >     </dataConfig>
> > >
> > >
> > > And here are the relevant lines from my schema file:
> > >
> > >     <uniqueKey>firma_ebi_nr</uniqueKey>
> > >
> > >      <field name="firma_ebi_nr"                 type="long"
> > >  required="true"         indexed="true"  stored="true"/>
> > >      <field name="firma_namenszeile_1"  type="text_general"
> > >              indexed="true"  stored="true"/>
> > >      <field name="firma_namenszeile_2"  type="text_general"
> > >              indexed="true"  stored="true"/>
> > >      <field name="firma_namenszeile_3"  type="text_general"
> > >              indexed="true"  stored="true"/>
> > >      <field name="branche_ebc_code"             type="long"
> > >                      indexed="true"  stored="true"/>
> > >
> > >
> > >
> > > After restarting solr and calling
> > > http://localhost:8983/solr/jcg/dataimport?command=full-import I 
> > > get "Indexing completed. Added/Updated: 9999 documents. Deleted 0
> documents."
> > > So basically it seams to work, but my search results look like this:
> > >
> > > {
> > >   "responseHeader":{
> > >     "status":0,
> > >     "QTime":71,
> > >     "params":{
> > >       "q":"Der Bunte",
> > >       "defType":"edismax",
> > >       "indent":"true",
> > >       "qf":"firma_namenszeile_1",
> > >       "wt":"json"}},
> > >   "response":{"numFound":85,"start":0,"docs":[
> > >       {
> > >         "firma_ebi_nr":123123123,
> > >         "firma_namenszeile_1":"Der Bunte Laden",
> > >         "_version_":1515185579421073408},
> > >       {
> > >      ...
> > > }
> > >
> > > Why are there no company branches inside the company records? 
> > > What's wrong with my configuration? Any help is appreciated!
> > >
> > > Kind regards
> > > Matthias Fischer
> > >
> > >
> >
>

Re: AW: Nested entities not imported / do not show up in search?

Posted by Andrea Gazzarini <a....@gmail.com>.
Most probably my answer makes no sense because I don't know the overall
context, but why don't you import flat branches and companies with a "type"
attribute ("company" or "branch") and a "owner" field that will be
valorized only for braches with the company id ? Then you could
autocomplete on the company name (fq=type:"company"). Once selected a
company it would be just a matter of another query with two fq:
type:"branch", owner: <selected company id>

Andrea
On 19 Oct 2015 11:48, "Matthias Fischer" <Ma...@doubleslash.de>
wrote:

> Ok, thanks for your advice so far. I can import companies with their
> nested entities (business branches) now. But I wonder whether there is a
> way to query for company name patterns and get the business branches nested
> inside the respective companies. Using the following query I only get the
> companies without their nested entities:
>
> http://localhost:8983/solr/jcg/select?q=firma_namenszeile_1%3AMu*&wt=xml&indent=true
>
> I can use the firma_ebi_nr (the company id) and get the associated
> branches by issueing the following query:
>
> http://localhost:8983/solr/jcg/select?q={!child%20of=%22firma_ebi_nr:123123%22}firma_ebi_nr:123123
> This results in a flat list of associated business branches. However I
> would like to search a company by name and in the result I would like to
> see all associated business branches nested inside the respective company.
> Is this possible or do I need to issue the second query above for each
> company search result in order to get the nested entities?
>
> Example of what I would like to achieve:
>
>
> http://localhost:8983/solr/jcg/select?q=firma_namenszeile_1%3AMust*&wt=xml&indent=true
>
> <response>
>     <lst name="responseHeader">
>         <int name="status">0</int>
>         <int name="QTime">1</int>
>         <lst name="params">
>             <str name="q">firma_namenszeile_1:Must*</str>
>             <str name="indent">true</str>
>             <str name="wt">xml</str>
>         </lst>
>     </lst>
>     <result name="response" numFound="2" start="0">
>     <doc>
>         <long name="firma_ebi_nr">123123</long>
>         <str name="firma_namenszeile_1">Musterfirma</str>
>         <str name="id">ac8d5627-b17a-bbbb-8926-8d5a80680ee4</str>
>         <long name="_version_">1515205299087081472</long>
>
>         <!-- nested branches -->
>         <doc>
>             <long name="branche_ebc_code">6</long>
>         </doc>
>         <doc>
>             <long name="branche_ebc_code">43000</long>
>         </doc>
>         <doc>
>             <long name="branche_ebc_code">43900</long>
>         </doc>
>
>      </doc>
>      ....
> </response>
>
>
> Is this possible? Or maybe there is a better way than nested enties? An
> alternative I could think of is to join companies and branches in the JDBC
> import. But this would result in duplicate companies in the search result
> (one for each associated branch). My goal is to have a suggest field where
> the user can type a company name pattern and gets a list of matching
> companies including the associated branches. Any suggestions?
>
> Kind regards,
> Matthias
>
> -----Ursprüngliche Nachricht-----
> Von: Andrea Gazzarini [mailto:a.gazzarini@gmail.com]
> Gesendet: Freitag, 16. Oktober 2015 17:24
> An: solr-user@lucene.apache.org
> Betreff: Re: Nested entities not imported / do not show up in search?
>
> Hi Matthias,
> I guess the company.id field is not unique so you need a "compound"
> uniqueKey on Solr, which is not strctly possible. As consequence of that
> (company) UUID is probably created before the index phase by an
> UpdateRequestProcessor [1] so you should check your solrconfig.xml and, if
> I'm right, check if the same strategy could be used for the nested entities.
>
> Andrea
>
> [1]
>
> http://lucene.apache.org/solr/5_2_1/solr-core/org/apache/solr/update/processor/UUIDUpdateProcessorFactory.html
>
> 2015-10-16 17:11 GMT+02:00 Matthias Fischer <
> Matthias.Fischer@doubleslash.de
> >:
>
> > Thank you, Andrea, for answering so quickly.
> >
> > However I got further errors. I also had to change
> > "<uniqueKey>firma_ebi_nr</uniqueKey>" to "<uniqueKey>id</uniqueKey>".
> > But it still does not work properly. It seems that an id is auto
> > generated for the company documents but not for the nested ones (the
> business branches).
> > Any ideas how to fix this?
> >
> > 2015-10-16 12:49:29.650 WARN  (Thread-17) [   x:jcg] o.a.s.h.d.SolrWriter
> > Error creating document :
> > SolrInputDocument(
> >     fields: [firma_ebi_nr=317709682, firma_namenszeile_1=Example
> > Company, id=3c7f7421-9d51-4056-a2a0-eebab87a546a,
> > _version_=1515192078460518400,
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a],
> >     children: [
> >            SolrInputDocument(fields: [branche_ebc_code=7,
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
> >            SolrInputDocument(fields: [branche_ebc_code=47000,
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
> >            SolrInputDocument(fields: [branche_ebc_code=47700,
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
> >            SolrInputDocument(fields: [branche_ebc_code=47790,
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
> >            SolrInputDocument(fields: [branche_ebc_code=47791,
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a])])
> > org.apache.solr.common.SolrException: [doc=null] missing required field:
> id
> >         at
> >
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:198)
> >         at
> > org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:191)
> >         at
> > org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:166)
> >         at
> >
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:259)
> >         at
> >
> org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:413)
> >         at
> >
> org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1316)
> >         at
> >
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:235)
> >         at
> >
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:163)
> >         at
> >
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
> >         at
> >
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
> >         at
> >
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
> >         at
> >
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
> >         at
> >
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
> >         at
> >
> org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104)
> >         at
> >
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
> >         at
> >
> org.apache.solr.update.processor.AbstractDefaultValueUpdateProcessorFactory$DefaultValueUpdateProcessor.processAdd(AbstractDefaultValueUpdateProcessorFactory.java:94)
> >         at
> > org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:71)
> >         at
> >
> org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:259)
> >         at
> >
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:524)
> >         at
> >
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
> >         at
> >
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
> >         at
> >
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
> >         at
> >
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
> >         at
> >
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
> >         at
> > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.jav
> > a:461)
> >
> > Kind regards,
> > Matthias
> >
> > -----Ursprüngliche Nachricht-----
> > Von: Andrea Gazzarini [mailto:a.gazzarini@gmail.com]
> > Gesendet: Freitag, 16. Oktober 2015 13:59
> > An: solr-user@lucene.apache.org
> > Betreff: Re: Nested entities not imported / do not show up in search?
> >
> > Hi Matthias,
> > you should use <entity-name>.<column-name> in your expressions. So for
> > example, here
> >
> > WHERE fb.EBI_NR='${firma.firma_ebi_nr}'
> >
> > should be
> >
> > WHERE fb.EBI_NR='${firma.EBI_NR}'
> >
> > Best,
> > Andrea
> >
> > 2015-10-16 13:40 GMT+02:00 Matthias Fischer <
> > Matthias.Fischer@doubleslash.de
> > >:
> >
> > > Hello everybody,
> > >
> > > I am trying to import from an Oracle DB 11g2 via DIH using SOLR 5.3.1.
> > > In my relational DB there are company addresses (table
> > > tb_firmen_adressen) and branches (table tb_branchen). They have an
> > > n:m relationship using the join table tb_firmen_branchen.
> > > Now I would like to find companies by their name and in each company
> > > result I would like to see the associated branches.
> > > However I only get the companies without the nested entries. As a
> > > newbie I'd highly appreciate some help as there are no errors or
> > > warnings in the log file and I could not find any helpful hints in
> > > the documentation or elsewhere in the internet concerning my problem.
> > >
> > > Here is my data config:
> > >
> > >     <dataConfig>
> > >         <dataSource name="jdbc"
> driver="oracle.jdbc.driver.OracleDriver"
> > > url="jdbc:oracle:thin:@//xxxxx.xxxxxxx:1521/pde11" user="myuser"
> > > password="mysecret"/>
> > >     <document>
> > >         <entity name="firma" pk="fa.EBI_NR" query="
> > >             SELECT fa.EBI_NR, fa.NAMENSZEILE_1, fa.NAMENSZEILE_2,
> > > fa.NAMENSZEILE_3
> > >             FROM tb_firmen_adressen fa
> > >             WHERE rownum &lt; 10000
> > >         ">
> > >
> > >             <field name="firma_ebi_nr"                  column="EBI_NR"
> > />
> > >             <field name="firma_namenszeile_1"   column="NAMENSZEILE_1"
> />
> > >             <field name="firma_namenszeile_2"   column="NAMENSZEILE_2"
> />
> > >             <field name="firma_namenszeile_3"   column="NAMENSZEILE_3"
> />
> > >
> > >             <entity name="firma_branche" child="true" query="
> > >                 SELECT b.EBC_CODE AS EBC_CODE
> > >                 FROM
> > >                     tb_firmen_branchen fb
> > >                         JOIN tb_branchen b ON fb.EBC_CODE = b.EBC_CODE
> > >                 WHERE fb.EBI_NR='${firma.firma_ebi_nr}'
> > >             ">
> > >                 <field name="branche_ebc_code" column="EBC_CODE" />
> > >                 <!-- I would like to add more fields later here once
> > > I get it to work -->
> > >             </entity>
> > >
> > >         </entity>
> > >     </document>
> > >     </dataConfig>
> > >
> > >
> > > And here are the relevant lines from my schema file:
> > >
> > >     <uniqueKey>firma_ebi_nr</uniqueKey>
> > >
> > >      <field name="firma_ebi_nr"                 type="long"
> > >  required="true"         indexed="true"  stored="true"/>
> > >      <field name="firma_namenszeile_1"  type="text_general"
> > >              indexed="true"  stored="true"/>
> > >      <field name="firma_namenszeile_2"  type="text_general"
> > >              indexed="true"  stored="true"/>
> > >      <field name="firma_namenszeile_3"  type="text_general"
> > >              indexed="true"  stored="true"/>
> > >      <field name="branche_ebc_code"             type="long"
> > >                      indexed="true"  stored="true"/>
> > >
> > >
> > >
> > > After restarting solr and calling
> > > http://localhost:8983/solr/jcg/dataimport?command=full-import I get
> > > "Indexing completed. Added/Updated: 9999 documents. Deleted 0
> documents."
> > > So basically it seams to work, but my search results look like this:
> > >
> > > {
> > >   "responseHeader":{
> > >     "status":0,
> > >     "QTime":71,
> > >     "params":{
> > >       "q":"Der Bunte",
> > >       "defType":"edismax",
> > >       "indent":"true",
> > >       "qf":"firma_namenszeile_1",
> > >       "wt":"json"}},
> > >   "response":{"numFound":85,"start":0,"docs":[
> > >       {
> > >         "firma_ebi_nr":123123123,
> > >         "firma_namenszeile_1":"Der Bunte Laden",
> > >         "_version_":1515185579421073408},
> > >       {
> > >      ...
> > > }
> > >
> > > Why are there no company branches inside the company records? What's
> > > wrong with my configuration? Any help is appreciated!
> > >
> > > Kind regards
> > > Matthias Fischer
> > >
> > >
> >
>

AW: Nested entities not imported / do not show up in search?

Posted by Matthias Fischer <Ma...@doubleslash.de>.
Ok, thanks for your advice so far. I can import companies with their nested entities (business branches) now. But I wonder whether there is a way to query for company name patterns and get the business branches nested inside the respective companies. Using the following query I only get the companies without their nested entities:
http://localhost:8983/solr/jcg/select?q=firma_namenszeile_1%3AMu*&wt=xml&indent=true

I can use the firma_ebi_nr (the company id) and get the associated branches by issueing the following query:
http://localhost:8983/solr/jcg/select?q={!child%20of=%22firma_ebi_nr:123123%22}firma_ebi_nr:123123
This results in a flat list of associated business branches. However I would like to search a company by name and in the result I would like to see all associated business branches nested inside the respective company.
Is this possible or do I need to issue the second query above for each company search result in order to get the nested entities?

Example of what I would like to achieve:

http://localhost:8983/solr/jcg/select?q=firma_namenszeile_1%3AMust*&wt=xml&indent=true

<response>
    <lst name="responseHeader">
        <int name="status">0</int>
        <int name="QTime">1</int>
        <lst name="params">
            <str name="q">firma_namenszeile_1:Must*</str>
            <str name="indent">true</str>
            <str name="wt">xml</str>
        </lst>
    </lst>
    <result name="response" numFound="2" start="0">
    <doc>
        <long name="firma_ebi_nr">123123</long>
        <str name="firma_namenszeile_1">Musterfirma</str>
        <str name="id">ac8d5627-b17a-bbbb-8926-8d5a80680ee4</str>
        <long name="_version_">1515205299087081472</long>

        <!-- nested branches -->
        <doc>
            <long name="branche_ebc_code">6</long>
        </doc>
        <doc>
            <long name="branche_ebc_code">43000</long>
        </doc>
        <doc>
            <long name="branche_ebc_code">43900</long>
        </doc>

     </doc>
     ....
</response>


Is this possible? Or maybe there is a better way than nested enties? An alternative I could think of is to join companies and branches in the JDBC import. But this would result in duplicate companies in the search result (one for each associated branch). My goal is to have a suggest field where the user can type a company name pattern and gets a list of matching companies including the associated branches. Any suggestions?

Kind regards,
Matthias

-----Ursprüngliche Nachricht-----
Von: Andrea Gazzarini [mailto:a.gazzarini@gmail.com] 
Gesendet: Freitag, 16. Oktober 2015 17:24
An: solr-user@lucene.apache.org
Betreff: Re: Nested entities not imported / do not show up in search?

Hi Matthias,
I guess the company.id field is not unique so you need a "compound"
uniqueKey on Solr, which is not strctly possible. As consequence of that
(company) UUID is probably created before the index phase by an UpdateRequestProcessor [1] so you should check your solrconfig.xml and, if I'm right, check if the same strategy could be used for the nested entities.

Andrea

[1]
http://lucene.apache.org/solr/5_2_1/solr-core/org/apache/solr/update/processor/UUIDUpdateProcessorFactory.html

2015-10-16 17:11 GMT+02:00 Matthias Fischer <Matthias.Fischer@doubleslash.de
>:

> Thank you, Andrea, for answering so quickly.
>
> However I got further errors. I also had to change 
> "<uniqueKey>firma_ebi_nr</uniqueKey>" to "<uniqueKey>id</uniqueKey>". 
> But it still does not work properly. It seems that an id is auto 
> generated for the company documents but not for the nested ones (the business branches).
> Any ideas how to fix this?
>
> 2015-10-16 12:49:29.650 WARN  (Thread-17) [   x:jcg] o.a.s.h.d.SolrWriter
> Error creating document :
> SolrInputDocument(
>     fields: [firma_ebi_nr=317709682, firma_namenszeile_1=Example 
> Company, id=3c7f7421-9d51-4056-a2a0-eebab87a546a, 
> _version_=1515192078460518400, _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a],
>     children: [
>            SolrInputDocument(fields: [branche_ebc_code=7, 
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
>            SolrInputDocument(fields: [branche_ebc_code=47000, 
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
>            SolrInputDocument(fields: [branche_ebc_code=47700, 
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
>            SolrInputDocument(fields: [branche_ebc_code=47790, 
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
>            SolrInputDocument(fields: [branche_ebc_code=47791,
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a])])
> org.apache.solr.common.SolrException: [doc=null] missing required field: id
>         at
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:198)
>         at
> org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:191)
>         at
> org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:166)
>         at
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:259)
>         at
> org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:413)
>         at
> org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1316)
>         at
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:235)
>         at
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:163)
>         at
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
>         at
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
>         at
> org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104)
>         at
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>         at
> org.apache.solr.update.processor.AbstractDefaultValueUpdateProcessorFactory$DefaultValueUpdateProcessor.processAdd(AbstractDefaultValueUpdateProcessorFactory.java:94)
>         at
> org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:71)
>         at
> org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:259)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:524)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
>         at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
>         at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
>         at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.jav
> a:461)
>
> Kind regards,
> Matthias
>
> -----Ursprüngliche Nachricht-----
> Von: Andrea Gazzarini [mailto:a.gazzarini@gmail.com]
> Gesendet: Freitag, 16. Oktober 2015 13:59
> An: solr-user@lucene.apache.org
> Betreff: Re: Nested entities not imported / do not show up in search?
>
> Hi Matthias,
> you should use <entity-name>.<column-name> in your expressions. So for 
> example, here
>
> WHERE fb.EBI_NR='${firma.firma_ebi_nr}'
>
> should be
>
> WHERE fb.EBI_NR='${firma.EBI_NR}'
>
> Best,
> Andrea
>
> 2015-10-16 13:40 GMT+02:00 Matthias Fischer < 
> Matthias.Fischer@doubleslash.de
> >:
>
> > Hello everybody,
> >
> > I am trying to import from an Oracle DB 11g2 via DIH using SOLR 5.3.1.
> > In my relational DB there are company addresses (table
> > tb_firmen_adressen) and branches (table tb_branchen). They have an 
> > n:m relationship using the join table tb_firmen_branchen.
> > Now I would like to find companies by their name and in each company 
> > result I would like to see the associated branches.
> > However I only get the companies without the nested entries. As a 
> > newbie I'd highly appreciate some help as there are no errors or 
> > warnings in the log file and I could not find any helpful hints in 
> > the documentation or elsewhere in the internet concerning my problem.
> >
> > Here is my data config:
> >
> >     <dataConfig>
> >         <dataSource name="jdbc" driver="oracle.jdbc.driver.OracleDriver"
> > url="jdbc:oracle:thin:@//xxxxx.xxxxxxx:1521/pde11" user="myuser"
> > password="mysecret"/>
> >     <document>
> >         <entity name="firma" pk="fa.EBI_NR" query="
> >             SELECT fa.EBI_NR, fa.NAMENSZEILE_1, fa.NAMENSZEILE_2,
> > fa.NAMENSZEILE_3
> >             FROM tb_firmen_adressen fa
> >             WHERE rownum &lt; 10000
> >         ">
> >
> >             <field name="firma_ebi_nr"                  column="EBI_NR"
> />
> >             <field name="firma_namenszeile_1"   column="NAMENSZEILE_1" />
> >             <field name="firma_namenszeile_2"   column="NAMENSZEILE_2" />
> >             <field name="firma_namenszeile_3"   column="NAMENSZEILE_3" />
> >
> >             <entity name="firma_branche" child="true" query="
> >                 SELECT b.EBC_CODE AS EBC_CODE
> >                 FROM
> >                     tb_firmen_branchen fb
> >                         JOIN tb_branchen b ON fb.EBC_CODE = b.EBC_CODE
> >                 WHERE fb.EBI_NR='${firma.firma_ebi_nr}'
> >             ">
> >                 <field name="branche_ebc_code" column="EBC_CODE" />
> >                 <!-- I would like to add more fields later here once 
> > I get it to work -->
> >             </entity>
> >
> >         </entity>
> >     </document>
> >     </dataConfig>
> >
> >
> > And here are the relevant lines from my schema file:
> >
> >     <uniqueKey>firma_ebi_nr</uniqueKey>
> >
> >      <field name="firma_ebi_nr"                 type="long"
> >  required="true"         indexed="true"  stored="true"/>
> >      <field name="firma_namenszeile_1"  type="text_general"
> >              indexed="true"  stored="true"/>
> >      <field name="firma_namenszeile_2"  type="text_general"
> >              indexed="true"  stored="true"/>
> >      <field name="firma_namenszeile_3"  type="text_general"
> >              indexed="true"  stored="true"/>
> >      <field name="branche_ebc_code"             type="long"
> >                      indexed="true"  stored="true"/>
> >
> >
> >
> > After restarting solr and calling
> > http://localhost:8983/solr/jcg/dataimport?command=full-import I get 
> > "Indexing completed. Added/Updated: 9999 documents. Deleted 0 documents."
> > So basically it seams to work, but my search results look like this:
> >
> > {
> >   "responseHeader":{
> >     "status":0,
> >     "QTime":71,
> >     "params":{
> >       "q":"Der Bunte",
> >       "defType":"edismax",
> >       "indent":"true",
> >       "qf":"firma_namenszeile_1",
> >       "wt":"json"}},
> >   "response":{"numFound":85,"start":0,"docs":[
> >       {
> >         "firma_ebi_nr":123123123,
> >         "firma_namenszeile_1":"Der Bunte Laden",
> >         "_version_":1515185579421073408},
> >       {
> >      ...
> > }
> >
> > Why are there no company branches inside the company records? What's 
> > wrong with my configuration? Any help is appreciated!
> >
> > Kind regards
> > Matthias Fischer
> >
> >
>

Re: Nested entities not imported / do not show up in search?

Posted by Andrea Gazzarini <a....@gmail.com>.
Hi Matthias,
I guess the company.id field is not unique so you need a "compound"
uniqueKey on Solr, which is not strctly possible. As consequence of that
(company) UUID is probably created before the index phase by an
UpdateRequestProcessor [1] so you should check your solrconfig.xml and, if
I'm right, check if the same strategy could be used for the nested entities.

Andrea

[1]
http://lucene.apache.org/solr/5_2_1/solr-core/org/apache/solr/update/processor/UUIDUpdateProcessorFactory.html

2015-10-16 17:11 GMT+02:00 Matthias Fischer <Matthias.Fischer@doubleslash.de
>:

> Thank you, Andrea, for answering so quickly.
>
> However I got further errors. I also had to change
> "<uniqueKey>firma_ebi_nr</uniqueKey>" to "<uniqueKey>id</uniqueKey>". But
> it still does not work properly. It seems that an id is auto generated for
> the company documents but not for the nested ones (the business branches).
> Any ideas how to fix this?
>
> 2015-10-16 12:49:29.650 WARN  (Thread-17) [   x:jcg] o.a.s.h.d.SolrWriter
> Error creating document :
> SolrInputDocument(
>     fields: [firma_ebi_nr=317709682, firma_namenszeile_1=Example Company,
> id=3c7f7421-9d51-4056-a2a0-eebab87a546a, _version_=1515192078460518400,
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a],
>     children: [
>            SolrInputDocument(fields: [branche_ebc_code=7,
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
>            SolrInputDocument(fields: [branche_ebc_code=47000,
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
>            SolrInputDocument(fields: [branche_ebc_code=47700,
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
>            SolrInputDocument(fields: [branche_ebc_code=47790,
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
>            SolrInputDocument(fields: [branche_ebc_code=47791,
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a])])
> org.apache.solr.common.SolrException: [doc=null] missing required field: id
>         at
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:198)
>         at
> org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:191)
>         at
> org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:166)
>         at
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:259)
>         at
> org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:413)
>         at
> org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1316)
>         at
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:235)
>         at
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:163)
>         at
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
>         at
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
>         at
> org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104)
>         at
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>         at
> org.apache.solr.update.processor.AbstractDefaultValueUpdateProcessorFactory$DefaultValueUpdateProcessor.processAdd(AbstractDefaultValueUpdateProcessorFactory.java:94)
>         at
> org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:71)
>         at
> org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:259)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:524)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
>         at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
>         at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
>         at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
>
> Kind regards,
> Matthias
>
> -----Ursprüngliche Nachricht-----
> Von: Andrea Gazzarini [mailto:a.gazzarini@gmail.com]
> Gesendet: Freitag, 16. Oktober 2015 13:59
> An: solr-user@lucene.apache.org
> Betreff: Re: Nested entities not imported / do not show up in search?
>
> Hi Matthias,
> you should use <entity-name>.<column-name> in your expressions. So for
> example, here
>
> WHERE fb.EBI_NR='${firma.firma_ebi_nr}'
>
> should be
>
> WHERE fb.EBI_NR='${firma.EBI_NR}'
>
> Best,
> Andrea
>
> 2015-10-16 13:40 GMT+02:00 Matthias Fischer <
> Matthias.Fischer@doubleslash.de
> >:
>
> > Hello everybody,
> >
> > I am trying to import from an Oracle DB 11g2 via DIH using SOLR 5.3.1.
> > In my relational DB there are company addresses (table
> > tb_firmen_adressen) and branches (table tb_branchen). They have an n:m
> > relationship using the join table tb_firmen_branchen.
> > Now I would like to find companies by their name and in each company
> > result I would like to see the associated branches.
> > However I only get the companies without the nested entries. As a
> > newbie I'd highly appreciate some help as there are no errors or
> > warnings in the log file and I could not find any helpful hints in the
> > documentation or elsewhere in the internet concerning my problem.
> >
> > Here is my data config:
> >
> >     <dataConfig>
> >         <dataSource name="jdbc" driver="oracle.jdbc.driver.OracleDriver"
> > url="jdbc:oracle:thin:@//xxxxx.xxxxxxx:1521/pde11" user="myuser"
> > password="mysecret"/>
> >     <document>
> >         <entity name="firma" pk="fa.EBI_NR" query="
> >             SELECT fa.EBI_NR, fa.NAMENSZEILE_1, fa.NAMENSZEILE_2,
> > fa.NAMENSZEILE_3
> >             FROM tb_firmen_adressen fa
> >             WHERE rownum &lt; 10000
> >         ">
> >
> >             <field name="firma_ebi_nr"                  column="EBI_NR"
> />
> >             <field name="firma_namenszeile_1"   column="NAMENSZEILE_1" />
> >             <field name="firma_namenszeile_2"   column="NAMENSZEILE_2" />
> >             <field name="firma_namenszeile_3"   column="NAMENSZEILE_3" />
> >
> >             <entity name="firma_branche" child="true" query="
> >                 SELECT b.EBC_CODE AS EBC_CODE
> >                 FROM
> >                     tb_firmen_branchen fb
> >                         JOIN tb_branchen b ON fb.EBC_CODE = b.EBC_CODE
> >                 WHERE fb.EBI_NR='${firma.firma_ebi_nr}'
> >             ">
> >                 <field name="branche_ebc_code" column="EBC_CODE" />
> >                 <!-- I would like to add more fields later here once I
> > get it to work -->
> >             </entity>
> >
> >         </entity>
> >     </document>
> >     </dataConfig>
> >
> >
> > And here are the relevant lines from my schema file:
> >
> >     <uniqueKey>firma_ebi_nr</uniqueKey>
> >
> >      <field name="firma_ebi_nr"                 type="long"
> >  required="true"         indexed="true"  stored="true"/>
> >      <field name="firma_namenszeile_1"  type="text_general"
> >              indexed="true"  stored="true"/>
> >      <field name="firma_namenszeile_2"  type="text_general"
> >              indexed="true"  stored="true"/>
> >      <field name="firma_namenszeile_3"  type="text_general"
> >              indexed="true"  stored="true"/>
> >      <field name="branche_ebc_code"             type="long"
> >                      indexed="true"  stored="true"/>
> >
> >
> >
> > After restarting solr and calling
> > http://localhost:8983/solr/jcg/dataimport?command=full-import I get
> > "Indexing completed. Added/Updated: 9999 documents. Deleted 0 documents."
> > So basically it seams to work, but my search results look like this:
> >
> > {
> >   "responseHeader":{
> >     "status":0,
> >     "QTime":71,
> >     "params":{
> >       "q":"Der Bunte",
> >       "defType":"edismax",
> >       "indent":"true",
> >       "qf":"firma_namenszeile_1",
> >       "wt":"json"}},
> >   "response":{"numFound":85,"start":0,"docs":[
> >       {
> >         "firma_ebi_nr":123123123,
> >         "firma_namenszeile_1":"Der Bunte Laden",
> >         "_version_":1515185579421073408},
> >       {
> >      ...
> > }
> >
> > Why are there no company branches inside the company records? What's
> > wrong with my configuration? Any help is appreciated!
> >
> > Kind regards
> > Matthias Fischer
> >
> >
>

AW: Nested entities not imported / do not show up in search?

Posted by Matthias Fischer <Ma...@doubleslash.de>.
Thank you, Andrea, for answering so quickly.

However I got further errors. I also had to change "<uniqueKey>firma_ebi_nr</uniqueKey>" to "<uniqueKey>id</uniqueKey>". But it still does not work properly. It seems that an id is auto generated for the company documents but not for the nested ones (the business branches). Any ideas how to fix this? 

2015-10-16 12:49:29.650 WARN  (Thread-17) [   x:jcg] o.a.s.h.d.SolrWriter Error creating document : 
SolrInputDocument(
    fields: [firma_ebi_nr=317709682, firma_namenszeile_1=Example Company, id=3c7f7421-9d51-4056-a2a0-eebab87a546a, _version_=1515192078460518400, _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a], 
    children: [
           SolrInputDocument(fields: [branche_ebc_code=7, _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]), 
           SolrInputDocument(fields: [branche_ebc_code=47000, _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]), 
           SolrInputDocument(fields: [branche_ebc_code=47700, _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]), 
           SolrInputDocument(fields: [branche_ebc_code=47790, _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]), 
           SolrInputDocument(fields: [branche_ebc_code=47791, _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a])])
org.apache.solr.common.SolrException: [doc=null] missing required field: id
        at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:198)
        at org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:191)
        at org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:166)
        at org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:259)
        at org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:413)
        at org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1316)
        at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:235)
        at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:163)
        at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
        at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
        at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
        at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
        at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
        at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104)
        at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
        at org.apache.solr.update.processor.AbstractDefaultValueUpdateProcessorFactory$DefaultValueUpdateProcessor.processAdd(AbstractDefaultValueUpdateProcessorFactory.java:94)
        at org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:71)
        at org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:259)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:524)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
        at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
        at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
        at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
        at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)

Kind regards,
Matthias

-----Ursprüngliche Nachricht-----
Von: Andrea Gazzarini [mailto:a.gazzarini@gmail.com] 
Gesendet: Freitag, 16. Oktober 2015 13:59
An: solr-user@lucene.apache.org
Betreff: Re: Nested entities not imported / do not show up in search?

Hi Matthias,
you should use <entity-name>.<column-name> in your expressions. So for example, here

WHERE fb.EBI_NR='${firma.firma_ebi_nr}'

should be

WHERE fb.EBI_NR='${firma.EBI_NR}'

Best,
Andrea

2015-10-16 13:40 GMT+02:00 Matthias Fischer <Matthias.Fischer@doubleslash.de
>:

> Hello everybody,
>
> I am trying to import from an Oracle DB 11g2 via DIH using SOLR 5.3.1.
> In my relational DB there are company addresses (table 
> tb_firmen_adressen) and branches (table tb_branchen). They have an n:m 
> relationship using the join table tb_firmen_branchen.
> Now I would like to find companies by their name and in each company 
> result I would like to see the associated branches.
> However I only get the companies without the nested entries. As a 
> newbie I'd highly appreciate some help as there are no errors or 
> warnings in the log file and I could not find any helpful hints in the 
> documentation or elsewhere in the internet concerning my problem.
>
> Here is my data config:
>
>     <dataConfig>
>         <dataSource name="jdbc" driver="oracle.jdbc.driver.OracleDriver"
> url="jdbc:oracle:thin:@//xxxxx.xxxxxxx:1521/pde11" user="myuser"
> password="mysecret"/>
>     <document>
>         <entity name="firma" pk="fa.EBI_NR" query="
>             SELECT fa.EBI_NR, fa.NAMENSZEILE_1, fa.NAMENSZEILE_2,
> fa.NAMENSZEILE_3
>             FROM tb_firmen_adressen fa
>             WHERE rownum &lt; 10000
>         ">
>
>             <field name="firma_ebi_nr"                  column="EBI_NR" />
>             <field name="firma_namenszeile_1"   column="NAMENSZEILE_1" />
>             <field name="firma_namenszeile_2"   column="NAMENSZEILE_2" />
>             <field name="firma_namenszeile_3"   column="NAMENSZEILE_3" />
>
>             <entity name="firma_branche" child="true" query="
>                 SELECT b.EBC_CODE AS EBC_CODE
>                 FROM
>                     tb_firmen_branchen fb
>                         JOIN tb_branchen b ON fb.EBC_CODE = b.EBC_CODE
>                 WHERE fb.EBI_NR='${firma.firma_ebi_nr}'
>             ">
>                 <field name="branche_ebc_code" column="EBC_CODE" />
>                 <!-- I would like to add more fields later here once I 
> get it to work -->
>             </entity>
>
>         </entity>
>     </document>
>     </dataConfig>
>
>
> And here are the relevant lines from my schema file:
>
>     <uniqueKey>firma_ebi_nr</uniqueKey>
>
>      <field name="firma_ebi_nr"                 type="long"
>  required="true"         indexed="true"  stored="true"/>
>      <field name="firma_namenszeile_1"  type="text_general"
>              indexed="true"  stored="true"/>
>      <field name="firma_namenszeile_2"  type="text_general"
>              indexed="true"  stored="true"/>
>      <field name="firma_namenszeile_3"  type="text_general"
>              indexed="true"  stored="true"/>
>      <field name="branche_ebc_code"             type="long"
>                      indexed="true"  stored="true"/>
>
>
>
> After restarting solr and calling
> http://localhost:8983/solr/jcg/dataimport?command=full-import I get 
> "Indexing completed. Added/Updated: 9999 documents. Deleted 0 documents."
> So basically it seams to work, but my search results look like this:
>
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":71,
>     "params":{
>       "q":"Der Bunte",
>       "defType":"edismax",
>       "indent":"true",
>       "qf":"firma_namenszeile_1",
>       "wt":"json"}},
>   "response":{"numFound":85,"start":0,"docs":[
>       {
>         "firma_ebi_nr":123123123,
>         "firma_namenszeile_1":"Der Bunte Laden",
>         "_version_":1515185579421073408},
>       {
>      ...
> }
>
> Why are there no company branches inside the company records? What's 
> wrong with my configuration? Any help is appreciated!
>
> Kind regards
> Matthias Fischer
>
>

Re: Nested entities not imported / do not show up in search?

Posted by Andrea Gazzarini <a....@gmail.com>.
Hi Matthias,
you should use <entity-name>.<column-name> in your expressions. So for
example, here

WHERE fb.EBI_NR='${firma.firma_ebi_nr}'

should be

WHERE fb.EBI_NR='${firma.EBI_NR}'

Best,
Andrea

2015-10-16 13:40 GMT+02:00 Matthias Fischer <Matthias.Fischer@doubleslash.de
>:

> Hello everybody,
>
> I am trying to import from an Oracle DB 11g2 via DIH using SOLR 5.3.1.
> In my relational DB there are company addresses (table tb_firmen_adressen)
> and branches (table tb_branchen). They have an n:m relationship using the
> join table tb_firmen_branchen.
> Now I would like to find companies by their name and in each company
> result I would like to see the associated branches.
> However I only get the companies without the nested entries. As a newbie
> I'd highly appreciate some help as there are no errors or warnings in the
> log file and I could not find any helpful hints in the documentation or
> elsewhere in the internet concerning my problem.
>
> Here is my data config:
>
>     <dataConfig>
>         <dataSource name="jdbc" driver="oracle.jdbc.driver.OracleDriver"
> url="jdbc:oracle:thin:@//xxxxx.xxxxxxx:1521/pde11" user="myuser"
> password="mysecret"/>
>     <document>
>         <entity name="firma" pk="fa.EBI_NR" query="
>             SELECT fa.EBI_NR, fa.NAMENSZEILE_1, fa.NAMENSZEILE_2,
> fa.NAMENSZEILE_3
>             FROM tb_firmen_adressen fa
>             WHERE rownum &lt; 10000
>         ">
>
>             <field name="firma_ebi_nr"                  column="EBI_NR" />
>             <field name="firma_namenszeile_1"   column="NAMENSZEILE_1" />
>             <field name="firma_namenszeile_2"   column="NAMENSZEILE_2" />
>             <field name="firma_namenszeile_3"   column="NAMENSZEILE_3" />
>
>             <entity name="firma_branche" child="true" query="
>                 SELECT b.EBC_CODE AS EBC_CODE
>                 FROM
>                     tb_firmen_branchen fb
>                         JOIN tb_branchen b ON fb.EBC_CODE = b.EBC_CODE
>                 WHERE fb.EBI_NR='${firma.firma_ebi_nr}'
>             ">
>                 <field name="branche_ebc_code" column="EBC_CODE" />
>                 <!-- I would like to add more fields later here once I get
> it to work -->
>             </entity>
>
>         </entity>
>     </document>
>     </dataConfig>
>
>
> And here are the relevant lines from my schema file:
>
>     <uniqueKey>firma_ebi_nr</uniqueKey>
>
>      <field name="firma_ebi_nr"                 type="long"
>  required="true"         indexed="true"  stored="true"/>
>      <field name="firma_namenszeile_1"  type="text_general"
>              indexed="true"  stored="true"/>
>      <field name="firma_namenszeile_2"  type="text_general"
>              indexed="true"  stored="true"/>
>      <field name="firma_namenszeile_3"  type="text_general"
>              indexed="true"  stored="true"/>
>      <field name="branche_ebc_code"             type="long"
>                      indexed="true"  stored="true"/>
>
>
>
> After restarting solr and calling
> http://localhost:8983/solr/jcg/dataimport?command=full-import I get
> "Indexing completed. Added/Updated: 9999 documents. Deleted 0 documents."
> So basically it seams to work, but my search results look like this:
>
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":71,
>     "params":{
>       "q":"Der Bunte",
>       "defType":"edismax",
>       "indent":"true",
>       "qf":"firma_namenszeile_1",
>       "wt":"json"}},
>   "response":{"numFound":85,"start":0,"docs":[
>       {
>         "firma_ebi_nr":123123123,
>         "firma_namenszeile_1":"Der Bunte Laden",
>         "_version_":1515185579421073408},
>       {
>      ...
> }
>
> Why are there no company branches inside the company records? What's wrong
> with my configuration? Any help is appreciated!
>
> Kind regards
> Matthias Fischer
>
>