You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Peter Blokland <pe...@desk.nl> on 2016/11/14 15:19:49 UTC

DIH problem with multiple (types of) resources

hi,

I'm porting an old data-import configuratie from 4.x to 6.3.0. a minimal config
is this :

<dataConfig>
  <dataSource name="web" 
              type="BinURLDataSource" />

  <dataSource name="db" 
              type="JdbcDataSource" 
              driver="com.mysql.jdbc.Driver" 
              url="jdbc:mysql://localhost/meulboek" 
              user="foo" 
              password="bar"/>

  <document>

    <entity dataSource="db" name="page" query="select id as pid from pages">
      <entity name="html" dataSource="web" processor="TikaEntityProcessor" url="http://site/nl/${page.pid}" format="text">
        <field column="text" name="_text_"/>
      </entity>
    </entity>

    <entity datasource="db" name="edition" query="select edition from editions">
      <field name="_text_" column="edition"/>
    </entity>

  </document>

</dataConfig>


when I try to do a full import with this, I get :

2016-11-14 12:31:52.173 INFO  (Thread-68) [   x:meulboek] o.a.s.u.p.LogUpdateProcessorFactory [meulboek]  webapp=/solr path=/dataimport params={core=meulboek&optimize=false&indent=on&commit=true&clean=true&wt=json&command=full-import&_=1479122291861&verbose=true} status=0 QTime=11{deleteByQuery=*:* (-1550976769832517632),add=[ed99517c-ece9-40c6-9682-c9ec74173241 (1550976769976172544), 9283532a-2395-43eb-bcb8-fd30c5ebfd08 (1550976770348417024), 87b75d5c-a12a-4538-bc29-ceb13d6a9d1c (1550976770455371776), 476b5da3-3752-4867-bdb3-4264403c5c2d (1550976770787770368), 71cdaadb-62ba-4753-ad1b-01ba7fd75bfa (1550976770875850752), 02f41269-4a28-4001-aab9-7b1feb51e332 (1550976770954493952), 6216ec48-2abd-465b-8d6b-60907c7f49db (1550976771047817216), 4317b308-dc88-47e1-9240-0d7d94646de6 (1550976771136946176), 159ee092-2f72-45f6-970e-9dfd6d635bdf (1550976771221880832), bdfa48c4-23e2-483f-9b63-e0c5753d60a5 (1550976771336175616)]} 0 1465
2016-11-14 12:31:52.173 ERROR (Thread-68) [   x:meulboek] o.a.s.h.d.DataImporter Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Exception in invoking url null Processing Document # 11
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:270)
        at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
        at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:475)
        at org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:458)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Exception in invoking url null Processing Document # 11
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:416)
        at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
        ... 4 more
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: Exception in invoking url null Processing Document # 11
        at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:69)
        at org.apache.solr.handler.dataimport.BinURLDataSource.getData(BinURLDataSource.java:89)
        at org.apache.solr.handler.dataimport.BinURLDataSource.getData(BinURLDataSource.java:38)
        at org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
        at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
        at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:244)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:475)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
        ... 6 more
Caused by: java.net.MalformedURLException: no protocol: nullselect edition from editions
        at java.net.URL.<init>(URL.java:593)
        at java.net.URL.<init>(URL.java:490)
        at java.net.URL.<init>(URL.java:439)
        at org.apache.solr.handler.dataimport.BinURLDataSource.getData(BinURLDataSource.java:81)
        ... 12 more


note that this failure occurrs with the second entity, and judging from this
line :

Caused by: java.net.MalformedURLException: no protocol: nullselect edition from editions

it seems solr tries to use the datasource named "web" (the BinURLDataSource)
instead of the configured "db" datasource (the JdbcDataSource). am I doing
something wrong, or is this a bug ? 

-- 
CUL8R, Peter.

www.desk.nl

Your excuse is: Communist revolutionaries taking over the server room and demanding all the computers in the building or they shoot the sysadmin. Poor misguided fools.

Re: DIH problem with multiple (types of) resources

Posted by Peter Blokland <pe...@desk.nl>.
hi,

On Tue, Nov 15, 2016 at 02:54:49AM +1100, Alexandre Rafalovitch wrote:

>>     <entity dataSource="db" name="page" query="select id as pid from pages">
>>     <entity datasource="db" name="edition" query="select edition from editions">
 
> Attribute names are case sensitive as far as I remember. Try
> 'dataSource' for the second definition.

oh wow... that's sneaky. in the old version the case didn't seem to matter,
but now it certainly does. thx :)

-- 
CUL8R, Peter.

www.desk.nl

Your excuse is: It is a layer 8 problem

Re: DIH problem with multiple (types of) resources

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
On 15 November 2016 at 02:19, Peter Blokland <pe...@desk.nl> wrote:
>     <entity dataSource="db" name="page" query="select id as pid from pages">
....
>     <entity datasource="db" name="edition" query="select edition from editions">

Attribute names are case sensitive as far as I remember. Try
'dataSource' for the second definition.

Regards,
   Alex.

----
Solr Example reading group is starting November 2016, join us at
http://j.mp/SolrERG
Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/