You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Spadez <ja...@hotmail.com> on 2012/11/17 23:49:30 UTC

Solr Delta Import Handler not working

Hi,

These are the exact steps that I have taken to try and get delta import
handler working. If I can provide any more information to help let me know.
I have literally spent the entire friday night and today on this and I throw
in the towel. Where have I gone wrong?

*Added this line to the solrconfig:*
/<requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
      <str name="config">/home/solr/data-config.xml</str>
    </lst>
  </requestHandler>/

*Then my data-config.xml looks like this:*
/<dataConfig>
  <dataSource type="FileDataSource" />
  <document>
    <entity
      name="document"
      processor="FileListEntityProcessor"
      baseDir="/var/lib/data"
      fileName=".*.xml$"
      recursive="false"
      rootEntity="false"
      dataSource="null">
      <entity
        processor="XPathEntityProcessor"
        url="${document.fileAbsolutePath}"
        useSolrAddSchema="true"
        stream="true">
      </entity>
    </entity>
  </document>
</dataConfig>/

*Then in my var/lib/data folder I have a data.xml file that looks like
this:*
/<add>
<doc>
	<field name="id">123</field>
	<field name="description">This is my long description</field>
	<field name="company">Google</field>
	<field name="location_name">England</field>
	<field name="date">2007-12-31 22:29:59</field>
	<field name="source">Google</field>
	<field name="url">www.google.com</field>
	<field name="latlng">45.17614,45.17614</field>
</doc>
</add>/

*Finally I then ran this command:*
/http://localhost:8080/solr/dataimport?command=delta-import&clean=false/

*And I get this result (failed):*
/<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
</lst>
<lst name="initArgs">
<lst name="defaults">
<str name="config">/opt/solr/example/solr/conf/data-config.xml</str>
</lst>
</lst>
<str name="command">delta-import</str>
<str name="status">idle</str>
<str name="importResponse"/>
<lst name="statusMessages">
<str name="Time Elapsed">0:15:9.543</str>
<str name="Total Requests made to DataSource">0</str>
<str name="Total Rows Fetched">0</str>
<str name="Total Documents Processed">0</str>
<str name="Total Documents Skipped">0</str>
<str name="Delta Dump started">2012-11-17 17:32:56</str>
<str name="Identifying Delta">2012-11-17 17:32:56</str>
<str name="">*Indexing failed*. Rolled back all changes.</str>
<str name="Rolledback">2012-11-17 17:32:56</str>
</lst>
<str name="WARNING">
This response format is experimental. It is likely to change in the future.
</str>
</response>/





--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-Delta-Import-Handler-not-working-tp4020897.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Delta Import Handler not working

Posted by Lance Norskog <go...@gmail.com>.
|  dataSource="null"

I think this should not be here. The datasource should default to the <dataSource> listing. And 'rootEntity=true' should be in the XPathEntityProcessor block, because you are adding each file as one document.

----- Original Message -----
| From: "Spadez" <ja...@hotmail.com>
| To: solr-user@lucene.apache.org
| Sent: Sunday, November 18, 2012 7:34:34 AM
| Subject: Re: Solr Delta Import Handler not working
| 
| Update! Thank you to Lance for the help. Based on your suggestion I
| have
| fixed up a few things.
| 
| *My Dataconfig now has the filename pattern fixed and root
| entity=true*
| /<dataConfig>
|   <dataSource type="FileDataSource" />
|   <document>
|     <entity
|       name="document"
|       processor="FileListEntityProcessor"
|       baseDir="/var/lib/employ"
|       fileName="^.*\.xml$"
|       recursive="false"
|       rootEntity="true"
|       dataSource="null">
|       <entity
|         processor="XPathEntityProcessor"
|         url="${document.fileAbsolutePath}"
|         useSolrAddSchema="true"
|         stream="true">
|       </entity>
|     </entity>
|   </document>
| </dataConfig>/
| 
| *My data.xml has a corrected date format with "T":*
| /<add>
| <doc>
|         <field name="id">123</field>
| 	<field name="title">Delta Import 2</field>
|         <field name="description">This is my long description</field>
| 	<field name="truncated_description">This is</field>
| 
|         <field name="company">Google</field>
|         <field name="location_name">England</field>
|         <field name="date">2007-12-31T22:29:59</field>
|         <field name="source">Google</field>
|         <field name="url">www.google.com</field>
|         <field name="latlng">45.17614,45.17614</field>
| </doc>
| </add>/
| 
| 
| 
| --
| View this message in context:
| http://lucene.472066.n3.nabble.com/Solr-Delta-Import-Handler-not-working-tp4020897p4020925.html
| Sent from the Solr - User mailing list archive at Nabble.com.
| 

Re: Solr Delta Import Handler not working

Posted by Spadez <ja...@hotmail.com>.
Update! Thank you to Lance for the help. Based on your suggestion I have
fixed up a few things.

*My Dataconfig now has the filename pattern fixed and root entity=true*
/<dataConfig>
  <dataSource type="FileDataSource" />
  <document>
    <entity
      name="document"
      processor="FileListEntityProcessor"
      baseDir="/var/lib/employ"
      fileName="^.*\.xml$"
      recursive="false"
      rootEntity="true"
      dataSource="null">
      <entity
        processor="XPathEntityProcessor"
        url="${document.fileAbsolutePath}"
        useSolrAddSchema="true"
        stream="true">
      </entity>
    </entity>
  </document>
</dataConfig>/

*My data.xml has a corrected date format with "T":*
/<add>
<doc>
        <field name="id">123</field>
	<field name="title">Delta Import 2</field>
        <field name="description">This is my long description</field>
	<field name="truncated_description">This is</field>

        <field name="company">Google</field>
        <field name="location_name">England</field>
        <field name="date">2007-12-31T22:29:59</field>
        <field name="source">Google</field>
        <field name="url">www.google.com</field>
        <field name="latlng">45.17614,45.17614</field>
</doc>
</add>/



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-Delta-Import-Handler-not-working-tp4020897p4020925.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Delta Import Handler not working

Posted by Spadez <ja...@hotmail.com>.
Thank you for the reply. I will try your corrections. I realised that it
actually logs the error in the tomcat log and this is what it says:

/Nov 18, 2012 10:09:40 AM org.apache.catalina.startup.Catalina start
INFO: Server startup in 3005 ms
Nov 18, 2012 10:09:47 AM org.apache.solr.handler.dataimport.DataImporter
doDeltaImport
INFO: Starting Delta Import
Nov 18, 2012 10:09:47 AM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/dataimport
params={clean=false&command=delta-import} status=0 QTime=10 
Nov 18, 2012 10:09:47 AM
org.apache.solr.handler.dataimport.SimplePropertiesWriter
readIndexerProperties
WARNING: Unable to read: dataimport.properties
Nov 18, 2012 10:09:47 AM org.apache.solr.handler.dataimport.DocBuilder
doDelta
INFO: Starting delta collection.
Nov 18, 2012 10:09:47 AM org.apache.solr.handler.dataimport.DocBuilder
collectDelta
INFO: Running ModifiedRowKey() for Entity: 80953050238262
Nov 18, 2012 10:09:47 AM org.apache.solr.handler.dataimport.DocBuilder
collectDelta
INFO: Completed ModifiedRowKey for Entity: 80953050238262 rows obtained : 0
Nov 18, 2012 10:09:47 AM org.apache.solr.handler.dataimport.DocBuilder
collectDelta
INFO: Completed DeletedRowKey for Entity: 80953050238262 rows obtained : 0
Nov 18, 2012 10:09:47 AM org.apache.solr.handler.dataimport.DocBuilder
collectDelta
INFO: Completed parentDeltaQuery for Entity: 80953050238262
Nov 18, 2012 10:09:47 AM org.apache.solr.handler.dataimport.DocBuilder
collectDelta
INFO: Running ModifiedRowKey() for Entity: document
Nov 18, 2012 10:09:47 AM org.apache.solr.handler.dataimport.DocBuilder
collectDelta
INFO: Completed ModifiedRowKey for Entity: document rows obtained : 0
Nov 18, 2012 10:09:47 AM org.apache.solr.handler.dataimport.DocBuilder
collectDelta
INFO: Completed DeletedRowKey for Entity: document rows obtained : 0
Nov 18, 2012 10:09:47 AM org.apache.solr.handler.dataimport.DocBuilder
collectDelta
INFO: Completed parentDeltaQuery for Entity: document
Nov 18, 2012 10:09:47 AM org.apache.solr.handler.dataimport.DocBuilder
doDelta
INFO: Delta Import completed successfully
Nov 18, 2012 10:09:47 AM org.apache.solr.handler.dataimport.DocBuilder
execute
INFO: Time taken = 0:0:0.37
Nov 18, 2012 10:09:47 AM org.apache.solr.update.processor.LogUpdateProcessor
finish
INFO: {} 0 10/

Is seems to be saying it executed successfully? I dont see any data in my
SOLR though!



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-Delta-Import-Handler-not-working-tp4020897p4020924.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Delta Import Handler not working

Posted by Lance Norskog <go...@gmail.com>.
I think this means the pattern did not match any files:
<str name="Total Rows Fetched">0</str>

The wiki example includes a '^' at the beginning of the filename pattern. This matches a complete line. 
http://wiki.apache.org/solr/DataImportHandler#Transformers_Example

More:
Add rootEntity="true". It cannot hurt to be explicit.

The date format needs a 'T' instead of a space:
http://en.wikipedia.org/wiki/ISO_8601

Cheers!

----- Original Message -----
| From: "Spadez" <ja...@hotmail.com>
| To: solr-user@lucene.apache.org
| Sent: Saturday, November 17, 2012 2:49:30 PM
| Subject: Solr Delta Import Handler not working
| 
| Hi,
| 
| These are the exact steps that I have taken to try and get delta
| import
| handler working. If I can provide any more information to help let me
| know.
| I have literally spent the entire friday night and today on this and
| I throw
| in the towel. Where have I gone wrong?
| 
| *Added this line to the solrconfig:*
| /<requestHandler name="/dataimport"
| class="org.apache.solr.handler.dataimport.DataImportHandler">
|     <lst name="defaults">
|       <str name="config">/home/solr/data-config.xml</str>
|     </lst>
|   </requestHandler>/
| 
| *Then my data-config.xml looks like this:*
| /<dataConfig>
|   <dataSource type="FileDataSource" />
|   <document>
|     <entity
|       name="document"
|       processor="FileListEntityProcessor"
|       baseDir="/var/lib/data"
|       fileName=".*.xml$"
|       recursive="false"
|       rootEntity="false"
|       dataSource="null">
|       <entity
|         processor="XPathEntityProcessor"
|         url="${document.fileAbsolutePath}"
|         useSolrAddSchema="true"
|         stream="true">
|       </entity>
|     </entity>
|   </document>
| </dataConfig>/
| 
| *Then in my var/lib/data folder I have a data.xml file that looks
| like
| this:*
| /<add>
| <doc>
| 	<field name="id">123</field>
| 	<field name="description">This is my long description</field>
| 	<field name="company">Google</field>
| 	<field name="location_name">England</field>
| 	<field name="date">2007-12-31 22:29:59</field>
| 	<field name="source">Google</field>
| 	<field name="url">www.google.com</field>
| 	<field name="latlng">45.17614,45.17614</field>
| </doc>
| </add>/
| 
| *Finally I then ran this command:*
| /http://localhost:8080/solr/dataimport?command=delta-import&clean=false/
| 
| *And I get this result (failed):*
| /<response>
| <lst name="responseHeader">
| <int name="status">0</int>
| <int name="QTime">1</int>
| </lst>
| <lst name="initArgs">
| <lst name="defaults">
| <str name="config">/opt/solr/example/solr/conf/data-config.xml</str>
| </lst>
| </lst>
| <str name="command">delta-import</str>
| <str name="status">idle</str>
| <str name="importResponse"/>
| <lst name="statusMessages">
| <str name="Time Elapsed">0:15:9.543</str>
| <str name="Total Requests made to DataSource">0</str>
| <str name="Total Rows Fetched">0</str>
| <str name="Total Documents Processed">0</str>
| <str name="Total Documents Skipped">0</str>
| <str name="Delta Dump started">2012-11-17 17:32:56</str>
| <str name="Identifying Delta">2012-11-17 17:32:56</str>
| <str name="">*Indexing failed*. Rolled back all changes.</str>
| <str name="Rolledback">2012-11-17 17:32:56</str>
| </lst>
| <str name="WARNING">
| This response format is experimental. It is likely to change in the
| future.
| </str>
| </response>/
| 
| 
| 
| 
| 
| --
| View this message in context:
| http://lucene.472066.n3.nabble.com/Solr-Delta-Import-Handler-not-working-tp4020897.html
| Sent from the Solr - User mailing list archive at Nabble.com.
|