You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by sc...@asia.com on 2010/07/31 09:29:45 UTC
DIH: Rows fetch OK, Total Documents Failed??
Hi,
I'm a bit lost with this, i'm trying to import a new XML via DIH, all row are fetched but no ducument are indexed? I don't find any log or error?
Any ideas?
Here is the STATUS:
<str name="command">status</str>
<str name="status">idle</str>
<str name="importResponse"/>
<lst name="statusMessages">
<str name="Total Requests made to DataSource">1</str>
<str name="Total Rows Fetched">7554</str>
<str name="Total Documents Skipped">0</str>
<str name="Full Dump Started">2010-07-31 10:14:33</str>
<str name="Total Documents Processed">0</str>
<str name="Total Documents Failed">7554</str>
<str name="Time taken ">0:0:4.720</str>
</lst>
My xml file looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<products>
<product>
<title>Moniteur VG1930wm 19 LCD Viewsonic</title>
<url>http://xxxxx.com/abc?a(12073231)p(2822679)prod(89042332277)ttid(5)url(http%3A%2F%2Fwww.ffdsssd.com%2Fproductinformation%2F%7E66297%7E%2Fproduct.htm%26sender%3D2003)</url>
<content>Moniteur VG1930wm 19 LCD Viewsonic VG1930WM</content>
<price>247.57</price>
<category>Ecrans</category>
</product
etc...
and my dataconfig:
<dataConfig>
<dataSource type="URLDataSource" />
<document>
<entity name="products"
url="file:///home/john/Desktop/src.xml"
processor="XPathEntityProcessor"
forEach="/products/product"
transformer="DateFormatTransformer">
<field column="id" xpath="/products/product/url" commonField="true" />
<field column="title" xpath="/products/product/title" commonField="true" />
<field column="category" xpath="/products/product/category" />
<field column="content" xpath="/products/product/content" />
<field column="price" xpath="/products/product/price" />
</entity>
</document>
</dataConfig>
RE: Rows fetch OK, Total Documents Failed??
Posted by Michael Griffiths <mg...@am-ind.com>.
Check your schema.xml; one of the fields is probable "Required," and it's not matching to a field extracted from DIH. Keep in mind that schema.xml is case-sensitive for names.
-----Original Message-----
From: scrapy@asia.com [mailto:scrapy@asia.com]
Sent: Saturday, July 31, 2010 3:30 AM
To: solr-user@lucene.apache.org
Subject: DIH: Rows fetch OK, Total Documents Failed??
Hi,
I'm a bit lost with this, i'm trying to import a new XML via DIH, all row are fetched but no ducument are indexed? I don't find any log or error?
Any ideas?
Here is the STATUS:
<str name="command">status</str>
<str name="status">idle</str>
<str name="importResponse"/>
<lst name="statusMessages">
<str name="Total Requests made to DataSource">1</str> <str name="Total Rows Fetched">7554</str> <str name="Total Documents Skipped">0</str> <str name="Full Dump Started">2010-07-31 10:14:33</str> <str name="Total Documents Processed">0</str> <str name="Total Documents Failed">7554</str> <str name="Time taken ">0:0:4.720</str> </lst>
My xml file looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<products>
<product>
<title>Moniteur VG1930wm 19 LCD Viewsonic</title>
<url>http://xxxxx.com/abc?a(12073231)p(2822679)prod(89042332277)ttid(5)url(http%3A%2F%2Fwww.ffdsssd.com%2Fproductinformation%2F%7E66297%7E%2Fproduct.htm%26sender%3D2003)</url>
<content>Moniteur VG1930wm 19 LCD Viewsonic VG1930WM</content>
<price>247.57</price>
<category>Ecrans</category>
</product
etc...
and my dataconfig:
<dataConfig>
<dataSource type="URLDataSource" />
<document>
<entity name="products"
url="file:///home/john/Desktop/src.xml"
processor="XPathEntityProcessor"
forEach="/products/product"
transformer="DateFormatTransformer">
<field column="id" xpath="/products/product/url" commonField="true" />
<field column="title" xpath="/products/product/title" commonField="true" />
<field column="category" xpath="/products/product/category" />
<field column="content" xpath="/products/product/content" />
<field column="price" xpath="/products/product/price" />
</entity>
</document>
</dataConfig>
Re: DIH: Rows fetch OK, Total Documents Failed??
Posted by Alexey Serba <as...@gmail.com>.
Do you have any required fields or uniqueKey in your schema.xml? Do
you provide values for all these fields?
AFAIU you don't need commonField attribute for id and title fields. I
don't think that's your problem but anyway...
On Sat, Jul 31, 2010 at 11:29 AM, <sc...@asia.com> wrote:
>
> Hi,
>
> I'm a bit lost with this, i'm trying to import a new XML via DIH, all row are fetched but no ducument are indexed? I don't find any log or error?
>
> Any ideas?
>
> Here is the STATUS:
>
>
> <str name="command">status</str>
> <str name="status">idle</str>
> <str name="importResponse"/>
> <lst name="statusMessages">
> <str name="Total Requests made to DataSource">1</str>
> <str name="Total Rows Fetched">7554</str>
> <str name="Total Documents Skipped">0</str>
> <str name="Full Dump Started">2010-07-31 10:14:33</str>
> <str name="Total Documents Processed">0</str>
> <str name="Total Documents Failed">7554</str>
> <str name="Time taken ">0:0:4.720</str>
> </lst>
>
>
> My xml file looks like this:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <products>
> <product>
> <title>Moniteur VG1930wm 19 LCD Viewsonic</title>
> <url>http://xxxxx.com/abc?a(12073231)p(2822679)prod(89042332277)ttid(5)url(http%3A%2F%2Fwww.ffdsssd.com%2Fproductinformation%2F%7E66297%7E%2Fproduct.htm%26sender%3D2003)</url>
> <content>Moniteur VG1930wm 19 LCD Viewsonic VG1930WM</content>
> <price>247.57</price>
> <category>Ecrans</category>
> </product
> etc...
>
> and my dataconfig:
>
> <dataConfig>
> <dataSource type="URLDataSource" />
> <document>
> <entity name="products"
> url="file:///home/john/Desktop/src.xml"
> processor="XPathEntityProcessor"
> forEach="/products/product"
> transformer="DateFormatTransformer">
>
> <field column="id" xpath="/products/product/url" commonField="true" />
> <field column="title" xpath="/products/product/title" commonField="true" />
> <field column="category" xpath="/products/product/category" />
> <field column="content" xpath="/products/product/content" />
> <field column="price" xpath="/products/product/price" />
>
> </entity>
> </document>
> </dataConfig>
>
>
>
>
>