You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Vijayant Kumar <vi...@websitetoolbox.com> on 2010/02/01 14:42:06 UTC

Problem in indexing on large data set by Dataimporthandler in solr

> Hi,
>
> I am trying to index some large set of data in solr using
> dataimporthandler.
>
> It is working fine for small set but when I am trying to index on large
> set it produces error.
>
> I am using solr version 1.3 and mysql version  Ver 14.7 Distrib 4.1.20,
> for
> redhat-linux-gnu (i686)
>
>
> Earlier I was facing java heap space error
>
> Exception in thread "Thread-16" java.lang.OutOfMemoryError: Java heap
> space
>
> Get it solve by allocating java -Xmx512M -Xms512M -jar start.jar
>
> But Now it is giving mysql communication link failure error
>
> com.mysql.jdbc.CommunicationsException: Communications link failure due to
> underlying exception:
>
> ** BEGIN NESTED EXCEPTION **
>
> java.io.EOFException
>
> STACKTRACE:
>
> java.io.EOFException
>         at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:1913)
> I had added transactionIsolation="TRANSACTION_READ_COMMITTED"
> holdability="CLOSE_CURSORS_AT_COMMIT" in this schema as suggested by solr
> wiki page
> but it is still not working.
>
> Here is my data-config.xml file
>
> <?xml version="1.0" encoding="UTF-8" ?>
> <dataConfig>
> <dataSource name="jdbc" driver="com.mysql.jdbc.Driver"
> url="jdbc:mysql://localhost/databasename"
> user="xxx" password="****" batchSize="-1" readOnly="true"
> autoCommit="false" transactionIsolation="TRANSACTION_READ_COMMITTED"
> holdability="CLOSE_CURSORS_AT_COMMIT"/>
> <document name="feature">
> <entity name="posts" dataSource="jdbc" pk="pid" query="
> SELECT uid, pid, message FROM xxx"
> transformer="TemplateTransformer">
> <add>
> <doc>
> <field column = "pid" name="pid"/>
> <field column = "uid" name="uid"/>
> <field column = "message" name="message"/>
> </doc>
> </add>
> </entity>
> </document>
> </dataConfig>
>
>
>
> My table contain around 1.5 core of data on local machine On production it
> is
> containing around 4 times of data as on local.
>
>
>
> Thank you,
> Vijayant Kumar
> Software Engineer
> Website Toolbox Inc.
> http://www.websitetoolbox.com
> 1-800-921-7803 x211
>


-- 

Thank you,
Vijayant Kumar
Software Engineer
Website Toolbox Inc.
http://www.websitetoolbox.com
1-800-921-7803 x211


Re: Problem in indexing on large data set by Dataimporthandler in solr

Posted by Vijayant Kumar <vi...@websitetoolbox.com>.
Hi Erik,

Thanks for your suggestion I had updated the Solr version.

and the problem is rectified.


> Can you give it a shot on Solr 1.4 instead?  DIH has had numerous
> enhancements/fixes since 1.3.
>
> 	Erik
>
>
> On Feb 1, 2010, at 8:42 AM, Vijayant Kumar wrote:
>
>>
>>> Hi,
>>>
>>> I am trying to index some large set of data in solr using
>>> dataimporthandler.
>>>
>>> It is working fine for small set but when I am trying to index on
>>> large
>>> set it produces error.
>>>
>>> I am using solr version 1.3 and mysql version  Ver 14.7 Distrib
>>> 4.1.20,
>>> for
>>> redhat-linux-gnu (i686)
>>>
>>>
>>> Earlier I was facing java heap space error
>>>
>>> Exception in thread "Thread-16" java.lang.OutOfMemoryError: Java heap
>>> space
>>>
>>> Get it solve by allocating java -Xmx512M -Xms512M -jar start.jar
>>>
>>> But Now it is giving mysql communication link failure error
>>>
>>> com.mysql.jdbc.CommunicationsException: Communications link failure
>>> due to
>>> underlying exception:
>>>
>>> ** BEGIN NESTED EXCEPTION **
>>>
>>> java.io.EOFException
>>>
>>> STACKTRACE:
>>>
>>> java.io.EOFException
>>>        at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:1913)
>>> I had added transactionIsolation="TRANSACTION_READ_COMMITTED"
>>> holdability="CLOSE_CURSORS_AT_COMMIT" in this schema as suggested
>>> by solr
>>> wiki page
>>> but it is still not working.
>>>
>>> Here is my data-config.xml file
>>>
>>> <?xml version="1.0" encoding="UTF-8" ?>
>>> <dataConfig>
>>> <dataSource name="jdbc" driver="com.mysql.jdbc.Driver"
>>> url="jdbc:mysql://localhost/databasename"
>>> user="xxx" password="****" batchSize="-1" readOnly="true"
>>> autoCommit="false" transactionIsolation="TRANSACTION_READ_COMMITTED"
>>> holdability="CLOSE_CURSORS_AT_COMMIT"/>
>>> <document name="feature">
>>> <entity name="posts" dataSource="jdbc" pk="pid" query="
>>> SELECT uid, pid, message FROM xxx"
>>> transformer="TemplateTransformer">
>>> <add>
>>> <doc>
>>> <field column = "pid" name="pid"/>
>>> <field column = "uid" name="uid"/>
>>> <field column = "message" name="message"/>
>>> </doc>
>>> </add>
>>> </entity>
>>> </document>
>>> </dataConfig>
>>>
>>>
>>>
>>> My table contain around 1.5 core of data on local machine On
>>> production it
>>> is
>>> containing around 4 times of data as on local.
>>>
>>>
>>>
>>> Thank you,
>>> Vijayant Kumar
>>> Software Engineer
>>> Website Toolbox Inc.
>>> http://www.websitetoolbox.com
>>> 1-800-921-7803 x211
>>>
>>
>>
>> --
>>
>> Thank you,
>> Vijayant Kumar
>> Software Engineer
>> Website Toolbox Inc.
>> http://www.websitetoolbox.com
>> 1-800-921-7803 x211
>>
>


-- 

Thank you,
Vijayant Kumar
Software Engineer
Website Toolbox Inc.
http://www.websitetoolbox.com
1-800-921-7803 x211


Re: Problem in indexing on large data set by Dataimporthandler in solr

Posted by Erik Hatcher <er...@gmail.com>.
Can you give it a shot on Solr 1.4 instead?  DIH has had numerous  
enhancements/fixes since 1.3.

	Erik


On Feb 1, 2010, at 8:42 AM, Vijayant Kumar wrote:

>
>> Hi,
>>
>> I am trying to index some large set of data in solr using
>> dataimporthandler.
>>
>> It is working fine for small set but when I am trying to index on  
>> large
>> set it produces error.
>>
>> I am using solr version 1.3 and mysql version  Ver 14.7 Distrib  
>> 4.1.20,
>> for
>> redhat-linux-gnu (i686)
>>
>>
>> Earlier I was facing java heap space error
>>
>> Exception in thread "Thread-16" java.lang.OutOfMemoryError: Java heap
>> space
>>
>> Get it solve by allocating java -Xmx512M -Xms512M -jar start.jar
>>
>> But Now it is giving mysql communication link failure error
>>
>> com.mysql.jdbc.CommunicationsException: Communications link failure  
>> due to
>> underlying exception:
>>
>> ** BEGIN NESTED EXCEPTION **
>>
>> java.io.EOFException
>>
>> STACKTRACE:
>>
>> java.io.EOFException
>>        at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:1913)
>> I had added transactionIsolation="TRANSACTION_READ_COMMITTED"
>> holdability="CLOSE_CURSORS_AT_COMMIT" in this schema as suggested  
>> by solr
>> wiki page
>> but it is still not working.
>>
>> Here is my data-config.xml file
>>
>> <?xml version="1.0" encoding="UTF-8" ?>
>> <dataConfig>
>> <dataSource name="jdbc" driver="com.mysql.jdbc.Driver"
>> url="jdbc:mysql://localhost/databasename"
>> user="xxx" password="****" batchSize="-1" readOnly="true"
>> autoCommit="false" transactionIsolation="TRANSACTION_READ_COMMITTED"
>> holdability="CLOSE_CURSORS_AT_COMMIT"/>
>> <document name="feature">
>> <entity name="posts" dataSource="jdbc" pk="pid" query="
>> SELECT uid, pid, message FROM xxx"
>> transformer="TemplateTransformer">
>> <add>
>> <doc>
>> <field column = "pid" name="pid"/>
>> <field column = "uid" name="uid"/>
>> <field column = "message" name="message"/>
>> </doc>
>> </add>
>> </entity>
>> </document>
>> </dataConfig>
>>
>>
>>
>> My table contain around 1.5 core of data on local machine On  
>> production it
>> is
>> containing around 4 times of data as on local.
>>
>>
>>
>> Thank you,
>> Vijayant Kumar
>> Software Engineer
>> Website Toolbox Inc.
>> http://www.websitetoolbox.com
>> 1-800-921-7803 x211
>>
>
>
> -- 
>
> Thank you,
> Vijayant Kumar
> Software Engineer
> Website Toolbox Inc.
> http://www.websitetoolbox.com
> 1-800-921-7803 x211
>