You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Alucard <al...@gmail.com> on 2011/06/13 13:13:43 UTC
How to optimize DIH Full import if I am using SQLite and have 70,000,
700,000 or 7 million records?
Hi all
As far as I know, by using DIH it will read all the documents from database
(I am using SQLite v3) to memory.
Now I would like to ask if I have a lot of records (let say 7 millions), it
will put
All 7 millions record in memory, how can I avoid that?
There is a piece of documentation that say: setting
responseBuffering="adaptive"(MSSQL)
Or setting batchsize=”-1” (MySQL), but there is no attributes for SQLite.
Can
We use those parameters? What other parameters can SQLite users use?
Thank you in advance.
Ellery
Re: How to optimize DIH Full import if I am using SQLite and have 70,000, 700,000 or 7 million records?
Posted by alucard001 <al...@gmail.com>.
Thank you Shalin.
But when I google "sqlite jdbc driver documentation", there is not so many pages that describe such kind of parameters.
The most relevant one is this: http://www.xerial.org/trac/Xerial/wiki/SQLiteJDBC
Can you please tell me which resources I can reference to?
Thank you again.
Ellery
Shalin Shekhar Mangar <sh...@gmail.com> 於 2011年6月13日 20:43 寫道:
> On Mon, Jun 13, 2011 at 4:43 PM, Alucard <al...@gmail.com> wrote:
>
>>
>> As far as I know, by using DIH it will read all the documents from database
>> (I am using SQLite v3) to memory.
>
>
> That is incorrect. DIH does not read rows into memory, rather, only the set
> of rows needed to create a Solr document is kept in memory at any given
> time. The documents are streamed.
>
>
>>
>> Now I would like to ask if I have a lot of records (let say 7 millions), it
>> will put
>> All 7 millions record in memory, how can I avoid that?
>>
>> There is a piece of documentation that say: setting
>> responseBuffering="adaptive"(MSSQL)
>> Or setting batchsize=”-1” (MySQL), but there is no attributes for SQLite.
>> Can
>> We use those parameters? What other parameters can SQLite users use?
>>
>>
> Those parameters are JDBC driver specific settings e.g. MySQL JDBC driver
> reads rows into memory unless you set batchSize="-1"
>
> You'll have to look at SQLite's jdbc driver's docs to see if it reads rows
> into memory or has a switch to stream rows one at a time to the client.
>
> --
> Regards,
> Shalin Shekhar Mangar.
Re: How to optimize DIH Full import if I am using SQLite and have
70,000, 700,000 or 7 million records?
Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
On Mon, Jun 13, 2011 at 4:43 PM, Alucard <al...@gmail.com> wrote:
>
> As far as I know, by using DIH it will read all the documents from database
> (I am using SQLite v3) to memory.
That is incorrect. DIH does not read rows into memory, rather, only the set
of rows needed to create a Solr document is kept in memory at any given
time. The documents are streamed.
>
> Now I would like to ask if I have a lot of records (let say 7 millions), it
> will put
> All 7 millions record in memory, how can I avoid that?
>
> There is a piece of documentation that say: setting
> responseBuffering="adaptive"(MSSQL)
> Or setting batchsize=”-1” (MySQL), but there is no attributes for SQLite.
> Can
> We use those parameters? What other parameters can SQLite users use?
>
>
Those parameters are JDBC driver specific settings e.g. MySQL JDBC driver
reads rows into memory unless you set batchSize="-1"
You'll have to look at SQLite's jdbc driver's docs to see if it reads rows
into memory or has a switch to stream rows one at a time to the client.
--
Regards,
Shalin Shekhar Mangar.