You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tanya Bompi <ta...@gmail.com> on 2018/09/18 22:48:39 UTC

SOLR 7.0 DIH out of memory issue with sqlserver

Hi,
  I have the SOLR 7.0 setup with the DataImportHandler connecting to the
sql server db. I keep getting OutOfMemory: Java Heap Space when doing a
full import. The size of the records is around 3 million so not very huge.
I tried the following steps and nothing helped thus far.

1. Setting the "responseBuffering=adaptive;selectMethod=Cursor" in the jdbc
connection string.
2. Setting the batchSize="-1" which hasnt helped
3. Increasing the heap size at solr startup by issuing the command \solr
start -m 1024m -p 8983
Increasing the heap size further doesnt start SOLR instance itself.

I am wondering what could be causing the issue and how to resolve this.
Below is the data-config :

<dataConfig>
  <dataSource type="JdbcDataSource"
              driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"

url="jdbc:sqlserver://SQL10;IntegratedSecurity=True;databaseName=SnapshotData;responseBuffering=adaptive;selectMethod=Cursor"

  user="yyyy"
              />
  <document>

Thanks,
Tanya

Re: SOLR 7.0 DIH out of memory issue with sqlserver

Posted by Erick Erickson <er...@gmail.com>.
Tanya:

Good to hear. You probably want to configure hard <autoCommit> as
well, and in your case perhaps with openSearcher=true

Indexing is only half the problem. It's quite possible that what's
happening is your index is just growing and that's pushing the
boundaries of Java heap. What I'm thinking is that DIH may be
importing small batches, but as more docs are added the underlying
index grows by enough to hit an OOM with batch N+1.

You could just move the indexing off to a client which would give you
a way to see which it is (index growth or jdbc config). Here's a
sample:
https://lucidworks.com/2012/02/14/indexing-with-solrj/

Best,
Erick
On Wed, Sep 19, 2018 at 1:22 PM Tanya Bompi <ta...@gmail.com> wrote:
>
> Hi Erick,
>   Thank you for the follow-up. I have resolved the issue with the increase
> in heapSize and I am able to set the SOLR VM to initialize with a  3G heap
> size and the subset of 1 mil records was fetched successfully. Although it
> fails with the entire 3 mil records. So something is off with the adaptive
> buffering setting as I see  it is not helping. And I also set the
> autosoftcommit param. I might have to increase the heap size further to see
> if it helps. I will keep posted if my issue doesnt resolve.
>
> Thanks,
> Tanya
>
> On Wed, Sep 19, 2018 at 8:22 AM Erick Erickson <er...@gmail.com>
> wrote:
>
> > Has this ever worked? IOW, is this something that's changed or has
> > just never worked?
> >
> > The obvious first step is to start Solr with more than 1G of memory.
> > Solr _likes_ memory and a 1G heap is quite small. But you say:
> > "Increasing the heap size further doesnt start SOLR instance itself.".
> > How much RAM do you have on your machine? What other programs are
> > running? You should be able to increase the heap and start Solr if you
> > have the RAM on your machine so I'd figure out what's behind that
> > issue first. I regularly start Solr with 16 or 32G of memory on my
> > local machines, I know of installations running Solr with 60G heaps so
> > this points to something really odd about your environment.
> >
> > When you "increase it further", exactly _how_ does Solr fail to start?
> > What appears in the Solr logs? etc. Really, troubleshoot that issue
> > first I'd recommend.
> >
> > If DIH still needs a ridiculous amount of memory, it's usually the
> > JDBC driver trying to read all the rows into memory at once and you'll
> > have to explore the jdbc driver settings in detail.
> >
> > Best,
> > Erick
> > On Tue, Sep 18, 2018 at 11:16 PM Tanya Bompi <ta...@gmail.com>
> > wrote:
> > >
> > > Hi,
> > >   I am using the Microsoft Jdbc driver 6.4 version in Solr 7.4.0 . I have
> > > tried removing the selectMethod=Cursor and still it runs out of heap
> > space.
> > > Do we have anyone who has faced similar issue.
> > >
> > > Thanks
> > > Tanya
> > >
> > >
> > > On Tue, Sep 18, 2018 at 6:38 PM Shawn Heisey <ap...@elyograg.org>
> > wrote:
> > >
> > > > On 9/18/2018 4:48 PM, Tanya Bompi wrote:
> > > > >    I have the SOLR 7.0 setup with the DataImportHandler connecting
> > to the
> > > > > sql server db. I keep getting OutOfMemory: Java Heap Space when
> > doing a
> > > > > full import. The size of the records is around 3 million so not very
> > > > huge.
> > > > > I tried the following steps and nothing helped thus far.
> > > >
> > > > See this wiki page:
> > > >
> > > > https://wiki.apache.org/solr/DataImportHandlerFaq
> > > >
> > > > You already have the suggested fix -- setting responseBuffering to
> > > > adaptive.  You might try upgrading the driver.  If that doesn't work,
> > > > you're probably going to need to talk to Microsoft about what you need
> > > > to do differently on the JDBC url.
> > > >
> > > > I did find this page:
> > > >
> > > >
> > > >
> > https://docs.microsoft.com/en-us/sql/connect/jdbc/using-adaptive-buffering?view=sql-server-2017
> > > >
> > > > This says that when using adaptive buffering, you should avoid using
> > > > selectMethod=cursor.  So you should try removing that parameter.
> > > >
> > > > Thanks,
> > > > Shawn
> > > >
> > > >
> >

Re: SOLR 7.0 DIH out of memory issue with sqlserver

Posted by Tanya Bompi <ta...@gmail.com>.
Hi Erick,
  Thank you for the follow-up. I have resolved the issue with the increase
in heapSize and I am able to set the SOLR VM to initialize with a  3G heap
size and the subset of 1 mil records was fetched successfully. Although it
fails with the entire 3 mil records. So something is off with the adaptive
buffering setting as I see  it is not helping. And I also set the
autosoftcommit param. I might have to increase the heap size further to see
if it helps. I will keep posted if my issue doesnt resolve.

Thanks,
Tanya

On Wed, Sep 19, 2018 at 8:22 AM Erick Erickson <er...@gmail.com>
wrote:

> Has this ever worked? IOW, is this something that's changed or has
> just never worked?
>
> The obvious first step is to start Solr with more than 1G of memory.
> Solr _likes_ memory and a 1G heap is quite small. But you say:
> "Increasing the heap size further doesnt start SOLR instance itself.".
> How much RAM do you have on your machine? What other programs are
> running? You should be able to increase the heap and start Solr if you
> have the RAM on your machine so I'd figure out what's behind that
> issue first. I regularly start Solr with 16 or 32G of memory on my
> local machines, I know of installations running Solr with 60G heaps so
> this points to something really odd about your environment.
>
> When you "increase it further", exactly _how_ does Solr fail to start?
> What appears in the Solr logs? etc. Really, troubleshoot that issue
> first I'd recommend.
>
> If DIH still needs a ridiculous amount of memory, it's usually the
> JDBC driver trying to read all the rows into memory at once and you'll
> have to explore the jdbc driver settings in detail.
>
> Best,
> Erick
> On Tue, Sep 18, 2018 at 11:16 PM Tanya Bompi <ta...@gmail.com>
> wrote:
> >
> > Hi,
> >   I am using the Microsoft Jdbc driver 6.4 version in Solr 7.4.0 . I have
> > tried removing the selectMethod=Cursor and still it runs out of heap
> space.
> > Do we have anyone who has faced similar issue.
> >
> > Thanks
> > Tanya
> >
> >
> > On Tue, Sep 18, 2018 at 6:38 PM Shawn Heisey <ap...@elyograg.org>
> wrote:
> >
> > > On 9/18/2018 4:48 PM, Tanya Bompi wrote:
> > > >    I have the SOLR 7.0 setup with the DataImportHandler connecting
> to the
> > > > sql server db. I keep getting OutOfMemory: Java Heap Space when
> doing a
> > > > full import. The size of the records is around 3 million so not very
> > > huge.
> > > > I tried the following steps and nothing helped thus far.
> > >
> > > See this wiki page:
> > >
> > > https://wiki.apache.org/solr/DataImportHandlerFaq
> > >
> > > You already have the suggested fix -- setting responseBuffering to
> > > adaptive.  You might try upgrading the driver.  If that doesn't work,
> > > you're probably going to need to talk to Microsoft about what you need
> > > to do differently on the JDBC url.
> > >
> > > I did find this page:
> > >
> > >
> > >
> https://docs.microsoft.com/en-us/sql/connect/jdbc/using-adaptive-buffering?view=sql-server-2017
> > >
> > > This says that when using adaptive buffering, you should avoid using
> > > selectMethod=cursor.  So you should try removing that parameter.
> > >
> > > Thanks,
> > > Shawn
> > >
> > >
>

Re: SOLR 7.0 DIH out of memory issue with sqlserver

Posted by Erick Erickson <er...@gmail.com>.
Has this ever worked? IOW, is this something that's changed or has
just never worked?

The obvious first step is to start Solr with more than 1G of memory.
Solr _likes_ memory and a 1G heap is quite small. But you say:
"Increasing the heap size further doesnt start SOLR instance itself.".
How much RAM do you have on your machine? What other programs are
running? You should be able to increase the heap and start Solr if you
have the RAM on your machine so I'd figure out what's behind that
issue first. I regularly start Solr with 16 or 32G of memory on my
local machines, I know of installations running Solr with 60G heaps so
this points to something really odd about your environment.

When you "increase it further", exactly _how_ does Solr fail to start?
What appears in the Solr logs? etc. Really, troubleshoot that issue
first I'd recommend.

If DIH still needs a ridiculous amount of memory, it's usually the
JDBC driver trying to read all the rows into memory at once and you'll
have to explore the jdbc driver settings in detail.

Best,
Erick
On Tue, Sep 18, 2018 at 11:16 PM Tanya Bompi <ta...@gmail.com> wrote:
>
> Hi,
>   I am using the Microsoft Jdbc driver 6.4 version in Solr 7.4.0 . I have
> tried removing the selectMethod=Cursor and still it runs out of heap space.
> Do we have anyone who has faced similar issue.
>
> Thanks
> Tanya
>
>
> On Tue, Sep 18, 2018 at 6:38 PM Shawn Heisey <ap...@elyograg.org> wrote:
>
> > On 9/18/2018 4:48 PM, Tanya Bompi wrote:
> > >    I have the SOLR 7.0 setup with the DataImportHandler connecting to the
> > > sql server db. I keep getting OutOfMemory: Java Heap Space when doing a
> > > full import. The size of the records is around 3 million so not very
> > huge.
> > > I tried the following steps and nothing helped thus far.
> >
> > See this wiki page:
> >
> > https://wiki.apache.org/solr/DataImportHandlerFaq
> >
> > You already have the suggested fix -- setting responseBuffering to
> > adaptive.  You might try upgrading the driver.  If that doesn't work,
> > you're probably going to need to talk to Microsoft about what you need
> > to do differently on the JDBC url.
> >
> > I did find this page:
> >
> >
> > https://docs.microsoft.com/en-us/sql/connect/jdbc/using-adaptive-buffering?view=sql-server-2017
> >
> > This says that when using adaptive buffering, you should avoid using
> > selectMethod=cursor.  So you should try removing that parameter.
> >
> > Thanks,
> > Shawn
> >
> >

Re: SOLR 7.0 DIH out of memory issue with sqlserver

Posted by Tanya Bompi <ta...@gmail.com>.
Hi,
  I am using the Microsoft Jdbc driver 6.4 version in Solr 7.4.0 . I have
tried removing the selectMethod=Cursor and still it runs out of heap space.
Do we have anyone who has faced similar issue.

Thanks
Tanya


On Tue, Sep 18, 2018 at 6:38 PM Shawn Heisey <ap...@elyograg.org> wrote:

> On 9/18/2018 4:48 PM, Tanya Bompi wrote:
> >    I have the SOLR 7.0 setup with the DataImportHandler connecting to the
> > sql server db. I keep getting OutOfMemory: Java Heap Space when doing a
> > full import. The size of the records is around 3 million so not very
> huge.
> > I tried the following steps and nothing helped thus far.
>
> See this wiki page:
>
> https://wiki.apache.org/solr/DataImportHandlerFaq
>
> You already have the suggested fix -- setting responseBuffering to
> adaptive.  You might try upgrading the driver.  If that doesn't work,
> you're probably going to need to talk to Microsoft about what you need
> to do differently on the JDBC url.
>
> I did find this page:
>
>
> https://docs.microsoft.com/en-us/sql/connect/jdbc/using-adaptive-buffering?view=sql-server-2017
>
> This says that when using adaptive buffering, you should avoid using
> selectMethod=cursor.  So you should try removing that parameter.
>
> Thanks,
> Shawn
>
>

Re: SOLR 7.0 DIH out of memory issue with sqlserver

Posted by Shawn Heisey <ap...@elyograg.org>.
On 9/18/2018 4:48 PM, Tanya Bompi wrote:
>    I have the SOLR 7.0 setup with the DataImportHandler connecting to the
> sql server db. I keep getting OutOfMemory: Java Heap Space when doing a
> full import. The size of the records is around 3 million so not very huge.
> I tried the following steps and nothing helped thus far.

See this wiki page:

https://wiki.apache.org/solr/DataImportHandlerFaq

You already have the suggested fix -- setting responseBuffering to 
adaptive.  You might try upgrading the driver.  If that doesn't work, 
you're probably going to need to talk to Microsoft about what you need 
to do differently on the JDBC url.

I did find this page:

https://docs.microsoft.com/en-us/sql/connect/jdbc/using-adaptive-buffering?view=sql-server-2017

This says that when using adaptive buffering, you should avoid using 
selectMethod=cursor.  So you should try removing that parameter.

Thanks,
Shawn