You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Son Nguyen <so...@trancorp.com> on 2013/05/06 18:32:59 UTC

Solr Cloud with large synonyms.txt

Hello,

I'm building a Solr Cloud (version 4.1.0) with 2 shards and a Zookeeper (the Zookeeer is on different machine, version 3.4.5).
I've tried to start with a 1.7MB synonyms.txt, but got a "ConnectionLossException":
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /configs/solr1/synonyms.txt
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1266)
        at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:270)
        at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:267)
        at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:65)
        at org.apache.solr.common.cloud.SolrZkClient.setData(SolrZkClient.java:267)
        at org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:436)
        at org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:315)
        at org.apache.solr.cloud.ZkController.uploadToZK(ZkController.java:1135)
        at org.apache.solr.cloud.ZkController.uploadConfigDir(ZkController.java:955)
        at org.apache.solr.core.CoreContainer.initZooKeeper(CoreContainer.java:285)
        ... 43 more

I did some researches on internet and found out that because Zookeeper znode size limit is 1MB. I tried to increase the system property "jute.maxbuffer" but it won't work.
Does anyone have experience of dealing with it?

Thanks,
Son

RE: Solr Cloud with large synonyms.txt

Posted by Roman Chyla <ro...@gmail.com>.
David, have you seen the finite state automata the synonym lookup is built
on? The lookup is very efficient and fast. You have a point though, it is
going to fail for someone.
Roman
On 8 May 2013 03:11, "David Parks" <da...@yahoo.com> wrote:

> I can see your point, though I think edge cases would be one concern, if
> someone *can* create a very large synonyms file, someone *will* create that
> file.  What  would you set the zookeeper max data size to be? 50MB? 100MB?
> Someone is going to do something bad if there's nothing to tell them not
> to.
> Today solr cloud just crashes if you try to create a modest sized synonyms
> file, clearly at a minimum some zookeeper settings should be configured out
> of the box.  Any reasonable setting you come up with for zookeeper is
> virtually guaranteed to fail for some percentage of users over a reasonably
> sized user-base (which solr has).
>
> What if I plugged in a 200MB synonyms file just for testing purposes (I
> don't care about performance implications)?  I don't think most users would
> catch the footnote in the docs that calls out a max synonyms file size.
>
> Dave
>
>
> -----Original Message-----
> From: Mark Miller [mailto:markrmiller@gmail.com]
> Sent: Tuesday, May 07, 2013 11:53 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr Cloud with large synonyms.txt
>
> I'm not so worried about the large file in zk issue myself.
>
> The concern is that you start storing and accessing lots of large files in
> ZK. This is not what it was made for, and everything stays in RAM, so they
> guard against this type of usage.
>
> We are talking about a config file that is loaded on Core load though. It's
> uploaded and read very rarely. On modern hardware and networks, making that
> file 5MB rather than 1MB is not going to ruin your day. It just won't. Solr
> does not use ZooKeeper heavily - in a steady state cluster, it doesn't read
> or write from ZooKeeper at all to any degree that registers. I'm going to
> have to see problems loading these larger config files from ZooKeeper
> before
> I'm worried that it's a problem.
>
> - Mark
>
> On May 7, 2013, at 12:21 PM, Son Nguyen <so...@trancorp.com> wrote:
>
> > Mark,
> >
> > I tried to set that property on both ZK (I have only one ZK instance) and
> Solr, but it still didn't work.
> > But I read somewhere that ZK is not really designed for keeping large
> data
> files, so this solution - increasing jute.maxbuffer (if I can implement it)
> should be just temporary.
> >
> > Son
> >
> > -----Original Message-----
> > From: Mark Miller [mailto:markrmiller@gmail.com]
> > Sent: Tuesday, May 07, 2013 9:35 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Solr Cloud with large synonyms.txt
> >
> >
> > On May 7, 2013, at 10:24 AM, Mark Miller <ma...@gmail.com> wrote:
> >
> >>
> >> On May 6, 2013, at 12:32 PM, Son Nguyen <so...@trancorp.com> wrote:
> >>
> >>> I did some researches on internet and found out that because Zookeeper
> znode size limit is 1MB. I tried to increase the system property
> "jute.maxbuffer" but it won't work.
> >>> Does anyone have experience of dealing with it?
> >>
> >> Perhaps hit up the ZK list? They doc it as simply raising
> jute.maxbuffer,
> though you have to do it for each ZK instance.
> >>
> >> - Mark
> >>
> >
> > "the system property must be set on all servers and clients otherwise
> problems will arise."
> >
> > Make sure you try passing it both to ZK *and* to Solr.
> >
> > - Mark
> >
>
>

RE: Solr Cloud with large synonyms.txt

Posted by David Parks <da...@yahoo.com>.
I can see your point, though I think edge cases would be one concern, if
someone *can* create a very large synonyms file, someone *will* create that
file.  What  would you set the zookeeper max data size to be? 50MB? 100MB?
Someone is going to do something bad if there's nothing to tell them not to.
Today solr cloud just crashes if you try to create a modest sized synonyms
file, clearly at a minimum some zookeeper settings should be configured out
of the box.  Any reasonable setting you come up with for zookeeper is
virtually guaranteed to fail for some percentage of users over a reasonably
sized user-base (which solr has).

What if I plugged in a 200MB synonyms file just for testing purposes (I
don't care about performance implications)?  I don't think most users would
catch the footnote in the docs that calls out a max synonyms file size.

Dave


-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com] 
Sent: Tuesday, May 07, 2013 11:53 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Cloud with large synonyms.txt

I'm not so worried about the large file in zk issue myself.

The concern is that you start storing and accessing lots of large files in
ZK. This is not what it was made for, and everything stays in RAM, so they
guard against this type of usage.

We are talking about a config file that is loaded on Core load though. It's
uploaded and read very rarely. On modern hardware and networks, making that
file 5MB rather than 1MB is not going to ruin your day. It just won't. Solr
does not use ZooKeeper heavily - in a steady state cluster, it doesn't read
or write from ZooKeeper at all to any degree that registers. I'm going to
have to see problems loading these larger config files from ZooKeeper before
I'm worried that it's a problem.

- Mark

On May 7, 2013, at 12:21 PM, Son Nguyen <so...@trancorp.com> wrote:

> Mark,
> 
> I tried to set that property on both ZK (I have only one ZK instance) and
Solr, but it still didn't work.
> But I read somewhere that ZK is not really designed for keeping large data
files, so this solution - increasing jute.maxbuffer (if I can implement it)
should be just temporary.
> 
> Son
> 
> -----Original Message-----
> From: Mark Miller [mailto:markrmiller@gmail.com] 
> Sent: Tuesday, May 07, 2013 9:35 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr Cloud with large synonyms.txt
> 
> 
> On May 7, 2013, at 10:24 AM, Mark Miller <ma...@gmail.com> wrote:
> 
>> 
>> On May 6, 2013, at 12:32 PM, Son Nguyen <so...@trancorp.com> wrote:
>> 
>>> I did some researches on internet and found out that because Zookeeper
znode size limit is 1MB. I tried to increase the system property
"jute.maxbuffer" but it won't work.
>>> Does anyone have experience of dealing with it?
>> 
>> Perhaps hit up the ZK list? They doc it as simply raising jute.maxbuffer,
though you have to do it for each ZK instance.
>> 
>> - Mark
>> 
> 
> "the system property must be set on all servers and clients otherwise
problems will arise."
> 
> Make sure you try passing it both to ZK *and* to Solr.
> 
> - Mark
> 


Re: Solr Cloud with large synonyms.txt

Posted by Mark Miller <ma...@gmail.com>.
I'm not so worried about the large file in zk issue myself.

The concern is that you start storing and accessing lots of large files in ZK. This is not what it was made for, and everything stays in RAM, so they guard against this type of usage.

We are talking about a config file that is loaded on Core load though. It's uploaded and read very rarely. On modern hardware and networks, making that file 5MB rather than 1MB is not going to ruin your day. It just won't. Solr does not use ZooKeeper heavily - in a steady state cluster, it doesn't read or write from ZooKeeper at all to any degree that registers. I'm going to have to see problems loading these larger config files from ZooKeeper before I'm worried that it's a problem.

- Mark

On May 7, 2013, at 12:21 PM, Son Nguyen <so...@trancorp.com> wrote:

> Mark,
> 
> I tried to set that property on both ZK (I have only one ZK instance) and Solr, but it still didn't work.
> But I read somewhere that ZK is not really designed for keeping large data files, so this solution - increasing jute.maxbuffer (if I can implement it) should be just temporary.
> 
> Son
> 
> -----Original Message-----
> From: Mark Miller [mailto:markrmiller@gmail.com] 
> Sent: Tuesday, May 07, 2013 9:35 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr Cloud with large synonyms.txt
> 
> 
> On May 7, 2013, at 10:24 AM, Mark Miller <ma...@gmail.com> wrote:
> 
>> 
>> On May 6, 2013, at 12:32 PM, Son Nguyen <so...@trancorp.com> wrote:
>> 
>>> I did some researches on internet and found out that because Zookeeper znode size limit is 1MB. I tried to increase the system property "jute.maxbuffer" but it won't work.
>>> Does anyone have experience of dealing with it?
>> 
>> Perhaps hit up the ZK list? They doc it as simply raising jute.maxbuffer, though you have to do it for each ZK instance.
>> 
>> - Mark
>> 
> 
> "the system property must be set on all servers and clients otherwise problems will arise."
> 
> Make sure you try passing it both to ZK *and* to Solr.
> 
> - Mark
> 


RE: Solr Cloud with large synonyms.txt

Posted by Son Nguyen <so...@trancorp.com>.
Mark,

I tried to set that property on both ZK (I have only one ZK instance) and Solr, but it still didn't work.
But I read somewhere that ZK is not really designed for keeping large data files, so this solution - increasing jute.maxbuffer (if I can implement it) should be just temporary.

Son

-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com] 
Sent: Tuesday, May 07, 2013 9:35 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Cloud with large synonyms.txt


On May 7, 2013, at 10:24 AM, Mark Miller <ma...@gmail.com> wrote:

> 
> On May 6, 2013, at 12:32 PM, Son Nguyen <so...@trancorp.com> wrote:
> 
>> I did some researches on internet and found out that because Zookeeper znode size limit is 1MB. I tried to increase the system property "jute.maxbuffer" but it won't work.
>> Does anyone have experience of dealing with it?
> 
> Perhaps hit up the ZK list? They doc it as simply raising jute.maxbuffer, though you have to do it for each ZK instance.
> 
> - Mark
> 

"the system property must be set on all servers and clients otherwise problems will arise."

Make sure you try passing it both to ZK *and* to Solr.

- Mark


Re: Solr Cloud with large synonyms.txt

Posted by Mark Miller <ma...@gmail.com>.
On May 7, 2013, at 10:24 AM, Mark Miller <ma...@gmail.com> wrote:

> 
> On May 6, 2013, at 12:32 PM, Son Nguyen <so...@trancorp.com> wrote:
> 
>> I did some researches on internet and found out that because Zookeeper znode size limit is 1MB. I tried to increase the system property "jute.maxbuffer" but it won't work.
>> Does anyone have experience of dealing with it?
> 
> Perhaps hit up the ZK list? They doc it as simply raising jute.maxbuffer, though you have to do it for each ZK instance.
> 
> - Mark
> 

"the system property must be set on all servers and clients otherwise problems will arise."

Make sure you try passing it both to ZK *and* to Solr.

- Mark


Re: Solr Cloud with large synonyms.txt

Posted by Mark Miller <ma...@gmail.com>.
On May 6, 2013, at 12:32 PM, Son Nguyen <so...@trancorp.com> wrote:

> I did some researches on internet and found out that because Zookeeper znode size limit is 1MB. I tried to increase the system property "jute.maxbuffer" but it won't work.
> Does anyone have experience of dealing with it?

Perhaps hit up the ZK list? They doc it as simply raising jute.maxbuffer, though you have to do it for each ZK instance.

- Mark


RE: Solr Cloud with large synonyms.txt

Posted by Son Nguyen <so...@trancorp.com>.
Jan,

Thank you for your answer.
I've opened a JIRA issue with your suggestion.
https://issues.apache.org/jira/browse/SOLR-4793

Son

-----Original Message-----
From: Jan Høydahl [mailto:jan.asf@cominvent.com] 
Sent: Tuesday, May 07, 2013 4:16 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Cloud with large synonyms.txt

Hi,

SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt....

Feel free to open a JIRA issue for this so we can get a proper resolution.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

7. mai 2013 kl. 09:55 skrev Roman Chyla <ro...@gmail.com>:

> We have synonym files bigger than 5MB so even with compression that 
> would be probably failing (not using solr cloud yet) Roman On 6 May 
> 2013 23:09, "David Parks" <da...@yahoo.com> wrote:
> 
>> Wouldn't it make more sense to only store a pointer to a synonyms 
>> file in zookeeper? Maybe just make the synonyms file accessible via 
>> http so other boxes can copy it if needed? Zookeeper was never meant 
>> for storing significant amounts of data.
>> 
>> 
>> -----Original Message-----
>> From: Jan Høydahl [mailto:jan.asf@cominvent.com]
>> Sent: Tuesday, May 07, 2013 4:35 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Solr Cloud with large synonyms.txt
>> 
>> See discussion here
>> http://lucene.472066.n3.nabble.com/gt-1MB-file-to-Zookeeper-td3958614
>> .html
>> 
>> One idea was compression. Perhaps if we add gzip support to 
>> SynonymFilter it can read synonyms.txt.gz which would then fit larger 
>> raw dicts?
>> 
>> --
>> Jan Høydahl, search solution architect Cominvent AS - 
>> www.cominvent.com
>> 
>> 6. mai 2013 kl. 18:32 skrev Son Nguyen <so...@trancorp.com>:
>> 
>>> Hello,
>>> 
>>> I'm building a Solr Cloud (version 4.1.0) with 2 shards and a 
>>> Zookeeper
>> (the Zookeeer is on different machine, version 3.4.5).
>>> I've tried to start with a 1.7MB synonyms.txt, but got a
>> "ConnectionLossException":
>>> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
>> KeeperErrorCode = ConnectionLoss for /configs/solr1/synonyms.txt
>>>       at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>>>       at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>>>       at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1266)
>>>       at
>> org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java
>> :270)
>>>       at
>> org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java
>> :267)
>>>       at
>> 
>> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecut
>> or.java
>> :65)
>>>       at
>> org.apache.solr.common.cloud.SolrZkClient.setData(SolrZkClient.java:2
>> 67)
>>>       at
>> org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:
>> 436)
>>>       at
>> org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:
>> 315)
>>>       at
>> org.apache.solr.cloud.ZkController.uploadToZK(ZkController.java:1135)
>>>       at
>> org.apache.solr.cloud.ZkController.uploadConfigDir(ZkController.java:
>> 955)
>>>       at
>> org.apache.solr.core.CoreContainer.initZooKeeper(CoreContainer.java:2
>> 85)
>>>       ... 43 more
>>> 
>>> I did some researches on internet and found out that because 
>>> Zookeeper
>> znode size limit is 1MB. I tried to increase the system property 
>> "jute.maxbuffer" but it won't work.
>>> Does anyone have experience of dealing with it?
>>> 
>>> Thanks,
>>> Son
>> 
>> 


Re: Solr Cloud with large synonyms.txt

Posted by Jan Høydahl <ja...@cominvent.com>.
Hi,

SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt....

Feel free to open a JIRA issue for this so we can get a proper resolution.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

7. mai 2013 kl. 09:55 skrev Roman Chyla <ro...@gmail.com>:

> We have synonym files bigger than 5MB so even with compression that would
> be probably failing (not using solr cloud yet)
> Roman
> On 6 May 2013 23:09, "David Parks" <da...@yahoo.com> wrote:
> 
>> Wouldn't it make more sense to only store a pointer to a synonyms file in
>> zookeeper? Maybe just make the synonyms file accessible via http so other
>> boxes can copy it if needed? Zookeeper was never meant for storing
>> significant amounts of data.
>> 
>> 
>> -----Original Message-----
>> From: Jan Høydahl [mailto:jan.asf@cominvent.com]
>> Sent: Tuesday, May 07, 2013 4:35 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Solr Cloud with large synonyms.txt
>> 
>> See discussion here
>> http://lucene.472066.n3.nabble.com/gt-1MB-file-to-Zookeeper-td3958614.html
>> 
>> One idea was compression. Perhaps if we add gzip support to SynonymFilter
>> it
>> can read synonyms.txt.gz which would then fit larger raw dicts?
>> 
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> 
>> 6. mai 2013 kl. 18:32 skrev Son Nguyen <so...@trancorp.com>:
>> 
>>> Hello,
>>> 
>>> I'm building a Solr Cloud (version 4.1.0) with 2 shards and a Zookeeper
>> (the Zookeeer is on different machine, version 3.4.5).
>>> I've tried to start with a 1.7MB synonyms.txt, but got a
>> "ConnectionLossException":
>>> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
>> KeeperErrorCode = ConnectionLoss for /configs/solr1/synonyms.txt
>>>       at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>>>       at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>>>       at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1266)
>>>       at
>> org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:270)
>>>       at
>> org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:267)
>>>       at
>> 
>> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java
>> :65)
>>>       at
>> org.apache.solr.common.cloud.SolrZkClient.setData(SolrZkClient.java:267)
>>>       at
>> org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:436)
>>>       at
>> org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:315)
>>>       at
>> org.apache.solr.cloud.ZkController.uploadToZK(ZkController.java:1135)
>>>       at
>> org.apache.solr.cloud.ZkController.uploadConfigDir(ZkController.java:955)
>>>       at
>> org.apache.solr.core.CoreContainer.initZooKeeper(CoreContainer.java:285)
>>>       ... 43 more
>>> 
>>> I did some researches on internet and found out that because Zookeeper
>> znode size limit is 1MB. I tried to increase the system property
>> "jute.maxbuffer" but it won't work.
>>> Does anyone have experience of dealing with it?
>>> 
>>> Thanks,
>>> Son
>> 
>> 


RE: Solr Cloud with large synonyms.txt

Posted by Roman Chyla <ro...@gmail.com>.
We have synonym files bigger than 5MB so even with compression that would
be probably failing (not using solr cloud yet)
Roman
On 6 May 2013 23:09, "David Parks" <da...@yahoo.com> wrote:

> Wouldn't it make more sense to only store a pointer to a synonyms file in
> zookeeper? Maybe just make the synonyms file accessible via http so other
> boxes can copy it if needed? Zookeeper was never meant for storing
> significant amounts of data.
>
>
> -----Original Message-----
> From: Jan Høydahl [mailto:jan.asf@cominvent.com]
> Sent: Tuesday, May 07, 2013 4:35 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr Cloud with large synonyms.txt
>
> See discussion here
> http://lucene.472066.n3.nabble.com/gt-1MB-file-to-Zookeeper-td3958614.html
>
> One idea was compression. Perhaps if we add gzip support to SynonymFilter
> it
> can read synonyms.txt.gz which would then fit larger raw dicts?
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> 6. mai 2013 kl. 18:32 skrev Son Nguyen <so...@trancorp.com>:
>
> > Hello,
> >
> > I'm building a Solr Cloud (version 4.1.0) with 2 shards and a Zookeeper
> (the Zookeeer is on different machine, version 3.4.5).
> > I've tried to start with a 1.7MB synonyms.txt, but got a
> "ConnectionLossException":
> > Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /configs/solr1/synonyms.txt
> >        at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> >        at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> >        at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1266)
> >        at
> org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:270)
> >        at
> org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:267)
> >        at
>
> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java
> :65)
> >        at
> org.apache.solr.common.cloud.SolrZkClient.setData(SolrZkClient.java:267)
> >        at
> org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:436)
> >        at
> org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:315)
> >        at
> org.apache.solr.cloud.ZkController.uploadToZK(ZkController.java:1135)
> >        at
> org.apache.solr.cloud.ZkController.uploadConfigDir(ZkController.java:955)
> >        at
> org.apache.solr.core.CoreContainer.initZooKeeper(CoreContainer.java:285)
> >        ... 43 more
> >
> > I did some researches on internet and found out that because Zookeeper
> znode size limit is 1MB. I tried to increase the system property
> "jute.maxbuffer" but it won't work.
> > Does anyone have experience of dealing with it?
> >
> > Thanks,
> > Son
>
>

RE: Solr Cloud with large synonyms.txt

Posted by David Parks <da...@yahoo.com>.
Wouldn't it make more sense to only store a pointer to a synonyms file in
zookeeper? Maybe just make the synonyms file accessible via http so other
boxes can copy it if needed? Zookeeper was never meant for storing
significant amounts of data.


-----Original Message-----
From: Jan Høydahl [mailto:jan.asf@cominvent.com] 
Sent: Tuesday, May 07, 2013 4:35 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr Cloud with large synonyms.txt

See discussion here
http://lucene.472066.n3.nabble.com/gt-1MB-file-to-Zookeeper-td3958614.html

One idea was compression. Perhaps if we add gzip support to SynonymFilter it
can read synonyms.txt.gz which would then fit larger raw dicts?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

6. mai 2013 kl. 18:32 skrev Son Nguyen <so...@trancorp.com>:

> Hello,
> 
> I'm building a Solr Cloud (version 4.1.0) with 2 shards and a Zookeeper
(the Zookeeer is on different machine, version 3.4.5).
> I've tried to start with a 1.7MB synonyms.txt, but got a
"ConnectionLossException":
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /configs/solr1/synonyms.txt
>        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>        at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1266)
>        at
org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:270)
>        at
org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:267)
>        at
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java
:65)
>        at
org.apache.solr.common.cloud.SolrZkClient.setData(SolrZkClient.java:267)
>        at
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:436)
>        at
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:315)
>        at
org.apache.solr.cloud.ZkController.uploadToZK(ZkController.java:1135)
>        at
org.apache.solr.cloud.ZkController.uploadConfigDir(ZkController.java:955)
>        at
org.apache.solr.core.CoreContainer.initZooKeeper(CoreContainer.java:285)
>        ... 43 more
> 
> I did some researches on internet and found out that because Zookeeper
znode size limit is 1MB. I tried to increase the system property
"jute.maxbuffer" but it won't work.
> Does anyone have experience of dealing with it?
> 
> Thanks,
> Son


Re: Solr Cloud with large synonyms.txt

Posted by Jan Høydahl <ja...@cominvent.com>.
See discussion here http://lucene.472066.n3.nabble.com/gt-1MB-file-to-Zookeeper-td3958614.html

One idea was compression. Perhaps if we add gzip support to SynonymFilter it can read synonyms.txt.gz which would then fit larger raw dicts?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

6. mai 2013 kl. 18:32 skrev Son Nguyen <so...@trancorp.com>:

> Hello,
> 
> I'm building a Solr Cloud (version 4.1.0) with 2 shards and a Zookeeper (the Zookeeer is on different machine, version 3.4.5).
> I've tried to start with a 1.7MB synonyms.txt, but got a "ConnectionLossException":
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /configs/solr1/synonyms.txt
>        at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>        at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1266)
>        at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:270)
>        at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:267)
>        at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:65)
>        at org.apache.solr.common.cloud.SolrZkClient.setData(SolrZkClient.java:267)
>        at org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:436)
>        at org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:315)
>        at org.apache.solr.cloud.ZkController.uploadToZK(ZkController.java:1135)
>        at org.apache.solr.cloud.ZkController.uploadConfigDir(ZkController.java:955)
>        at org.apache.solr.core.CoreContainer.initZooKeeper(CoreContainer.java:285)
>        ... 43 more
> 
> I did some researches on internet and found out that because Zookeeper znode size limit is 1MB. I tried to increase the system property "jute.maxbuffer" but it won't work.
> Does anyone have experience of dealing with it?
> 
> Thanks,
> Son