You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by ku3ia <de...@gmail.com> on 2012/11/08 11:19:13 UTC

Replicated zookeeper

Hi!

I'm trying to setup SolrCloud with replicated zookeeper, but have a problem.

I'm using Jetty 8 (not embedded), Zookeeper 3.3.6, SolrCloud 4.0 from
branch, Ubuntu 12.04 LTS.
My configs are:

Four Jetty instances running on ports 8080, 8081, 8082 and 8083

Jetty1.sh:
JAVA_OPTIONS="$JAVA_OPTIONS
-Djava.util.logging.config.file=$JETTY_HOME/etc/logging.properties
-XX:+DisableExplicitGC \
    -XX:PermSize=96M -XX:MaxPermSize=96M -Xmx512M -Xms512M -XX:NewSize=96M
-XX:MaxNewSize=96M \
    -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled \
    -XX:CMSInitiatingOccupancyFraction=50 -XX:GCTimeRatio=9
-XX:MinHeapFreeRatio=25 -XX:MaxHeapFreeRatio=25 \
    -verbose:gc -XX:+PrintGCTimeStamps -Xloggc:$JETTY_HOME/logs/gc.log
-Dsolr.solr.home=/opt/search4/solr/1 \
    -Dbootstrap_confdir=/opt/search4/solr/1/collection1/conf
-Dcollection.configName=sm -DnumShards=2
-DzkHost=10.112.1.2:2181,10.112.1.2:2182,10.112.1.2:2183"

Jetty2.sh (3 and 4 are the same except solr.home var):
JAVA_OPTIONS="$JAVA_OPTIONS
-Djava.util.logging.config.file=$JETTY_HOME/etc/logging.properties
-XX:+DisableExplicitGC \
    -XX:PermSize=96M -XX:MaxPermSize=96M -Xmx512M -Xms512M -XX:NewSize=96M
-XX:MaxNewSize=96M \
    -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled \
    -XX:CMSInitiatingOccupancyFraction=50 -XX:GCTimeRatio=9
-XX:MinHeapFreeRatio=25 -XX:MaxHeapFreeRatio=25 \
    -verbose:gc -XX:+PrintGCTimeStamps -Xloggc:$JETTY_HOME/logs/gc.log
-Dsolr.solr.home=/opt/search4/solr/2 \
    -DzkHost=10.112.1.2:2181,10.112.1.2:2182,10.112.1.2:2183"

My solr.xml files:
solr.xml (8080 port)
<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="true">
  <cores adminPath="/admin/cores" defaultCoreName="collection1"
host="10.112.1.2" hostPort="8080" hostContext="${hostContext:}"
zkClientTimeout="${zkClientTimeout:15000}">
    <core name="collection1" instanceDir="collection1" shard="shard1" />
  </cores>
</solr>

solr.xml (8081 port)
<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="true">
  <cores adminPath="/admin/cores" defaultCoreName="collection1"
host="10.112.1.2" hostPort="8081" hostContext="${hostContext:}"
zkClientTimeout="${zkClientTimeout:15000}">
    <core name="collection1" instanceDir="collection1" shard="shard2" />
  </cores>
</solr>

solr.xml (8082 port)
<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="true">
  <cores adminPath="/admin/cores" defaultCoreName="collection1"
host="10.112.1.2" hostPort="8082" hostContext="${hostContext:}"
zkClientTimeout="${zkClientTimeout:15000}">
    <core name="collection1" instanceDir="collection1" shard="shard1" />
  </cores>
</solr>

solr.xml (8083 port)
<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="true">
  <cores adminPath="/admin/cores" defaultCoreName="collection1"
host="10.112.1.2" hostPort="8083" hostContext="${hostContext:}"
zkClientTimeout="${zkClientTimeout:15000}">
    <core name="collection1" instanceDir="collection1" shard="shard2" />
  </cores>
</solr>

My zookeeper configs (are the same, except dataDir and clientPort):
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/search4/zookeeper/1/data
clientPort=2181

# zookeeper ensemble
server.1=10.112.1.2:2888:3888
server.2=10.112.1.2:2889:3889
server.3=10.112.1.2:2890:3890

I had put myid file to datadir to each zookeper and start them and after
that I started Jetty.

Everything looks fine, SolrCloud is running normally, I have two leaders on
ports 8080 (shard1) and 8081 (shard2), but when I turn off first JVM (port
8080) Solr at third JVM doesn't become leader and I see errors in logs (3rd
JVM on port 8082):

Nov 08, 2012 11:00:40 AM org.apache.solr.cloud.ShardLeaderElectionContext
waitForReplicasToComeUp
INFO: Waiting until we see more replicas up: total=2 found=1
timeoutin=118104
Nov 08, 2012 11:00:41 AM org.apache.solr.cloud.RecoveryStrategy doRecovery
INFO: Starting Replication Recovery. core=collection1
Nov 08, 2012 11:00:41 AM org.apache.solr.client.solrj.impl.HttpClientUtil
createClient
INFO: Creating new http client,
config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
Nov 08, 2012 11:00:41 AM org.apache.solr.common.SolrException log
SEVERE: Error while trying to recover.
core=collection1:org.apache.solr.client.solrj.SolrServerException: Server
refused connection at: http://10.112.1.2:8080/solr
        at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:406)
        at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
        at
org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:199)
        at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:388)
        at
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:220)
Caused by: org.apache.http.conn.HttpHostConnectException: Connection to
http://10.112.1.2:8080 refused
        at
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:158)
        at
org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:150)
        at
org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:121)
        at
org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:575)
        at
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:425)
        at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
        at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
        at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732)
        at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:352)
        ... 4 more
Caused by: java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
        at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
        at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
        at java.net.Socket.connect(Socket.java:579)
        at
org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:123)
        at
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:148)
        ... 12 more

Nov 08, 2012 11:00:41 AM org.apache.solr.cloud.RecoveryStrategy doRecovery
SEVERE: Recovery failed - trying again... core=collection1
Nov 08, 2012 11:00:41 AM org.apache.solr.cloud.ShardLeaderElectionContext
waitForReplicasToComeUp
INFO: Waiting until we see more replicas up: total=2 found=1
timeoutin=117601
Nov 08, 2012 11:00:41 AM org.apache.solr.cloud.ShardLeaderElectionContext
waitForReplicasToComeUp
INFO: Waiting until we see more replicas up: total=2 found=1
timeoutin=117098

But when I run not replicated embedded zookeper no errors are present in
logs.
When I turn off second JVM app (8082) - see attach

<http://lucene.472066.n3.nabble.com/file/n4018984/Untitled.png> 

I have empty segments at all shards, numFound=0.
Please advice, what I'm doing wrong.

Thanks




--
View this message in context: http://lucene.472066.n3.nabble.com/Replicated-zookeeper-tp4018984.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Replicated zookeeper

Posted by ku3ia <de...@gmail.com>.
>>When I turn off second JVM app (8082) - see attach 
Little mistake. I'm turning off second JVM app, but port is 8081 not 8082.
Attach is correct.



--
View this message in context: http://lucene.472066.n3.nabble.com/Replicated-zookeeper-tp4018984p4019020.html
Sent from the Solr - User mailing list archive at Nabble.com.