You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by roySolr <ro...@gmail.com> on 2013/03/14 14:22:38 UTC
Advice: solrCloud + DIH
Hello,
I need some advice with my solrcloud cluster and the DIH. I have a cluster
with 3 cloud servers. Every server has an solr instance and a zookeeper
instance. I start it with the -Dzkhost parameter. It works great, i send
updates by an curl(xml) like this:
curl http:/ip:SOLRport/solr/update -H "Content-Type: text/xml" --data-binary
'<add><doc><field name="id">223232</field><field
name="content">test</field></doc></add>'
Solr has 2 million docs in the index. Now i want a extra field: content2. I
add this in my schema and upload this again to the cluster with
-Dbootstrap_confdir and -Dcollection.configName. It's replicated to the
whole cluster.
Now i need a re-index to add the field to every doc. I have a database with
all the data and want to use the full-import of DIH(this was the way i did
this in previous solr versions). When i run this it goes with 3 doc/s(Really
slow). When i run solr alone(not solrcloud) it goes 600 docs/sec.
What's the best way to do a full re-index with solrcloud? Does solrcloud
support DIH?
Thanks
--
View this message in context: http://lucene.472066.n3.nabble.com/Advice-solrCloud-DIH-tp4047339.html
Sent from the Solr - User mailing list archive at Nabble.com.
答复: Advice: solrCloud + DIH
Posted by "Rollin.R.Ma (lab.sh04.Newegg) 41099" <Ro...@newegg.com>.
2000docs/s is my result. Near to embededsolr. Can be tuned .
Yes u can know that, u must understand shard partition.
--
View this message in context: http://lucene.472066.n3.nabble.com/Advice-solrCloud-DIH-tp4047339p4047673.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Advice: solrCloud + DIH
Posted by rulinma <ru...@gmail.com>.
Yes u can know that, u must understand shard partition.
--
View this message in context: http://lucene.472066.n3.nabble.com/Advice-solrCloud-DIH-tp4047339p4047673.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Advice: solrCloud + DIH
Posted by roySolr <ro...@gmail.com>.
Thans for the support so far,
I was running the dataimport on a replica! Now i start it on the leader and
it goes with 590 doc/s. I think all docs were going to another node and then
came back.
Is there a way to get the leader? If there is, i can detect the leader with
a script and start the DIH every night on the right server.
Roy
--
View this message in context: http://lucene.472066.n3.nabble.com/Advice-solrCloud-DIH-tp4047339p4047627.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Advice: solrCloud + DIH
Posted by Mark Miller <ma...@gmail.com>.
On Mar 14, 2013, at 9:22 AM, roySolr <ro...@gmail.com> wrote:
> Hello,
>
> When i run this it goes with 3 doc/s(Really
> slow). When i run solr alone(not solrcloud) it goes 600 docs/sec.
>
> What's the best way to do a full re-index with solrcloud? Does solrcloud
> support DIH?
>
> Thanks
>
SolrCloud supports DIH, but not fully and happily. It's setup to work pretty nicely with non SolrCloud - it will load pretty quick - with SolrCloud a few things can happen - one is that you might be running DIH on a replica rather than a leader - and that can change without your consent - in this case all docs will go to another node and then come back. SolrCloud also works best with multiple threads really - DIH will only use one to my knowledge.
Still, at 3 docs/s, something sounds wrong. That's too slow.
- Mark
Re: Advice: solrCloud + DIH
Posted by rulinma <ru...@gmail.com>.
3docs/s is lower, I test with 4 node is more 1000docs/s and 4k/doc with
solrcloud. Every leader has a replica.
I am tuning to improve to 3000docs/s. 3docs/s is too slow.
3x!
--
View this message in context: http://lucene.472066.n3.nabble.com/Advice-solrCloud-DIH-tp4047339p4047559.html
Sent from the Solr - User mailing list archive at Nabble.com.