You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by subinalex <al...@gmail.com> on 2016/11/21 09:16:21 UTC

Re-Indexing 143 million rows

Hi Team,

I have indexed data with 143 rows(docs) into solr.It takes around 3 hours to
index.I usde csvUpdateHandler and indexes the csv file by remote streaming.
Now ,when i re-indexing the same csv data,it is still taking 3+ hours.

Ideally,since there are no changes in _id values,it should have finished
quickly right?.

Please provide some insights on this..

Regards,
Subin



--
View this message in context: http://lucene.472066.n3.nabble.com/Re-Indexing-143-million-rows-tp4306622.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Re-Indexing 143 million rows

Posted by subinalex <al...@gmail.com>.
Thanks a lot Eric..:-)

On 21 Nov 2016 20:09, "Erick Erickson [via Lucene]" <
ml-node+s472066n4306659h85@n3.nabble.com> wrote:

> In a word, "no". Resending the same document will
>
> 1> delete the old version (based on ID)
> 2> index the document just sent.
>
> When a document comes in, Solr can't assume that
> "nothing's changed". What if you changed your schema?
>
> So I'd expect the second run to take at least as long as the first.
>
> Best,
> Erick
>
> On Mon, Nov 21, 2016 at 1:16 AM, subinalex <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=4306659&i=0>> wrote:
>
> > Hi Team,
> >
> > I have indexed data with 143 rows(docs) into solr.It takes around 3
> hours to
> > index.I usde csvUpdateHandler and indexes the csv file by remote
> streaming.
> > Now ,when i re-indexing the same csv data,it is still taking 3+ hours.
> >
> > Ideally,since there are no changes in _id values,it should have finished
> > quickly right?.
> >
> > Please provide some insights on this..
> >
> > Regards,
> > Subin
> >
> >
> >
> > --
> > View this message in context: http://lucene.472066.n3.
> nabble.com/Re-Indexing-143-million-rows-tp4306622.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://lucene.472066.n3.nabble.com/Re-Indexing-143-million-rows-
> tp4306622p4306659.html
> To unsubscribe from Re-Indexing 143 million rows, click here
> <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4306622&code=YWxleGt1dHR5MTlAZ21haWwuY29tfDQzMDY2MjJ8LTc3MzYxMjgxNA==>
> .
> NAML
> <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://lucene.472066.n3.nabble.com/Re-Indexing-143-million-rows-tp4306622p4306952.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Re-Indexing 143 million rows

Posted by Erick Erickson <er...@gmail.com>.
In a word, "no". Resending the same document will

1> delete the old version (based on ID)
2> index the document just sent.

When a document comes in, Solr can't assume that
"nothing's changed". What if you changed your schema?

So I'd expect the second run to take at least as long as the first.

Best,
Erick

On Mon, Nov 21, 2016 at 1:16 AM, subinalex <al...@gmail.com> wrote:
> Hi Team,
>
> I have indexed data with 143 rows(docs) into solr.It takes around 3 hours to
> index.I usde csvUpdateHandler and indexes the csv file by remote streaming.
> Now ,when i re-indexing the same csv data,it is still taking 3+ hours.
>
> Ideally,since there are no changes in _id values,it should have finished
> quickly right?.
>
> Please provide some insights on this..
>
> Regards,
> Subin
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Re-Indexing-143-million-rows-tp4306622.html
> Sent from the Solr - User mailing list archive at Nabble.com.