You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by karthik <km...@gmail.com> on 2011/06/13 22:39:13 UTC

Adding documents in a batch using Solrj

Hi Everyone,

I am trying to use Solrj to add documents to my solr index. In the process
of playing around with the implementation I noticed that when we add
documents in a batch to Solr the response back from solr is just - status &
qtime. I am using Solr 3.1 right now.

I came across the following scenario that I would like to handle carefully
-

When there are exceptions caused by one of the document within the batch
then the documents after that specific documents doesnt make it to the index
ie., lets say out of 100 documents trying to get added, doc 56 has an issue
due to schema restrictions, etc., then docs 57 - 100 dont make it to the
index. Even for docs 1 - 55 to get indexed I need the commit outside the
exception handling block of the addBeans() method.

In the above scenario I would like Solr (or) Solrj to return the doc id's
that got indexed successfully or the doc id's that failed. I would also like
for the documents 57 - 100 to be processed & not get dropped abruptly
because doc 56 had an issue.

Not sure if there is a way for me to get these details/functionality right
now. If I cant get them, I can try to take a crack at developing a patch. I
would require a lot more help in the latter scenario ;-)

Thanks in advance.

-- karthik

Re: Adding documents in a batch using Solrj

Posted by Erick Erickson <er...@gmail.com>.
Have fun. Note that the intent is to have the logging/record
keeping in the superclass (whose name escapes me) and each
update type should be able to use that....

Best
Erick

On Mon, Jun 13, 2011 at 11:15 PM, karthik <km...@gmail.com> wrote:
> Thanks Erick. Will certainly take a look.
>
> I am looking to do this for binary objects since i have started with that.
>
> -- karthik
>
> On Mon, Jun 13, 2011 at 8:52 PM, Erick Erickson <er...@gmail.com>wrote:
>
>> Take a look at SOLR-445, I started down this road a while
>> ago but then got distracted. If you'd like to pick it up and
>> take it farther, feel free. I haven't applied that patch in a
>> while, so I don't know how easy it will be to apply.
>>
>> Last I left it, it would do much of what you're asking for for
>> xml documents fed to solr, and I was going to get around to
>> some of the other input types but haven't yet. That was what
>> committing this was waiting on.
>>
>> Best
>> Erick
>>
>> On Mon, Jun 13, 2011 at 4:39 PM, karthik <km...@gmail.com> wrote:
>> > Hi Everyone,
>> >
>> > I am trying to use Solrj to add documents to my solr index. In the
>> process
>> > of playing around with the implementation I noticed that when we add
>> > documents in a batch to Solr the response back from solr is just - status
>> &
>> > qtime. I am using Solr 3.1 right now.
>> >
>> > I came across the following scenario that I would like to handle
>> carefully
>> > -
>> >
>> > When there are exceptions caused by one of the document within the batch
>> > then the documents after that specific documents doesnt make it to the
>> index
>> > ie., lets say out of 100 documents trying to get added, doc 56 has an
>> issue
>> > due to schema restrictions, etc., then docs 57 - 100 dont make it to the
>> > index. Even for docs 1 - 55 to get indexed I need the commit outside the
>> > exception handling block of the addBeans() method.
>> >
>> > In the above scenario I would like Solr (or) Solrj to return the doc id's
>> > that got indexed successfully or the doc id's that failed. I would also
>> like
>> > for the documents 57 - 100 to be processed & not get dropped abruptly
>> > because doc 56 had an issue.
>> >
>> > Not sure if there is a way for me to get these details/functionality
>> right
>> > now. If I cant get them, I can try to take a crack at developing a patch.
>> I
>> > would require a lot more help in the latter scenario ;-)
>> >
>> > Thanks in advance.
>> >
>> > -- karthik
>> >
>>
>

Re: Adding documents in a batch using Solrj

Posted by karthik <km...@gmail.com>.
Thanks Erick. Will certainly take a look.

I am looking to do this for binary objects since i have started with that.

-- karthik

On Mon, Jun 13, 2011 at 8:52 PM, Erick Erickson <er...@gmail.com>wrote:

> Take a look at SOLR-445, I started down this road a while
> ago but then got distracted. If you'd like to pick it up and
> take it farther, feel free. I haven't applied that patch in a
> while, so I don't know how easy it will be to apply.
>
> Last I left it, it would do much of what you're asking for for
> xml documents fed to solr, and I was going to get around to
> some of the other input types but haven't yet. That was what
> committing this was waiting on.
>
> Best
> Erick
>
> On Mon, Jun 13, 2011 at 4:39 PM, karthik <km...@gmail.com> wrote:
> > Hi Everyone,
> >
> > I am trying to use Solrj to add documents to my solr index. In the
> process
> > of playing around with the implementation I noticed that when we add
> > documents in a batch to Solr the response back from solr is just - status
> &
> > qtime. I am using Solr 3.1 right now.
> >
> > I came across the following scenario that I would like to handle
> carefully
> > -
> >
> > When there are exceptions caused by one of the document within the batch
> > then the documents after that specific documents doesnt make it to the
> index
> > ie., lets say out of 100 documents trying to get added, doc 56 has an
> issue
> > due to schema restrictions, etc., then docs 57 - 100 dont make it to the
> > index. Even for docs 1 - 55 to get indexed I need the commit outside the
> > exception handling block of the addBeans() method.
> >
> > In the above scenario I would like Solr (or) Solrj to return the doc id's
> > that got indexed successfully or the doc id's that failed. I would also
> like
> > for the documents 57 - 100 to be processed & not get dropped abruptly
> > because doc 56 had an issue.
> >
> > Not sure if there is a way for me to get these details/functionality
> right
> > now. If I cant get them, I can try to take a crack at developing a patch.
> I
> > would require a lot more help in the latter scenario ;-)
> >
> > Thanks in advance.
> >
> > -- karthik
> >
>

Re: Adding documents in a batch using Solrj

Posted by Erick Erickson <er...@gmail.com>.
Take a look at SOLR-445, I started down this road a while
ago but then got distracted. If you'd like to pick it up and
take it farther, feel free. I haven't applied that patch in a
while, so I don't know how easy it will be to apply.

Last I left it, it would do much of what you're asking for for
xml documents fed to solr, and I was going to get around to
some of the other input types but haven't yet. That was what
committing this was waiting on.

Best
Erick

On Mon, Jun 13, 2011 at 4:39 PM, karthik <km...@gmail.com> wrote:
> Hi Everyone,
>
> I am trying to use Solrj to add documents to my solr index. In the process
> of playing around with the implementation I noticed that when we add
> documents in a batch to Solr the response back from solr is just - status &
> qtime. I am using Solr 3.1 right now.
>
> I came across the following scenario that I would like to handle carefully
> -
>
> When there are exceptions caused by one of the document within the batch
> then the documents after that specific documents doesnt make it to the index
> ie., lets say out of 100 documents trying to get added, doc 56 has an issue
> due to schema restrictions, etc., then docs 57 - 100 dont make it to the
> index. Even for docs 1 - 55 to get indexed I need the commit outside the
> exception handling block of the addBeans() method.
>
> In the above scenario I would like Solr (or) Solrj to return the doc id's
> that got indexed successfully or the doc id's that failed. I would also like
> for the documents 57 - 100 to be processed & not get dropped abruptly
> because doc 56 had an issue.
>
> Not sure if there is a way for me to get these details/functionality right
> now. If I cant get them, I can try to take a crack at developing a patch. I
> would require a lot more help in the latter scenario ;-)
>
> Thanks in advance.
>
> -- karthik
>