You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Swetha Shenoy <ss...@gmail.com> on 2012/06/13 23:48:44 UTC

Regarding number of documents

Hi,

I have a data config file that contains the data import query. If I just
run the import query against MySQL, I get a certain number of results. I
assume that if I run the full-import, I should get the same number of
documents added to the index, but I see that it's not the case and the
number of documents added to the index are less than what I see from the
MySQL query result. Can any one tell me if my assumption is correct and why
the number of documents would be off?

Thanks,
Swetha

Re: Regarding number of documents

Posted by Swetha Shenoy <ss...@gmail.com>.
Thanks all, for your inputs.

We found what the problem was, the reason certain entries were missing from
the index and not from the MySQL search results was that we had some
customized transformers in the data config, that skipped the entries when a
particular field was missing.

On Thu, Jun 14, 2012 at 1:28 PM, Erick Erickson <er...@gmail.com>wrote:

> Here's a quick thing to check. Delete your index and do a fresh import.
> Then
> go to the admin/statistics. Check the "numDocs" and "maxDocs" entries. If
> they're different, it means that some of your documents have been deleted.
>
> Deleted you say? What's that about? Well, if more than one record has the
> same <uniqueKey> (see schema.xml), then the first doc is overwritten by the
> second. But this is really a delete of the old doc followed by an add.
>
> NOTE: This won't show any difference if you optimize, so don't optimize
> for this
> test.
>
> The fact that this isn't changing even after you add new entries probably
> means
> you're indexing documents with the same <uniqueKey>.
>
> Hope this helps
> Erick
>
> On Thu, Jun 14, 2012 at 12:03 PM, Swetha Shenoy <ss...@gmail.com> wrote:
> > I am running a full-import. DIH reported that 1125 documents were added
> > after indexing. This number did not change even after I added the new
> > entries.
> >
> > How do I check the ID for an entry and query it against Solr?
> >
> > On Wed, Jun 13, 2012 at 10:33 PM, Gora Mohanty <go...@mimirtech.com>
> wrote:
> >
> >> On 14 June 2012 04:51, Swetha Shenoy <ss...@gmail.com> wrote:
> >> > That makes sense. But I added a new entry that showed up in the MySQL
> >> > results and not in the Solr search results. The count of documents
> also
> >> did
> >> > not increase after the addition. How can a new entry show up in MySQL
> >> > results and not as a new document?
> >>
> >> Sorry, but this is not very clear: Are you running a
> >> full-import, or a delta-import after adding the new
> >> entry in mysql? By any chance, does the new entry
> >> have an ID that already exists in the Solr index?
> >>
> >> What is the number of records that DIH reports
> >> after an import is completed?
> >>
> >> Regards,
> >> Gora
> >>
>

Re: Regarding number of documents

Posted by Erick Erickson <er...@gmail.com>.
Here's a quick thing to check. Delete your index and do a fresh import. Then
go to the admin/statistics. Check the "numDocs" and "maxDocs" entries. If
they're different, it means that some of your documents have been deleted.

Deleted you say? What's that about? Well, if more than one record has the
same <uniqueKey> (see schema.xml), then the first doc is overwritten by the
second. But this is really a delete of the old doc followed by an add.

NOTE: This won't show any difference if you optimize, so don't optimize for this
test.

The fact that this isn't changing even after you add new entries probably means
you're indexing documents with the same <uniqueKey>.

Hope this helps
Erick

On Thu, Jun 14, 2012 at 12:03 PM, Swetha Shenoy <ss...@gmail.com> wrote:
> I am running a full-import. DIH reported that 1125 documents were added
> after indexing. This number did not change even after I added the new
> entries.
>
> How do I check the ID for an entry and query it against Solr?
>
> On Wed, Jun 13, 2012 at 10:33 PM, Gora Mohanty <go...@mimirtech.com> wrote:
>
>> On 14 June 2012 04:51, Swetha Shenoy <ss...@gmail.com> wrote:
>> > That makes sense. But I added a new entry that showed up in the MySQL
>> > results and not in the Solr search results. The count of documents also
>> did
>> > not increase after the addition. How can a new entry show up in MySQL
>> > results and not as a new document?
>>
>> Sorry, but this is not very clear: Are you running a
>> full-import, or a delta-import after adding the new
>> entry in mysql? By any chance, does the new entry
>> have an ID that already exists in the Solr index?
>>
>> What is the number of records that DIH reports
>> after an import is completed?
>>
>> Regards,
>> Gora
>>

Re: Regarding number of documents

Posted by Swetha Shenoy <ss...@gmail.com>.
I am running a full-import. DIH reported that 1125 documents were added
after indexing. This number did not change even after I added the new
entries.

How do I check the ID for an entry and query it against Solr?

On Wed, Jun 13, 2012 at 10:33 PM, Gora Mohanty <go...@mimirtech.com> wrote:

> On 14 June 2012 04:51, Swetha Shenoy <ss...@gmail.com> wrote:
> > That makes sense. But I added a new entry that showed up in the MySQL
> > results and not in the Solr search results. The count of documents also
> did
> > not increase after the addition. How can a new entry show up in MySQL
> > results and not as a new document?
>
> Sorry, but this is not very clear: Are you running a
> full-import, or a delta-import after adding the new
> entry in mysql? By any chance, does the new entry
> have an ID that already exists in the Solr index?
>
> What is the number of records that DIH reports
> after an import is completed?
>
> Regards,
> Gora
>

Re: Regarding number of documents

Posted by Gora Mohanty <go...@mimirtech.com>.
On 14 June 2012 04:51, Swetha Shenoy <ss...@gmail.com> wrote:
> That makes sense. But I added a new entry that showed up in the MySQL
> results and not in the Solr search results. The count of documents also did
> not increase after the addition. How can a new entry show up in MySQL
> results and not as a new document?

Sorry, but this is not very clear: Are you running a
full-import, or a delta-import after adding the new
entry in mysql? By any chance, does the new entry
have an ID that already exists in the Solr index?

What is the number of records that DIH reports
after an import is completed?

Regards,
Gora

Re: Regarding number of documents

Posted by Jack Krupansky <ja...@basetechnology.com>.
Check the ID for that latest record and try to query it in Solr.

One way you can get multiple records in an RDBMS query is via join. In that 
case, each of the records could have the same value in the column(s) that 
you are using for your unique key field in Solr.

-- Jack Krupansky

-----Original Message----- 
From: Swetha Shenoy
Sent: Wednesday, June 13, 2012 7:21 PM
To: solr-user@lucene.apache.org
Subject: Re: Regarding number of documents

That makes sense. But I added a new entry that showed up in the MySQL
results and not in the Solr search results. The count of documents also did
not increase after the addition. How can a new entry show up in MySQL
results and not as a new document?

On Wed, Jun 13, 2012 at 6:26 PM, Afroz Ahmad <ah...@gmail.com> wrote:

> Could it be that you are getting records that are not unique. If so then
> SOLR would just overwrite the non unique documents.
>
> Thanks
> Afroz
>
> On Wed, Jun 13, 2012 at 4:50 PM, Swetha Shenoy <ss...@gmail.com> wrote:
>
> > Note: I don't see any errors in the logs when I run the index.
> >
> > On Wed, Jun 13, 2012 at 5:48 PM, Swetha Shenoy <ss...@gmail.com>
> wrote:
> >
> > > Hi,
> > >
> > > I have a data config file that contains the data import query. If I
> just
> > > run the import query against MySQL, I get a certain number of results.
> I
> > > assume that if I run the full-import, I should get the same number of
> > > documents added to the index, but I see that it's not the case and the
> > > number of documents added to the index are less than what I see from
> the
> > > MySQL query result. Can any one tell me if my assumption is correct 
> > > and
> > why
> > > the number of documents would be off?
> > >
> > > Thanks,
> > > Swetha
> > >
> >
> 


Re: Regarding number of documents

Posted by Swetha Shenoy <ss...@gmail.com>.
That makes sense. But I added a new entry that showed up in the MySQL
results and not in the Solr search results. The count of documents also did
not increase after the addition. How can a new entry show up in MySQL
results and not as a new document?

On Wed, Jun 13, 2012 at 6:26 PM, Afroz Ahmad <ah...@gmail.com> wrote:

> Could it be that you are getting records that are not unique. If so then
> SOLR would just overwrite the non unique documents.
>
> Thanks
> Afroz
>
> On Wed, Jun 13, 2012 at 4:50 PM, Swetha Shenoy <ss...@gmail.com> wrote:
>
> > Note: I don't see any errors in the logs when I run the index.
> >
> > On Wed, Jun 13, 2012 at 5:48 PM, Swetha Shenoy <ss...@gmail.com>
> wrote:
> >
> > > Hi,
> > >
> > > I have a data config file that contains the data import query. If I
> just
> > > run the import query against MySQL, I get a certain number of results.
> I
> > > assume that if I run the full-import, I should get the same number of
> > > documents added to the index, but I see that it's not the case and the
> > > number of documents added to the index are less than what I see from
> the
> > > MySQL query result. Can any one tell me if my assumption is correct and
> > why
> > > the number of documents would be off?
> > >
> > > Thanks,
> > > Swetha
> > >
> >
>

Re: Regarding number of documents

Posted by Afroz Ahmad <ah...@gmail.com>.
Could it be that you are getting records that are not unique. If so then
SOLR would just overwrite the non unique documents.

Thanks
Afroz

On Wed, Jun 13, 2012 at 4:50 PM, Swetha Shenoy <ss...@gmail.com> wrote:

> Note: I don't see any errors in the logs when I run the index.
>
> On Wed, Jun 13, 2012 at 5:48 PM, Swetha Shenoy <ss...@gmail.com> wrote:
>
> > Hi,
> >
> > I have a data config file that contains the data import query. If I just
> > run the import query against MySQL, I get a certain number of results. I
> > assume that if I run the full-import, I should get the same number of
> > documents added to the index, but I see that it's not the case and the
> > number of documents added to the index are less than what I see from the
> > MySQL query result. Can any one tell me if my assumption is correct and
> why
> > the number of documents would be off?
> >
> > Thanks,
> > Swetha
> >
>

Re: Regarding number of documents

Posted by Swetha Shenoy <ss...@gmail.com>.
Note: I don't see any errors in the logs when I run the index.

On Wed, Jun 13, 2012 at 5:48 PM, Swetha Shenoy <ss...@gmail.com> wrote:

> Hi,
>
> I have a data config file that contains the data import query. If I just
> run the import query against MySQL, I get a certain number of results. I
> assume that if I run the full-import, I should get the same number of
> documents added to the index, but I see that it's not the case and the
> number of documents added to the index are less than what I see from the
> MySQL query result. Can any one tell me if my assumption is correct and why
> the number of documents would be off?
>
> Thanks,
> Swetha
>