You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by stetogias <st...@gmail.com> on 2012/03/08 15:02:10 UTC

Understanding update handler statistics

Hi,

Trying to understand the update handler statistics
so I have this:

commits : 2824
autocommit maxDocs : 10000
autocommit maxTime : 1000ms
autocommits : 41
optimizes : 822
rollbacks : 0
expungeDeletes : 0
docsPending : 0
adds : 0
deletesById : 0
deletesByQuery : 0
errors : 0
cumulative_adds : 17457
cumulative_deletesById : 1959
cumulative_deletesByQuery : 0
cumulative_errors : 0 

my problem is with the cumulative part.

If for instance I am doing a commit after each add and delete operation then
the sum of cumulative_adds plus 
cumulative_deletes plus cumulative_errors should much the commit number.
is that right?
And another question, these stats are since SOLR instance startup or since
update handler startup, these
can differ as far as I understand...

and from this part:
docsPending : 0
adds : 0
deletesById : 0
deletesByQuery : 0
errors : 0

I understand that if I had docsPending I should have adds(pending)
deletes*(pending) but how could I have errors...

thanks
stelios


--
View this message in context: http://lucene.472066.n3.nabble.com/Understanding-update-handler-statistics-tp3809743p3809743.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Understanding update handler statistics

Posted by stetogias <st...@gmail.com>.
OK after doing some tests it seems the "normal" stats(adds/deletes) are
pending
and the cumulative stats are since the instance is up

and it makes sense what you said about the error stats

thanks
stelios

On 8 March 2012 15:58, Shawn Heisey-4 [via Lucene] <
ml-node+s472066n3809864h53@n3.nabble.com> wrote:

> On 3/8/2012 7:02 AM, stetogias wrote:
>
> > Hi,
> >
> > Trying to understand the update handler statistics
> > so I have this:
> >
> > commits : 2824
> > autocommit maxDocs : 10000
> > autocommit maxTime : 1000ms
> > autocommits : 41
> > optimizes : 822
> > rollbacks : 0
> > expungeDeletes : 0
> > docsPending : 0
> > adds : 0
> > deletesById : 0
> > deletesByQuery : 0
> > errors : 0
> > cumulative_adds : 17457
> > cumulative_deletesById : 1959
> > cumulative_deletesByQuery : 0
> > cumulative_errors : 0
> >
> > my problem is with the cumulative part.
> >
> > If for instance I am doing a commit after each add and delete operation
> then
> > the sum of cumulative_adds plus
> > cumulative_deletes plus cumulative_errors should much the commit number.
> > is that right?
> > And another question, these stats are since SOLR instance startup or
> since
> > update handler startup, these
> > can differ as far as I understand...
> >
> > and from this part:
> > docsPending : 0
> > adds : 0
> > deletesById : 0
> > deletesByQuery : 0
> > errors : 0
> >
> > I understand that if I had docsPending I should have adds(pending)
> > deletes*(pending) but how could I have errors...
>
> I'm fairly sure that adds and deletes refer to the number of documents
> added or deleted.  You can have many documents added and/or deleted for
> each commit.  I would not expect the sums to match, unless you are
> adding or deleting only one document at a time and doing a commit after
> every one.  I hope you're not doing that, unless you're using trunk with
> the near-realtime feature and doing soft commits, with which I have no
> experience.  Normally doing a commit after every document would be too
> much of a load for good performance, unless there is a relatively long
> time period between each add or delete.
>
> Your question about errors - that probably tracks the number of times
> that the update handler returned an error response, though I don't
> really know.  If I'm right, then that number, like commits, has little
> to do with the number of documents.
>
> Thanks,
> Shawn
>
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Understanding-update-handler-statistics-tp3809743p3809864.html
>  To unsubscribe from Understanding update handler statistics, click here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3809743&code=c3RldG9naWFzQGdtYWlsLmNvbXwzODA5NzQzfC0xMDc4NTcwNTM2>
> .
> NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>


--
View this message in context: http://lucene.472066.n3.nabble.com/Understanding-update-handler-statistics-tp3809743p3813320.html
Sent from the Solr - User mailing list archive at Nabble.com.

Importing dynamicField data on the fly

Posted by Mark Beeby <mb...@cambridge.org>.
Hello Everyone, 

I'm trying to work out how, if at all possible, dynamicFields can be 
imported from a dynamic data source through the DataImportHandler 
configurations. Currently the DataImportHandler configuration file 
requires me to name every single field I want to map in advance, but I do 
not know the dynamicField set at this stage necessarily. 

Here's my example schema.xml dynamic field definition:

    <dynamicField name="*_sortable"  type="alphaOnlySort"  indexed="true" 
stored="true"/>

My DataImportHandler import configuration file looks like this:

    <dataSource name="Gateway1Source" type="HttpDataSource" baseUrl="
http://acproplatforms.internal/feeds.xml" encoding="UTF-8" 
connectionTimeout="15000" readTimeout="15000"/>
        <document name="feeds">
            <entity name="feed" processor="XPathEntityProcessor" 
stream="true" forEach="/gateway/feedItem/" url="">
                <field column="type" xpath="/gateway/feedItem/type"/>
                ...
            </entity>
        </document>
    </dataConfig>

I have looked, very optimistically, at Script Transformers 
(transformer="script:importDynamics"), specifically hoping the row in the 
transformer function would hold the dynamic field content, but this was 
silly thinking obviously, as they would already fall through had they made 
it into here. 

Has anyone managed to import into dynamic fields in advance of knowing 
what they were going to be in the data source?

To give you an idea of why I want this, there's an application aggregating 
web services from many sources, some of which contain patterns of fields I 
know we'll want, and the nature of their data types, but which are added 
to quite frequently. It seems aside from the field mappings here, the hard 
work has been done in Solr to achieve this!

Kindest Regards,
Mark 




From:   Shawn Heisey <so...@elyograg.org>
To:     solr-user@lucene.apache.org, 
Date:   08/03/2012 14:58
Subject:        Re: Understanding update handler statistics



On 3/8/2012 7:02 AM, stetogias wrote:
> Hi,
>
> Trying to understand the update handler statistics
> so I have this:
>
> commits : 2824
> autocommit maxDocs : 10000
> autocommit maxTime : 1000ms
> autocommits : 41
> optimizes : 822
> rollbacks : 0
> expungeDeletes : 0
> docsPending : 0
> adds : 0
> deletesById : 0
> deletesByQuery : 0
> errors : 0
> cumulative_adds : 17457
> cumulative_deletesById : 1959
> cumulative_deletesByQuery : 0
> cumulative_errors : 0
>
> my problem is with the cumulative part.
>
> If for instance I am doing a commit after each add and delete operation 
then
> the sum of cumulative_adds plus
> cumulative_deletes plus cumulative_errors should much the commit number.
> is that right?
> And another question, these stats are since SOLR instance startup or 
since
> update handler startup, these
> can differ as far as I understand...
>
> and from this part:
> docsPending : 0
> adds : 0
> deletesById : 0
> deletesByQuery : 0
> errors : 0
>
> I understand that if I had docsPending I should have adds(pending)
> deletes*(pending) but how could I have errors...

I'm fairly sure that adds and deletes refer to the number of documents 
added or deleted.  You can have many documents added and/or deleted for 
each commit.  I would not expect the sums to match, unless you are 
adding or deleting only one document at a time and doing a commit after 
every one.  I hope you're not doing that, unless you're using trunk with 
the near-realtime feature and doing soft commits, with which I have no 
experience.  Normally doing a commit after every document would be too 
much of a load for good performance, unless there is a relatively long 
time period between each add or delete.

Your question about errors - that probably tracks the number of times 
that the update handler returned an error response, though I don't 
really know.  If I'm right, then that number, like commits, has little 
to do with the number of documents.

Thanks,
Shawn



Re: Understanding update handler statistics

Posted by Shawn Heisey <so...@elyograg.org>.
On 3/8/2012 7:02 AM, stetogias wrote:
> Hi,
>
> Trying to understand the update handler statistics
> so I have this:
>
> commits : 2824
> autocommit maxDocs : 10000
> autocommit maxTime : 1000ms
> autocommits : 41
> optimizes : 822
> rollbacks : 0
> expungeDeletes : 0
> docsPending : 0
> adds : 0
> deletesById : 0
> deletesByQuery : 0
> errors : 0
> cumulative_adds : 17457
> cumulative_deletesById : 1959
> cumulative_deletesByQuery : 0
> cumulative_errors : 0
>
> my problem is with the cumulative part.
>
> If for instance I am doing a commit after each add and delete operation then
> the sum of cumulative_adds plus
> cumulative_deletes plus cumulative_errors should much the commit number.
> is that right?
> And another question, these stats are since SOLR instance startup or since
> update handler startup, these
> can differ as far as I understand...
>
> and from this part:
> docsPending : 0
> adds : 0
> deletesById : 0
> deletesByQuery : 0
> errors : 0
>
> I understand that if I had docsPending I should have adds(pending)
> deletes*(pending) but how could I have errors...

I'm fairly sure that adds and deletes refer to the number of documents 
added or deleted.  You can have many documents added and/or deleted for 
each commit.  I would not expect the sums to match, unless you are 
adding or deleting only one document at a time and doing a commit after 
every one.  I hope you're not doing that, unless you're using trunk with 
the near-realtime feature and doing soft commits, with which I have no 
experience.  Normally doing a commit after every document would be too 
much of a load for good performance, unless there is a relatively long 
time period between each add or delete.

Your question about errors - that probably tracks the number of times 
that the update handler returned an error response, though I don't 
really know.  If I'm right, then that number, like commits, has little 
to do with the number of documents.

Thanks,
Shawn