You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by KRIS MUSSHORN <mu...@comcast.net> on 2016/10/05 19:17:19 UTC

bash to get doc count

Will someone please tell me why this stores the text "numDocs" instead of returning the number of docs in the core? 

#!/bin/bash 
DOC_COUNT=`wget -O- -q $SOLR_HOST'admin/cores?action=STATUS&core='$SOLR_CORE_NAME'&wt=json&indent=true' | grep numDocs | tr -d '0-9'` 

TIA 

Kris 

Re: bash to get doc count

Posted by Walter Underwood <wa...@gmail.com>.
If you have the jq command, that will be cleaner than using tr.

Also, you can get the number of documents with a query for *:* instead of using the admin API.

This same question was asked on Sep. 19th. This was my answer then.

====
Do a search. The URL will looks something like this:

  /solr/core-name/select?q=*:*&rows=0&wt=json

That will return something like this:

  {"responseHeader":{"status":0,"QTime":1},"response":{"numFound":287176,"start":0,"docs":[]}}

Filter that response through this:

  jq .response.numFound

And you’ll get the number of documents in the core.
====

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Oct 5, 2016, at 12:22 PM, Alan Woodward <al...@flax.co.uk> wrote:
> 
> tr -d ‘0-9’ is removing all numbers from the line, which I’m guessing is the opposite of what you want?
> 
> Alan Woodward
> www.flax.co.uk
> 
> 
>> On 5 Oct 2016, at 20:17, KRIS MUSSHORN <mu...@comcast.net> wrote:
>> 
>> Will someone please tell me why this stores the text "numDocs" instead of returning the number of docs in the core? 
>> 
>> #!/bin/bash 
>> DOC_COUNT=`wget -O- -q $SOLR_HOST'admin/cores?action=STATUS&core='$SOLR_CORE_NAME'&wt=json&indent=true' | grep numDocs | tr -d '0-9'` 
>> 
>> TIA 
>> 
>> Kris 
> 


Re: bash to get doc count

Posted by Comcast <mu...@comcast.net>.
So what would be the right -?

Sent from my iPhone

> On Oct 5, 2016, at 3:22 PM, Alan Woodward <al...@flax.co.uk> wrote:
> 
> tr -d ‘0-9’ is removing all numbers from the line, which I’m guessing is the opposite of what you want?
> 
> Alan Woodward
> www.flax.co.uk
> 
> 
>> On 5 Oct 2016, at 20:17, KRIS MUSSHORN <mu...@comcast.net> wrote:
>> 
>> Will someone please tell me why this stores the text "numDocs" instead of returning the number of docs in the core? 
>> 
>> #!/bin/bash 
>> DOC_COUNT=`wget -O- -q $SOLR_HOST'admin/cores?action=STATUS&core='$SOLR_CORE_NAME'&wt=json&indent=true' | grep numDocs | tr -d '0-9'` 
>> 
>> TIA 
>> 
>> Kris 
> 


Re: bash to get doc count

Posted by Alan Woodward <al...@flax.co.uk>.
tr -d ‘0-9’ is removing all numbers from the line, which I’m guessing is the opposite of what you want?

Alan Woodward
www.flax.co.uk


> On 5 Oct 2016, at 20:17, KRIS MUSSHORN <mu...@comcast.net> wrote:
> 
> Will someone please tell me why this stores the text "numDocs" instead of returning the number of docs in the core? 
> 
> #!/bin/bash 
> DOC_COUNT=`wget -O- -q $SOLR_HOST'admin/cores?action=STATUS&core='$SOLR_CORE_NAME'&wt=json&indent=true' | grep numDocs | tr -d '0-9'` 
> 
> TIA 
> 
> Kris 


Re: bash to get doc count

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/5/2016 1:17 PM, KRIS MUSSHORN wrote:
> Will someone please tell me why this stores the text "numDocs" instead
> of returning the number of docs in the core? #!/bin/bash
> DOC_COUNT=`wget -O- -q
> $SOLR_HOST'admin/cores?action=STATUS&core='$SOLR_CORE_NAME'&wt=json&indent=true'
> | grep numDocs | tr -d '0-9'`

The "-d" option on the "tr" command means "delete" ... so the final
command in that pipe says "delete all numbers."  I think that is
probably the exact opposite of what you're trying to do.

If your "tr" command supports the "-c" option to operate on the
complement of the set, you can you add that option before "-d".   Then
the command should delete everything *except* the numbers, and I think
it will do what you're after.  It seemed to work for me, at least.

Another suggestion you've gotten is to use jq to parse the json
directly.  That would be very effective, and would continue to work even
if Solr's indented JSON output format were to change.  It does assume
the presence of a tool that's less common, though.

Thanks,
Shawn


Re: bash to get doc count

Posted by KRIS MUSSHORN <mu...@comcast.net>.
ps $SOLR_HOST and $SOLR_CORE_NAME are set correctly. 


Kris 

----- Original Message -----

From: "KRIS MUSSHORN" <mu...@comcast.net> 
To: solr-user@lucene.apache.org 
Sent: Wednesday, October 5, 2016 3:17:19 PM 
Subject: bash to get doc count 

Will someone please tell me why this stores the text "numDocs" instead of returning the number of docs in the core? 

#!/bin/bash 
DOC_COUNT=`wget -O- -q $SOLR_HOST'admin/cores?action=STATUS&core='$SOLR_CORE_NAME'&wt=json&indent=true' | grep numDocs | tr -d '0-9'` 

TIA 

Kris