You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Buttler, David" <bu...@llnl.gov> on 2012/08/15 02:13:00 UTC

Duplicated facet counts in solr 4 beta: user error

Here are my steps:

1)      Download apache-solr-4.0.0-BETA

2)      Untar into a directory

3)      cp -r example example2

4)      cp -r example exampleB

5)      cp -r example example2B

6)      cd example;  java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar

7)      cd example2; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar

8)      cd exampleB; java -Djetty.port=8900 -DzkHost=localhost:9983 -jar start.jar

9)      cd example2B; java -Djetty.port=7500 -DzkHost=localhost:9983 -jar start.jar

10)   cd example/exampledocs; java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar *.xml

http://localhost:8983/solr/collection1/select?q=*:*&wt=xml&fq=cat:%22electronics%22
14 results returned

This is correct.  Let's try a slightly more circuitous route by running through the solr tutorial first


1)      Download apache-solr-4.0.0-BETA

2)      Untar into a directory

3)      cd example; java  -jar start.jar

4)      cd example/exampledocs; java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar *.xml

5)      kill jetty server

6)      cp -r example example2

7)      cp -r example exampleB

8)      cp -r example example2B

9)      cd example;  java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar

10)   cd example2; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar

11)   cd exampleB; java -Djetty.port=8900 -DzkHost=localhost:9983 -jar start.jar

12)   cd example2B; java -Djetty.port=7500 -DzkHost=localhost:9983 -jar start.jar

13)   cd example/exampledocs; java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar *.xml

With the same query as above, 22 results are returned.

Looking at this, it is somewhat obvious that what is happening is that the index was copied over from the tutorial and was not cleaned up before running the cloud examples.

Adding the debug=query parameter to the query URL produces the following:
<lst name="debug">
<str name="rawquerystring">*:*</str>
<str name="querystring">*:*</str>
<str name="parsedquery">MatchAllDocsQuery(*:*)</str>
<str name="parsedquery_toString">*:*</str>
<str name="QParser">LuceneQParser</str>
<arr name="filter_queries">
<str>cat:"electronics"</str>
</arr>
<arr name="parsed_filter_queries">
<str>cat:electronics</str>
</arr>
</lst>

So, Erick's diagnoses is correct: pilot error.  However, the straightforward path through the tutorial and on to solr cloud makes it easy to make this mistake. Maybe a small warning in the solr cloud page would help?

Now, running a delete operations fixes things:
cd example/exampledocs;
java -Dcommit=false -Ddata=args -jar post.jar "<delete><query>*:*</query></delete>"
causes the number of results to be zero.  So, let's reload the data:
java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar *.xml
now the number of results for our query
http://localhost:8983/solr/collection1/select?q=*:*&wt=xml&fq=cat:"electronics<http://localhost:8983/solr/collection1/select?q=*:*&wt=xml&fq=cat:%22electronics>"
is back to the correct 14 results.

Dave

PS apologizes for hijacking the thread earlier.

Re: Duplicated facet counts in solr 4 beta: user error

Posted by Erick Erickson <er...@gmail.com>.
I say you updated the Wiki, thanks!

Erick

On Wed, Aug 15, 2012 at 9:34 AM, Erick Erickson <er...@gmail.com> wrote:
> No problem, and thanks for posting the resolution....
>
> If you have the time and energy, anyone can edit the Wiki if you
> create a logon, so any clarification you'd like to provide to keep
> others from having this problem would be most welcome!
>
> Best
> Erick
>
> On Tue, Aug 14, 2012 at 6:13 PM, Buttler, David <bu...@llnl.gov> wrote:
>> Here are my steps:
>>
>> 1)      Download apache-solr-4.0.0-BETA
>>
>> 2)      Untar into a directory
>>
>> 3)      cp -r example example2
>>
>> 4)      cp -r example exampleB
>>
>> 5)      cp -r example example2B
>>
>> 6)      cd example;  java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar
>>
>> 7)      cd example2; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar
>>
>> 8)      cd exampleB; java -Djetty.port=8900 -DzkHost=localhost:9983 -jar start.jar
>>
>> 9)      cd example2B; java -Djetty.port=7500 -DzkHost=localhost:9983 -jar start.jar
>>
>> 10)   cd example/exampledocs; java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar *.xml
>>
>> http://localhost:8983/solr/collection1/select?q=*:*&wt=xml&fq=cat:%22electronics%22
>> 14 results returned
>>
>> This is correct.  Let's try a slightly more circuitous route by running through the solr tutorial first
>>
>>
>> 1)      Download apache-solr-4.0.0-BETA
>>
>> 2)      Untar into a directory
>>
>> 3)      cd example; java  -jar start.jar
>>
>> 4)      cd example/exampledocs; java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar *.xml
>>
>> 5)      kill jetty server
>>
>> 6)      cp -r example example2
>>
>> 7)      cp -r example exampleB
>>
>> 8)      cp -r example example2B
>>
>> 9)      cd example;  java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar
>>
>> 10)   cd example2; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar
>>
>> 11)   cd exampleB; java -Djetty.port=8900 -DzkHost=localhost:9983 -jar start.jar
>>
>> 12)   cd example2B; java -Djetty.port=7500 -DzkHost=localhost:9983 -jar start.jar
>>
>> 13)   cd example/exampledocs; java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar *.xml
>>
>> With the same query as above, 22 results are returned.
>>
>> Looking at this, it is somewhat obvious that what is happening is that the index was copied over from the tutorial and was not cleaned up before running the cloud examples.
>>
>> Adding the debug=query parameter to the query URL produces the following:
>> <lst name="debug">
>> <str name="rawquerystring">*:*</str>
>> <str name="querystring">*:*</str>
>> <str name="parsedquery">MatchAllDocsQuery(*:*)</str>
>> <str name="parsedquery_toString">*:*</str>
>> <str name="QParser">LuceneQParser</str>
>> <arr name="filter_queries">
>> <str>cat:"electronics"</str>
>> </arr>
>> <arr name="parsed_filter_queries">
>> <str>cat:electronics</str>
>> </arr>
>> </lst>
>>
>> So, Erick's diagnoses is correct: pilot error.  However, the straightforward path through the tutorial and on to solr cloud makes it easy to make this mistake. Maybe a small warning in the solr cloud page would help?
>>
>> Now, running a delete operations fixes things:
>> cd example/exampledocs;
>> java -Dcommit=false -Ddata=args -jar post.jar "<delete><query>*:*</query></delete>"
>> causes the number of results to be zero.  So, let's reload the data:
>> java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar *.xml
>> now the number of results for our query
>> http://localhost:8983/solr/collection1/select?q=*:*&wt=xml&fq=cat:"electronics<http://localhost:8983/solr/collection1/select?q=*:*&wt=xml&fq=cat:%22electronics>"
>> is back to the correct 14 results.
>>
>> Dave
>>
>> PS apologizes for hijacking the thread earlier.

Re: Duplicated facet counts in solr 4 beta: user error

Posted by Erick Erickson <er...@gmail.com>.
No problem, and thanks for posting the resolution....

If you have the time and energy, anyone can edit the Wiki if you
create a logon, so any clarification you'd like to provide to keep
others from having this problem would be most welcome!

Best
Erick

On Tue, Aug 14, 2012 at 6:13 PM, Buttler, David <bu...@llnl.gov> wrote:
> Here are my steps:
>
> 1)      Download apache-solr-4.0.0-BETA
>
> 2)      Untar into a directory
>
> 3)      cp -r example example2
>
> 4)      cp -r example exampleB
>
> 5)      cp -r example example2B
>
> 6)      cd example;  java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar
>
> 7)      cd example2; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar
>
> 8)      cd exampleB; java -Djetty.port=8900 -DzkHost=localhost:9983 -jar start.jar
>
> 9)      cd example2B; java -Djetty.port=7500 -DzkHost=localhost:9983 -jar start.jar
>
> 10)   cd example/exampledocs; java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar *.xml
>
> http://localhost:8983/solr/collection1/select?q=*:*&wt=xml&fq=cat:%22electronics%22
> 14 results returned
>
> This is correct.  Let's try a slightly more circuitous route by running through the solr tutorial first
>
>
> 1)      Download apache-solr-4.0.0-BETA
>
> 2)      Untar into a directory
>
> 3)      cd example; java  -jar start.jar
>
> 4)      cd example/exampledocs; java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar *.xml
>
> 5)      kill jetty server
>
> 6)      cp -r example example2
>
> 7)      cp -r example exampleB
>
> 8)      cp -r example example2B
>
> 9)      cd example;  java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar
>
> 10)   cd example2; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar
>
> 11)   cd exampleB; java -Djetty.port=8900 -DzkHost=localhost:9983 -jar start.jar
>
> 12)   cd example2B; java -Djetty.port=7500 -DzkHost=localhost:9983 -jar start.jar
>
> 13)   cd example/exampledocs; java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar *.xml
>
> With the same query as above, 22 results are returned.
>
> Looking at this, it is somewhat obvious that what is happening is that the index was copied over from the tutorial and was not cleaned up before running the cloud examples.
>
> Adding the debug=query parameter to the query URL produces the following:
> <lst name="debug">
> <str name="rawquerystring">*:*</str>
> <str name="querystring">*:*</str>
> <str name="parsedquery">MatchAllDocsQuery(*:*)</str>
> <str name="parsedquery_toString">*:*</str>
> <str name="QParser">LuceneQParser</str>
> <arr name="filter_queries">
> <str>cat:"electronics"</str>
> </arr>
> <arr name="parsed_filter_queries">
> <str>cat:electronics</str>
> </arr>
> </lst>
>
> So, Erick's diagnoses is correct: pilot error.  However, the straightforward path through the tutorial and on to solr cloud makes it easy to make this mistake. Maybe a small warning in the solr cloud page would help?
>
> Now, running a delete operations fixes things:
> cd example/exampledocs;
> java -Dcommit=false -Ddata=args -jar post.jar "<delete><query>*:*</query></delete>"
> causes the number of results to be zero.  So, let's reload the data:
> java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar *.xml
> now the number of results for our query
> http://localhost:8983/solr/collection1/select?q=*:*&wt=xml&fq=cat:"electronics<http://localhost:8983/solr/collection1/select?q=*:*&wt=xml&fq=cat:%22electronics>"
> is back to the correct 14 results.
>
> Dave
>
> PS apologizes for hijacking the thread earlier.