You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by MitchK <mi...@web.de> on 2010/03/29 18:40:12 UTC

Absolutely empty resultset regardless of what I am searching for

Hello guys,

my analysis.jsp shows me the right results. That means, everything seems to
be parsed the right way and there are some matches.

However, when I try this live, there are never any matched documents. When I
try out to look up whether there is anything in my index, I get the expected
result - everything is indexed. 

What am I doing wrong here?

An example looks like:
select/?indent=on&debugQuery=on&q=introduction&start=0&rows=10

The result looks like:
-------------------
<response>
−
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">16</int>
−
<lst name="params">
<str name="debugQuery">on</str>
<str name="indent">on</str>
<str name="start">0</str>
<str name="q">introduction</str>
<str name="rows">10</str>
</lst>
</lst>
<result name="response" numFound="0" start="0"/>
−
<lst name="debug">
<str name="rawquerystring">introduction</str>
<str name="querystring">introduction</str>
<str name="parsedquery">title:introduction</str>
<str name="parsedquery_toString">title:introduction</str>
<lst name="explain"/>
<str name="QParser">LuceneQParser</str>
−
<lst name="timing">
<double name="time">0.0</double>
−
<lst name="prepare">
<double name="time">0.0</double>
−
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
−
<lst name="process">
<double name="time">0.0</double>
−
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
</lst>
</lst>
</response>
--------------------

Thank you!
-- 
View this message in context: http://n3.nabble.com/Absolutely-empty-resultset-regardless-of-what-I-am-searching-for-tp683866p683866.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Absolutely empty resultset regardless of what I am searching for

Posted by MitchK <mi...@web.de>.
The problem was a wrong incrementToken-implementation. 

Now TermsComponent as well as Luke are showing expected responses for every
field.

However: What could be wrong, when some terms of a field are not searchable?
Here is my query: solr/select/?q=titleSemantic:Me

Let me show you a response of TermsComponent:

------------------------------------------------
<response>
−
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
−
<lst name="terms">
−
<lst name="titleSemantic">
<int name="contemplat">1</int>
<int name="me">1</int>
<int name="snakes">1</int>
<int name="te">1</int>
<int name="your">1</int>
</lst>
</lst>
</response>
---------------------------------
The schema.xml:

	<field name="titleProcessed" type="Processed" indexed="true"
stored="false"/>
        <field name="titleSynonyms" type="Synonym" indexed="true"
stored="false"/>
	<field name="titleSemantic" type="Semantic" indexed="true" stored="false"/>

 <uniqueKey>ID</uniqueKey>

 <defaultSearchField>titleMain</defaultSearchField>
---------------------------------

Thank you!
It's the first time I am setting up a Solr-Server on my own - sorry, if some
questions might be stupid ;).

Kind regards
- Mitch
-- 
View this message in context: http://n3.nabble.com/Absolutely-empty-resultset-regardless-of-what-I-am-searching-for-tp683866p686081.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Absolutely empty resultset regardless of what I am searching for

Posted by MitchK <mi...@web.de>.
I have tried another thing: I have skipped the problematic ID 56 and set the
query to SELECT... FROM... WHERE ID BETWEEN 60 AND 100 - now it works for
the lines 60-83.

That means my title-field has indexed the terms of the lines 60-83. 
The 84th line has no problematic letters.

What a bother!
-- 
View this message in context: http://n3.nabble.com/Absolutely-empty-resultset-regardless-of-what-I-am-searching-for-tp683866p685724.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Absolutely empty resultset regardless of what I am searching for

Posted by MitchK <mi...@web.de>.
So, the CSV-update fails. I have no idea whats wrong with that file, but it
shouts, that there is something wrong encapsuled. It seems to be the same
mistake as it is mentioned here:
http://markmail.org/message/lo3vztzezyuc6h3w#query:solr%20csv%20invalid%20char%20between%20encapsulated%20token%20end%20delimiter+page:1+mid:lo3vztzezyuc6h3w+state:results

However, this don't help and is stuff for another topic - I only mention
that, because I have said I will try it.

Beeing busy and confused yesterday, I have copied old schema.xml-information
to you..., sorry.
I should drink some more coffee, right?
There is one modification to the xml-schema. It's a custom filter, a special
variant of the stopword-filter.
This filter cuts the input-phrases in a special way. That means, whenever a
stopword occurs, it deletes words related to the stopword as well.

Since I have had outcommented this filter, the search works like one can
expect it without my custom filter.
I know that some of you do some consulting for lucene/solr. Did you ever see
such an error?
I mean, I don't know much about the architecture of lucene/solr and I am
absolutely unable to reproduce the error on my own, since I don't know where
to start.

The only thing I have got is that nothing after the 55th row is searchable.
Furthermore: if I change my sql-statement to something like WHERE ID between
56 AND 100, ONLY the 56th line is searchable.

Thank you very much!
Mitch
-- 
View this message in context: http://n3.nabble.com/Absolutely-empty-resultset-regardless-of-what-I-am-searching-for-tp683866p685697.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Absolutely empty resultset regardless of what I am searching for

Posted by Erik Hatcher <er...@gmail.com>.
On Mar 30, 2010, at 2:14 AM, MitchK wrote:

>
> Excuse my language-correctness yesterday - I think one can see the  
> long
> trying-to-fix-a-bug-work? :)
>
> Erick, I have found out how to delete the whole index - with the  
> help of
> some http-requests.
> Afterwards I have started to reindex the data again - and Solr used  
> the
> newest schema-information after a restart.
> However, the problem is the same.
>
> The most confusing thing is, that searching over the unique-ID works  
> great.
>
> Here is some information about custom changes in the solr-config.  
> Everything
> else is similar to the solrconfig of the example-directory.
>
>  <requestHandler name="/dataimport/mysql"
> class="org.apache.solr.handler.dataimport.DataImportHandler">
>    <lst name="defaults">
>      <str name="config">db-data-config.xml</str>
>    </lst>
>  </requestHandler>
>
> The db-data-config.xml
> <dataConfig>
>    <dataSource driver="com.mysql.jdbc.Driver"
> url="jdbc:mysql://localhost:3306/testdb" user="testuser"  
> password="password"
> batchSize="-1"/>
>    <document>
>            <entity name="defaults" query="SELECT ID,Title FROM  
> testdaten">
>            <field column="title" name="titleProcessed" />

I believe the column is case-sensitive, so you'll need to say  
column="Title".

	Erik


Re: Absolutely empty resultset regardless of what I am searching for

Posted by MitchK <mi...@web.de>.
Excuse my language-correctness yesterday - I think one can see the long
trying-to-fix-a-bug-work? :)

Erick, I have found out how to delete the whole index - with the help of
some http-requests.
Afterwards I have started to reindex the data again - and Solr used the
newest schema-information after a restart.
However, the problem is the same.

The most confusing thing is, that searching over the unique-ID works great.

Here is some information about custom changes in the solr-config. Everything
else is similar to the solrconfig of the example-directory.

  <requestHandler name="/dataimport/mysql"
class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
      <str name="config">db-data-config.xml</str>
    </lst>
  </requestHandler>

The db-data-config.xml
<dataConfig>
    <dataSource driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost:3306/testdb" user="testuser" password="password"
batchSize="-1"/>
    <document>
            <entity name="defaults" query="SELECT ID,Title FROM testdaten">
            <field column="title" name="titleProcessed" />
            <field column="ID" name="ID" />
        </entity>
    </document>
</dataConfig>

Kind regards,
- Mitch
-- 
View this message in context: http://n3.nabble.com/Absolutely-empty-resultset-regardless-of-what-I-am-searching-for-tp683866p685200.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Absolutely empty resultset regardless of what I am searching for

Posted by MitchK <mi...@web.de>.
I was using TermsComponent now to make sure, what is really indexed.

Well, one title-field has got only a few terms indexed (as I have mentioned
earlier: it is only saving up to 55 rows of the RDBMS), while the other
fields (which are based on the same filter, but with another
special-word.txt) indexes every term.
However, regardless which field I choose to search on, it makes no
difference: every line after the 55th is unsearchable. 

Any suggestions would be greate!

If I can't solve the problem, I will try to export the whole data as csv and
try it again, although I don't think that this will help, because the stored
fields store the expected values...
-- 
View this message in context: http://n3.nabble.com/Absolutely-empty-resultset-regardless-of-what-I-am-searching-for-tp683866p684679.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Absolutely empty resultset regardless of what I am searching for

Posted by MitchK <mi...@web.de>.
Luke is responsing (now):
My topTerms of synonyms got a frequency of up to 800.000
and my processed title gots a maximum frequency of 7... 
What the hell???

However, I can't search any of the top synonyms.
I am able to search within the first 55 documents of my index. 

What might be wrong, when analysis.jsp shows the right results, but the
real-index does not?
-- 
View this message in context: http://n3.nabble.com/Absolutely-empty-resultset-regardless-of-what-I-am-searching-for-tp683866p684418.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Absolutely empty resultset regardless of what I am searching for

Posted by MitchK <mi...@web.de>.
I was using this page:
 solr/admin/dataimport.jsp?handler=/dataimport
To import my data from my database.
I have made a few restarts of my Solr-server and I have re-imported the data
a lot of times.
Furthermore, I have tried to delete everything with the help of the post.jar
from the tutorial.
I have recognized that it deletes only a few thousands of documents, instead
of emptying the whole index.
This was the last thing I've done. Now I am reindexing again.

I have got a unique id - called ID, it is the primary key of my
database-table.
Perharps I am missunderstanding your post, but what do you mean with "a
unique key that is replacing documents"? 

Thank you
- Mitch
-- 
View this message in context: http://n3.nabble.com/Absolutely-empty-resultset-regardless-of-what-I-am-searching-for-tp683866p684387.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Absolutely empty resultset regardless of what I am searching for

Posted by Erick Erickson <er...@gmail.com>.
Perhaps a silly question, but did you recreate your index after you made
your schema changes? Or did you delete a bunch of documents in the meantime?
Or do you have a unique key defined in your schema that is replacing
documents? The fact that Luke is giving you unexpected results is a red flag
that your index isn't in the state you *think* it's in....

Best
Erick

On Mon, Mar 29, 2010 at 1:13 PM, MitchK <mi...@web.de> wrote:

>
> EDIT:
> The shown query was not the ment one,... please, excuse me, I have tested a
> lot and I am a little bit confused :-).
>
> The right query is, of course:
>
> select/?q=titleProcessed:life&start=0&rows=10&indent=on
> --
> View this message in context:
> http://n3.nabble.com/Absolutely-empty-resultset-regardless-of-what-I-am-searching-for-tp683866p684350.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Absolutely empty resultset regardless of what I am searching for

Posted by MitchK <mi...@web.de>.
EDIT:
The shown query was not the ment one,... please, excuse me, I have tested a
lot and I am a little bit confused :-).

The right query is, of course:

select/?q=titleProcessed:life&start=0&rows=10&indent=on 
-- 
View this message in context: http://n3.nabble.com/Absolutely-empty-resultset-regardless-of-what-I-am-searching-for-tp683866p684350.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Absolutely empty resultset regardless of what I am searching for

Posted by MitchK <mi...@web.de>.
Hoss,

thank you for your response.

/select?q=*:*
This returns results as expected. 

I have found the mistake, why introduction didn't match - a wrong copyfield.
*rolleyes*
However, this seems to bring more problems to the light: Now, the first few
rows from my database seem to be searchable, but the rest is not searchable.
The thing is, I have got two stored (as well as indexed) fields: ID and
title.
If I search for the ID of a document, which I can't find over its title, it
produces a match. If I search for the title, it returns nothing.

Is there any possibility to see, what is exactly indexed?
Luke seems to response wrong results... since it says, that "life" is one of
the most frequent terms (398 times) of my index, but if I search for "life"
(sounds great, doesn't it?) it responses only ONE match. 

select/?q=titleProcessed:live&start=0&rows=10&indent=on

Here is my schema.xml:
Please, notice that I have done a modification: titleProcessed means the
same as "title" from my first post. The mistake is NOT that title is now a
string-type. 

<field name="title" type="string" indexed=true stored="true"/>
<field name="synonymTitle" type="Synonym" indexed=true stored="false"/>
<field name="titleProcessed" type="text" indexed=true stored="false"/>



<copyField source="title" dest="titleProcessed"/> 
<copyField source="title" dest="titleSynonym"/> 
<copyField source="title" dest="titleProcessed"/> 

	<fieldType name="Synonym" class = "solr.TextField"
positionIncrementGap="100">
		<analyzer>
			<tokenizer class="solr.WhitespaceTokenizerFactory"/>
			<filter class="solr.LowerCaseFilterFactory"/>
			<filter class="solr.SynonymFilterFactory" synonyms="Synonyms.txt"
ignoreCase="true" expand="true"/>
			<filter class="solr.WordDelimiterFilterFactory" 
					generateWordParts="1" 
					generateNumberParts="1" 
					catenateWords="0" 
					catenateNumbers="0" 
					catenateAll="0" 
					splitOnCaseChange="1"/>
      </analyzer>
    </fieldType>

	<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
		<filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>

---------------
May there be a problem, because the fields are already tokenized???

Kind regards
- Mitch
-- 
View this message in context: http://n3.nabble.com/Absolutely-empty-resultset-regardless-of-what-I-am-searching-for-tp683866p684344.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Absolutely empty resultset regardless of what I am searching for

Posted by Chris Hostetter <ho...@fucit.org>.
: my analysis.jsp shows me the right results. That means, everything seems to
: be parsed the right way and there are some matches.

analysis.jsp can tell you that *if* a document is indexed with the current 
config, then what will the tokens look like -- but it doesn't know if 
there are any documents in your index, or if you changed hte ocnfig after 
indexing.

what does /select?q=*:*  return?
how about /admin/luke?fl=title   ?

: select/?indent=on&debugQuery=on&q=introduction&start=0&rows=10
		...
: <str name="parsedquery">title:introduction</str>

... i assume "title" is in fact the field you expect "introduction" to 
match on?

what does your schema.xml look like?, etc...

	http://wiki.apache.org/solr/UsingMailingLists


-Hoss