You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Andreas Niekler <an...@informatik.uni-leipzig.de> on 2013/05/28 11:10:23 UTC

Paging with all Hits

Hello,

i indexed some monographs with solr. Within each document a have a
multi-valued field where i store the paragraphs. When i search for a
specific term within the monographs i get the whole monograph as a
result object. The single hits can be accessed via the highlight
component. The prevents server side pageing with all the hits within one
monograph.

Is there either a possibility to page results within a multi-valued
field instead of the whole documents? Can i show each single value of a
mutli valued field as result?

Or can i page the highlighted results (All of them) without showing the
documents?

Thank you very much

-- 
Andreas Niekler, Dipl. Ing. (FH)
NLP Group | Department of Computer Science
University of Leipzig
Johannisgasse 26 | 04103 Leipzig

mail: aniekler@informatik.uni-leipzig.deg.de

Re: Paging with all Hits

Posted by Jack Krupansky <ja...@basetechnology.com>.
:)

-- Jack Krupansky
-----Original Message----- 
From: Alexandre Rafalovitch
Sent: Tuesday, May 28, 2013 10:41 AM
To: solr-user@lucene.apache.org
Subject: Re: Paging with all Hits

<counter-rant>
I feel that the strength of the Jack's rant is somewhat unprovoked by
the original question. I also feel that the rant itself is worth being
printed and framed :-)

But more than anything else, I feel that supposedly-known limitations
of Solr/Lucene are not actually exposed all that much. Certainly, for
myself, I did not see those iron-clad BEWARE OF THE DRAGONS signs
anywhere on the Wiki or otherwise. I feel that they are more like Zen
aspects that one learns by reading between the lines of various forum
messages and by thinking through the presentations such as Adrian
Trenaman's (on Gilt's experience).

Maybe the books are supposed to do that, but even they, I feel, are
failing to do it perfectly (including my own, I am sure).

Just a thought.
</counter-rant>

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)

On Tue, May 28, 2013 at 9:19 AM, Jack Krupansky <ja...@basetechnology.com> 
wrote:
> Yes, the fact that multi-valued fields are not first-class Lucene/Solr
> objects is a problem, but the limitations were all known in advance and no
> guarantees were made, so you don't have much of an excuse now, other than 
> to
> lament the fact that somebody conned you into believing that multi-valued
> fields were some kind of magic elixir, a magic "escape hatch" to a world
> where the limits of Lucene and Solr don't apply. Sigh. 


Re: Paging with all Hits

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
<counter-rant>
I feel that the strength of the Jack's rant is somewhat unprovoked by
the original question. I also feel that the rant itself is worth being
printed and framed :-)

But more than anything else, I feel that supposedly-known limitations
of Solr/Lucene are not actually exposed all that much. Certainly, for
myself, I did not see those iron-clad BEWARE OF THE DRAGONS signs
anywhere on the Wiki or otherwise. I feel that they are more like Zen
aspects that one learns by reading between the lines of various forum
messages and by thinking through the presentations such as Adrian
Trenaman's (on Gilt's experience).

Maybe the books are supposed to do that, but even they, I feel, are
failing to do it perfectly (including my own, I am sure).

Just a thought.
</counter-rant>

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)

On Tue, May 28, 2013 at 9:19 AM, Jack Krupansky <ja...@basetechnology.com> wrote:
> Yes, the fact that multi-valued fields are not first-class Lucene/Solr
> objects is a problem, but the limitations were all known in advance and no
> guarantees were made, so you don't have much of an excuse now, other than to
> lament the fact that somebody conned you into believing that multi-valued
> fields were some kind of magic elixir, a magic "escape hatch" to a world
> where the limits of Lucene and Solr don't apply. Sigh.

Re: Paging with all Hits

Posted by Jack Krupansky <ja...@basetechnology.com>.
Dynamic and multi-valued fields are both powerful but dangerous features. 
Yes, there offer wonderful capabilities - if used within moderation, but 
expecting that they are "get out of jail free / go past go as many times as 
you want" cards to ignore the limits of Solr and do anything you want is a 
really bad idea.

Yes, the fact that multi-valued fields are not first-class Lucene/Solr 
objects is a problem, but the limitations were all known in advance and no 
guarantees were made, so you don't have much of an excuse now, other than to 
lament the fact that somebody conned you into believing that multi-valued 
fields were some kind of magic elixir, a magic "escape hatch" to a world 
where the limits of Lucene and Solr don't apply. Sigh.

Multi-valued field are great for storing a "few" related items, even 
"dozens" of them, typically modest-length strings. But storing "hundreds" or 
"thousands" or storing large bulky items (e.g., entire contents of the text 
of a page) are a really bad idea. Sure, maybe it does work, at least for 
some cases, for some people, some of the time, but that shouldn't be the 
criteria for building a robust production application.

A couple of the serious limitations of multi-valued fields are that 
individual elements cannot be "addressed", either to insert or delete or 
move, or to receive an indication of which matched. Sorry, but Lucene and 
Solr do not have "sub-documents", which is the "get out of jail free" card 
that a lot of people expect with multi-valued (and dynamic) fields.

If you want an object to be a first-class object, make it a separate Solr 
document. Bite the bullet, and live with it.

-- Jack Krupansky

-----Original Message----- 
From: Andreas Niekler
Sent: Tuesday, May 28, 2013 5:10 AM
To: solr-user@lucene.apache.org
Subject: Paging with all Hits

Hello,

i indexed some monographs with solr. Within each document a have a
multi-valued field where i store the paragraphs. When i search for a
specific term within the monographs i get the whole monograph as a
result object. The single hits can be accessed via the highlight
component. The prevents server side pageing with all the hits within one
monograph.

Is there either a possibility to page results within a multi-valued
field instead of the whole documents? Can i show each single value of a
mutli valued field as result?

Or can i page the highlighted results (All of them) without showing the
documents?

Thank you very much

-- 
Andreas Niekler, Dipl. Ing. (FH)
NLP Group | Department of Computer Science
University of Leipzig
Johannisgasse 26 | 04103 Leipzig

mail: aniekler@informatik.uni-leipzig.deg.de