You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by george young <ge...@gmail.com> on 2010/03/14 23:44:04 UTC

some hyphenated words not found

I have a nearly generic out-of-box installation of solr.  When I
search on a short text document containing a few hyphenated words, I
get hits on *some* of the words, but not all.  I'm quite puzzled as to
why.  I've checked that the text is only plain ascii.  How can I find
out what's wrong?  In the file below, solr finds life-long, but not
love-lorn.

Here's the file:
This is a small sample document just to insure that a type *.doc can
be accessed by XXXXXXXXX Documentation.
It is sung to the moon by a love-lorn loon,
who fled from the mocking throng O!
It’s the song of a merryman, moping mum,
whose soul was sad and whose glance was glum. Misery me — lack-a-day-dee!
He sipped no sup, and he craved no crumb,
As he sighed for the love of a ladye!
Who sipped no sup, and who craved no crumb,
As he sighed for the love of a ladye.
Heighdy! heighdy! Misery me — lack-a-day-dee!
He sipped no sup, and he craved no crumb,
As he sighed for the love of a ladye!

I have a song to sing, O!
Sing me your song, O!

It is sung with the ring
Of the songs maids sing
Who love with a love life-long, O!
It's the song of a merrymaid, peerly proud,
Who loved a lord, and who laughed aloud
At the moan of the merryman, moping mum,
Whose soul was sad, and whose glance was glum,
Who sipped no sup, and who craved no crumb,
As he sighed for the love of a ladye!
Heighdy! heighdy!
Misery me — lack-a-day-dee!
He sipped no sup, and he craved no crumb,
As he sighed for the love of a ladye!


-- 
georgeryoung@gmail.com

Re: some hyphenated words not found

Posted by Lance Norskog <go...@gmail.com>.
Look at the terms in the index with the analysis.jsp file, or with Luke.

The different here is that love-lorn is a separate phrase, but
life-long has a comma after it. Try inserting a space before the
comma.

On 3/14/10, george young <ge...@gmail.com> wrote:
> I have a nearly generic out-of-box installation of solr.  When I
> search on a short text document containing a few hyphenated words, I
> get hits on *some* of the words, but not all.  I'm quite puzzled as to
> why.  I've checked that the text is only plain ascii.  How can I find
> out what's wrong?  In the file below, solr finds life-long, but not
> love-lorn.
>
> Here's the file:
> This is a small sample document just to insure that a type *.doc can
> be accessed by XXXXXXXXX Documentation.
> It is sung to the moon by a love-lorn loon,
> who fled from the mocking throng O!
> It’s the song of a merryman, moping mum,
> whose soul was sad and whose glance was glum. Misery me — lack-a-day-dee!
> He sipped no sup, and he craved no crumb,
> As he sighed for the love of a ladye!
> Who sipped no sup, and who craved no crumb,
> As he sighed for the love of a ladye.
> Heighdy! heighdy! Misery me — lack-a-day-dee!
> He sipped no sup, and he craved no crumb,
> As he sighed for the love of a ladye!
>
> I have a song to sing, O!
> Sing me your song, O!
>
> It is sung with the ring
> Of the songs maids sing
> Who love with a love life-long, O!
> It's the song of a merrymaid, peerly proud,
> Who loved a lord, and who laughed aloud
> At the moan of the merryman, moping mum,
> Whose soul was sad, and whose glance was glum,
> Who sipped no sup, and who craved no crumb,
> As he sighed for the love of a ladye!
> Heighdy! heighdy!
> Misery me — lack-a-day-dee!
> He sipped no sup, and he craved no crumb,
> As he sighed for the love of a ladye!
>
>
> --
> georgeryoung@gmail.com
>


-- 
Lance Norskog
goksron@gmail.com

Re: some hyphenated words not found

Posted by Erick Erickson <er...@gmail.com>.
Look carefully at the analyzers you have defined for index
and query. And if you're still puzzled, please post them.

Erick

On Sun, Mar 14, 2010 at 6:44 PM, george young <ge...@gmail.com>wrote:

> I have a nearly generic out-of-box installation of solr.  When I
> search on a short text document containing a few hyphenated words, I
> get hits on *some* of the words, but not all.  I'm quite puzzled as to
> why.  I've checked that the text is only plain ascii.  How can I find
> out what's wrong?  In the file below, solr finds life-long, but not
> love-lorn.
>
> Here's the file:
> This is a small sample document just to insure that a type *.doc can
> be accessed by XXXXXXXXX Documentation.
> It is sung to the moon by a love-lorn loon,
> who fled from the mocking throng O!
> It’s the song of a merryman, moping mum,
> whose soul was sad and whose glance was glum. Misery me — lack-a-day-dee!
> He sipped no sup, and he craved no crumb,
> As he sighed for the love of a ladye!
> Who sipped no sup, and who craved no crumb,
> As he sighed for the love of a ladye.
> Heighdy! heighdy! Misery me — lack-a-day-dee!
> He sipped no sup, and he craved no crumb,
> As he sighed for the love of a ladye!
>
> I have a song to sing, O!
> Sing me your song, O!
>
> It is sung with the ring
> Of the songs maids sing
> Who love with a love life-long, O!
> It's the song of a merrymaid, peerly proud,
> Who loved a lord, and who laughed aloud
> At the moan of the merryman, moping mum,
> Whose soul was sad, and whose glance was glum,
> Who sipped no sup, and who craved no crumb,
> As he sighed for the love of a ladye!
> Heighdy! heighdy!
> Misery me — lack-a-day-dee!
> He sipped no sup, and he craved no crumb,
> As he sighed for the love of a ladye!
>
>
> --
> georgeryoung@gmail.com
>