You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucy.apache.org by Saurabh Vasekar <sv...@listenlogic.com> on 2012/06/15 02:02:52 UTC

[lucy-user] Proximity Search support in apache Lucy (~)

Hello,

Apache Lucene supports the Proimity Search queries. e.g. search query
"jakarta apache" ~10  would search for "apache" and "jakarta" within 10
words of each other in a document. Is the proximity search supported in
Lucy also? If it is not supported do I implement the query parser to
incorporate the proximity search? Also what other wildcard characters are
supported in apache Lucy? I assigned "*" to query string. Ideally it should
retrieve all the contents in the document. But it did not retrieve
anything. How should I assign the "*" to the query string so that it
retrieves the entire content.

Thank you.

Re: [lucy-user] Proximity Search support in apache Lucy (~)

Posted by Peter Karman <pe...@peknet.com>.
Saurabh Vasekar wrote on 6/15/12 6:16 PM:

> 
> Error -
> 
> No field specified for term ' jakarta apache' -- set a default_field in Parser
> or Dialect at program_name.pl <http://program_name.pl>
> 
> I think my code is correct. I am not able to figure out the error.
> 


Did you try doing what the error message suggests? Set a default field in your
Parser:

 my $queryparser = Search::Query->parser(
          dialect       => 'Lucy',
          default_field => [ 'field1' ],
 );

I see that the documentation/synopsis is lacking. I have fixed that on github
and will release a new version soon.


-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Re: [lucy-user] Proximity Search support in apache Lucy (~)

Posted by Saurabh Vasekar <sv...@listenlogic.com>.
Hi Peter,

Thanks a lot!

I tried the code you have mentioned below. But the program is throwing an
error.
My code looks as below -

my $queryparser = Search::Query->parser( dialect => 'Lucy');

my $proximity_query = $queryparser->parse( '"jakarta apache"~4');

print("$proximity_query\n");

my $lucy_proximity_query = $proximity_query->as_lucy_query();   # Error
occurring at this line

my $searcher = Lucy::Search::IndexSearcher->new(
     index => $path_to_index,
    );

my $hits = $searcher->hits(
      query => $lucy_proximity_query,
      offset => $offset,
      num_wanted => $page_size,
);

Error -

No field specified for term ' jakarta apache' -- set a default_field in
Parser or Dialect at program_name.pl

I think my code is correct. I am not able to figure out the error.

Thank you.



On Thu, Jun 14, 2012 at 7:42 PM, Peter Karman <pe...@peknet.com> wrote:

> Saurabh Vasekar wrote on 6/14/12 7:02 PM:
> > Hello,
> >
> > Apache Lucene supports the Proimity Search queries. e.g. search query
> > "jakarta apache" ~10  would search for "apache" and "jakarta" within 10
> > words of each other in a document. Is the proximity search supported in
> > Lucy also? If it is not supported do I implement the query parser to
> > incorporate the proximity search? Also what other wildcard characters are
> > supported in apache Lucy? I assigned "*" to query string. Ideally it
> should
> > retrieve all the contents in the document. But it did not retrieve
> > anything. How should I assign the "*" to the query string so that it
> > retrieves the entire content.
> >
>
>
> http://search.cpan.org/dist/LucyX-Search-WildcardQuery/
> http://search.cpan.org/dist/Lucy/lib/LucyX/Search/ProximityQuery.pod
>
>
> The built-in Lucy QueryParser has no syntax for either of those. One
> example
> that does support wildcard and proximity syntax is:
>
>  http://search.cpan.org/dist/Search-Query-Dialect-Lucy/
>
> So you could do (UNTESTED):
>
>  my $queryparser = Search::Query->parser( dialect => 'Lucy' );
>  my $everything_query = $queryparser->parse('?*');
>  my $proximity_query  = $queryparser->parse('"jakarta apache" ~10');
>
>  my $lucy_everything_query = $everything_query->as_lucy_query();
>  my $everything_hits = $lucy_searcher->hits( query =>
> $lucy_everything_query );
>
>  my $lucy_proximity_query = $proximity_query->as_lucy_query();
>  my $proximity_hits = $lucy_searcher->hits( query => $lucy_proximity_query
> );
>
> --
> Peter Karman  .  http://peknet.com/  .  peter@peknet.com
>

Re: [lucy-user] Proximity Search support in apache Lucy (~)

Posted by Peter Karman <pe...@peknet.com>.
Saurabh Vasekar wrote on 6/14/12 7:02 PM:
> Hello,
> 
> Apache Lucene supports the Proimity Search queries. e.g. search query
> "jakarta apache" ~10  would search for "apache" and "jakarta" within 10
> words of each other in a document. Is the proximity search supported in
> Lucy also? If it is not supported do I implement the query parser to
> incorporate the proximity search? Also what other wildcard characters are
> supported in apache Lucy? I assigned "*" to query string. Ideally it should
> retrieve all the contents in the document. But it did not retrieve
> anything. How should I assign the "*" to the query string so that it
> retrieves the entire content.
> 


http://search.cpan.org/dist/LucyX-Search-WildcardQuery/
http://search.cpan.org/dist/Lucy/lib/LucyX/Search/ProximityQuery.pod


The built-in Lucy QueryParser has no syntax for either of those. One example
that does support wildcard and proximity syntax is:

 http://search.cpan.org/dist/Search-Query-Dialect-Lucy/

So you could do (UNTESTED):

 my $queryparser = Search::Query->parser( dialect => 'Lucy' );
 my $everything_query = $queryparser->parse('?*');
 my $proximity_query  = $queryparser->parse('"jakarta apache" ~10');

 my $lucy_everything_query = $everything_query->as_lucy_query();
 my $everything_hits = $lucy_searcher->hits( query => $lucy_everything_query );

 my $lucy_proximity_query = $proximity_query->as_lucy_query();
 my $proximity_hits = $lucy_searcher->hits( query => $lucy_proximity_query );

-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com