You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by sqzaman <sq...@gmail.com> on 2010/01/05 05:55:45 UTC

Single "A" parsing problem

hi
i am using Java Lucene 2.9.1
my problem is When i parse the folowing query
name: zaman AND name:15 name:A
just last "A" skiped after parsing
i found
query = (+name: zaman +name:15)

why A is missing

can anybody tell me the reason?

need quick feedback
-- 
View this message in context: http://old.nabble.com/Single-%22A%22-parsing-problem-tp27023764p27023764.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Problem with PhraseQuery

Posted by Simon Willnauer <si...@googlemail.com>.

Hi Andre,

you are using StandardAnalyzer for indexing but you search with an
un-analyzed string "Lucene" (q.add(new Term("title","Lucene"));)
If you pass this string to the query parser your query string will be
analyzed (will most likely result in a lowercased string). The
analyzed query will then match and retrieve your documents. try to
search :

>    PhraseQuery q = new PhraseQuery();
>    q.setSlop(1);
>    q.add(new Term("title","lucene"));
>    q.add(new Term("title","for"));

simon
.

2010/1/5 Mário André <ma...@infonet.com.br>:
> Hi,
> I need search by phrase containing a particular sequence of terms , then I
> am using Java Lucene 3.0, more specifically the PhraseQuery.
> I'm using the code below, but does not work(PhraseQuery). Only does work
> when I use the QueryParser:
> Is there some problem or how can I use the PhraseQuery in Lucene 3.0?
> public class Main1 {
>
>    /**
>     * @param args the command line arguments
>     */
>    public static void main(String[] args) throws IOException,
> ParseException
>    {
>    StandardAnalyzer analyzer = new
> StandardAnalyzer(Version.LUCENE_CURRENT);
>
>    // 1. create the index
>    Directory index = new RAMDirectory();
>
>    IndexWriter w = new IndexWriter(index, analyzer, true,
> IndexWriter.MaxFieldLength.UNLIMITED);
>
>    addDoc(w, "Lucene in Action for about ten days here in brazil kkkkkkk
> eeeeeeee nnnn xxxx tttt dddd jjjjjjj k");
>    addDoc(w, "Lucene for Dummies");
>    addDoc(w, "Lucene for Dummies");
>    addDoc(w, "Managing Gigabytes");
>
>    w.close();
>
>    // 2. query
>    //Query q = new QueryParser(Version.LUCENE_CURRENT, "title",
> analyzer).parse(querystr);
>
>    PhraseQuery q = new PhraseQuery();
>    q.setSlop(1);
>    q.add(new Term("title","Lucene"));
>    q.add(new Term("title","for"));
>
>
>    // 3. search
>    int hitsPerPage = 20;
>    IndexSearcher searcher = new IndexSearcher(index, true);
>    TopScoreDocCollector collector =
> TopScoreDocCollector.create(hitsPerPage, true);
>    searcher.search(q, collector);
>    ScoreDoc[] hits = collector.topDocs().scoreDocs;
>
>    // 4. display results
>    System.out.println("Found " + hits.length + " hits.");
>    for(int i=0;i<hits.length;++i) {
>      int docId = hits[i].doc;
>      Document d = searcher.doc(docId);
>      System.out.println((i + 1) + ". " + d.get("title"));
>    }
>
>    searcher.close();
>  }
>
>  private static void addDoc(IndexWriter w, String value) throws IOException
> {
>    Document doc = new Document();
>    doc.add(new Field("title", value, Field.Store.YES,
> Field.Index.ANALYZED));
>    w.addDocument(doc);
>  }
>
>    }
>
> ---------------------------------------------------------------------
> Mário André
> Instituto Federal de Educação, Ciência e Tecnologia de Sergipe - IFS
> Mestrando em MCC - Universidade Federal de Alagoas - UFAL
> http://www.marioandre.com.br/
> Skype: mario-fa
> ----------------------------------------------------------------------------
> ----------
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Problem with PhraseQuery

Posted by Mário André <ma...@infonet.com.br>.

Hi,
I need search by phrase containing a particular sequence of terms , then I
am using Java Lucene 3.0, more specifically the PhraseQuery. 
I'm using the code below, but does not work(PhraseQuery). Only does work
when I use the QueryParser:
Is there some problem or how can I use the PhraseQuery in Lucene 3.0? 
public class Main1 {

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) throws IOException,
ParseException 
    {
    StandardAnalyzer analyzer = new
StandardAnalyzer(Version.LUCENE_CURRENT);

    // 1. create the index
    Directory index = new RAMDirectory();

    IndexWriter w = new IndexWriter(index, analyzer, true,
IndexWriter.MaxFieldLength.UNLIMITED);

    addDoc(w, "Lucene in Action for about ten days here in brazil kkkkkkk
eeeeeeee nnnn xxxx tttt dddd jjjjjjj k");
    addDoc(w, "Lucene for Dummies");
    addDoc(w, "Lucene for Dummies");
    addDoc(w, "Managing Gigabytes");

    w.close();

    // 2. query    
    //Query q = new QueryParser(Version.LUCENE_CURRENT, "title",
analyzer).parse(querystr);

    PhraseQuery q = new PhraseQuery();
    q.setSlop(1);
    q.add(new Term("title","Lucene")); 
    q.add(new Term("title","for"));
 
    
    // 3. search
    int hitsPerPage = 20;
    IndexSearcher searcher = new IndexSearcher(index, true);
    TopScoreDocCollector collector =
TopScoreDocCollector.create(hitsPerPage, true);
    searcher.search(q, collector);
    ScoreDoc[] hits = collector.topDocs().scoreDocs;

    // 4. display results
    System.out.println("Found " + hits.length + " hits.");
    for(int i=0;i<hits.length;++i) {
      int docId = hits[i].doc;
      Document d = searcher.doc(docId);
      System.out.println((i + 1) + ". " + d.get("title"));
    }
  
    searcher.close();
  }

  private static void addDoc(IndexWriter w, String value) throws IOException
{
    Document doc = new Document();
    doc.add(new Field("title", value, Field.Store.YES,
Field.Index.ANALYZED));
    w.addDocument(doc);
  }
     
    }

---------------------------------------------------------------------
Mário André
Instituto Federal de Educação, Ciência e Tecnologia de Sergipe - IFS
Mestrando em MCC - Universidade Federal de Alagoas - UFAL
http://www.marioandre.com.br/
Skype: mario-fa
----------------------------------------------------------------------------
----------



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Single "A" parsing problem

Posted by sqzaman <sq...@gmail.com>.



sqzaman wrote:
> 
> 
> 
> Philip Puffinburger wrote:
>> 
>> That depends on what you are trying to do.   
>> 
>> You could create the StandardAnalyzer and pass in your own stop word set,
>> but that would use that stop word set for all your analyzed fields.    
>> 
>> There is a PerFieldAnalyzerWrapper (I think that is the name) where you
>> can set up different analyzers per field.    
>> 
>> In a project that I work on we wrote our own analyzer that looks at the
>> field and applies different filters based on the field (some use stop
>> words, some stem, etc).   So our note fields use stop words and stem,
>> while our name fields don't.
>> 
>> On Jan 5, 2010, at 12:40 AM, sqzaman wrote:
>> 
>>> 
>>> 
>>> 
>>> Philip Puffinburger wrote:
>>>> 
>>>> I'm going to take a guess that you are using the StandardAnalyzer or
>>>> another analyzer that removes stop words.   'a' is a stop word so is
>>>> removed.
>>>> 
>>>> On Jan 4, 2010, at 11:55 PM, sqzaman wrote:
>>>> 
>>>>> 
>>>>> hi
>>>>> i am using Java Lucene 2.9.1
>>>>> my problem is When i parse the folowing query
>>>>> name: zaman AND name:15 name:A
>>>>> just last "A" skiped after parsing
>>>>> i found
>>>>> query = (+name: zaman +name:15)
>>>>> 
>>>>> why A is missing
>>>>> 
>>>>> can anybody tell me the reason?
>>>>> 
>>>>> need quick feedback
>>>>> -- 
>>>>> View this message in context:
>>>>> http://old.nabble.com/Single-%22A%22-parsing-problem-tp27023764p27023764.html
>>>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>> 
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>> 
>>>> 
>>>> 
>>> 
>>> hi
>>> yes i am using standard analyzer
>>> please tell me how i can solve this problem
>>> 
>>> best regards
>>> sqzaman
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/Single-%22A%22-parsing-problem-tp27023764p27024026.html
>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
>> 
>> 
> hi
> thank u very much for your reply
> can tell me how can i create instance for the following analyzer which
> will not stop my 'A' or some other character
> Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT)
> 
> give me an example
> 
> thanks in advance
> sqzaman
>  
> 
> 
Hi Philip
thank u very much for helping me a lot
i have solved my problem

best regards
zaman

-- 
View this message in context: http://old.nabble.com/Single-%22A%22-parsing-problem-tp27023764p27026604.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Single "A" parsing problem

Posted by sqzaman <sq...@gmail.com>.



Philip Puffinburger wrote:
> 
> That depends on what you are trying to do.   
> 
> You could create the StandardAnalyzer and pass in your own stop word set,
> but that would use that stop word set for all your analyzed fields.    
> 
> There is a PerFieldAnalyzerWrapper (I think that is the name) where you
> can set up different analyzers per field.    
> 
> In a project that I work on we wrote our own analyzer that looks at the
> field and applies different filters based on the field (some use stop
> words, some stem, etc).   So our note fields use stop words and stem,
> while our name fields don't.
> 
> On Jan 5, 2010, at 12:40 AM, sqzaman wrote:
> 
>> 
>> 
>> 
>> Philip Puffinburger wrote:
>>> 
>>> I'm going to take a guess that you are using the StandardAnalyzer or
>>> another analyzer that removes stop words.   'a' is a stop word so is
>>> removed.
>>> 
>>> On Jan 4, 2010, at 11:55 PM, sqzaman wrote:
>>> 
>>>> 
>>>> hi
>>>> i am using Java Lucene 2.9.1
>>>> my problem is When i parse the folowing query
>>>> name: zaman AND name:15 name:A
>>>> just last "A" skiped after parsing
>>>> i found
>>>> query = (+name: zaman +name:15)
>>>> 
>>>> why A is missing
>>>> 
>>>> can anybody tell me the reason?
>>>> 
>>>> need quick feedback
>>>> -- 
>>>> View this message in context:
>>>> http://old.nabble.com/Single-%22A%22-parsing-problem-tp27023764p27023764.html
>>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>> 
>>> 
>>> 
>> 
>> hi
>> yes i am using standard analyzer
>> please tell me how i can solve this problem
>> 
>> best regards
>> sqzaman
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/Single-%22A%22-parsing-problem-tp27023764p27024026.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 
hi
thank u very much for your reply
can tell me how can i create instance for the following analyzer which will
not stop my 'A' or some other character
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT)

give me an example

thanks in advance
sqzaman
 

-- 
View this message in context: http://old.nabble.com/Single-%22A%22-parsing-problem-tp27023764p27025020.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Single "A" parsing problem

Posted by Philip Puffinburger <pp...@tlcdelivers.com>.

That depends on what you are trying to do.   

You could create the StandardAnalyzer and pass in your own stop word set, but that would use that stop word set for all your analyzed fields.    

There is a PerFieldAnalyzerWrapper (I think that is the name) where you can set up different analyzers per field.    

In a project that I work on we wrote our own analyzer that looks at the field and applies different filters based on the field (some use stop words, some stem, etc).   So our note fields use stop words and stem, while our name fields don't.

On Jan 5, 2010, at 12:40 AM, sqzaman wrote:

> 
> 
> 
> Philip Puffinburger wrote:
>> 
>> I'm going to take a guess that you are using the StandardAnalyzer or
>> another analyzer that removes stop words.   'a' is a stop word so is
>> removed.
>> 
>> On Jan 4, 2010, at 11:55 PM, sqzaman wrote:
>> 
>>> 
>>> hi
>>> i am using Java Lucene 2.9.1
>>> my problem is When i parse the folowing query
>>> name: zaman AND name:15 name:A
>>> just last "A" skiped after parsing
>>> i found
>>> query = (+name: zaman +name:15)
>>> 
>>> why A is missing
>>> 
>>> can anybody tell me the reason?
>>> 
>>> need quick feedback
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/Single-%22A%22-parsing-problem-tp27023764p27023764.html
>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
>> 
>> 
> 
> hi
> yes i am using standard analyzer
> please tell me how i can solve this problem
> 
> best regards
> sqzaman
> 
> -- 
> View this message in context: http://old.nabble.com/Single-%22A%22-parsing-problem-tp27023764p27024026.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Single "A" parsing problem

Posted by sqzaman <sq...@gmail.com>.



Philip Puffinburger wrote:
> 
> I'm going to take a guess that you are using the StandardAnalyzer or
> another analyzer that removes stop words.   'a' is a stop word so is
> removed.
> 
> On Jan 4, 2010, at 11:55 PM, sqzaman wrote:
> 
>> 
>> hi
>> i am using Java Lucene 2.9.1
>> my problem is When i parse the folowing query
>> name: zaman AND name:15 name:A
>> just last "A" skiped after parsing
>> i found
>> query = (+name: zaman +name:15)
>> 
>> why A is missing
>> 
>> can anybody tell me the reason?
>> 
>> need quick feedback
>> -- 
>> View this message in context:
>> http://old.nabble.com/Single-%22A%22-parsing-problem-tp27023764p27023764.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

hi
yes i am using standard analyzer
please tell me how i can solve this problem

best regards
sqzaman

-- 
View this message in context: http://old.nabble.com/Single-%22A%22-parsing-problem-tp27023764p27024026.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Single "A" parsing problem

Posted by Philip Puffinburger <pp...@tlcdelivers.com>.

I'm going to take a guess that you are using the StandardAnalyzer or another analyzer that removes stop words.   'a' is a stop word so is removed.

On Jan 4, 2010, at 11:55 PM, sqzaman wrote:

> 
> hi
> i am using Java Lucene 2.9.1
> my problem is When i parse the folowing query
> name: zaman AND name:15 name:A
> just last "A" skiped after parsing
> i found
> query = (+name: zaman +name:15)
> 
> why A is missing
> 
> can anybody tell me the reason?
> 
> need quick feedback
> -- 
> View this message in context: http://old.nabble.com/Single-%22A%22-parsing-problem-tp27023764p27023764.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org