You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Adrien RUFFIE <a....@e-deal.com> on 2012/12/17 18:41:59 UTC

Lucene-analyzer 3.3.0 and Lucene snowball 3.0.1

Hello all,

I beginning with an application and nobody knows with Lucene-analyzer 3.3.0.jar and Lucene snowball 3.0.1.jar are both included
Its do same thing ?

I how can I be sure that excluding Lucene-snowball jar in my application, it is not used (the idea that it is Lucene-analyzer is used instead).

Thank best regards

Bien cordialement,

Adrien Ruffié
LD : +33 1 73 03 29 50
Tél : +33 1 73 03 29 80

E-DEAL
Innover la Relation Client

-----Message d'origine-----
De : Vitaly_Artemov@McAfee.com [mailto:Vitaly_Artemov@McAfee.com] 
Envoyé : lundi 17 décembre 2012 17:46
À : java-user@lucene.apache.org
Objet : Lucene 4.0.0 - find offsets for phrase queries

Hi all,
I use Lucene 4.0.
I try to find offsets for phrase queries.
My code works then I search for one word but then I call it for some phrase I didn't get offsets.
termsEnum.seekExact returns false for phrase queries.

reader = DirectoryReader.open( mIndexDir );
               IndexSearcher searcher = new IndexSearcher(reader);
               QueryParser parser = new QueryParser(Version.LUCENE_40, mField, mAnalyzer);
               Query query = parser.parse(aQuery);

               TopScoreDocCollector collector = TopScoreDocCollector.create(100, true);
               searcher.search(query, collector);
               ScoreDoc[] hits = collector.topDocs().scoreDocs;

               for(int i=0;i<hits.length;++i) {
                   int docId = hits[i].doc;

                   Document d = searcher.doc(docId);

                   Terms tfvector = reader.getTermVector(docId, "contents");

                   if( tfvector != null )
                   {
                      TermsEnum termsEnum = tfvector.iterator(null);

                      if ( termsEnum.seekExact(new BytesRef( aQuery.toLowerCase() ), false ) )
                      {
                             DocsAndPositionsEnum dpEnum = null;
                             dpEnum = termsEnum.docsAndPositions(null, dpEnum);

if( dpEnum != null )
                             {
                                   int freq = dpEnum.freq();

                                   int maxOcc = 20;

                                    while( freq-- > 0 && maxOcc-- > 0 ) {
                                          dpEnum.nextPosition();
                                         System.out.println("Start offset " + dpEnum.startOffset() + " End offset " + dpEnum.endOffset());
                                    }
                             }
}

What is the problem?

Thanks in advance, Vitaly.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

RE: Lucene-analyzer 3.3.0 and Lucene snowball 3.0.1

Posted by Adrien RUFFIE <a....@e-deal.com>.

Hello Steve,

Sorry for the 1st point.

No it is an old version of solr 3.3.0 and Maven/Ivy/Gradle not used, the library has been added by hands by hands
By another developer.

Do you have an idea if I can remove snowball without worries ?

Bien cordialement,

Adrien Ruffié
LD : +33 1 73 03 29 50
Tél : +33 1 73 03 29 80

E-DEAL
Innover la Relation Client

-----Message d'origine-----
De : Steve Rowe [mailto:sarowe@gmail.com] 
Envoyé : lundi 17 décembre 2012 19:11
À : java-user@lucene.apache.org
Objet : Re: Lucene-analyzer 3.3.0 and Lucene snowball 3.0.1

Hi Adrien,

Three comments and a question:

1. From <http://people.apache.org/~hossman/#threadhijack>:

   When starting a new discussion on a mailing list, please do not reply to 
   an existing message, instead start a fresh email.  Even if you change the 
   subject line of your email, other mail headers still track which thread 
   you replied to and your question is "hidden" in that thread and gets less 
   attention.   It makes following discussions in the mailing list archives 
   particularly difficult.

2. If you're beginning a new application, start with a more modern version of Lucene: 4.0.

3. If you must use an older version of Lucene, don't mix versions!

Question: where do your Lucene jars come from?  That is, did you download the binary distribution?  Or are you using Maven/Ivy to download from a Maven repository?  Or . ?

Steve

On Dec 17, 2012, at 12:41 PM, Adrien RUFFIE <a....@e-deal.com> wrote:

> Hello all,
> 
> I beginning with an application and nobody knows with Lucene-analyzer 
> 3.3.0.jar and Lucene snowball 3.0.1.jar are both included Its do same thing ?
> 
> I how can I be sure that excluding Lucene-snowball jar in my application, it is not used (the idea that it is Lucene-analyzer is used instead).
> 
> Thank best regards
> 
> Bien cordialement,
> 
> Adrien Ruffié
> LD : +33 1 73 03 29 50
> Tél : +33 1 73 03 29 80
> 
> E-DEAL
> Innover la Relation Client
> 
> -----Message d'origine-----
> De : Vitaly_Artemov@McAfee.com [mailto:Vitaly_Artemov@McAfee.com]
> Envoyé : lundi 17 décembre 2012 17:46
> À : java-user@lucene.apache.org
> Objet : Lucene 4.0.0 - find offsets for phrase queries
> 
> Hi all,
> I use Lucene 4.0.
> I try to find offsets for phrase queries.
> My code works then I search for one word but then I call it for some phrase I didn't get offsets.
> termsEnum.seekExact returns false for phrase queries.
> 
> reader = DirectoryReader.open( mIndexDir );
>               IndexSearcher searcher = new IndexSearcher(reader);
>               QueryParser parser = new QueryParser(Version.LUCENE_40, mField, mAnalyzer);
>               Query query = parser.parse(aQuery);
> 
>               TopScoreDocCollector collector = TopScoreDocCollector.create(100, true);
>               searcher.search(query, collector);
>               ScoreDoc[] hits = collector.topDocs().scoreDocs;
> 
>               for(int i=0;i<hits.length;++i) {
>                   int docId = hits[i].doc;
> 
>                   Document d = searcher.doc(docId);
> 
>                   Terms tfvector = reader.getTermVector(docId, 
> "contents");
> 
>                   if( tfvector != null )
>                   {
>                      TermsEnum termsEnum = tfvector.iterator(null);
> 
>                      if ( termsEnum.seekExact(new BytesRef( aQuery.toLowerCase() ), false ) )
>                      {
>                             DocsAndPositionsEnum dpEnum = null;
>                             dpEnum = termsEnum.docsAndPositions(null, 
> dpEnum);
> 
> if( dpEnum != null )
>                             {
>                                   int freq = dpEnum.freq();
> 
>                                   int maxOcc = 20;
> 
>                                    while( freq-- > 0 && maxOcc-- > 0 ) {
>                                          dpEnum.nextPosition();
>                                         System.out.println("Start offset " + dpEnum.startOffset() + " End offset " + dpEnum.endOffset());
>                                    }
>                             }
> }
> 
> What is the problem?
> 
> Thanks in advance, Vitaly.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Lucene-analyzer 3.3.0 and Lucene snowball 3.0.1

Posted by Steve Rowe <sa...@gmail.com>.

Hi Adrien,

Three comments and a question:

1. From <http://people.apache.org/~hossman/#threadhijack>:

   When starting a new discussion on a mailing list, please do not reply to 
   an existing message, instead start a fresh email.  Even if you change the 
   subject line of your email, other mail headers still track which thread 
   you replied to and your question is "hidden" in that thread and gets less 
   attention.   It makes following discussions in the mailing list archives 
   particularly difficult.

2. If you're beginning a new application, start with a more modern version of Lucene: 4.0.

3. If you must use an older version of Lucene, don't mix versions!

Question: where do your Lucene jars come from?  That is, did you download the binary distribution?  Or are you using Maven/Ivy to download from a Maven repository?  Or … ?

Steve

On Dec 17, 2012, at 12:41 PM, Adrien RUFFIE <a....@e-deal.com> wrote:

> Hello all,
> 
> I beginning with an application and nobody knows with Lucene-analyzer 3.3.0.jar and Lucene snowball 3.0.1.jar are both included
> Its do same thing ?
> 
> I how can I be sure that excluding Lucene-snowball jar in my application, it is not used (the idea that it is Lucene-analyzer is used instead).
> 
> Thank best regards
> 
> Bien cordialement,
> 
> Adrien Ruffié
> LD : +33 1 73 03 29 50
> Tél : +33 1 73 03 29 80
> 
> E-DEAL
> Innover la Relation Client
> 
> -----Message d'origine-----
> De : Vitaly_Artemov@McAfee.com [mailto:Vitaly_Artemov@McAfee.com] 
> Envoyé : lundi 17 décembre 2012 17:46
> À : java-user@lucene.apache.org
> Objet : Lucene 4.0.0 - find offsets for phrase queries
> 
> Hi all,
> I use Lucene 4.0.
> I try to find offsets for phrase queries.
> My code works then I search for one word but then I call it for some phrase I didn't get offsets.
> termsEnum.seekExact returns false for phrase queries.
> 
> reader = DirectoryReader.open( mIndexDir );
>               IndexSearcher searcher = new IndexSearcher(reader);
>               QueryParser parser = new QueryParser(Version.LUCENE_40, mField, mAnalyzer);
>               Query query = parser.parse(aQuery);
> 
>               TopScoreDocCollector collector = TopScoreDocCollector.create(100, true);
>               searcher.search(query, collector);
>               ScoreDoc[] hits = collector.topDocs().scoreDocs;
> 
>               for(int i=0;i<hits.length;++i) {
>                   int docId = hits[i].doc;
> 
>                   Document d = searcher.doc(docId);
> 
>                   Terms tfvector = reader.getTermVector(docId, "contents");
> 
>                   if( tfvector != null )
>                   {
>                      TermsEnum termsEnum = tfvector.iterator(null);
> 
>                      if ( termsEnum.seekExact(new BytesRef( aQuery.toLowerCase() ), false ) )
>                      {
>                             DocsAndPositionsEnum dpEnum = null;
>                             dpEnum = termsEnum.docsAndPositions(null, dpEnum);
> 
> if( dpEnum != null )
>                             {
>                                   int freq = dpEnum.freq();
> 
>                                   int maxOcc = 20;
> 
>                                    while( freq-- > 0 && maxOcc-- > 0 ) {
>                                          dpEnum.nextPosition();
>                                         System.out.println("Start offset " + dpEnum.startOffset() + " End offset " + dpEnum.endOffset());
>                                    }
>                             }
> }
> 
> What is the problem?
> 
> Thanks in advance, Vitaly.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org