You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Karthik N S <ka...@controlnet.co.in> on 2005/08/08 06:06:26 UTC
Reply Split Search Word
Hi
Luceners
Apologies.....
As I have already replied,Using Analysis I have tried on all Analyzers
(including Standard Analyzer)
But not able to achive the required COMPLETS WORD Split.
My I/p String would be a lengthy one as below
String sKey = "\"" + "Dough Cutting" + "\"" + " " + "Otis Gospodnetic" +
" " +
"\"" + "Erik Hatcher" + "\"" + " " + "Authors of " + "\"" +
"Lucene In Action" + "\"";
The required split of complete words should return
1) "Dough Cutting"
2) Otis Gospodnetic
3) "Erik Hatcher"
4) Authors of
5) "Lucene In Action"
Plz Note :- Words with "\"" are complete split words....
I am shure some Analyzer code inside Lucene is handling this task.
som how can one achive this task..
with regards
Karthik
-----Original Message-----
From: Mordo, Aviran (EXP N-NANNATEK) [mailto:aviran.mordo@lmco.com]
Sent: Friday, August 05, 2005 7:58 PM
To: java-user@lucene.apache.org
Subject: RE: Split Search Word
The StandardAnalyzer should work just fine with it, It will break the
search string to 5 search terms.
HTH
Aviran
http://www.aviransplace.com
_____
From: Karthik N S [mailto:karthik@controlnet.co.in]
Sent: Friday, August 05, 2005 1:57 AM
To: LUCENE
Subject: Split Search Word
Hi Luceners
Apologies.....
I have along Search String as given below...
SearchWord = "\"" + "Dough Cutting" + "\"" + " " + "Otis
Gospodnetic" + " " + "\"" + "Erik Hatcher" + "\"" + " " +
"Authors of " + "\"" + "Lucene In Action"
+"\"";
And prior to searching the Index ,I need the Words to be Split.
SearchWord =
1) "\"" + "Dough Cutting" + "\""
2) "Otis Gospodnetic"
3) "\"" + "Erik Hatcher" + "\""
4) "Authors of "
5) "\"" +"Lucene In Action" +"\""
I am shure some Analyzer within Lucene is performin the task.
So some body please Tell me Howto
[ I already used Analysis/Paralysis code to check ,but no help ]
WITH WARM REGARDS
HAVE A NICE DAY
[ N.S.KARTHIK]
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Reply Split Search Word
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Aug 8, 2005, at 7:44 AM, Karthik N S wrote:
> I would like to reformat the Question slightly ,
> Words without double Quotes may also be present in the String.
> Also I have to apply the STOP - Analyzer to filter out common
> English
> words appearing within.
>
> Do u mind giving me a bit of src hint for the same...
> [ I am googled out of ideas ]
Well, the Analyzers chapter in Lucene in Action will give you all the
basics you need to write custom pieces of an analyzer :)
A custom Tokenizer is where you'll want to start with, perhaps,
though there are other possibilities such as tokenizing quotes
separately and then having a filter buffer when they are encountered
and combine multiple tokens into one when within quotes. The only
hints I have would be examples from within Lucene's codebase itself.
You'll need to maintain some state (whether within quotes or not),
but otherwise it should be relatively straightforward.
To provide many more hints would require implementing it myself, I'm
afraid.
Erik
>
>
> with regards
> karthik
>
>
> -----Original Message-----
> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
> Sent: Monday, August 08, 2005 4:23 PM
> To: java-user@lucene.apache.org
> Subject: Re: Reply Split Search Word
>
>
> To have an analyzer split that string into 1-5 as you have listed
> will require you write a custom Analyzer to tokenize with double
> quotes in mind like that.
>
> Erik
>
> On Aug 8, 2005, at 12:06 AM, Karthik N S wrote:
>
>
>> Hi
>>
>> Luceners
>>
>> Apologies.....
>>
>> As I have already replied,Using Analysis I have tried on all
>> Analyzers
>> (including Standard Analyzer)
>> But not able to achive the required COMPLETS WORD Split.
>>
>> My I/p String would be a lengthy one as below
>>
>> String sKey = "\"" + "Dough Cutting" + "\"" + " " + "Otis
>> Gospodnetic" +
>> " " +
>> "\"" + "Erik Hatcher" + "\"" + " " + "Authors of
>> " + "\"" +
>> "Lucene In Action" + "\"";
>>
>> The required split of complete words should return
>>
>> 1) "Dough Cutting"
>> 2) Otis Gospodnetic
>> 3) "Erik Hatcher"
>> 4) Authors of
>> 5) "Lucene In Action"
>>
>> Plz Note :- Words with "\"" are complete split words....
>>
>> I am shure some Analyzer code inside Lucene is handling this task.
>>
>>
>> som how can one achive this task..
>>
>> with regards
>> Karthik
>>
>> -----Original Message-----
>> From: Mordo, Aviran (EXP N-NANNATEK) [mailto:aviran.mordo@lmco.com]
>> Sent: Friday, August 05, 2005 7:58 PM
>> To: java-user@lucene.apache.org
>> Subject: RE: Split Search Word
>>
>>
>> The StandardAnalyzer should work just fine with it, It will break the
>> search string to 5 search terms.
>>
>> HTH
>>
>> Aviran
>> http://www.aviransplace.com
>>
>> _____
>>
>> From: Karthik N S [mailto:karthik@controlnet.co.in]
>> Sent: Friday, August 05, 2005 1:57 AM
>> To: LUCENE
>> Subject: Split Search Word
>>
>>
>>
>> Hi Luceners
>>
>> Apologies.....
>>
>> I have along Search String as given below...
>>
>>
>>
>> SearchWord = "\"" + "Dough Cutting" + "\"" + " " + "Otis
>> Gospodnetic" + " " + "\"" + "Erik Hatcher" + "\"" + " " +
>> "Authors of " + "\"" + "Lucene In Action"
>> +"\"";
>>
>> And prior to searching the Index ,I need the Words to be Split.
>>
>> SearchWord =
>>
>> 1) "\"" + "Dough Cutting" + "\""
>> 2) "Otis Gospodnetic"
>> 3) "\"" + "Erik Hatcher" + "\""
>> 4) "Authors of "
>> 5) "\"" +"Lucene In Action" +"\""
>>
>> I am shure some Analyzer within Lucene is performin the task.
>> So some body please Tell me Howto
>>
>> [ I already used Analysis/Paralysis code to check ,but no help ]
>>
>>
>>
>>
>> WITH WARM REGARDS
>> HAVE A NICE DAY
>> [ N.S.KARTHIK]
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
RE: Reply Split Search Word
Posted by Karthik N S <ka...@controlnet.co.in>.
Hi Erik
I would like to reformat the Question slightly ,
Words without double Quotes may also be present in the String.
Also I have to apply the STOP - Analyzer to filter out common English
words appearing within.
Do u mind giving me a bit of src hint for the same...
[ I am googled out of ideas ]
with regards
karthik
-----Original Message-----
From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
Sent: Monday, August 08, 2005 4:23 PM
To: java-user@lucene.apache.org
Subject: Re: Reply Split Search Word
To have an analyzer split that string into 1-5 as you have listed
will require you write a custom Analyzer to tokenize with double
quotes in mind like that.
Erik
On Aug 8, 2005, at 12:06 AM, Karthik N S wrote:
> Hi
>
> Luceners
>
> Apologies.....
>
> As I have already replied,Using Analysis I have tried on all Analyzers
> (including Standard Analyzer)
> But not able to achive the required COMPLETS WORD Split.
>
> My I/p String would be a lengthy one as below
>
> String sKey = "\"" + "Dough Cutting" + "\"" + " " + "Otis
> Gospodnetic" +
> " " +
> "\"" + "Erik Hatcher" + "\"" + " " + "Authors of
> " + "\"" +
> "Lucene In Action" + "\"";
>
> The required split of complete words should return
>
> 1) "Dough Cutting"
> 2) Otis Gospodnetic
> 3) "Erik Hatcher"
> 4) Authors of
> 5) "Lucene In Action"
>
> Plz Note :- Words with "\"" are complete split words....
>
> I am shure some Analyzer code inside Lucene is handling this task.
>
>
> som how can one achive this task..
>
> with regards
> Karthik
>
> -----Original Message-----
> From: Mordo, Aviran (EXP N-NANNATEK) [mailto:aviran.mordo@lmco.com]
> Sent: Friday, August 05, 2005 7:58 PM
> To: java-user@lucene.apache.org
> Subject: RE: Split Search Word
>
>
> The StandardAnalyzer should work just fine with it, It will break the
> search string to 5 search terms.
>
> HTH
>
> Aviran
> http://www.aviransplace.com
>
> _____
>
> From: Karthik N S [mailto:karthik@controlnet.co.in]
> Sent: Friday, August 05, 2005 1:57 AM
> To: LUCENE
> Subject: Split Search Word
>
>
>
> Hi Luceners
>
> Apologies.....
>
> I have along Search String as given below...
>
>
>
> SearchWord = "\"" + "Dough Cutting" + "\"" + " " + "Otis
> Gospodnetic" + " " + "\"" + "Erik Hatcher" + "\"" + " " +
> "Authors of " + "\"" + "Lucene In Action"
> +"\"";
>
> And prior to searching the Index ,I need the Words to be Split.
>
> SearchWord =
>
> 1) "\"" + "Dough Cutting" + "\""
> 2) "Otis Gospodnetic"
> 3) "\"" + "Erik Hatcher" + "\""
> 4) "Authors of "
> 5) "\"" +"Lucene In Action" +"\""
>
> I am shure some Analyzer within Lucene is performin the task.
> So some body please Tell me Howto
>
> [ I already used Analysis/Paralysis code to check ,but no help ]
>
>
>
>
> WITH WARM REGARDS
> HAVE A NICE DAY
> [ N.S.KARTHIK]
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Reply Split Search Word
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
To have an analyzer split that string into 1-5 as you have listed
will require you write a custom Analyzer to tokenize with double
quotes in mind like that.
Erik
On Aug 8, 2005, at 12:06 AM, Karthik N S wrote:
> Hi
>
> Luceners
>
> Apologies.....
>
> As I have already replied,Using Analysis I have tried on all Analyzers
> (including Standard Analyzer)
> But not able to achive the required COMPLETS WORD Split.
>
> My I/p String would be a lengthy one as below
>
> String sKey = "\"" + "Dough Cutting" + "\"" + " " + "Otis
> Gospodnetic" +
> " " +
> "\"" + "Erik Hatcher" + "\"" + " " + "Authors of
> " + "\"" +
> "Lucene In Action" + "\"";
>
> The required split of complete words should return
>
> 1) "Dough Cutting"
> 2) Otis Gospodnetic
> 3) "Erik Hatcher"
> 4) Authors of
> 5) "Lucene In Action"
>
> Plz Note :- Words with "\"" are complete split words....
>
> I am shure some Analyzer code inside Lucene is handling this task.
>
>
> som how can one achive this task..
>
> with regards
> Karthik
>
> -----Original Message-----
> From: Mordo, Aviran (EXP N-NANNATEK) [mailto:aviran.mordo@lmco.com]
> Sent: Friday, August 05, 2005 7:58 PM
> To: java-user@lucene.apache.org
> Subject: RE: Split Search Word
>
>
> The StandardAnalyzer should work just fine with it, It will break the
> search string to 5 search terms.
>
> HTH
>
> Aviran
> http://www.aviransplace.com
>
> _____
>
> From: Karthik N S [mailto:karthik@controlnet.co.in]
> Sent: Friday, August 05, 2005 1:57 AM
> To: LUCENE
> Subject: Split Search Word
>
>
>
> Hi Luceners
>
> Apologies.....
>
> I have along Search String as given below...
>
>
>
> SearchWord = "\"" + "Dough Cutting" + "\"" + " " + "Otis
> Gospodnetic" + " " + "\"" + "Erik Hatcher" + "\"" + " " +
> "Authors of " + "\"" + "Lucene In Action"
> +"\"";
>
> And prior to searching the Index ,I need the Words to be Split.
>
> SearchWord =
>
> 1) "\"" + "Dough Cutting" + "\""
> 2) "Otis Gospodnetic"
> 3) "\"" + "Erik Hatcher" + "\""
> 4) "Authors of "
> 5) "\"" +"Lucene In Action" +"\""
>
> I am shure some Analyzer within Lucene is performin the task.
> So some body please Tell me Howto
>
> [ I already used Analysis/Paralysis code to check ,but no help ]
>
>
>
>
> WITH WARM REGARDS
> HAVE A NICE DAY
> [ N.S.KARTHIK]
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org