You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by starz10de <fa...@yahoo.com> on 2008/07/05 20:41:24 UTC

Index different files in different folders in lucene

Hi all,
I am new to lucene , is it possible to Index different files in different
folders in lucene

for examples , i have two folderes a and b , each contain several files.

in lucene args i wrote :  c:\a\ , c:\b\   but it does index only the first
files in folder A  and it doesnt index any files in folder b.  
is there any way to do that or i must put all files in one folder which is
not nice way to do as i have different types of files and need them to be
seperated.
thanks in advance
-- 
View this message in context: http://www.nabble.com/Index-different-files-in-different-folders-in-lucene-tp18295066p18295066.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: newbie question (for John Griffin) - fixed

Posted by Chris Bamford <ch...@scalix.com>.
Thanks Steve.

Steven A Rowe wrote:
> Hi Chris,
>
> The PhraseQuery class does no parsing; tokenization is expected to happen before you feed anything to it.  So unless you have an index-time analyzer that outputs terms that look like "aaa ddd" -- that is, terms with embedded spaces -- then attempting to use PhraseQuery or any other query type to look for these terms will bring you no joy.  (Of course, this only applies if you are not using a query parser - I believe John's point about enclosing a phrase query in quotes refers to the action Lucene's QueryParser takes when it sees input of this sort.)
>
> The way that it worked for you - adding terms one at a time, with no quotes and no spaces - is the correct usage pattern.
>
> Steve
>
> On 07/15/2008 at 8:20 AM, Chris Bamford wrote:
>   
>> Hi John
>>
>> Thanks for your continued interest in my travails!
>>
>> ==I'm not sure I understand. You want a phrase query so they should be
>> ==passed as a phrase in quotes.
>>
>> Ok... well I must be missing something then  :-(
>> This fails to return any hits for me:
>>
>>         PhraseQuery pq = new  PhraseQuery();
>>         pq.add(new Term("body", "aaa ddd"));
>>
>> while
>>        PhraseQuery pq = new  PhraseQuery();
>>        pq.add(new Term("body", "aaa"));
>>        pq.add(new Term("body", "ddd"));
>>
>> works fine.
>> I have tried with both Lucene 2.0 and 2.3 jars.
>>
>> Please advise!
>>
>> Thanks,
>>
>> -Chris
>>     BTW thanks for the tip about Luke
>>
>>
>> John Griffin wrote:
>>     
>>> Chris,
>>>
>>> -----Original Message-----
>>> From: Chris Bamford [mailto:chris.bamford@scalix.com]
>>> Sent: Thursday, July 10, 2008 9:15 AM
>>> To: java-user@lucene.apache.org
>>> Subject: Re: newbie question (for John Griffin) - fixed
>>>
>>> Hi John,
>>>
>>> Please ignore my earlier questions on this subject, as I have got to
>>> the bottom of it. I was not passing each word in the phrase as a
>>> separate Term to the query;
>>>
>>> ==I'm not sure I understand. You want a phrase query so they should be
>>> ==passed as a phrase in quotes.
>>>
>>>
>>> instead I was passing the whole string (doh!).
>>>
>>> Thanks.
>>>
>>> - Chris
>>>
>>> Chris Bamford wrote:
>>>
>>>       
>>>> Hi John,
>>>>
>>>> Further to my question below, I did some back-to-basics investigation
>>>> of PhraseQueries and found that even basic ones fail for me... I found
>>>> the attached code on the Internet (see
>>>>
>>>>         
>> http://affy.blogspot.com/2003/04/codebit-examples-for-all-of-l
>> ucenes.html)
>>     
>>>       
>>>> and this fails too...  Can you explain why?  I would expect the first
>>>> test to deliver 2 hits.
>>>>
>>>> I have tried with Lucene 2.0 and 2.3.2 jars and both fail.
>>>>
>>>> Thanks again,
>>>>
>>>> - Chris
>>>>
>>>>
>>>>
>>>> Chris Bamford wrote:
>>>>
>>>>         
>>>>> Hi John,
>>>>>
>>>>> Just continuing from an earlier question where I asked you how to
>>>>> handle strings like "from:fred flintston*" (sorry I have lost the
>>>>> original email). You advised me to write my own BooleanQuery and add
>>>>> to it Prefix- / Term- / Phrase- Querys as appropriate.  I have done
>>>>> so, but am having trouble with the result - my PhraseQueries just do
>>>>> not get any hits at all  :-( My code looks for quotes - if it finds
>>>>> them, it treats the quoted phrase as a PhraseQuery and sets the slop
>>>>> factor to 0. so,  an input of:
>>>>>
>>>>>    subject:"Good Morning"
>>>>>
>>>>> results in a PhraseQuery (which I add to my BooleanQuery and then
>>>>> dump with toString()) of:
>>>>>
>>>>>    +subject:"good morning"
>>>>>
>>>>> ... which fails. However, if I break it into 2 TermQuerys, it works
>>>>> (but that's not what I want).
>>>>>
>>>>> What am I missing?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> - Chris
>>>>>           
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>   


-- 
------------------------------------------------------------------------
*Chris Bamford*
Senior Development Engineer 	<http://www.scalix.com>
------------------------------------------------------------------------
/Email / MSN/ 	chris.bamford@scalix.com
/Tel/ 	+44 (0)1344 381814 	  	/Skype/ 	c.bamford


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: newbie question (for John Griffin) - fixed

Posted by Steven A Rowe <sa...@syr.edu>.
Hi Chris,

The PhraseQuery class does no parsing; tokenization is expected to happen before you feed anything to it.  So unless you have an index-time analyzer that outputs terms that look like "aaa ddd" -- that is, terms with embedded spaces -- then attempting to use PhraseQuery or any other query type to look for these terms will bring you no joy.  (Of course, this only applies if you are not using a query parser - I believe John's point about enclosing a phrase query in quotes refers to the action Lucene's QueryParser takes when it sees input of this sort.)

The way that it worked for you - adding terms one at a time, with no quotes and no spaces - is the correct usage pattern.

Steve

On 07/15/2008 at 8:20 AM, Chris Bamford wrote:
> Hi John
> 
> Thanks for your continued interest in my travails!
> 
> ==I'm not sure I understand. You want a phrase query so they should be
> ==passed as a phrase in quotes.
> 
> Ok... well I must be missing something then  :-(
> This fails to return any hits for me:
> 
>         PhraseQuery pq = new  PhraseQuery();
>         pq.add(new Term("body", "aaa ddd"));
> 
> while
>        PhraseQuery pq = new  PhraseQuery();
>        pq.add(new Term("body", "aaa"));
>        pq.add(new Term("body", "ddd"));
> 
> works fine.
> I have tried with both Lucene 2.0 and 2.3 jars.
> 
> Please advise!
> 
> Thanks,
> 
> -Chris
>     BTW thanks for the tip about Luke
> 
> 
> John Griffin wrote:
> > Chris,
> > 
> > -----Original Message-----
> > From: Chris Bamford [mailto:chris.bamford@scalix.com]
> > Sent: Thursday, July 10, 2008 9:15 AM
> > To: java-user@lucene.apache.org
> > Subject: Re: newbie question (for John Griffin) - fixed
> > 
> > Hi John,
> > 
> > Please ignore my earlier questions on this subject, as I have got to
> > the bottom of it. I was not passing each word in the phrase as a
> > separate Term to the query;
> > 
> > ==I'm not sure I understand. You want a phrase query so they should be
> > ==passed as a phrase in quotes.
> > 
> > 
> > instead I was passing the whole string (doh!).
> > 
> > Thanks.
> > 
> > - Chris
> > 
> > Chris Bamford wrote:
> > 
> > > Hi John,
> > > 
> > > Further to my question below, I did some back-to-basics investigation
> > > of PhraseQueries and found that even basic ones fail for me... I found
> > > the attached code on the Internet (see
> > > 
> http://affy.blogspot.com/2003/04/codebit-examples-for-all-of-l
> ucenes.html)
> > > 
> > 
> > 
> > > and this fails too...  Can you explain why?  I would expect the first
> > > test to deliver 2 hits.
> > > 
> > > I have tried with Lucene 2.0 and 2.3.2 jars and both fail.
> > > 
> > > Thanks again,
> > > 
> > > - Chris
> > > 
> > > 
> > > 
> > > Chris Bamford wrote:
> > > 
> > > > Hi John,
> > > > 
> > > > Just continuing from an earlier question where I asked you how to
> > > > handle strings like "from:fred flintston*" (sorry I have lost the
> > > > original email). You advised me to write my own BooleanQuery and add
> > > > to it Prefix- / Term- / Phrase- Querys as appropriate.  I have done
> > > > so, but am having trouble with the result - my PhraseQueries just do
> > > > not get any hits at all  :-( My code looks for quotes - if it finds
> > > > them, it treats the quoted phrase as a PhraseQuery and sets the slop
> > > > factor to 0. so,  an input of:
> > > > 
> > > >    subject:"Good Morning"
> > > > 
> > > > results in a PhraseQuery (which I add to my BooleanQuery and then
> > > > dump with toString()) of:
> > > > 
> > > >    +subject:"good morning"
> > > > 
> > > > ... which fails. However, if I break it into 2 TermQuerys, it works
> > > > (but that's not what I want).
> > > > 
> > > > What am I missing?
> > > > 
> > > > Thanks,
> > > > 
> > > > - Chris



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: newbie question (for John Griffin) - fixed

Posted by Chris Bamford <ch...@scalix.com>.
Hi John

Thanks for your continued interest in my travails!

==I'm not sure I understand. You want a phrase query so they should be
==passed as a phrase in quotes.

Ok... well I must be missing something then  :-(
This fails to return any hits for me:

        PhraseQuery pq = new  PhraseQuery();       
        pq.add(new Term("body", "aaa ddd"));

while
       PhraseQuery pq = new  PhraseQuery();       
       pq.add(new Term("body", "aaa"));
       pq.add(new Term("body", "ddd"));

works fine.
I have tried with both Lucene 2.0 and 2.3 jars.

Please advise!

Thanks,

-Chris
    BTW thanks for the tip about Luke


John Griffin wrote:
> Chris,
>
> -----Original Message-----
> From: Chris Bamford [mailto:chris.bamford@scalix.com] 
> Sent: Thursday, July 10, 2008 9:15 AM
> To: java-user@lucene.apache.org
> Subject: Re: newbie question (for John Griffin) - fixed
>
> Hi John,
>
> Please ignore my earlier questions on this subject, as I have got to the 
> bottom of it.
> I was not passing each word in the phrase as a separate Term to the 
> query; 
>
> ==I'm not sure I understand. You want a phrase query so they should be
> ==passed as a phrase in quotes.
>
>
> instead I was passing the whole string (doh!).
>
> Thanks.
>
> - Chris
>
> Chris Bamford wrote:
>   
>> Hi John,
>>
>> Further to my question below, I did some back-to-basics investigation 
>> of PhraseQueries and found that even basic ones fail for me...
>> I found the attached code on the Internet (see 
>> http://affy.blogspot.com/2003/04/codebit-examples-for-all-of-lucenes.html)
>>     
>
>   
>> and this fails too...  Can you explain why?  I would expect the first 
>> test to deliver 2 hits.
>>
>> I have tried with Lucene 2.0 and 2.3.2 jars and both fail.
>>
>> Thanks again,
>>
>> - Chris
>>
>>
>>
>> Chris Bamford wrote:
>>     
>>> Hi John,
>>>
>>> Just continuing from an earlier question where I asked you how to 
>>> handle strings like "from:fred flintston*" (sorry I have lost the 
>>> original email).
>>> You advised me to write my own BooleanQuery and add to it Prefix- / 
>>> Term- / Phrase- Querys as appropriate.  I have done so, but am having 
>>> trouble with the result - my PhraseQueries just do not get any hits 
>>> at all  :-(
>>> My code looks for quotes - if it finds them, it treats the quoted 
>>> phrase as a PhraseQuery and sets the slop factor to 0.
>>> so,  an input of:
>>>
>>>    subject:"Good Morning"
>>>
>>> results in a PhraseQuery (which I add to my BooleanQuery and then 
>>> dump with toString()) of:
>>>
>>>    +subject:"good morning"
>>>
>>> ... which fails.
>>> However, if I break it into 2 TermQuerys, it works (but that's not 
>>> what I want).
>>>
>>> What am I missing?
>>>
>>> Thanks,
>>>
>>> - Chris
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>       
>> ------------------------------------------------------------------------
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>     
>
>
>   


-- 
------------------------------------------------------------------------
*Chris Bamford*
Senior Development Engineer 	<http://www.scalix.com>
------------------------------------------------------------------------
/Email / MSN/ 	chris.bamford@scalix.com
/Tel/ 	+44 (0)1344 381814 	  	/Skype/ 	c.bamford


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: newbie question (for John Griffin) - fixed

Posted by John Griffin <jg...@thebluezone.net>.
Chris,

-----Original Message-----
From: Chris Bamford [mailto:chris.bamford@scalix.com] 
Sent: Thursday, July 10, 2008 9:15 AM
To: java-user@lucene.apache.org
Subject: Re: newbie question (for John Griffin) - fixed

Hi John,

Please ignore my earlier questions on this subject, as I have got to the 
bottom of it.
I was not passing each word in the phrase as a separate Term to the 
query; 

==I'm not sure I understand. You want a phrase query so they should be
==passed as a phrase in quotes.


instead I was passing the whole string (doh!).

Thanks.

- Chris

Chris Bamford wrote:
> Hi John,
>
> Further to my question below, I did some back-to-basics investigation 
> of PhraseQueries and found that even basic ones fail for me...
> I found the attached code on the Internet (see 
> http://affy.blogspot.com/2003/04/codebit-examples-for-all-of-lucenes.html)

> and this fails too...  Can you explain why?  I would expect the first 
> test to deliver 2 hits.
>
> I have tried with Lucene 2.0 and 2.3.2 jars and both fail.
>
> Thanks again,
>
> - Chris
>
>
>
> Chris Bamford wrote:
>> Hi John,
>>
>> Just continuing from an earlier question where I asked you how to 
>> handle strings like "from:fred flintston*" (sorry I have lost the 
>> original email).
>> You advised me to write my own BooleanQuery and add to it Prefix- / 
>> Term- / Phrase- Querys as appropriate.  I have done so, but am having 
>> trouble with the result - my PhraseQueries just do not get any hits 
>> at all  :-(
>> My code looks for quotes - if it finds them, it treats the quoted 
>> phrase as a PhraseQuery and sets the slop factor to 0.
>> so,  an input of:
>>
>>    subject:"Good Morning"
>>
>> results in a PhraseQuery (which I add to my BooleanQuery and then 
>> dump with toString()) of:
>>
>>    +subject:"good morning"
>>
>> ... which fails.
>> However, if I break it into 2 TermQuerys, it works (but that's not 
>> what I want).
>>
>> What am I missing?
>>
>> Thanks,
>>
>> - Chris
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ------------------------------------------------------------------------
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


-- 
------------------------------------------------------------------------
*Chris Bamford*
Senior Development Engineer 	<http://www.scalix.com>
------------------------------------------------------------------------
/Email / MSN/ 	chris.bamford@scalix.com
/Tel/ 	+44 (0)1344 381814 	  	/Skype/ 	c.bamford


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: newbie question (for John Griffin) - fixed

Posted by Chris Bamford <ch...@scalix.com>.
Hi John,

Please ignore my earlier questions on this subject, as I have got to the 
bottom of it.
I was not passing each word in the phrase as a separate Term to the 
query; instead I was passing the whole string (doh!).

Thanks.

- Chris

Chris Bamford wrote:
> Hi John,
>
> Further to my question below, I did some back-to-basics investigation 
> of PhraseQueries and found that even basic ones fail for me...
> I found the attached code on the Internet (see 
> http://affy.blogspot.com/2003/04/codebit-examples-for-all-of-lucenes.html) 
> and this fails too...  Can you explain why?  I would expect the first 
> test to deliver 2 hits.
>
> I have tried with Lucene 2.0 and 2.3.2 jars and both fail.
>
> Thanks again,
>
> - Chris
>
>
>
> Chris Bamford wrote:
>> Hi John,
>>
>> Just continuing from an earlier question where I asked you how to 
>> handle strings like "from:fred flintston*" (sorry I have lost the 
>> original email).
>> You advised me to write my own BooleanQuery and add to it Prefix- / 
>> Term- / Phrase- Querys as appropriate.  I have done so, but am having 
>> trouble with the result - my PhraseQueries just do not get any hits 
>> at all  :-(
>> My code looks for quotes - if it finds them, it treats the quoted 
>> phrase as a PhraseQuery and sets the slop factor to 0.
>> so,  an input of:
>>
>>    subject:"Good Morning"
>>
>> results in a PhraseQuery (which I add to my BooleanQuery and then 
>> dump with toString()) of:
>>
>>    +subject:"good morning"
>>
>> ... which fails.
>> However, if I break it into 2 TermQuerys, it works (but that's not 
>> what I want).
>>
>> What am I missing?
>>
>> Thanks,
>>
>> - Chris
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ------------------------------------------------------------------------
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


-- 
------------------------------------------------------------------------
*Chris Bamford*
Senior Development Engineer 	<http://www.scalix.com>
------------------------------------------------------------------------
/Email / MSN/ 	chris.bamford@scalix.com
/Tel/ 	+44 (0)1344 381814 	  	/Skype/ 	c.bamford


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Index different files in different folders in lucene

Posted by John Griffin <jg...@thebluezone.net>.
Starz,

P.S.

Even if you ran the program twice it still would not work. The program will
tell you to delete the 'index' directory before indexing. 

Bottom line is 'refactor the code to so what you want'.

John G.

-----Original Message-----
From: starz10de [mailto:farag_ahmed@yahoo.com] 
Sent: Sunday, July 06, 2008 4:34 AM
To: java-user@lucene.apache.org
Subject: RE: Index different files in different folders in lucene


hi  John ,

Is important to know my code ? I though it is general question! I use lucene
to index some files in one folder, but in my case I have different files
(text files) in two different folderes and I though may be lucene could
index both in same time. I am just lucene at it is and I dont have any
special code. just indexing , I need this because i have an arabic and
english files and i dont want to index them in one index as it make no sense
as they are not related , so when you like to look for english string you
dont need to look for it in arabic and so on. My question is it possible for
lucene to index multiple folderes in  same time and put them in several
indexes?
thanks

John Griffin-3 wrote:
> 
> Starz,
> 
> How about your code so we can see what you are doing? We're flying blind
> here.
> 
> John G.
> 
> -----Original Message-----
> From: starz10de [mailto:farag_ahmed@yahoo.com] 
> Sent: Saturday, July 05, 2008 12:41 PM
> To: java-user@lucene.apache.org
> Subject: Index different files in different folders in lucene
> 
> 
> Hi all,
> I am new to lucene , is it possible to Index different files in different
> folders in lucene
> 
> for examples , i have two folderes a and b , each contain several files.
> 
> in lucene args i wrote :  c:\a\ , c:\b\   but it does index only the first
> files in folder A  and it doesnt index any files in folder b.  
> is there any way to do that or i must put all files in one folder which is
> not nice way to do as i have different types of files and need them to be
> seperated.
> thanks in advance
> -- 
> View this message in context:
>
http://www.nabble.com/Index-different-files-in-different-folders-in-lucene-t
> p18295066p18295066.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context:
http://www.nabble.com/Index-different-files-in-different-folders-in-lucene-t
p18295066p18300833.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Index different files in different folders in lucene

Posted by starz10de <fa...@yahoo.com>.
John,
 Thanks for your kind help. I will try to modify the source code.


John Griffin-3 wrote:
> 
> Starz,
> 
> If you are using the LuceneDemo jar to index your docs its default
> behavior
> is to recursively index all files to an 'index' directory from a
> 'root-directory' you specify. So what you are trying to do won't work
> unless
> you modify the source to do what you want. It would not be that difficult
> to
> do.
> 
> JohnG.
> 
> -----Original Message-----
> From: starz10de [mailto:farag_ahmed@yahoo.com] 
> Sent: Sunday, July 06, 2008 4:34 AM
> To: java-user@lucene.apache.org
> Subject: RE: Index different files in different folders in lucene
> 
> 
> hi  John ,
> 
> Is important to know my code ? I though it is general question! I use
> lucene
> to index some files in one folder, but in my case I have different files
> (text files) in two different folderes and I though may be lucene could
> index both in same time. I am just lucene at it is and I dont have any
> special code. just indexing , I need this because i have an arabic and
> english files and i dont want to index them in one index as it make no
> sense
> as they are not related , so when you like to look for english string you
> dont need to look for it in arabic and so on. My question is it possible
> for
> lucene to index multiple folderes in  same time and put them in several
> indexes?
> thanks
> 
> John Griffin-3 wrote:
>> 
>> Starz,
>> 
>> How about your code so we can see what you are doing? We're flying blind
>> here.
>> 
>> John G.
>> 
>> -----Original Message-----
>> From: starz10de [mailto:farag_ahmed@yahoo.com] 
>> Sent: Saturday, July 05, 2008 12:41 PM
>> To: java-user@lucene.apache.org
>> Subject: Index different files in different folders in lucene
>> 
>> 
>> Hi all,
>> I am new to lucene , is it possible to Index different files in different
>> folders in lucene
>> 
>> for examples , i have two folderes a and b , each contain several files.
>> 
>> in lucene args i wrote :  c:\a\ , c:\b\   but it does index only the
>> first
>> files in folder A  and it doesnt index any files in folder b.  
>> is there any way to do that or i must put all files in one folder which
>> is
>> not nice way to do as i have different types of files and need them to be
>> seperated.
>> thanks in advance
>> -- 
>> View this message in context:
>>
> http://www.nabble.com/Index-different-files-in-different-folders-in-lucene-t
>> p18295066p18295066.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/Index-different-files-in-different-folders-in-lucene-t
> p18295066p18300833.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Index-different-files-in-different-folders-in-lucene-tp18295066p18305778.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Index different files in different folders in lucene

Posted by John Griffin <jg...@thebluezone.net>.
Starz,

If you are using the LuceneDemo jar to index your docs its default behavior
is to recursively index all files to an 'index' directory from a
'root-directory' you specify. So what you are trying to do won't work unless
you modify the source to do what you want. It would not be that difficult to
do.

JohnG.

-----Original Message-----
From: starz10de [mailto:farag_ahmed@yahoo.com] 
Sent: Sunday, July 06, 2008 4:34 AM
To: java-user@lucene.apache.org
Subject: RE: Index different files in different folders in lucene


hi  John ,

Is important to know my code ? I though it is general question! I use lucene
to index some files in one folder, but in my case I have different files
(text files) in two different folderes and I though may be lucene could
index both in same time. I am just lucene at it is and I dont have any
special code. just indexing , I need this because i have an arabic and
english files and i dont want to index them in one index as it make no sense
as they are not related , so when you like to look for english string you
dont need to look for it in arabic and so on. My question is it possible for
lucene to index multiple folderes in  same time and put them in several
indexes?
thanks

John Griffin-3 wrote:
> 
> Starz,
> 
> How about your code so we can see what you are doing? We're flying blind
> here.
> 
> John G.
> 
> -----Original Message-----
> From: starz10de [mailto:farag_ahmed@yahoo.com] 
> Sent: Saturday, July 05, 2008 12:41 PM
> To: java-user@lucene.apache.org
> Subject: Index different files in different folders in lucene
> 
> 
> Hi all,
> I am new to lucene , is it possible to Index different files in different
> folders in lucene
> 
> for examples , i have two folderes a and b , each contain several files.
> 
> in lucene args i wrote :  c:\a\ , c:\b\   but it does index only the first
> files in folder A  and it doesnt index any files in folder b.  
> is there any way to do that or i must put all files in one folder which is
> not nice way to do as i have different types of files and need them to be
> seperated.
> thanks in advance
> -- 
> View this message in context:
>
http://www.nabble.com/Index-different-files-in-different-folders-in-lucene-t
> p18295066p18295066.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context:
http://www.nabble.com/Index-different-files-in-different-folders-in-lucene-t
p18295066p18300833.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Index different files in different folders in lucene

Posted by starz10de <fa...@yahoo.com>.
hi  John ,

Is important to know my code ? I though it is general question! I use lucene
to index some files in one folder, but in my case I have different files
(text files) in two different folderes and I though may be lucene could
index both in same time. I am just lucene at it is and I dont have any
special code. just indexing , I need this because i have an arabic and
english files and i dont want to index them in one index as it make no sense
as they are not related , so when you like to look for english string you
dont need to look for it in arabic and so on. My question is it possible for
lucene to index multiple folderes in  same time and put them in several
indexes?
thanks

John Griffin-3 wrote:
> 
> Starz,
> 
> How about your code so we can see what you are doing? We're flying blind
> here.
> 
> John G.
> 
> -----Original Message-----
> From: starz10de [mailto:farag_ahmed@yahoo.com] 
> Sent: Saturday, July 05, 2008 12:41 PM
> To: java-user@lucene.apache.org
> Subject: Index different files in different folders in lucene
> 
> 
> Hi all,
> I am new to lucene , is it possible to Index different files in different
> folders in lucene
> 
> for examples , i have two folderes a and b , each contain several files.
> 
> in lucene args i wrote :  c:\a\ , c:\b\   but it does index only the first
> files in folder A  and it doesnt index any files in folder b.  
> is there any way to do that or i must put all files in one folder which is
> not nice way to do as i have different types of files and need them to be
> seperated.
> thanks in advance
> -- 
> View this message in context:
> http://www.nabble.com/Index-different-files-in-different-folders-in-lucene-t
> p18295066p18295066.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Index-different-files-in-different-folders-in-lucene-tp18295066p18300833.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: newbie question (for John Griffin)

Posted by John Griffin <jg...@thebluezone.net>.
Chris,

The code you refer to in the blog is 5 years old! Some of the code is no
longer valid with the newer Lucene jars. I wouldn't use it to test anything.


My suspicion is that your index itself is suspect. Let's see the code you
use to build the index with a small data set that will show what you are
trying to accomplish.

BUT FIRST! Look at your built index with Luke before doing this to make sure
that what you THINK you have in your index is really what you have.

Luke is at http://www.getopt.org/luke/. This is probably THE most important
tool you'll have in your arsenal and is pretty easy to use. You can query
your index with it and see if it responds the way you think it should. You
can enter your subject:"Good Morning" query and see what happens. If Luke
can't find what you're querying for then your code won't. 

John G.


-----Original Message-----
From: Chris Bamford [mailto:chris.bamford@scalix.com] 
Sent: Thursday, July 10, 2008 5:58 AM
To: java-user@lucene.apache.org
Subject: Re: newbie question (for John Griffin)

Hi John,

Further to my question below, I did some back-to-basics investigation of 
PhraseQueries and found that even basic ones fail for me...
I found the attached code on the Internet (see 
http://affy.blogspot.com/2003/04/codebit-examples-for-all-of-lucenes.html) 
and this fails too...  Can you explain why?  I would expect the first 
test to deliver 2 hits.

I have tried with Lucene 2.0 and 2.3.2 jars and both fail.

Thanks again,

- Chris



Chris Bamford wrote:
> Hi John,
>
> Just continuing from an earlier question where I asked you how to 
> handle strings like "from:fred flintston*" (sorry I have lost the 
> original email).
> You advised me to write my own BooleanQuery and add to it Prefix- / 
> Term- / Phrase- Querys as appropriate.  I have done so, but am having 
> trouble with the result - my PhraseQueries just do not get any hits at 
> all  :-(
> My code looks for quotes - if it finds them, it treats the quoted 
> phrase as a PhraseQuery and sets the slop factor to 0.
> so,  an input of:
>
>    subject:"Good Morning"
>
> results in a PhraseQuery (which I add to my BooleanQuery and then dump 
> with toString()) of:
>
>    +subject:"good morning"
>
> ... which fails.
> However, if I break it into 2 TermQuerys, it works (but that's not 
> what I want).
>
> What am I missing?
>
> Thanks,
>
> - Chris
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


-- 
------------------------------------------------------------------------
*Chris Bamford*
Senior Development Engineer 	<http://www.scalix.com>
------------------------------------------------------------------------
/Email / MSN/ 	chris.bamford@scalix.com
/Tel/ 	+44 (0)1344 381814 	  	/Skype/ 	c.bamford



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: newbie question (for John Griffin)

Posted by Chris Bamford <ch...@scalix.com>.
Hi John,

Further to my question below, I did some back-to-basics investigation of 
PhraseQueries and found that even basic ones fail for me...
I found the attached code on the Internet (see 
http://affy.blogspot.com/2003/04/codebit-examples-for-all-of-lucenes.html) 
and this fails too...  Can you explain why?  I would expect the first 
test to deliver 2 hits.

I have tried with Lucene 2.0 and 2.3.2 jars and both fail.

Thanks again,

- Chris



Chris Bamford wrote:
> Hi John,
>
> Just continuing from an earlier question where I asked you how to 
> handle strings like "from:fred flintston*" (sorry I have lost the 
> original email).
> You advised me to write my own BooleanQuery and add to it Prefix- / 
> Term- / Phrase- Querys as appropriate.  I have done so, but am having 
> trouble with the result - my PhraseQueries just do not get any hits at 
> all  :-(
> My code looks for quotes - if it finds them, it treats the quoted 
> phrase as a PhraseQuery and sets the slop factor to 0.
> so,  an input of:
>
>    subject:"Good Morning"
>
> results in a PhraseQuery (which I add to my BooleanQuery and then dump 
> with toString()) of:
>
>    +subject:"good morning"
>
> ... which fails.
> However, if I break it into 2 TermQuerys, it works (but that's not 
> what I want).
>
> What am I missing?
>
> Thanks,
>
> - Chris
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


-- 
------------------------------------------------------------------------
*Chris Bamford*
Senior Development Engineer 	<http://www.scalix.com>
------------------------------------------------------------------------
/Email / MSN/ 	chris.bamford@scalix.com
/Tel/ 	+44 (0)1344 381814 	  	/Skype/ 	c.bamford


Re: newbie question

Posted by Chris Bamford <ch...@scalix.com>.
Hi John,

Just continuing from an earlier question where I asked you how to handle 
strings like "from:fred flintston*" (sorry I have lost the original email).
You advised me to write my own BooleanQuery and add to it Prefix- / 
Term- / Phrase- Querys as appropriate.  I have done so, but am having 
trouble with the result - my PhraseQueries just do not get any hits at 
all  :-(
My code looks for quotes - if it finds them, it treats the quoted phrase 
as a PhraseQuery and sets the slop factor to 0.
so,  an input of:

    subject:"Good Morning"

results in a PhraseQuery (which I add to my BooleanQuery and then dump 
with toString()) of:

    +subject:"good morning"

... which fails.
However, if I break it into 2 TermQuerys, it works (but that's not what 
I want).

What am I missing?

Thanks,

- Chris

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Index different files in different folders in lucene

Posted by John Griffin <jg...@thebluezone.net>.
Starz,

How about your code so we can see what you are doing? We're flying blind
here.

John G.

-----Original Message-----
From: starz10de [mailto:farag_ahmed@yahoo.com] 
Sent: Saturday, July 05, 2008 12:41 PM
To: java-user@lucene.apache.org
Subject: Index different files in different folders in lucene


Hi all,
I am new to lucene , is it possible to Index different files in different
folders in lucene

for examples , i have two folderes a and b , each contain several files.

in lucene args i wrote :  c:\a\ , c:\b\   but it does index only the first
files in folder A  and it doesnt index any files in folder b.  
is there any way to do that or i must put all files in one folder which is
not nice way to do as i have different types of files and need them to be
seperated.
thanks in advance
-- 
View this message in context:
http://www.nabble.com/Index-different-files-in-different-folders-in-lucene-t
p18295066p18295066.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org