You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Jugal Kolariya <ju...@rancoretech.com> on 2013/08/12 15:47:57 UTC

Creating Indexes when data inside the file is being written.

Hello,
           I have a potential usecase for which I am not sure whether 
using lucene will help me or not.

In my code case, I am creating a new file and writing data to that file.

Now, when the file writing is in progress, I would like to create Lucene 
Indexes. Once indexes are created, I can then perform operation on the 
indexes.

I want to know whether I can create indexes for the file on which the 
data is still being getting written from the code.

If yes, what about the incremental changes which happen in file. Those 
details wont be captured in the indexes. Does this effectively means 
everytime I try to perform a search I will first create index and then 
perform search operation.

Any guidance on this will be highly appreciated. Javadoc does not seem 
to provide relevant information on this. Lucene version is 4.4.0
-- 


        Thanks & regards
        /Jugal Kishore Kolariya
        /


Re: Creating Indexes when data inside the file is being written.

Posted by Ian Lea <ia...@gmail.com>.
I'm not sure what you're getting at.  If you've got one job reading
data, writing to an output file and indexing as you go, it should
work.  If you've got multiple jobs trying to write to the same output
file and lucene index you'll need some external synchronisation.


--
Ian.


On Tue, Aug 13, 2013 at 10:39 AM, Jugal Kolariya
<ju...@rancoretech.com> wrote:
> Probably, Last doubt:
>
> The data in my application is coming from a stream after performing some
> functionality.
>
> This stream is getting continously written in the file.
>
> So , effectively, if I open a lucene index and create indexes using this
> file, I would be able to create the indexes ..???  Won't I get any kind of
> exception from the file as I am still writing data in that file. ???
>
> Guidance is highly appreciated...
>
> On 13-08-2013 PM 02:01, Ian Lea wrote:
>>
>> If I've understood your question correctly, the answer is yes.
>> Assuming the input data is coming from another file the flow will be
>> along the lines of
>>
>> .  Open input file for reading
>> .  Open output file for writing
>> .  Open (or create) lucene index
>>
>> .  For each input record
>> -   write to output file
>> -   add to lucene
>>
>> Then at the end close the files and the index and you're done.
>>
>>
>> --
>> Ian.
>>
>>
>> On Tue, Aug 13, 2013 at 9:16 AM, Jugal Kolariya
>> <ju...@rancoretech.com> wrote:
>>>
>>> That only answer my 2nd part.
>>>
>>> My most important question still remains.
>>>
>>> "
>>>
>>> In my code case, I am creating a new file and writing data to that file.
>>>
>>> Now, when the file writing is in progress, I would like to create Lucene
>>> Indexes. Once indexes are created, I can then perform operation on the
>>> indexes.
>>>
>>> I want to know whether I can create indexes for the file on which the
>>> data
>>> is still being getting written from the code.
>>>
>>> "
>>>
>>> Is it possible to create indexes of a file, data in which is still
>>> getting
>>> written...
>>>
>>> Any guidance will be highly appreciated.
>>>
>>>
>>> On 12-08-2013 PM 10:46, Michael McCandless wrote:
>>>>
>>>> You'll have to periodically re-index that document, if it's content is
>>>> constantly changing.
>>>>
>>>> Alternatively, it's possible to index sub-documents so that each new
>>>> "chunk" of content added because a new document, and then you join or
>>>> group the results back into a single document ...
>>>>
>>>> Mike McCandless
>>>>
>>>> http://blog.mikemccandless.com
>>>>
>>>>
>>>> On Mon, Aug 12, 2013 at 9:47 AM, Jugal Kolariya
>>>> <ju...@rancoretech.com> wrote:
>>>>>
>>>>> Hello,
>>>>>             I have a potential usecase for which I am not sure whether
>>>>> using
>>>>> lucene will help me or not.
>>>>>
>>>>> In my code case, I am creating a new file and writing data to that
>>>>> file.
>>>>>
>>>>> Now, when the file writing is in progress, I would like to create
>>>>> Lucene
>>>>> Indexes. Once indexes are created, I can then perform operation on the
>>>>> indexes.
>>>>>
>>>>> I want to know whether I can create indexes for the file on which the
>>>>> data
>>>>> is still being getting written from the code.
>>>>>
>>>>> If yes, what about the incremental changes which happen in file. Those
>>>>> details wont be captured in the indexes. Does this effectively means
>>>>> everytime I try to perform a search I will first create index and then
>>>>> perform search operation.
>>>>>
>>>>> Any guidance on this will be highly appreciated. Javadoc does not seem
>>>>> to
>>>>> provide relevant information on this. Lucene version is 4.4.0
>>>>> --
>>>>>
>>>>>
>>>>>          Thanks & regards
>>>>>          /Jugal Kishore Kolariya
>>>>>          /
>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Creating Indexes when data inside the file is being written.

Posted by Jugal Kolariya <ju...@rancoretech.com>.
Probably, Last doubt:

The data in my application is coming from a stream after performing some 
functionality.

This stream is getting continously written in the file.

So , effectively, if I open a lucene index and create indexes using this 
file, I would be able to create the indexes ..???  Won't I get any kind 
of exception from the file as I am still writing data in that file. ???

Guidance is highly appreciated...

On 13-08-2013 PM 02:01, Ian Lea wrote:
> If I've understood your question correctly, the answer is yes.
> Assuming the input data is coming from another file the flow will be
> along the lines of
>
> .  Open input file for reading
> .  Open output file for writing
> .  Open (or create) lucene index
>
> .  For each input record
> -   write to output file
> -   add to lucene
>
> Then at the end close the files and the index and you're done.
>
>
> --
> Ian.
>
>
> On Tue, Aug 13, 2013 at 9:16 AM, Jugal Kolariya
> <ju...@rancoretech.com> wrote:
>> That only answer my 2nd part.
>>
>> My most important question still remains.
>>
>> "
>>
>> In my code case, I am creating a new file and writing data to that file.
>>
>> Now, when the file writing is in progress, I would like to create Lucene
>> Indexes. Once indexes are created, I can then perform operation on the
>> indexes.
>>
>> I want to know whether I can create indexes for the file on which the data
>> is still being getting written from the code.
>>
>> "
>>
>> Is it possible to create indexes of a file, data in which is still getting
>> written...
>>
>> Any guidance will be highly appreciated.
>>
>>
>> On 12-08-2013 PM 10:46, Michael McCandless wrote:
>>> You'll have to periodically re-index that document, if it's content is
>>> constantly changing.
>>>
>>> Alternatively, it's possible to index sub-documents so that each new
>>> "chunk" of content added because a new document, and then you join or
>>> group the results back into a single document ...
>>>
>>> Mike McCandless
>>>
>>> http://blog.mikemccandless.com
>>>
>>>
>>> On Mon, Aug 12, 2013 at 9:47 AM, Jugal Kolariya
>>> <ju...@rancoretech.com> wrote:
>>>> Hello,
>>>>             I have a potential usecase for which I am not sure whether
>>>> using
>>>> lucene will help me or not.
>>>>
>>>> In my code case, I am creating a new file and writing data to that file.
>>>>
>>>> Now, when the file writing is in progress, I would like to create Lucene
>>>> Indexes. Once indexes are created, I can then perform operation on the
>>>> indexes.
>>>>
>>>> I want to know whether I can create indexes for the file on which the
>>>> data
>>>> is still being getting written from the code.
>>>>
>>>> If yes, what about the incremental changes which happen in file. Those
>>>> details wont be captured in the indexes. Does this effectively means
>>>> everytime I try to perform a search I will first create index and then
>>>> perform search operation.
>>>>
>>>> Any guidance on this will be highly appreciated. Javadoc does not seem to
>>>> provide relevant information on this. Lucene version is 4.4.0
>>>> --
>>>>
>>>>
>>>>          Thanks & regards
>>>>          /Jugal Kishore Kolariya
>>>>          /
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Creating Indexes when data inside the file is being written.

Posted by Ian Lea <ia...@gmail.com>.
If I've understood your question correctly, the answer is yes.
Assuming the input data is coming from another file the flow will be
along the lines of

.  Open input file for reading
.  Open output file for writing
.  Open (or create) lucene index

.  For each input record
-   write to output file
-   add to lucene

Then at the end close the files and the index and you're done.


--
Ian.


On Tue, Aug 13, 2013 at 9:16 AM, Jugal Kolariya
<ju...@rancoretech.com> wrote:
> That only answer my 2nd part.
>
> My most important question still remains.
>
> "
>
> In my code case, I am creating a new file and writing data to that file.
>
> Now, when the file writing is in progress, I would like to create Lucene
> Indexes. Once indexes are created, I can then perform operation on the
> indexes.
>
> I want to know whether I can create indexes for the file on which the data
> is still being getting written from the code.
>
> "
>
> Is it possible to create indexes of a file, data in which is still getting
> written...
>
> Any guidance will be highly appreciated.
>
>
> On 12-08-2013 PM 10:46, Michael McCandless wrote:
>>
>> You'll have to periodically re-index that document, if it's content is
>> constantly changing.
>>
>> Alternatively, it's possible to index sub-documents so that each new
>> "chunk" of content added because a new document, and then you join or
>> group the results back into a single document ...
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Mon, Aug 12, 2013 at 9:47 AM, Jugal Kolariya
>> <ju...@rancoretech.com> wrote:
>>>
>>> Hello,
>>>            I have a potential usecase for which I am not sure whether
>>> using
>>> lucene will help me or not.
>>>
>>> In my code case, I am creating a new file and writing data to that file.
>>>
>>> Now, when the file writing is in progress, I would like to create Lucene
>>> Indexes. Once indexes are created, I can then perform operation on the
>>> indexes.
>>>
>>> I want to know whether I can create indexes for the file on which the
>>> data
>>> is still being getting written from the code.
>>>
>>> If yes, what about the incremental changes which happen in file. Those
>>> details wont be captured in the indexes. Does this effectively means
>>> everytime I try to perform a search I will first create index and then
>>> perform search operation.
>>>
>>> Any guidance on this will be highly appreciated. Javadoc does not seem to
>>> provide relevant information on this. Lucene version is 4.4.0
>>> --
>>>
>>>
>>>         Thanks & regards
>>>         /Jugal Kishore Kolariya
>>>         /
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Creating Indexes when data inside the file is being written.

Posted by Jugal Kolariya <ju...@rancoretech.com>.
That only answer my 2nd part.

My most important question still remains.

"

In my code case, I am creating a new file and writing data to that file.

Now, when the file writing is in progress, I would like to create Lucene
Indexes. Once indexes are created, I can then perform operation on the
indexes.

I want to know whether I can create indexes for the file on which the data
is still being getting written from the code.

"

Is it possible to create indexes of a file, data in which is still 
getting written...

Any guidance will be highly appreciated.


On 12-08-2013 PM 10:46, Michael McCandless wrote:
> You'll have to periodically re-index that document, if it's content is
> constantly changing.
>
> Alternatively, it's possible to index sub-documents so that each new
> "chunk" of content added because a new document, and then you join or
> group the results back into a single document ...
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Mon, Aug 12, 2013 at 9:47 AM, Jugal Kolariya
> <ju...@rancoretech.com> wrote:
>> Hello,
>>            I have a potential usecase for which I am not sure whether using
>> lucene will help me or not.
>>
>> In my code case, I am creating a new file and writing data to that file.
>>
>> Now, when the file writing is in progress, I would like to create Lucene
>> Indexes. Once indexes are created, I can then perform operation on the
>> indexes.
>>
>> I want to know whether I can create indexes for the file on which the data
>> is still being getting written from the code.
>>
>> If yes, what about the incremental changes which happen in file. Those
>> details wont be captured in the indexes. Does this effectively means
>> everytime I try to perform a search I will first create index and then
>> perform search operation.
>>
>> Any guidance on this will be highly appreciated. Javadoc does not seem to
>> provide relevant information on this. Lucene version is 4.4.0
>> --
>>
>>
>>         Thanks & regards
>>         /Jugal Kishore Kolariya
>>         /
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Creating Indexes when data inside the file is being written.

Posted by Michael McCandless <lu...@mikemccandless.com>.
You'll have to periodically re-index that document, if it's content is
constantly changing.

Alternatively, it's possible to index sub-documents so that each new
"chunk" of content added because a new document, and then you join or
group the results back into a single document ...

Mike McCandless

http://blog.mikemccandless.com


On Mon, Aug 12, 2013 at 9:47 AM, Jugal Kolariya
<ju...@rancoretech.com> wrote:
> Hello,
>           I have a potential usecase for which I am not sure whether using
> lucene will help me or not.
>
> In my code case, I am creating a new file and writing data to that file.
>
> Now, when the file writing is in progress, I would like to create Lucene
> Indexes. Once indexes are created, I can then perform operation on the
> indexes.
>
> I want to know whether I can create indexes for the file on which the data
> is still being getting written from the code.
>
> If yes, what about the incremental changes which happen in file. Those
> details wont be captured in the indexes. Does this effectively means
> everytime I try to perform a search I will first create index and then
> perform search operation.
>
> Any guidance on this will be highly appreciated. Javadoc does not seem to
> provide relevant information on this. Lucene version is 4.4.0
> --
>
>
>        Thanks & regards
>        /Jugal Kishore Kolariya
>        /
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org