You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Thomas J. Buhr" <vi...@gmail.com> on 2009/03/12 07:17:37 UTC

Dynamic Indexing?

Lucene,

 From what I have read on your website indexing does seem like a  
useful thing. I'm considering the possible use of Lucene in a company  
project and have a few research questions.

What I'm considering is using Lucene as a backend data store for a  
graphic editor. The typical usage examples given on your website  
usually involve indexing data into an index from some file and then  
searching the index.

What I would like to do is create the data in a text editor in the UI  
and then store the text data in an index as it is being created/ 
typed. Later, I would search for the data and display it as text  
characters again, back in the text editor and edit it further.

The kind of text I’m creating and editing is unique. Each letter of a  
word can have multiple character versions, below is a sample of this  
where there are three possible letters for the second word position,  
an “e” or “a” or “i”.

1   2   3   4   5
H   e   l   l   o
     a
     i

I need to store all these characters for the five positions they  
occupy as string values of one field in an index document. Then I  
also need two more document fields for storing style (bold, italic…)  
data and also color data for each of the five positions. All these  
three fields (text chars, style info, color info) need to be  
positionally aligned.

Some questions on data formatting and positional selection in my  
scenario:

1 - Is it possible to store multiple chars for one position? How do I  
specify the format? Can I use a delimited string list like x, (x, x,  
x), x, x, x ?

2 - My text editor has a method that tells me what letter position in  
string of words I clicked on with the mouse. Can I then create a  
query and select the data for all three fields at the given word  
letter position? In my view of indexing, fields are like rows of  
values so this would be like picking out a complete column across the  
three fields (the values intersecting the three rows at the given  
position). Is this easy or difficult to do?

More questions based on my simple data creation and editing scenario:

3 - If I type a new letter into a given word in the editor can I add/ 
insert its data at the right position in all three fields of data  
used to describe/tag my letters? Do I have to create a whole new  
document and re-index it or is the document flexible enough to allow  
inplace dynamic editing? What if I remove/delete a letter?

4 - If I change a letter or it’s style or color tags can these edits  
be dynamically updated at the specific row and column intersection/ 
cell? Or do I need to re-index the whole document again?

5 - What does the term “local alignment” mean in search engine  
parlance? Is this referring to data value positions across fields  
(rows) of data?

6 - Has anyone ever used an index as a “Local History” system for  
undo/redo operations? Would this be feasible?

7 - Is there a standard way to export/write documents out to a file  
like JSON or XML?

8 - The Jackpot Plugin for the Netbeans IDE is a useful refactoring  
tool. If Lucene was used as the model for Java code or other data  
could a refactoring system like Jackpot be built to operate on data  
in an Lucene index? Would the performance of Lucene be good enough to  
be used for inplace dynamic editing/indexing?


Thanks, hope this can work...

Thom