You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@vxquery.apache.org by Menaka Madushanka <me...@gmail.com> on 2016/06/19 20:38:58 UTC

Updating Lucene Index

Hello,

I created the update index query as follows.

Query Structure:
*update-index(collection_folder, index_folder)*

But encountered a problem while implementing this.

In current Lucene indexing implementation, a document is created for each
of the xml file.

In order to retrieve a specific document from index, there should be a
unique field for a document, so that it can be retrieved using that field.
Such as File name or path. (In current implementation, only xml data is
stored in a document.)

So, I propose adding the file path to the document as a field. So that, in
the methodology that I proposed, without storing any other metadata, the
index updating can be done easily. Currently, file name, path and checksum
value are taken as metadata.

I'd like to have any feedback on this.

Thank you very much
Menaka

-- 
*Menaka Madushanka Jayawardena*
Faculty of Engineering, <http://www.pdn.ac.lk/eng>
University of Peradeniyaya.
LinkedIn <http://lk.linkedin.com/in/menakajayawardena>
TP:- 071 885 1183/ 071 350 5470

Re: Updating Lucene Index

Posted by Menaka Madushanka <me...@gmail.com>.
That field will be for the document as a whole. One unique field per
document.

On 20 June 2016 at 04:47, Steven Jacobs <sj...@ucr.edu> wrote:

> Hi,
> Great progress!
> Would this be part of the data for each item, or is there metadata for the
> document as a whole?
> Steven
>
> On Sunday, June 19, 2016, Menaka Madushanka <me...@gmail.com> wrote:
>
> > Hello,
> >
> > I created the update index query as follows.
> >
> > Query Structure:
> > *update-index(collection_folder, index_folder)*
> >
> > But encountered a problem while implementing this.
> >
> > In current Lucene indexing implementation, a document is created for each
> > of the xml file.
> >
> > In order to retrieve a specific document from index, there should be a
> > unique field for a document, so that it can be retrieved using that
> field.
> > Such as File name or path. (In current implementation, only xml data is
> > stored in a document.)
> >
> > So, I propose adding the file path to the document as a field. So that,
> in
> > the methodology that I proposed, without storing any other metadata, the
> > index updating can be done easily. Currently, file name, path and
> checksum
> > value are taken as metadata.
> >
> > I'd like to have any feedback on this.
> >
> > Thank you very much
> > Menaka
> >
> > --
> > *Menaka Madushanka Jayawardena*
> > Faculty of Engineering, <http://www.pdn.ac.lk/eng>
> > University of Peradeniyaya.
> > LinkedIn <http://lk.linkedin.com/in/menakajayawardena>
> > TP:- 071 885 1183/ 071 350 5470
> >
>



-- 
*Menaka Madushanka Jayawardena*
Faculty of Engineering, <http://www.pdn.ac.lk/eng>
University of Peradeniyaya.
LinkedIn <http://lk.linkedin.com/in/menakajayawardena>
TP:- 071 885 1183/ 071 350 5470

Re: Updating Lucene Index

Posted by Steven Jacobs <sj...@ucr.edu>.
Hi,
Great progress!
Would this be part of the data for each item, or is there metadata for the
document as a whole?
Steven

On Sunday, June 19, 2016, Menaka Madushanka <me...@gmail.com> wrote:

> Hello,
>
> I created the update index query as follows.
>
> Query Structure:
> *update-index(collection_folder, index_folder)*
>
> But encountered a problem while implementing this.
>
> In current Lucene indexing implementation, a document is created for each
> of the xml file.
>
> In order to retrieve a specific document from index, there should be a
> unique field for a document, so that it can be retrieved using that field.
> Such as File name or path. (In current implementation, only xml data is
> stored in a document.)
>
> So, I propose adding the file path to the document as a field. So that, in
> the methodology that I proposed, without storing any other metadata, the
> index updating can be done easily. Currently, file name, path and checksum
> value are taken as metadata.
>
> I'd like to have any feedback on this.
>
> Thank you very much
> Menaka
>
> --
> *Menaka Madushanka Jayawardena*
> Faculty of Engineering, <http://www.pdn.ac.lk/eng>
> University of Peradeniyaya.
> LinkedIn <http://lk.linkedin.com/in/menakajayawardena>
> TP:- 071 885 1183/ 071 350 5470
>