You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@vxquery.apache.org by Menaka Madushanka <me...@gmail.com> on 2016/07/17 13:17:38 UTC

Modifications for xml metadata file structure.

Hello,

Currently the xml metadata are stored in the same directory with indexes.
The collection information was also stored in the file.
In update and delete queries, the path to the index is given so that the
file can be located and perform the tasks.

But, according to the requirements, the metadata file will be stored
separately and all metadata information will be stored in the same file.
The structure will be as follows.

<indexes>
    <index collection="path_to_collection_1">
        <file>
            <path>/home/menaka/xml/catalog.xml</path>
            <md5>44AC8A401C32384D9EB00952E1C96685</md5>
            <fileName>catalog.xml</fileName>
            <lastModified>10/07/2016 23:41:13</lastModified>
        </file>
        <file>
        </file>
        .
        .
        .
    </index>
    <index collection="path_to_collection_2">
        <file>
            <path>path_to_collection_2/catalog.xml</path>
            <md5>44AC8A401C32384D9EB00952E1C96685</md5>
            <fileName>catalog.xml</fileName>
            <lastModified>10/07/2016 23:41:13</lastModified>
        </file>
        <file>
        </file>
        .
        .
        .
    </index>

</indexes>

In this format, the collection directory is stored separately. But what
really need is the index location. Because, here we already have the paths
for xml documents.
As all indexes are saved in single file and we give path-to-index as update
and delete index parameters, we can retrieve the metadata for xml files
regarding to the given index.

But in this approach we cannot do that unless we store the metadata file
inside the index folder. So, I suggest storing path-to-index instead of
path-to-collection.

The new structure would be,

<indexes>
    <index *location="path_to_index_1*">
        <file>
            <path>/home/menaka/xml/catalog.xml</path>
            <md5>44AC8A401C32384D9EB00952E1C96685</md5>
            <fileName>catalog.xml</fileName>
            <lastModified>10/07/2016 23:41:13</lastModified>
        </file>
        <file>
        </file>
        .
        .
        .
    </index>
    <index* location="path_to_index_2"*>
        <file>
            <path>path_to_collection_2/catalog.xml</path>
            <md5>44AC8A401C32384D9EB00952E1C96685</md5>
            <fileName>catalog.xml</fileName>
            <lastModified>10/07/2016 23:41:13</lastModified>
        </file>
        <file>
        </file>
        .
        .
        .
    </index>

</indexes>

Thank you very much
Menaka

-- 
*Menaka Madushanka Jayawardena*
Faculty of Engineering, <http://www.pdn.ac.lk/eng>
University of Peradeniyaya.
LinkedIn <http://lk.linkedin.com/in/menakajayawardena>
TP:- 071 885 1183/ 071 350 5470

Re: Modifications for xml metadata file structure.

Posted by Preston Carman <pr...@apache.org>.
Great.

On Sun, Jul 17, 2016 at 11:52 AM, Menaka Madushanka
<me...@gmail.com> wrote:
> Hello,
>
> I have done the modification already and forgot to update the thread.
> The new structure is like follows.
>
> <indexes>
>     <index location="/home/menaka/index" collection="/home/menaka/xml">
>         <file>
>             <path>/home/menaka/xml/US000000001.xml</path>
>             <md5>6790FD58A71834EA2D2CE8A24D1869CE</md5>
>             <fileName>US000000001.xml</fileName>
>             <lastModified>22/03/2016, 11:22:25</lastModified>
>         </file>
>         <file>
>             <path>/home/menaka/xml/US000000001 (copy).xml</path>
>             <md5>48EB6806A2B3AAA9871F4ABE683D9BB5</md5>
>             <fileName>US000000001 (copy).xml</fileName>
>             <lastModified>17/07/2016 19:42:43</lastModified>
>         </file>
>     </index>
> </indexes>
>
> This way the process can be executed regardless of the location of metadata
> file.
>
> Thank you very much
> Menaka
>
>
>
> On 18 July 2016 at 00:14, Preston Carman <pr...@apache.org> wrote:
>
>> Why not store both the path to index and the path to collection. The
>> index element could have two attributes: "collection-path" and
>> "index-path".
>>
>> Also note: while the metadata structure can hold many indexes, this
>> does not require all index metadata files to be stored in the same
>> location. They still can be spread out and/or in the same location.
>>
>> On Sun, Jul 17, 2016 at 6:17 AM, Menaka Madushanka
>> <me...@gmail.com> wrote:
>> > Hello,
>> >
>> > Currently the xml metadata are stored in the same directory with indexes.
>> > The collection information was also stored in the file.
>> > In update and delete queries, the path to the index is given so that the
>> > file can be located and perform the tasks.
>> >
>> > But, according to the requirements, the metadata file will be stored
>> > separately and all metadata information will be stored in the same file.
>> > The structure will be as follows.
>> >
>> > <indexes>
>> >     <index collection="path_to_collection_1">
>> >         <file>
>> >             <path>/home/menaka/xml/catalog.xml</path>
>> >             <md5>44AC8A401C32384D9EB00952E1C96685</md5>
>> >             <fileName>catalog.xml</fileName>
>> >             <lastModified>10/07/2016 23:41:13</lastModified>
>> >         </file>
>> >         <file>
>> >         </file>
>> >         .
>> >         .
>> >         .
>> >     </index>
>> >     <index collection="path_to_collection_2">
>> >         <file>
>> >             <path>path_to_collection_2/catalog.xml</path>
>> >             <md5>44AC8A401C32384D9EB00952E1C96685</md5>
>> >             <fileName>catalog.xml</fileName>
>> >             <lastModified>10/07/2016 23:41:13</lastModified>
>> >         </file>
>> >         <file>
>> >         </file>
>> >         .
>> >         .
>> >         .
>> >     </index>
>> >
>> > </indexes>
>> >
>> > In this format, the collection directory is stored separately. But what
>> > really need is the index location. Because, here we already have the
>> paths
>> > for xml documents.
>> > As all indexes are saved in single file and we give path-to-index as
>> update
>> > and delete index parameters, we can retrieve the metadata for xml files
>> > regarding to the given index.
>> >
>> > But in this approach we cannot do that unless we store the metadata file
>> > inside the index folder. So, I suggest storing path-to-index instead of
>> > path-to-collection.
>> >
>> > The new structure would be,
>> >
>> > <indexes>
>> >     <index *location="path_to_index_1*">
>> >         <file>
>> >             <path>/home/menaka/xml/catalog.xml</path>
>> >             <md5>44AC8A401C32384D9EB00952E1C96685</md5>
>> >             <fileName>catalog.xml</fileName>
>> >             <lastModified>10/07/2016 23:41:13</lastModified>
>> >         </file>
>> >         <file>
>> >         </file>
>> >         .
>> >         .
>> >         .
>> >     </index>
>> >     <index* location="path_to_index_2"*>
>> >         <file>
>> >             <path>path_to_collection_2/catalog.xml</path>
>> >             <md5>44AC8A401C32384D9EB00952E1C96685</md5>
>> >             <fileName>catalog.xml</fileName>
>> >             <lastModified>10/07/2016 23:41:13</lastModified>
>> >         </file>
>> >         <file>
>> >         </file>
>> >         .
>> >         .
>> >         .
>> >     </index>
>> >
>> > </indexes>
>> >
>> > Thank you very much
>> > Menaka
>> >
>> > --
>> > *Menaka Madushanka Jayawardena*
>> > Faculty of Engineering, <http://www.pdn.ac.lk/eng>
>> > University of Peradeniyaya.
>> > LinkedIn <http://lk.linkedin.com/in/menakajayawardena>
>> > TP:- 071 885 1183/ 071 350 5470
>>
>
>
>
> --
> *Menaka Madushanka Jayawardena*
> Faculty of Engineering, <http://www.pdn.ac.lk/eng>
> University of Peradeniyaya.
> LinkedIn <http://lk.linkedin.com/in/menakajayawardena>
> TP:- 071 885 1183/ 071 350 5470

Re: Modifications for xml metadata file structure.

Posted by Menaka Madushanka <me...@gmail.com>.
Hello,

I have done the modification already and forgot to update the thread.
The new structure is like follows.

<indexes>
    <index location="/home/menaka/index" collection="/home/menaka/xml">
        <file>
            <path>/home/menaka/xml/US000000001.xml</path>
            <md5>6790FD58A71834EA2D2CE8A24D1869CE</md5>
            <fileName>US000000001.xml</fileName>
            <lastModified>22/03/2016, 11:22:25</lastModified>
        </file>
        <file>
            <path>/home/menaka/xml/US000000001 (copy).xml</path>
            <md5>48EB6806A2B3AAA9871F4ABE683D9BB5</md5>
            <fileName>US000000001 (copy).xml</fileName>
            <lastModified>17/07/2016 19:42:43</lastModified>
        </file>
    </index>
</indexes>

This way the process can be executed regardless of the location of metadata
file.

Thank you very much
Menaka



On 18 July 2016 at 00:14, Preston Carman <pr...@apache.org> wrote:

> Why not store both the path to index and the path to collection. The
> index element could have two attributes: "collection-path" and
> "index-path".
>
> Also note: while the metadata structure can hold many indexes, this
> does not require all index metadata files to be stored in the same
> location. They still can be spread out and/or in the same location.
>
> On Sun, Jul 17, 2016 at 6:17 AM, Menaka Madushanka
> <me...@gmail.com> wrote:
> > Hello,
> >
> > Currently the xml metadata are stored in the same directory with indexes.
> > The collection information was also stored in the file.
> > In update and delete queries, the path to the index is given so that the
> > file can be located and perform the tasks.
> >
> > But, according to the requirements, the metadata file will be stored
> > separately and all metadata information will be stored in the same file.
> > The structure will be as follows.
> >
> > <indexes>
> >     <index collection="path_to_collection_1">
> >         <file>
> >             <path>/home/menaka/xml/catalog.xml</path>
> >             <md5>44AC8A401C32384D9EB00952E1C96685</md5>
> >             <fileName>catalog.xml</fileName>
> >             <lastModified>10/07/2016 23:41:13</lastModified>
> >         </file>
> >         <file>
> >         </file>
> >         .
> >         .
> >         .
> >     </index>
> >     <index collection="path_to_collection_2">
> >         <file>
> >             <path>path_to_collection_2/catalog.xml</path>
> >             <md5>44AC8A401C32384D9EB00952E1C96685</md5>
> >             <fileName>catalog.xml</fileName>
> >             <lastModified>10/07/2016 23:41:13</lastModified>
> >         </file>
> >         <file>
> >         </file>
> >         .
> >         .
> >         .
> >     </index>
> >
> > </indexes>
> >
> > In this format, the collection directory is stored separately. But what
> > really need is the index location. Because, here we already have the
> paths
> > for xml documents.
> > As all indexes are saved in single file and we give path-to-index as
> update
> > and delete index parameters, we can retrieve the metadata for xml files
> > regarding to the given index.
> >
> > But in this approach we cannot do that unless we store the metadata file
> > inside the index folder. So, I suggest storing path-to-index instead of
> > path-to-collection.
> >
> > The new structure would be,
> >
> > <indexes>
> >     <index *location="path_to_index_1*">
> >         <file>
> >             <path>/home/menaka/xml/catalog.xml</path>
> >             <md5>44AC8A401C32384D9EB00952E1C96685</md5>
> >             <fileName>catalog.xml</fileName>
> >             <lastModified>10/07/2016 23:41:13</lastModified>
> >         </file>
> >         <file>
> >         </file>
> >         .
> >         .
> >         .
> >     </index>
> >     <index* location="path_to_index_2"*>
> >         <file>
> >             <path>path_to_collection_2/catalog.xml</path>
> >             <md5>44AC8A401C32384D9EB00952E1C96685</md5>
> >             <fileName>catalog.xml</fileName>
> >             <lastModified>10/07/2016 23:41:13</lastModified>
> >         </file>
> >         <file>
> >         </file>
> >         .
> >         .
> >         .
> >     </index>
> >
> > </indexes>
> >
> > Thank you very much
> > Menaka
> >
> > --
> > *Menaka Madushanka Jayawardena*
> > Faculty of Engineering, <http://www.pdn.ac.lk/eng>
> > University of Peradeniyaya.
> > LinkedIn <http://lk.linkedin.com/in/menakajayawardena>
> > TP:- 071 885 1183/ 071 350 5470
>



-- 
*Menaka Madushanka Jayawardena*
Faculty of Engineering, <http://www.pdn.ac.lk/eng>
University of Peradeniyaya.
LinkedIn <http://lk.linkedin.com/in/menakajayawardena>
TP:- 071 885 1183/ 071 350 5470

Re: Modifications for xml metadata file structure.

Posted by Preston Carman <pr...@apache.org>.
Why not store both the path to index and the path to collection. The
index element could have two attributes: "collection-path" and
"index-path".

Also note: while the metadata structure can hold many indexes, this
does not require all index metadata files to be stored in the same
location. They still can be spread out and/or in the same location.

On Sun, Jul 17, 2016 at 6:17 AM, Menaka Madushanka
<me...@gmail.com> wrote:
> Hello,
>
> Currently the xml metadata are stored in the same directory with indexes.
> The collection information was also stored in the file.
> In update and delete queries, the path to the index is given so that the
> file can be located and perform the tasks.
>
> But, according to the requirements, the metadata file will be stored
> separately and all metadata information will be stored in the same file.
> The structure will be as follows.
>
> <indexes>
>     <index collection="path_to_collection_1">
>         <file>
>             <path>/home/menaka/xml/catalog.xml</path>
>             <md5>44AC8A401C32384D9EB00952E1C96685</md5>
>             <fileName>catalog.xml</fileName>
>             <lastModified>10/07/2016 23:41:13</lastModified>
>         </file>
>         <file>
>         </file>
>         .
>         .
>         .
>     </index>
>     <index collection="path_to_collection_2">
>         <file>
>             <path>path_to_collection_2/catalog.xml</path>
>             <md5>44AC8A401C32384D9EB00952E1C96685</md5>
>             <fileName>catalog.xml</fileName>
>             <lastModified>10/07/2016 23:41:13</lastModified>
>         </file>
>         <file>
>         </file>
>         .
>         .
>         .
>     </index>
>
> </indexes>
>
> In this format, the collection directory is stored separately. But what
> really need is the index location. Because, here we already have the paths
> for xml documents.
> As all indexes are saved in single file and we give path-to-index as update
> and delete index parameters, we can retrieve the metadata for xml files
> regarding to the given index.
>
> But in this approach we cannot do that unless we store the metadata file
> inside the index folder. So, I suggest storing path-to-index instead of
> path-to-collection.
>
> The new structure would be,
>
> <indexes>
>     <index *location="path_to_index_1*">
>         <file>
>             <path>/home/menaka/xml/catalog.xml</path>
>             <md5>44AC8A401C32384D9EB00952E1C96685</md5>
>             <fileName>catalog.xml</fileName>
>             <lastModified>10/07/2016 23:41:13</lastModified>
>         </file>
>         <file>
>         </file>
>         .
>         .
>         .
>     </index>
>     <index* location="path_to_index_2"*>
>         <file>
>             <path>path_to_collection_2/catalog.xml</path>
>             <md5>44AC8A401C32384D9EB00952E1C96685</md5>
>             <fileName>catalog.xml</fileName>
>             <lastModified>10/07/2016 23:41:13</lastModified>
>         </file>
>         <file>
>         </file>
>         .
>         .
>         .
>     </index>
>
> </indexes>
>
> Thank you very much
> Menaka
>
> --
> *Menaka Madushanka Jayawardena*
> Faculty of Engineering, <http://www.pdn.ac.lk/eng>
> University of Peradeniyaya.
> LinkedIn <http://lk.linkedin.com/in/menakajayawardena>
> TP:- 071 885 1183/ 071 350 5470