You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@vxquery.apache.org by Steven Jacobs <sj...@ucr.edu> on 2016/08/02 16:18:34 UTC

Centralized Indexing

Hi all,
Menaka has been making great progress with Summer of Code so far, and today
I will be discussing the next task with him. I wanted to share the plan
that Preston and I came up with, in case anyone else had feedback.

The plan is to move indexing to a central location (rather than providing a
new index directory for every index query. We are planning to add this
directory path as a potential parameter to local.xml.

In this directory, we will store two things:
1) A sub directory for each index that has been created
2) A single XML file that catalogs the existing indexes, with at least two
fields per record:
a) collection path
b) index directory

This provides us with at least the following benefits:
Keeping indexes together
Enabling the "show indexes" query
Simplifying the index queries (now the user will only ever need to specify
the collection path)
Enabling future optimizations, including deciding dynamically when to use a
query

Steven

Re: Centralized Indexing

Posted by Till Westmann <ti...@apache.org>.
It would be really nice, to have some before/after example queries and
config files so that everybody can get an idea what this will look like 
:)

Cheers,
Till

On 2 Aug 2016, at 9:47, Preston Carman wrote:

> In your benefits section, I think you meant to say that the user will
> not have to specify the index path.
>
> The discussion also included talking about parallel queries. If a user
> would like for a query to use multiple indexes. The user can write a
> query similar to the current method of writing a query with multiple
> collection locations. The vertical bar separating local partitions can
> also be used to locate separate local indexes. While the indexing
> structure is stored in a single location, the individual indexes can
> be used for a parallel query.
>
>
> On Tue, Aug 2, 2016 at 9:18 AM, Steven Jacobs <sj...@ucr.edu> 
> wrote:
>> Hi all,
>> Menaka has been making great progress with Summer of Code so far, and 
>> today
>> I will be discussing the next task with him. I wanted to share the 
>> plan
>> that Preston and I came up with, in case anyone else had feedback.
>>
>> The plan is to move indexing to a central location (rather than 
>> providing a
>> new index directory for every index query. We are planning to add 
>> this
>> directory path as a potential parameter to local.xml.
>>
>> In this directory, we will store two things:
>> 1) A sub directory for each index that has been created
>> 2) A single XML file that catalogs the existing indexes, with at 
>> least two
>> fields per record:
>> a) collection path
>> b) index directory
>>
>> This provides us with at least the following benefits:
>> Keeping indexes together
>> Enabling the "show indexes" query
>> Simplifying the index queries (now the user will only ever need to 
>> specify
>> the collection path)
>> Enabling future optimizations, including deciding dynamically when to 
>> use a
>> query
>>
>> Steven

Re: Centralized Indexing

Posted by Preston Carman <pr...@apache.org>.
In your benefits section, I think you meant to say that the user will
not have to specify the index path.

The discussion also included talking about parallel queries. If a user
would like for a query to use multiple indexes. The user can write a
query similar to the current method of writing a query with multiple
collection locations. The vertical bar separating local partitions can
also be used to locate separate local indexes. While the indexing
structure is stored in a single location, the individual indexes can
be used for a parallel query.


On Tue, Aug 2, 2016 at 9:18 AM, Steven Jacobs <sj...@ucr.edu> wrote:
> Hi all,
> Menaka has been making great progress with Summer of Code so far, and today
> I will be discussing the next task with him. I wanted to share the plan
> that Preston and I came up with, in case anyone else had feedback.
>
> The plan is to move indexing to a central location (rather than providing a
> new index directory for every index query. We are planning to add this
> directory path as a potential parameter to local.xml.
>
> In this directory, we will store two things:
> 1) A sub directory for each index that has been created
> 2) A single XML file that catalogs the existing indexes, with at least two
> fields per record:
> a) collection path
> b) index directory
>
> This provides us with at least the following benefits:
> Keeping indexes together
> Enabling the "show indexes" query
> Simplifying the index queries (now the user will only ever need to specify
> the collection path)
> Enabling future optimizations, including deciding dynamically when to use a
> query
>
> Steven