You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@phoenix.apache.org by Tao Xiao <xi...@gmail.com> on 2015/05/17 16:25:27 UTC

Can we create an secondary index for two or more columns?

I tried to build secondary index for two columns, *M.C0* and *M.C1*:

      create index my_idx on EXAMPLE (M.C0, M.C1)


Then I tried to query by two indexes, respectively:

    a). select M.C0 from EXAMPLE where M.C0 = 'c0_00000001'
    b). select M.C1 from EXAMPLE where M.C1 = 'c1_00000001'

Query a) returned results in less than 0.3 second, and query b) returned
results in about 13 seconds. It seems that index was built for column *M.C0*,
not *M.C1*.

Can we build secondary index for two or more columns ?

Thanks.

Re: Can we create an secondary index for two or more columns?

Posted by Tao Xiao <xi...@gmail.com>.

Thank you Jesse, that really makes sense.

2015-05-18 11:19 GMT+08:00 Jesse Yates <je...@gmail.com>:

> create index my_idx on EXAMPLE (M.C0, M.C1)
>
>
> This will create an index on both columns _at the same time_. This means
> the row key in the index will be a combination of both columns. Creating an
> index like this is only an advantage if you are commonly querying _both
> columns_ at the same time. For instance, a query like:
>
> Select * from EXAMPLE WHERE m.c0 = "a" AND m.c1 = "b" will leverage the
> index on both columns. However, if you are just querying each column
> separately, then using your solution (b) will be better.
>
> Does that make sense?
>
> -------------------
> Jesse Yates
> @jesse_yates
> jyates.github.com
>
> On Sun, May 17, 2015 at 6:17 PM, Tao Xiao <xi...@gmail.com>
> wrote:
>
>> Thanks to Yuhao,
>>
>> Is the column *M.C1* ignored when I invoke " create index idx on EXAMPLE
>> (M.C0, M.C1) " ?
>>
>> Which of the following two solutions make sense?
>>
>> solution a):
>>      create index idx_1 on EXAMPLE (M.C0, M.C1)
>>      create index idx_2 on EXAMPLE (M.C1, M.C0)
>>
>>
>> solution b):
>>     create index idx_1 on EXAMPLE (M.C0)
>>     create index idx_2 on EXAMPLE (M.C1)
>>
>>
>>
>>
>> 2015-05-17 22:34 GMT+08:00 Yuhao Bi <by...@gmail.com>:
>>
>>> Hi Xiao Tao,
>>>
>>> You can create another secondary index on the same columns.
>>>     create index another_index on EXAMPLE(M.C1, M.C0)
>>>
>>> After doing that, query b) should return faster.
>>>
>>> 2015-05-17 22:25 GMT+08:00 Tao Xiao <xi...@gmail.com>:
>>>
>>>> I tried to build secondary index for two columns, *M.C0* and *M.C1*:
>>>>
>>>>       create index my_idx on EXAMPLE (M.C0, M.C1)
>>>>
>>>>
>>>> Then I tried to query by two indexes, respectively:
>>>>
>>>>     a). select M.C0 from EXAMPLE where M.C0 = 'c0_00000001'
>>>>     b). select M.C1 from EXAMPLE where M.C1 = 'c1_00000001'
>>>>
>>>> Query a) returned results in less than 0.3 second, and query b)
>>>> returned results in about 13 seconds. It seems that index was built for
>>>> column *M.C0*, not *M.C1*.
>>>>
>>>> Can we build secondary index for two or more columns ?
>>>>
>>>> Thanks.
>>>>
>>>
>>>
>>
>

Re: Can we create an secondary index for two or more columns?

Posted by Jesse Yates <je...@gmail.com>.

>
> create index my_idx on EXAMPLE (M.C0, M.C1)


This will create an index on both columns _at the same time_. This means
the row key in the index will be a combination of both columns. Creating an
index like this is only an advantage if you are commonly querying _both
columns_ at the same time. For instance, a query like:

Select * from EXAMPLE WHERE m.c0 = "a" AND m.c1 = "b" will leverage the
index on both columns. However, if you are just querying each column
separately, then using your solution (b) will be better.

Does that make sense?

-------------------
Jesse Yates
@jesse_yates
jyates.github.com

On Sun, May 17, 2015 at 6:17 PM, Tao Xiao <xi...@gmail.com> wrote:

> Thanks to Yuhao,
>
> Is the column *M.C1* ignored when I invoke " create index idx on EXAMPLE
> (M.C0, M.C1) " ?
>
> Which of the following two solutions make sense?
>
> solution a):
>      create index idx_1 on EXAMPLE (M.C0, M.C1)
>      create index idx_2 on EXAMPLE (M.C1, M.C0)
>
>
> solution b):
>     create index idx_1 on EXAMPLE (M.C0)
>     create index idx_2 on EXAMPLE (M.C1)
>
>
>
>
> 2015-05-17 22:34 GMT+08:00 Yuhao Bi <by...@gmail.com>:
>
>> Hi Xiao Tao,
>>
>> You can create another secondary index on the same columns.
>>     create index another_index on EXAMPLE(M.C1, M.C0)
>>
>> After doing that, query b) should return faster.
>>
>> 2015-05-17 22:25 GMT+08:00 Tao Xiao <xi...@gmail.com>:
>>
>>> I tried to build secondary index for two columns, *M.C0* and *M.C1*:
>>>
>>>       create index my_idx on EXAMPLE (M.C0, M.C1)
>>>
>>>
>>> Then I tried to query by two indexes, respectively:
>>>
>>>     a). select M.C0 from EXAMPLE where M.C0 = 'c0_00000001'
>>>     b). select M.C1 from EXAMPLE where M.C1 = 'c1_00000001'
>>>
>>> Query a) returned results in less than 0.3 second, and query b) returned
>>> results in about 13 seconds. It seems that index was built for column
>>> *M.C0*, not *M.C1*.
>>>
>>> Can we build secondary index for two or more columns ?
>>>
>>> Thanks.
>>>
>>
>>
>

Re: Can we create an secondary index for two or more columns?

Posted by Tao Xiao <xi...@gmail.com>.

Thanks to Yuhao,

Is the column *M.C1* ignored when I invoke " create index idx on EXAMPLE
(M.C0, M.C1) " ?

Which of the following two solutions make sense?

solution a):
     create index idx_1 on EXAMPLE (M.C0, M.C1)
     create index idx_2 on EXAMPLE (M.C1, M.C0)


solution b):
    create index idx_1 on EXAMPLE (M.C0)
    create index idx_2 on EXAMPLE (M.C1)




2015-05-17 22:34 GMT+08:00 Yuhao Bi <by...@gmail.com>:

> Hi Xiao Tao,
>
> You can create another secondary index on the same columns.
>     create index another_index on EXAMPLE(M.C1, M.C0)
>
> After doing that, query b) should return faster.
>
> 2015-05-17 22:25 GMT+08:00 Tao Xiao <xi...@gmail.com>:
>
>> I tried to build secondary index for two columns, *M.C0* and *M.C1*:
>>
>>       create index my_idx on EXAMPLE (M.C0, M.C1)
>>
>>
>> Then I tried to query by two indexes, respectively:
>>
>>     a). select M.C0 from EXAMPLE where M.C0 = 'c0_00000001'
>>     b). select M.C1 from EXAMPLE where M.C1 = 'c1_00000001'
>>
>> Query a) returned results in less than 0.3 second, and query b) returned
>> results in about 13 seconds. It seems that index was built for column
>> *M.C0*, not *M.C1*.
>>
>> Can we build secondary index for two or more columns ?
>>
>> Thanks.
>>
>
>

Re: Can we create an secondary index for two or more columns?

Posted by Yuhao Bi <by...@gmail.com>.

Hi Xiao Tao,

You can create another secondary index on the same columns.
    create index another_index on EXAMPLE(M.C1, M.C0)

After doing that, query b) should return faster.

2015-05-17 22:25 GMT+08:00 Tao Xiao <xi...@gmail.com>:

> I tried to build secondary index for two columns, *M.C0* and *M.C1*:
>
>       create index my_idx on EXAMPLE (M.C0, M.C1)
>
>
> Then I tried to query by two indexes, respectively:
>
>     a). select M.C0 from EXAMPLE where M.C0 = 'c0_00000001'
>     b). select M.C1 from EXAMPLE where M.C1 = 'c1_00000001'
>
> Query a) returned results in less than 0.3 second, and query b) returned
> results in about 13 seconds. It seems that index was built for column
> *M.C0*, not *M.C1*.
>
> Can we build secondary index for two or more columns ?
>
> Thanks.
>