You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by Zhangshunyu <gi...@git.apache.org> on 2016/07/01 08:30:10 UTC

[GitHub] incubator-carbondata pull request #10: Make inverted index can be configurab...

GitHub user Zhangshunyu opened a pull request:

    https://github.com/apache/incubator-carbondata/pull/10

    Make inverted index can be configurable

    Using DDL to determine which column not do inverted index, if the user doesn't specify for one column, the default value is true which means do inverted index.
    For example:
    
    ```
    CREATE TABLE IF NOT EXISTS index
    (id Int, name String, city String)
    STORED BY 'org.apache.carbondata.format'
    TBLPROPERTIES('NO_INVERTED_INDEX'='name,city')
    ```

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Zhangshunyu/incubator-carbondata index71

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-carbondata/pull/10.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10
    
----
commit 3a639368bf6b012f5ac68a4e59e4b9adfac493a7
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-13T13:55:42Z

    make inverted index configurable using ddl

commit 0877656ede1d344996d65c33c56058b15fcd2d3a
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-13T14:09:20Z

    style

commit c53ebdc73d0c668926e9219657466de027920751
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-13T14:12:52Z

    style

commit 1734a0d06594e4617ca4b24695ad755894aa127b
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-14T02:45:17Z

    fix compact bug

commit 16c475bf2892fca85e6856f9a2c479790c71a559
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-14T07:37:32Z

    str

commit 612cbad4f5ce8de92c1c4625c7e9f7da92a1ee8a
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-14T07:38:13Z

    test case

commit fa4d3fb4a0ac0b449b8a796a0c05d0d22e3fa3bb
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-21T03:20:09Z

    rebase

commit 84d5197f797b499c4a87204605c885d0be5a7490
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-22T02:54:35Z

    modify test case

commit 2794d429571123d40ee62199546aad2c1c91aa98
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-22T02:56:06Z

    modify test case

commit cb67371e7319cbe49a17115fd34e3f12965f541c
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-22T03:04:45Z

    modify test case

commit fa6876cbcad24273bc5513da726b65c327fc9550
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-27T02:09:24Z

    rebase627

commit 067032066413158dabdf30b0d517ecdd755a30bc
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-27T07:55:26Z

    rebase627

commit 3b94e97c09f90ed4178f6283e1fb117a89ec0b2d
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-27T11:18:05Z

    rebase627

commit dd78a09f7daf7878937966ba40712c9f94e875be
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-29T04:01:03Z

    rebase629

commit 4f6edcc66343d0a69d3e456a5848e78f0dcd9050
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-07-01T01:18:26Z

    rebase71

commit e3435c31bb95579b6ed0a7fe9afa1d89e38582e7
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-07-01T08:26:41Z

    rebase new repo

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #10: [CARBONDATA-29] Make inverted index c...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/10#discussion_r69290989
  
    --- Diff: core/src/main/java/org/carbondata/scan/filter/executer/ExcludeFilterExecuterImpl.java ---
    @@ -166,34 +166,20 @@ private BitSet setFilterdIndexToBitSetWithColumnIndex(
       private BitSet setFilterdIndexToBitSet(FixedLengthDimensionDataChunk dimColumnDataChunk,
           int numerOfRows) {
         BitSet bitSet = new BitSet(numerOfRows);
    -    int startKey = 0;
    -    int last = 0;
    -    bitSet.flip(0, numerOfRows);
    -    int startIndex = 0;
    -    byte[][] filterValues = dimColumnExecuterInfo.getFilterKeys();
    -    for (int k = 0; k < filterValues.length; k++) {
    -      startKey = CarbonUtil
    -          .getFirstIndexUsingBinarySearch(dimColumnDataChunk, startIndex, numerOfRows - 1,
    -              filterValues[k], false);
    -      if (startKey < 0) {
    -        continue;
    -      }
    -      bitSet.flip(startKey);
    -      last = startKey;
    -      for (int j = startKey + 1; j < numerOfRows; j++) {
    -        if (ByteUtil.UnsafeComparer.INSTANCE
    -            .compareTo(dimColumnDataChunk.getCompleteDataChunk(), j * filterValues[k].length,
    -                filterValues[k].length, filterValues[k], 0, filterValues[k].length) == 0) {
    -          bitSet.flip(j);
    -          last++;
    -        } else {
    -          break;
    +    if (dimColumnDataChunk instanceof FixedLengthDimensionDataChunk) {
    +      FixedLengthDimensionDataChunk fixedChunk = (FixedLengthDimensionDataChunk) dimColumnDataChunk;
    --- End diff --
    
    no need of type casting


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #10: [CARBONDATA-29] Make inverted index c...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/10#discussion_r69290954
  
    --- Diff: core/src/main/java/org/carbondata/scan/filter/executer/ExcludeFilterExecuterImpl.java ---
    @@ -166,34 +166,20 @@ private BitSet setFilterdIndexToBitSetWithColumnIndex(
       private BitSet setFilterdIndexToBitSet(FixedLengthDimensionDataChunk dimColumnDataChunk,
           int numerOfRows) {
         BitSet bitSet = new BitSet(numerOfRows);
    -    int startKey = 0;
    -    int last = 0;
    -    bitSet.flip(0, numerOfRows);
    -    int startIndex = 0;
    -    byte[][] filterValues = dimColumnExecuterInfo.getFilterKeys();
    -    for (int k = 0; k < filterValues.length; k++) {
    -      startKey = CarbonUtil
    -          .getFirstIndexUsingBinarySearch(dimColumnDataChunk, startIndex, numerOfRows - 1,
    -              filterValues[k], false);
    -      if (startKey < 0) {
    -        continue;
    -      }
    -      bitSet.flip(startKey);
    -      last = startKey;
    -      for (int j = startKey + 1; j < numerOfRows; j++) {
    -        if (ByteUtil.UnsafeComparer.INSTANCE
    -            .compareTo(dimColumnDataChunk.getCompleteDataChunk(), j * filterValues[k].length,
    -                filterValues[k].length, filterValues[k], 0, filterValues[k].length) == 0) {
    -          bitSet.flip(j);
    -          last++;
    -        } else {
    -          break;
    +    if (dimColumnDataChunk instanceof FixedLengthDimensionDataChunk) {
    --- End diff --
    
    No need of `if` condition


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #10: [CARBONDATA-29] Make inverted index c...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/10#discussion_r69291144
  
    --- Diff: core/src/main/java/org/carbondata/scan/filter/executer/ExcludeFilterExecuterImpl.java ---
    @@ -166,34 +166,20 @@ private BitSet setFilterdIndexToBitSetWithColumnIndex(
       private BitSet setFilterdIndexToBitSet(FixedLengthDimensionDataChunk dimColumnDataChunk,
           int numerOfRows) {
         BitSet bitSet = new BitSet(numerOfRows);
    -    int startKey = 0;
    -    int last = 0;
    -    bitSet.flip(0, numerOfRows);
    -    int startIndex = 0;
    -    byte[][] filterValues = dimColumnExecuterInfo.getFilterKeys();
    -    for (int k = 0; k < filterValues.length; k++) {
    -      startKey = CarbonUtil
    -          .getFirstIndexUsingBinarySearch(dimColumnDataChunk, startIndex, numerOfRows - 1,
    -              filterValues[k], false);
    -      if (startKey < 0) {
    -        continue;
    -      }
    -      bitSet.flip(startKey);
    -      last = startKey;
    -      for (int j = startKey + 1; j < numerOfRows; j++) {
    -        if (ByteUtil.UnsafeComparer.INSTANCE
    -            .compareTo(dimColumnDataChunk.getCompleteDataChunk(), j * filterValues[k].length,
    -                filterValues[k].length, filterValues[k], 0, filterValues[k].length) == 0) {
    -          bitSet.flip(j);
    -          last++;
    -        } else {
    -          break;
    +    if (dimColumnDataChunk instanceof FixedLengthDimensionDataChunk) {
    +      FixedLengthDimensionDataChunk fixedChunk = (FixedLengthDimensionDataChunk) dimColumnDataChunk;
    +      int start = 0;
    +      bitSet.flip(0, numerOfRows);
    +      byte[][] filterValues = dimColumnExecuterInfo.getFilterKeys();
    +      for (int k = 0; k < filterValues.length; k++) {
    +        for (int j = start + 1; j < numerOfRows; j++) {
    --- End diff --
    
    I guess the logic `j = start + 1` seems wrong. It supposed to start from 0 like `j = 0`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #10: [CARBONDATA-29] Make inverted index c...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/10#discussion_r69291021
  
    --- Diff: core/src/main/java/org/carbondata/scan/filter/executer/ExcludeFilterExecuterImpl.java ---
    @@ -166,34 +166,20 @@ private BitSet setFilterdIndexToBitSetWithColumnIndex(
       private BitSet setFilterdIndexToBitSet(FixedLengthDimensionDataChunk dimColumnDataChunk,
           int numerOfRows) {
         BitSet bitSet = new BitSet(numerOfRows);
    -    int startKey = 0;
    -    int last = 0;
    -    bitSet.flip(0, numerOfRows);
    -    int startIndex = 0;
    -    byte[][] filterValues = dimColumnExecuterInfo.getFilterKeys();
    -    for (int k = 0; k < filterValues.length; k++) {
    -      startKey = CarbonUtil
    -          .getFirstIndexUsingBinarySearch(dimColumnDataChunk, startIndex, numerOfRows - 1,
    -              filterValues[k], false);
    -      if (startKey < 0) {
    -        continue;
    -      }
    -      bitSet.flip(startKey);
    -      last = startKey;
    -      for (int j = startKey + 1; j < numerOfRows; j++) {
    -        if (ByteUtil.UnsafeComparer.INSTANCE
    -            .compareTo(dimColumnDataChunk.getCompleteDataChunk(), j * filterValues[k].length,
    -                filterValues[k].length, filterValues[k], 0, filterValues[k].length) == 0) {
    -          bitSet.flip(j);
    -          last++;
    -        } else {
    -          break;
    +    if (dimColumnDataChunk instanceof FixedLengthDimensionDataChunk) {
    +      FixedLengthDimensionDataChunk fixedChunk = (FixedLengthDimensionDataChunk) dimColumnDataChunk;
    +      int start = 0;
    --- End diff --
    
    what is the use of `start`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #10: [CARBONDATA-29] Make inverted index c...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/10#discussion_r69289069
  
    --- Diff: core/src/main/java/org/carbondata/core/datastorage/store/columnar/BlockIndexerStorageForNoInvertedIndex.java ---
    @@ -0,0 +1,134 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +package org.carbondata.core.datastorage.store.columnar;
    +
    +import java.util.ArrayList;
    +import java.util.List;
    +
    +import org.carbondata.core.constants.CarbonCommonConstants;
    +import org.carbondata.core.util.ByteUtil;
    +
    +public class BlockIndexerStorageForNoInvertedIndex implements IndexStorage<int[]> {
    --- End diff --
    
    Lot of code is duplicated , Better make Abstract class and move the common code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---