You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by kumarvishal09 <gi...@git.apache.org> on 2018/07/04 10:56:09 UTC
[GitHub] carbondata pull request #2447: [WIP]Local dictionary query
GitHub user kumarvishal09 opened a pull request:
https://github.com/apache/carbondata/pull/2447
[WIP]Local dictionary query
Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:
- [ ] Any interfaces changed?
- [ ] Any backward compatibility impacted?
- [ ] Document update required?
- [ ] Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/kumarvishal09/incubator-carbondata querylocaltemp1
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/2447.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2447
----
commit c2949152ea65aa82d9a990e55fd9c793d44b0f77
Author: kumarvishal09 <ku...@...>
Date: 2018-07-02T13:22:02Z
Local dictionary query
----
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6857/
---
[GitHub] carbondata pull request #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2447#discussion_r200376474
--- Diff: core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RangeValueFilterExecuterImpl.java ---
@@ -89,7 +90,10 @@ public RangeValueFilterExecuterImpl(DimColumnResolvedFilterInfo dimColEvaluatorI
isRangeFullyCoverBlock = false;
initDimensionChunkIndexes();
ifDefaultValueMatchesFilter();
-
+ if (isDimensionPresentInCurrentBlock == true) {
--- End diff --
Just use ` if (isDimensionPresentInCurrentBlock)`
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2447
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5647/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5616/
---
[GitHub] carbondata pull request #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2447#discussion_r200359974
--- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java ---
@@ -936,6 +936,10 @@
*/
public static final String LOCAL_DICTIONARY_THRESHOLD_DEFAULT = "10000";
+ public static final int LOCAL_DICTIONARY_MAX = 100000;
--- End diff --
Add comments
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5647/
---
[GitHub] carbondata issue #2447: [CARBONDATA-2589][CARBONDATA-2590][CARBONDATA-2602]L...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5723/
---
[GitHub] carbondata issue #2447: [CARBONDATA-2589][CARBONDATA-2590][CARBONDATA-2602]L...
Posted by kumarvishal09 <gi...@git.apache.org>.
Github user kumarvishal09 commented on the issue:
https://github.com/apache/carbondata/pull/2447
retest sdv please
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6838/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5606/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2447
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5609/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6784/
---
[GitHub] carbondata pull request #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2447#discussion_r200373102
--- Diff: core/src/main/java/org/apache/carbondata/core/scan/filter/FilterUtil.java ---
@@ -1839,4 +1844,118 @@ public static void removeInExpressionNodeWithPositionIdColumn(Expression express
}
}
}
+
+ public static byte[][] getEncodedFilterValues(CarbonDictionary dictionary,
+ byte[][] actualFilterValues) {
+ if (null == dictionary) {
+ return actualFilterValues;
+ }
+ KeyGenerator keyGenerator = KeyGeneratorFactory.getKeyGenerator(new int[] { 100000 });
+ List<byte[]> encodedFilters = new ArrayList<>();
+ for (byte[] actualFilter : actualFilterValues) {
+ for (int i = 1; i < dictionary.getDictionaryValues().length; i++) {
+ if (ByteUtil.UnsafeComparer.INSTANCE
+ .compareTo(actualFilter, dictionary.getDictionaryValues()[i])
+ == 0) {
+ try {
+ encodedFilters.add(keyGenerator.generateKey(new int[] { i }));
+ } catch (KeyGenException e) {
+ //do nothing
+ }
+ break;
+ }
+ }
+ }
+ return getSortedEncodedFilters(encodedFilters);
+ }
+
+ private static byte[][] getSortedEncodedFilters(List<byte[]> encodedFilters) {
+ java.util.Comparator<byte[]> filterNoDictValueComaparator = new java.util.Comparator<byte[]>() {
+ @Override public int compare(byte[] filterMember1, byte[] filterMember2) {
+ return ByteUtil.UnsafeComparer.INSTANCE.compareTo(filterMember1, filterMember2);
+ }
+ };
+ Collections.sort(encodedFilters, filterNoDictValueComaparator);
+ return encodedFilters.toArray(new byte[encodedFilters.size()][]);
+ }
+
+ private static BitSet getIncludeDictionaryValues(Expression expression,
+ CarbonDictionary dictionary) throws FilterUnsupportedException {
+ ConditionalExpression conExp = (ConditionalExpression) expression;
+ ColumnExpression columnExpression = conExp.getColumnList().get(0);
+ BitSet includeFilterBitSet = new BitSet();
+ for (int i = 2; i < dictionary.getDictionaryValues().length; i++) {
+ try {
+ RowIntf row = new RowImpl();
+ String stringValue = new String(dictionary.getDictionaryValues()[i],
+ Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET));
+ row.setValues(new Object[] { DataTypeUtil.getDataBasedOnDataType(stringValue,
+ columnExpression.getCarbonColumn().getDataType()) });
+ Boolean rslt = expression.evaluate(row).getBoolean();
+ if (null != rslt) {
+ if (rslt) {
+ includeFilterBitSet.set(i);
+ }
+ }
+ } catch (FilterIllegalMemberException e) {
+ LOGGER.debug(e.getMessage());
+ }
+ }
+ return includeFilterBitSet;
+ }
+
+ public static byte[][] getEncodedFilterValues(BitSet includeDictValues, int dictSize,
+ boolean useExclude) {
+ KeyGenerator keyGenerator = KeyGeneratorFactory
+ .getKeyGenerator(new int[] { CarbonCommonConstants.LOCAL_DICTIONARY_MAX });
+ List<byte[]> encodedFilterValues = new ArrayList<>();
+ int[] dummy = new int[1];
+ if (!useExclude) {
+ try {
+ for (int i = includeDictValues.nextSetBit(0);
+ i >= 0; i = includeDictValues.nextSetBit(i + 1)) {
+ dummy[0] = i;
+ encodedFilterValues.add(keyGenerator.generateKey(dummy));
+ }
+ } catch (KeyGenException e) {
+ // do nothing
+ }
+ return encodedFilterValues.toArray(new byte[encodedFilterValues.size()][]);
+ } else {
+ try {
+ for (int i = 1; i < dictSize; i++) {
+ if (!includeDictValues.get(i)) {
+ dummy[0] = i;
+ encodedFilterValues.add(keyGenerator.generateKey(dummy));
+ }
+ }
+ } catch (KeyGenException e) {
+ // do nothing
--- End diff --
Add log
---
[GitHub] carbondata issue #2447: [CARBONDATA-2589][CARBONDATA-2590][CARBONDATA-2602]L...
Posted by kunal642 <gi...@git.apache.org>.
Github user kunal642 commented on the issue:
https://github.com/apache/carbondata/pull/2447
LGTM
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2447
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5629/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5629/
---
[GitHub] carbondata issue #2447: [CARBONDATA-2589][CARBONDATA-2590][CARBONDATA-2602]L...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6880/
---
[GitHub] carbondata pull request #2447: [WIP]Local dictionary query
Posted by kumarvishal09 <gi...@git.apache.org>.
Github user kumarvishal09 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2447#discussion_r200427006
--- Diff: core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RangeValueFilterExecuterImpl.java ---
@@ -89,7 +90,10 @@ public RangeValueFilterExecuterImpl(DimColumnResolvedFilterInfo dimColEvaluatorI
isRangeFullyCoverBlock = false;
initDimensionChunkIndexes();
ifDefaultValueMatchesFilter();
-
+ if (isDimensionPresentInCurrentBlock == true) {
--- End diff --
Ok i will change, I haven't change this code may be because of formatting:)
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6758/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5620/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by kumarvishal09 <gi...@git.apache.org>.
Github user kumarvishal09 commented on the issue:
https://github.com/apache/carbondata/pull/2447
retest this please
---
[GitHub] carbondata issue #2447: [CARBONDATA-2589][CARBONDATA-2590][CARBONDATA-2602]L...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2447
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5729/
---
[GitHub] carbondata issue #2447: [CARBONDATA-2589][CARBONDATA-2590][CARBONDATA-2602]L...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2447
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5723/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by kumarvishal09 <gi...@git.apache.org>.
Github user kumarvishal09 commented on the issue:
https://github.com/apache/carbondata/pull/2447
retest this please
---
[GitHub] carbondata issue #2447: [CARBONDATA-2589][CARBONDATA-2590][CARBONDATA-2602]L...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2447
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5661/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6853/
---
[GitHub] carbondata pull request #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2447#discussion_r200380100
--- Diff: core/src/main/java/org/apache/carbondata/core/scan/filter/FilterUtil.java ---
@@ -1839,4 +1844,118 @@ public static void removeInExpressionNodeWithPositionIdColumn(Expression express
}
}
}
+
+ public static byte[][] getEncodedFilterValues(CarbonDictionary dictionary,
+ byte[][] actualFilterValues) {
+ if (null == dictionary) {
+ return actualFilterValues;
+ }
+ KeyGenerator keyGenerator = KeyGeneratorFactory.getKeyGenerator(new int[] { 100000 });
+ List<byte[]> encodedFilters = new ArrayList<>();
+ for (byte[] actualFilter : actualFilterValues) {
+ for (int i = 1; i < dictionary.getDictionaryValues().length; i++) {
+ if (ByteUtil.UnsafeComparer.INSTANCE
+ .compareTo(actualFilter, dictionary.getDictionaryValues()[i])
+ == 0) {
+ try {
+ encodedFilters.add(keyGenerator.generateKey(new int[] { i }));
+ } catch (KeyGenException e) {
+ //do nothing
+ }
+ break;
+ }
+ }
+ }
+ return getSortedEncodedFilters(encodedFilters);
+ }
+
+ private static byte[][] getSortedEncodedFilters(List<byte[]> encodedFilters) {
+ java.util.Comparator<byte[]> filterNoDictValueComaparator = new java.util.Comparator<byte[]>() {
+ @Override public int compare(byte[] filterMember1, byte[] filterMember2) {
+ return ByteUtil.UnsafeComparer.INSTANCE.compareTo(filterMember1, filterMember2);
+ }
+ };
+ Collections.sort(encodedFilters, filterNoDictValueComaparator);
+ return encodedFilters.toArray(new byte[encodedFilters.size()][]);
+ }
+
+ private static BitSet getIncludeDictionaryValues(Expression expression,
+ CarbonDictionary dictionary) throws FilterUnsupportedException {
+ ConditionalExpression conExp = (ConditionalExpression) expression;
+ ColumnExpression columnExpression = conExp.getColumnList().get(0);
+ BitSet includeFilterBitSet = new BitSet();
+ for (int i = 2; i < dictionary.getDictionaryValues().length; i++) {
+ try {
+ RowIntf row = new RowImpl();
+ String stringValue = new String(dictionary.getDictionaryValues()[i],
+ Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET));
+ row.setValues(new Object[] { DataTypeUtil.getDataBasedOnDataType(stringValue,
+ columnExpression.getCarbonColumn().getDataType()) });
+ Boolean rslt = expression.evaluate(row).getBoolean();
+ if (null != rslt) {
+ if (rslt) {
+ includeFilterBitSet.set(i);
+ }
+ }
+ } catch (FilterIllegalMemberException e) {
+ LOGGER.debug(e.getMessage());
+ }
+ }
+ return includeFilterBitSet;
+ }
+
+ public static byte[][] getEncodedFilterValues(BitSet includeDictValues, int dictSize,
+ boolean useExclude) {
+ KeyGenerator keyGenerator = KeyGeneratorFactory
+ .getKeyGenerator(new int[] { CarbonCommonConstants.LOCAL_DICTIONARY_MAX });
+ List<byte[]> encodedFilterValues = new ArrayList<>();
+ int[] dummy = new int[1];
+ if (!useExclude) {
+ try {
+ for (int i = includeDictValues.nextSetBit(0);
+ i >= 0; i = includeDictValues.nextSetBit(i + 1)) {
+ dummy[0] = i;
+ encodedFilterValues.add(keyGenerator.generateKey(dummy));
+ }
+ } catch (KeyGenException e) {
+ // do nothing
+ }
+ return encodedFilterValues.toArray(new byte[encodedFilterValues.size()][]);
+ } else {
+ try {
+ for (int i = 1; i < dictSize; i++) {
+ if (!includeDictValues.get(i)) {
+ dummy[0] = i;
+ encodedFilterValues.add(keyGenerator.generateKey(dummy));
+ }
+ }
+ } catch (KeyGenException e) {
+ // do nothing
+ }
+ }
+ return getSortedEncodedFilters(encodedFilterValues);
+ }
+
+ public static FilterExecuter getFilterExecutorForLocalDictionary(
+ DimensionRawColumnChunk rawColumnChunk, Expression exp, boolean isNaturalSorted) {
--- End diff --
Pass `CarbonDictionary` directly
---
[GitHub] carbondata issue #2447: [CARBONDATA-2589][CARBONDATA-2590][CARBONDATA-2602]L...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2447
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5713/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6773/
---
[GitHub] carbondata pull request #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2447#discussion_r200368462
--- Diff: core/src/main/java/org/apache/carbondata/core/scan/filter/FilterUtil.java ---
@@ -1839,4 +1844,118 @@ public static void removeInExpressionNodeWithPositionIdColumn(Expression express
}
}
}
+
+ public static byte[][] getEncodedFilterValues(CarbonDictionary dictionary,
+ byte[][] actualFilterValues) {
+ if (null == dictionary) {
+ return actualFilterValues;
+ }
+ KeyGenerator keyGenerator = KeyGeneratorFactory.getKeyGenerator(new int[] { 100000 });
+ List<byte[]> encodedFilters = new ArrayList<>();
+ for (byte[] actualFilter : actualFilterValues) {
+ for (int i = 1; i < dictionary.getDictionaryValues().length; i++) {
+ if (ByteUtil.UnsafeComparer.INSTANCE
+ .compareTo(actualFilter, dictionary.getDictionaryValues()[i])
+ == 0) {
--- End diff --
Format it properly
---
[GitHub] carbondata issue #2447: [CARBONDATA-2589][CARBONDATA-2590][CARBONDATA-2602]L...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6940/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5638/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by brijoobopanna <gi...@git.apache.org>.
Github user brijoobopanna commented on the issue:
https://github.com/apache/carbondata/pull/2447
retest sdv please
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6815/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by brijoobopanna <gi...@git.apache.org>.
Github user brijoobopanna commented on the issue:
https://github.com/apache/carbondata/pull/2447
retest this please
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2447
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5621/
---
[GitHub] carbondata pull request #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2447#discussion_r200368298
--- Diff: core/src/main/java/org/apache/carbondata/core/scan/filter/FilterUtil.java ---
@@ -1839,4 +1844,118 @@ public static void removeInExpressionNodeWithPositionIdColumn(Expression express
}
}
}
+
+ public static byte[][] getEncodedFilterValues(CarbonDictionary dictionary,
+ byte[][] actualFilterValues) {
+ if (null == dictionary) {
+ return actualFilterValues;
+ }
+ KeyGenerator keyGenerator = KeyGeneratorFactory.getKeyGenerator(new int[] { 100000 });
+ List<byte[]> encodedFilters = new ArrayList<>();
+ for (byte[] actualFilter : actualFilterValues) {
+ for (int i = 1; i < dictionary.getDictionaryValues().length; i++) {
+ if (ByteUtil.UnsafeComparer.INSTANCE
+ .compareTo(actualFilter, dictionary.getDictionaryValues()[i])
+ == 0) {
+ try {
+ encodedFilters.add(keyGenerator.generateKey(new int[] { i }));
--- End diff --
Use constant size of int[] and use
---
[GitHub] carbondata issue #2447: [CARBONDATA-2589][CARBONDATA-2590][CARBONDATA-2602]L...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5656/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2447
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5654/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5580/
---
[GitHub] carbondata pull request #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2447#discussion_r200366803
--- Diff: core/src/main/java/org/apache/carbondata/core/scan/result/vector/CarbonDictionary.java ---
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.core.scan.result.vector;
+
+public interface CarbonDictionary {
+
+ byte[][] getDictionaryValues();
+
+ int getDictionarySize();
+
+ boolean isDictionaryUsed();
--- End diff --
I think setting here is not relevant , try to set in scanned result class
---
[GitHub] carbondata pull request #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2447#discussion_r200382824
--- Diff: core/src/main/java/org/apache/carbondata/core/scan/result/vector/impl/CarbonDictionaryImpl.java ---
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.core.scan.result.vector.impl;
+
+import org.apache.carbondata.core.scan.result.vector.CarbonDictionary;
+
+public class CarbonDictionaryImpl implements CarbonDictionary {
+
+ private byte[][] dictionary;
+
+ private int actualSize;
+
+ private boolean isDictUsed;
+
+ public CarbonDictionaryImpl(byte[][] dictionary, int actualSize) {
+ this.dictionary = dictionary;
+ this.actualSize = actualSize;
+ }
+
+ @Override public byte[][] getDictionaryValues() {
--- End diff --
add another method to take index and return dictionary value
---
[GitHub] carbondata issue #2447: [CARBONDATA-2589][CARBONDATA-2590][CARBONDATA-2602]L...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6894/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2447
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5627/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6819/
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5595/
---
[GitHub] carbondata pull request #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2447#discussion_r200383740
--- Diff: integration/spark2/src/main/java/org/apache/carbondata/spark/vectorreader/CarbonDictionaryWrapper.java ---
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.spark.vectorreader;
+
+import org.apache.carbondata.core.scan.result.vector.CarbonDictionary;
+
+import org.apache.parquet.column.Dictionary;
+import org.apache.parquet.column.Encoding;
+import org.apache.parquet.io.api.Binary;
+
+public class CarbonDictionaryWrapper extends Dictionary {
+
+ private Binary[] binaries;
+
+ public CarbonDictionaryWrapper(Encoding encoding, CarbonDictionary dictionary) {
+ super(encoding);
+ byte[][] rleData = dictionary.getDictionaryValues();
+ if (rleData != null) {
+ binaries = new Binary[rleData.length];
+ binaries[0] = Binary.fromReusedByteArray(new byte[0]);
+ binaries[1] = Binary.fromReusedByteArray(new byte[0]);
--- End diff --
Please remove above 2 lines as no need to add dummy binary
---
[GitHub] carbondata pull request #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2447#discussion_r200372912
--- Diff: core/src/main/java/org/apache/carbondata/core/scan/filter/FilterUtil.java ---
@@ -1839,4 +1844,118 @@ public static void removeInExpressionNodeWithPositionIdColumn(Expression express
}
}
}
+
+ public static byte[][] getEncodedFilterValues(CarbonDictionary dictionary,
+ byte[][] actualFilterValues) {
+ if (null == dictionary) {
+ return actualFilterValues;
+ }
+ KeyGenerator keyGenerator = KeyGeneratorFactory.getKeyGenerator(new int[] { 100000 });
+ List<byte[]> encodedFilters = new ArrayList<>();
+ for (byte[] actualFilter : actualFilterValues) {
+ for (int i = 1; i < dictionary.getDictionaryValues().length; i++) {
+ if (ByteUtil.UnsafeComparer.INSTANCE
+ .compareTo(actualFilter, dictionary.getDictionaryValues()[i])
+ == 0) {
+ try {
+ encodedFilters.add(keyGenerator.generateKey(new int[] { i }));
+ } catch (KeyGenException e) {
+ //do nothing
+ }
+ break;
+ }
+ }
+ }
+ return getSortedEncodedFilters(encodedFilters);
+ }
+
+ private static byte[][] getSortedEncodedFilters(List<byte[]> encodedFilters) {
+ java.util.Comparator<byte[]> filterNoDictValueComaparator = new java.util.Comparator<byte[]>() {
+ @Override public int compare(byte[] filterMember1, byte[] filterMember2) {
+ return ByteUtil.UnsafeComparer.INSTANCE.compareTo(filterMember1, filterMember2);
+ }
+ };
+ Collections.sort(encodedFilters, filterNoDictValueComaparator);
+ return encodedFilters.toArray(new byte[encodedFilters.size()][]);
+ }
+
+ private static BitSet getIncludeDictionaryValues(Expression expression,
+ CarbonDictionary dictionary) throws FilterUnsupportedException {
+ ConditionalExpression conExp = (ConditionalExpression) expression;
+ ColumnExpression columnExpression = conExp.getColumnList().get(0);
+ BitSet includeFilterBitSet = new BitSet();
+ for (int i = 2; i < dictionary.getDictionaryValues().length; i++) {
+ try {
+ RowIntf row = new RowImpl();
+ String stringValue = new String(dictionary.getDictionaryValues()[i],
+ Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET));
+ row.setValues(new Object[] { DataTypeUtil.getDataBasedOnDataType(stringValue,
+ columnExpression.getCarbonColumn().getDataType()) });
+ Boolean rslt = expression.evaluate(row).getBoolean();
+ if (null != rslt) {
+ if (rslt) {
+ includeFilterBitSet.set(i);
+ }
+ }
+ } catch (FilterIllegalMemberException e) {
+ LOGGER.debug(e.getMessage());
+ }
+ }
+ return includeFilterBitSet;
+ }
+
+ public static byte[][] getEncodedFilterValues(BitSet includeDictValues, int dictSize,
--- End diff --
Change method name appropriately. And make the access to private
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by kumarvishal09 <gi...@git.apache.org>.
Github user kumarvishal09 commented on the issue:
https://github.com/apache/carbondata/pull/2447
retest this please
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by kumarvishal09 <gi...@git.apache.org>.
Github user kumarvishal09 commented on the issue:
https://github.com/apache/carbondata/pull/2447
retest this please
---
[GitHub] carbondata issue #2447: [CARBONDATA-2589][CARBONDATA-2590][CARBONDATA-2602]L...
Posted by kumarvishal09 <gi...@git.apache.org>.
Github user kumarvishal09 commented on the issue:
https://github.com/apache/carbondata/pull/2447
retest sdv please
---
[GitHub] carbondata issue #2447: [CARBONDATA-2589][CARBONDATA-2590][CARBONDATA-2602]L...
Posted by brijoobopanna <gi...@git.apache.org>.
Github user brijoobopanna commented on the issue:
https://github.com/apache/carbondata/pull/2447
retest sdv please
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6809/
---
[GitHub] carbondata pull request #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2447#discussion_r200374220
--- Diff: core/src/main/java/org/apache/carbondata/core/scan/filter/FilterUtil.java ---
@@ -1839,4 +1844,118 @@ public static void removeInExpressionNodeWithPositionIdColumn(Expression express
}
}
}
+
+ public static byte[][] getEncodedFilterValues(CarbonDictionary dictionary,
+ byte[][] actualFilterValues) {
+ if (null == dictionary) {
+ return actualFilterValues;
+ }
+ KeyGenerator keyGenerator = KeyGeneratorFactory.getKeyGenerator(new int[] { 100000 });
+ List<byte[]> encodedFilters = new ArrayList<>();
+ for (byte[] actualFilter : actualFilterValues) {
+ for (int i = 1; i < dictionary.getDictionaryValues().length; i++) {
+ if (ByteUtil.UnsafeComparer.INSTANCE
+ .compareTo(actualFilter, dictionary.getDictionaryValues()[i])
+ == 0) {
+ try {
+ encodedFilters.add(keyGenerator.generateKey(new int[] { i }));
+ } catch (KeyGenException e) {
+ //do nothing
+ }
+ break;
+ }
+ }
+ }
+ return getSortedEncodedFilters(encodedFilters);
+ }
+
+ private static byte[][] getSortedEncodedFilters(List<byte[]> encodedFilters) {
+ java.util.Comparator<byte[]> filterNoDictValueComaparator = new java.util.Comparator<byte[]>() {
+ @Override public int compare(byte[] filterMember1, byte[] filterMember2) {
+ return ByteUtil.UnsafeComparer.INSTANCE.compareTo(filterMember1, filterMember2);
+ }
+ };
+ Collections.sort(encodedFilters, filterNoDictValueComaparator);
+ return encodedFilters.toArray(new byte[encodedFilters.size()][]);
+ }
+
+ private static BitSet getIncludeDictionaryValues(Expression expression,
+ CarbonDictionary dictionary) throws FilterUnsupportedException {
+ ConditionalExpression conExp = (ConditionalExpression) expression;
+ ColumnExpression columnExpression = conExp.getColumnList().get(0);
+ BitSet includeFilterBitSet = new BitSet();
+ for (int i = 2; i < dictionary.getDictionaryValues().length; i++) {
+ try {
+ RowIntf row = new RowImpl();
+ String stringValue = new String(dictionary.getDictionaryValues()[i],
+ Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET));
+ row.setValues(new Object[] { DataTypeUtil.getDataBasedOnDataType(stringValue,
+ columnExpression.getCarbonColumn().getDataType()) });
+ Boolean rslt = expression.evaluate(row).getBoolean();
+ if (null != rslt) {
+ if (rslt) {
+ includeFilterBitSet.set(i);
+ }
+ }
+ } catch (FilterIllegalMemberException e) {
+ LOGGER.debug(e.getMessage());
+ }
+ }
+ return includeFilterBitSet;
+ }
+
+ public static byte[][] getEncodedFilterValues(BitSet includeDictValues, int dictSize,
+ boolean useExclude) {
+ KeyGenerator keyGenerator = KeyGeneratorFactory
+ .getKeyGenerator(new int[] { CarbonCommonConstants.LOCAL_DICTIONARY_MAX });
+ List<byte[]> encodedFilterValues = new ArrayList<>();
+ int[] dummy = new int[1];
+ if (!useExclude) {
+ try {
+ for (int i = includeDictValues.nextSetBit(0);
+ i >= 0; i = includeDictValues.nextSetBit(i + 1)) {
+ dummy[0] = i;
+ encodedFilterValues.add(keyGenerator.generateKey(dummy));
+ }
+ } catch (KeyGenException e) {
+ // do nothing
+ }
+ return encodedFilterValues.toArray(new byte[encodedFilterValues.size()][]);
+ } else {
+ try {
+ for (int i = 1; i < dictSize; i++) {
+ if (!includeDictValues.get(i)) {
+ dummy[0] = i;
+ encodedFilterValues.add(keyGenerator.generateKey(dummy));
+ }
+ }
+ } catch (KeyGenException e) {
+ // do nothing
+ }
+ }
+ return getSortedEncodedFilters(encodedFilterValues);
+ }
+
+ public static FilterExecuter getFilterExecutorForLocalDictionary(
--- End diff --
Change name appropriately as it is only for range filters
---
[GitHub] carbondata issue #2447: [CARBONDATA-2589][CARBONDATA-2590][CARBONDATA-2602]L...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5675/
---
[GitHub] carbondata pull request #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2447#discussion_r200363116
--- Diff: core/src/main/java/org/apache/carbondata/core/datastore/chunk/impl/DimensionRawColumnChunk.java ---
@@ -126,4 +129,12 @@ public void setFileReader(FileReader fileReader) {
public FileReader getFileReader() {
return fileReader;
}
+
+ public CarbonDictionary getLocalDictionary() {
+ return localDictionary;
+ }
+
+ public void setLocalDictionary(CarbonDictionary localDictionary) {
--- End diff --
Remove setter and use the only getter. Inside getter check null and uncompress it for first time read.
---
[GitHub] carbondata pull request #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2447#discussion_r200364300
--- Diff: core/src/main/java/org/apache/carbondata/core/datastore/chunk/reader/dimension/v3/CompressedDimensionChunkFileBasedReaderV3.java ---
@@ -289,4 +307,30 @@ private DimensionColumnPage decodeDimensionLegacy(DimensionRawColumnChunk rawCol
}
return columnDataChunk;
}
+
+ private CarbonDictionary getDictionary(LocalDictionaryChunk localDictionaryChunk)
+ throws IOException, MemoryException {
+ if (null != localDictionaryChunk) {
+ List<Encoding> encodings = localDictionaryChunk.getDictionary_meta().getEncoders();
+ List<ByteBuffer> encoderMetas = localDictionaryChunk.getDictionary_meta().getEncoder_meta();
+ ColumnPageDecoder decoder = encodingFactory.createDecoder(encodings, encoderMetas);
+ ColumnPage decode = decoder.decode(localDictionaryChunk.getDictionary_data(), 0,
+ localDictionaryChunk.getDictionary_data().length);
+ BitSet usedDictionary = BitSet.valueOf(CompressorFactory.getInstance().getCompressor()
+ .unCompressByte(localDictionaryChunk.getDictionary_values()));
+ int length = usedDictionary.length();
+ int index = 0;
+ byte[][] dictionary = new byte[length][];
+ for (int i = 0; i < length; i++) {
+ if (usedDictionary.get(i)) {
+ dictionary[i] = decode.getBytes(index++);
+ } else {
+ dictionary[i] = new byte[0];
--- End diff --
Assign null directly instead of o bytes
---
[GitHub] carbondata pull request #2447: [CARBONDATA-2589][CARBONDATA-2590][CARBONDATA...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/carbondata/pull/2447
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by brijoobopanna <gi...@git.apache.org>.
Github user brijoobopanna commented on the issue:
https://github.com/apache/carbondata/pull/2447
retest this please
---
[GitHub] carbondata issue #2447: [WIP]Local dictionary query
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2447
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6801/
---
[GitHub] carbondata pull request #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2447#discussion_r200367630
--- Diff: core/src/main/java/org/apache/carbondata/core/scan/filter/FilterUtil.java ---
@@ -1839,4 +1844,118 @@ public static void removeInExpressionNodeWithPositionIdColumn(Expression express
}
}
}
+
+ public static byte[][] getEncodedFilterValues(CarbonDictionary dictionary,
+ byte[][] actualFilterValues) {
+ if (null == dictionary) {
+ return actualFilterValues;
+ }
+ KeyGenerator keyGenerator = KeyGeneratorFactory.getKeyGenerator(new int[] { 100000 });
--- End diff --
Use from constant
---
[GitHub] carbondata pull request #2447: [WIP]Local dictionary query
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2447#discussion_r200381066
--- Diff: core/src/main/java/org/apache/carbondata/core/scan/processor/DataBlockIterator.java ---
@@ -246,7 +246,6 @@ public void processNextBatch(CarbonColumnarBatch columnarBatch) {
}
}
-
--- End diff --
remove it
---