You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by sounakr <gi...@git.apache.org> on 2018/05/07 01:28:14 UTC
[GitHub] carbondata pull request #2276: [CARBONDATA-2443][WIP][SDK]Multi level comple...
GitHub user sounakr opened a pull request:
https://github.com/apache/carbondata/pull/2276
[CARBONDATA-2443][WIP][SDK]Multi level complex type support for AVRO based SDK
Multi level complex type support for AVRO based SDK
- [ ] Any interfaces changed?
- [ ] Any backward compatibility impacted?
- [ ] Document update required?
- [ ] Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sounakr/incubator-carbondata multi_level_complex
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/2276.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2276
----
commit 29ac4474d28ccb08c9a5cff52e973da2dbbd5c4b
Author: kumarvishal09 <ku...@...>
Date: 2018-05-03T15:29:18Z
Fixed No dictionary complex type issues
commit f357202cd897e004c41d378d764247008717e182
Author: sounakr <so...@...>
Date: 2018-05-02T15:59:57Z
[CARBONDATA-2430] Fixes Related to Complex Type Support with No Dictionary
a) Reshuffling of fields. Sorting of columns based on SORT -> DIMENSION -> COMPLEX -> MEASURE.
b) Enable Complex Type NoDictionary Creation from Create Table DDL.
c) ARRAY Support in AVRO.
d) Minor Fixes related to AVRO SDK support.
commit e8cbdc5ef0d7634fc60a156fb8049567fdc6e9ec
Author: ajantha-bhat <aj...@...>
Date: 2018-05-04T11:35:51Z
sdk complex type issue fixes and test case additions
commit 1867b600b40ef7e4ef4e8cde28f31dc32865dd07
Author: ajantha-bhat <aj...@...>
Date: 2018-05-04T16:57:26Z
Struct of array test case
commit 5bf8067f1f4c583f14e5af3ef25a794af6b1ff1c
Author: sounakr <so...@...>
Date: 2018-05-07T01:21:54Z
Multi Level Complex type Support For AVRO Based SDK
----
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on the issue:
https://github.com/apache/carbondata/pull/2276
retest this please
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][WIP][SDK]Multi level complex type ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4542/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][WIP][SDK]Multi level complex type ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4531/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2276
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4780/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][WIP][SDK]Multi level complex type ...
Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on the issue:
https://github.com/apache/carbondata/pull/2276
Retest this please
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on the issue:
https://github.com/apache/carbondata/pull/2276
Retest this please
---
[GitHub] carbondata pull request #2276: [CARBONDATA-2443][WIP][SDK]Multi level comple...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2276#discussion_r186338238
--- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/steps/InputProcessorStepForPartitionImpl.java ---
@@ -220,32 +241,97 @@ private CarbonRowBatch getBatch() {
CarbonRowBatch carbonRowBatch = new CarbonRowBatch(batchSize);
int count = 0;
while (internalHasNext() && count < batchSize) {
- carbonRowBatch.addRow(new CarbonRow(convertToNoDictionaryToBytes(currentIterator.next())));
+ carbonRowBatch.addRow(
+ new CarbonRow(convertToNoDictionaryToBytes(currentIterator.next(), dataFields)));
count++;
}
rowCounter.getAndAdd(carbonRowBatch.getSize());
return carbonRowBatch;
}
- private Object[] convertToNoDictionaryToBytes(Object[] data) {
+ private Object[] convertToNoDictionaryToBytes(Object[] data, DataField[] dataFields) {
Object[] newData = new Object[data.length];
- for (int i = 0; i < noDictionaryMapping.length; i++) {
- if (noDictionaryMapping[i]) {
+ for (int i = 0; i < data.length; i++) {
+ if (i < noDictionaryMapping.length && noDictionaryMapping[i]) {
newData[i] = DataTypeUtil
.getBytesDataDataTypeForNoDictionaryColumn(data[orderOfData[i]], dataTypes[i]);
} else {
- newData[i] = data[orderOfData[i]];
+ // if this is a complex column then recursively comver the data into Byte Array.
+ if (dataTypes[i].isComplexType()) {
+ ByteArrayOutputStream byteArray = new ByteArrayOutputStream();
+ DataOutputStream dataOutputStream = new DataOutputStream(byteArray);
+ try {
+ getBytesForComplex(data[orderOfData[i]], dataFields[i], dataOutputStream);
+ dataOutputStream.close();
+ newData[i] = byteArray.toByteArray();
+ } catch (Exception e) {
+ throw new CarbonDataLoadingException( "Loading Exception", e);
+ }
+ } else {
+ newData[i] = data[orderOfData[i]];
+ }
}
}
- if (newData.length > noDictionaryMapping.length) {
- for (int i = noDictionaryMapping.length; i < newData.length; i++) {
- newData[i] = data[orderOfData[i]];
+ System.out.println(Arrays.toString(data));
+ return newData;
+ }
+
+ private void getBytesForComplex(Object datum, DataField dataField,
+ DataOutputStream dataOutputStream) throws Exception {
+ getBytesForComplex(datum, dataField.getColumn().getDataType(),
+ ((CarbonDimension) dataField.getColumn()), dataOutputStream);
+ }
+
+ private void getBytesForComplex(Object datum, DataType dataType,
+ CarbonDimension carbonDimensions, DataOutputStream dataOutputStream) throws Exception {
+ if (dataType.getName().equalsIgnoreCase("struct")) {
+ List<CarbonDimension> childStructDimensions = carbonDimensions.getListOfChildDimensions();
+ int size = childStructDimensions.size();
+ Object[] structFields = ((StructObject) datum).getData();
+ dataOutputStream.writeInt(size);
+ // loop through each one of them.
+ for (int i = 0; i < size; i++) {
+ getBytesForComplex(structFields[i], childStructDimensions.get(i).getDataType(),
+ childStructDimensions.get(i), dataOutputStream);
+ }
+ } else if (dataType.getName().equalsIgnoreCase("array")) {
--- End diff --
check `instance of` rather than string comparision
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4549/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5704/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5707/
---
[GitHub] carbondata pull request #2276: [CARBONDATA-2443][WIP][SDK]Multi level comple...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2276#discussion_r186336351
--- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/DataLoadProcessBuilder.java ---
@@ -76,6 +78,19 @@ public AbstractDataLoadProcessorStep build(CarbonLoadModel loadModel, String[] s
}
}
+ private AbstractDataLoadProcessorStep buildInternalForAvroSDKLoad(CarbonIterator[] inputIterators,
--- End diff --
THis logic is same as `buildInternalForPartitionLoad` then why special method and handling required? why not use same partition flow?
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on the issue:
https://github.com/apache/carbondata/pull/2276
retest this Please
---
[GitHub] carbondata pull request #2276: [CARBONDATA-2443][WIP][SDK]Multi level comple...
Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2276#discussion_r186388851
--- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/DataLoadProcessBuilder.java ---
@@ -76,6 +78,19 @@ public AbstractDataLoadProcessorStep build(CarbonLoadModel loadModel, String[] s
}
}
+ private AbstractDataLoadProcessorStep buildInternalForAvroSDKLoad(CarbonIterator[] inputIterators,
--- End diff --
Both of Partition and AVRO calling the same method now.
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][WIP][SDK]Multi level complex type ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5695/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4554/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5711/
---
[GitHub] carbondata pull request #2276: [CARBONDATA-2443][SDK]Multi level complex typ...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/carbondata/pull/2276
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2276
LGTM
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2276
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4769/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][WIP][SDK]Multi level complex type ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4535/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on the issue:
https://github.com/apache/carbondata/pull/2276
Retest this please
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5715/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][WIP][SDK]Multi level complex type ...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2276
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4753/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][WIP][SDK]Multi level complex type ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5702/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2276
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4773/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2276
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4792/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on the issue:
https://github.com/apache/carbondata/pull/2276
Retest this please
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4546/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2276
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4785/
---
[GitHub] carbondata pull request #2276: [CARBONDATA-2443][WIP][SDK]Multi level comple...
Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2276#discussion_r186388685
--- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/steps/InputProcessorStepForPartitionImpl.java ---
@@ -220,32 +241,97 @@ private CarbonRowBatch getBatch() {
CarbonRowBatch carbonRowBatch = new CarbonRowBatch(batchSize);
int count = 0;
while (internalHasNext() && count < batchSize) {
- carbonRowBatch.addRow(new CarbonRow(convertToNoDictionaryToBytes(currentIterator.next())));
+ carbonRowBatch.addRow(
+ new CarbonRow(convertToNoDictionaryToBytes(currentIterator.next(), dataFields)));
count++;
}
rowCounter.getAndAdd(carbonRowBatch.getSize());
return carbonRowBatch;
}
- private Object[] convertToNoDictionaryToBytes(Object[] data) {
+ private Object[] convertToNoDictionaryToBytes(Object[] data, DataField[] dataFields) {
Object[] newData = new Object[data.length];
- for (int i = 0; i < noDictionaryMapping.length; i++) {
- if (noDictionaryMapping[i]) {
+ for (int i = 0; i < data.length; i++) {
+ if (i < noDictionaryMapping.length && noDictionaryMapping[i]) {
newData[i] = DataTypeUtil
.getBytesDataDataTypeForNoDictionaryColumn(data[orderOfData[i]], dataTypes[i]);
} else {
- newData[i] = data[orderOfData[i]];
+ // if this is a complex column then recursively comver the data into Byte Array.
+ if (dataTypes[i].isComplexType()) {
+ ByteArrayOutputStream byteArray = new ByteArrayOutputStream();
+ DataOutputStream dataOutputStream = new DataOutputStream(byteArray);
+ try {
+ getBytesForComplex(data[orderOfData[i]], dataFields[i], dataOutputStream);
+ dataOutputStream.close();
+ newData[i] = byteArray.toByteArray();
+ } catch (Exception e) {
+ throw new CarbonDataLoadingException( "Loading Exception", e);
+ }
+ } else {
+ newData[i] = data[orderOfData[i]];
+ }
}
}
- if (newData.length > noDictionaryMapping.length) {
- for (int i = noDictionaryMapping.length; i < newData.length; i++) {
- newData[i] = data[orderOfData[i]];
+ System.out.println(Arrays.toString(data));
+ return newData;
+ }
+
+ private void getBytesForComplex(Object datum, DataField dataField,
+ DataOutputStream dataOutputStream) throws Exception {
+ getBytesForComplex(datum, dataField.getColumn().getDataType(),
+ ((CarbonDimension) dataField.getColumn()), dataOutputStream);
+ }
+
+ private void getBytesForComplex(Object datum, DataType dataType,
+ CarbonDimension carbonDimensions, DataOutputStream dataOutputStream) throws Exception {
+ if (dataType.getName().equalsIgnoreCase("struct")) {
+ List<CarbonDimension> childStructDimensions = carbonDimensions.getListOfChildDimensions();
+ int size = childStructDimensions.size();
+ Object[] structFields = ((StructObject) datum).getData();
+ dataOutputStream.writeInt(size);
+ // loop through each one of them.
+ for (int i = 0; i < size; i++) {
+ getBytesForComplex(structFields[i], childStructDimensions.get(i).getDataType(),
+ childStructDimensions.get(i), dataOutputStream);
+ }
+ } else if (dataType.getName().equalsIgnoreCase("array")) {
--- End diff --
Done
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][WIP][SDK]Multi level complex type ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5691/
---
[GitHub] carbondata pull request #2276: [CARBONDATA-2443][WIP][SDK]Multi level comple...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2276#discussion_r186336455
--- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/model/CarbonLoadModel.java ---
@@ -202,6 +202,13 @@
*/
private boolean isPartitionLoad;
+ /**
+ * It directly writes data directly to nosort processor bypassing convert steps.
+ * This data comes from SDK where the input is from AVRO and AVRO Writer already
+ * converts the AVRO data to the Primitive datatypes object.
+ */
+ private boolean isAvroSDKLoad;
--- End diff --
I don't think this is required, use partition flow
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2276
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4779/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][WIP][SDK]Multi level complex type ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5676/
---
[GitHub] carbondata pull request #2276: [CARBONDATA-2443][WIP][SDK]Multi level comple...
Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2276#discussion_r186388739
--- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/model/CarbonLoadModel.java ---
@@ -202,6 +202,13 @@
*/
private boolean isPartitionLoad;
+ /**
+ * It directly writes data directly to nosort processor bypassing convert steps.
+ * This data comes from SDK where the input is from AVRO and AVRO Writer already
+ * converts the AVRO data to the Primitive datatypes object.
+ */
+ private boolean isAvroSDKLoad;
--- End diff --
Done.
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4566/
---
[GitHub] carbondata pull request #2276: [CARBONDATA-2443][WIP][SDK]Multi level comple...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2276#discussion_r186338913
--- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/steps/InputProcessorStepForPartitionImpl.java ---
@@ -220,32 +241,97 @@ private CarbonRowBatch getBatch() {
CarbonRowBatch carbonRowBatch = new CarbonRowBatch(batchSize);
int count = 0;
while (internalHasNext() && count < batchSize) {
- carbonRowBatch.addRow(new CarbonRow(convertToNoDictionaryToBytes(currentIterator.next())));
+ carbonRowBatch.addRow(
+ new CarbonRow(convertToNoDictionaryToBytes(currentIterator.next(), dataFields)));
count++;
}
rowCounter.getAndAdd(carbonRowBatch.getSize());
return carbonRowBatch;
}
- private Object[] convertToNoDictionaryToBytes(Object[] data) {
+ private Object[] convertToNoDictionaryToBytes(Object[] data, DataField[] dataFields) {
Object[] newData = new Object[data.length];
- for (int i = 0; i < noDictionaryMapping.length; i++) {
- if (noDictionaryMapping[i]) {
+ for (int i = 0; i < data.length; i++) {
+ if (i < noDictionaryMapping.length && noDictionaryMapping[i]) {
newData[i] = DataTypeUtil
.getBytesDataDataTypeForNoDictionaryColumn(data[orderOfData[i]], dataTypes[i]);
} else {
- newData[i] = data[orderOfData[i]];
+ // if this is a complex column then recursively comver the data into Byte Array.
+ if (dataTypes[i].isComplexType()) {
+ ByteArrayOutputStream byteArray = new ByteArrayOutputStream();
+ DataOutputStream dataOutputStream = new DataOutputStream(byteArray);
+ try {
+ getBytesForComplex(data[orderOfData[i]], dataFields[i], dataOutputStream);
+ dataOutputStream.close();
+ newData[i] = byteArray.toByteArray();
+ } catch (Exception e) {
+ throw new CarbonDataLoadingException( "Loading Exception", e);
+ }
+ } else {
+ newData[i] = data[orderOfData[i]];
+ }
}
}
- if (newData.length > noDictionaryMapping.length) {
- for (int i = noDictionaryMapping.length; i < newData.length; i++) {
- newData[i] = data[orderOfData[i]];
+ System.out.println(Arrays.toString(data));
+ return newData;
+ }
+
+ private void getBytesForComplex(Object datum, DataField dataField,
+ DataOutputStream dataOutputStream) throws Exception {
+ getBytesForComplex(datum, dataField.getColumn().getDataType(),
+ ((CarbonDimension) dataField.getColumn()), dataOutputStream);
+ }
+
+ private void getBytesForComplex(Object datum, DataType dataType,
--- End diff --
You should use `GenericDataType` to write the data. Don't duplicate code
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5727/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2276
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4766/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5725/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2276
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4767/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][WIP][SDK]Multi level complex type ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2276
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4516/
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][WIP][SDK]Multi level complex type ...
Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on the issue:
https://github.com/apache/carbondata/pull/2276
Retest this please
---
[GitHub] carbondata pull request #2276: [CARBONDATA-2443][SDK]Multi level complex typ...
Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2276#discussion_r186433007
--- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/steps/InputProcessorStepForPartitionImpl.java ---
@@ -220,32 +241,97 @@ private CarbonRowBatch getBatch() {
CarbonRowBatch carbonRowBatch = new CarbonRowBatch(batchSize);
int count = 0;
while (internalHasNext() && count < batchSize) {
- carbonRowBatch.addRow(new CarbonRow(convertToNoDictionaryToBytes(currentIterator.next())));
+ carbonRowBatch.addRow(
+ new CarbonRow(convertToNoDictionaryToBytes(currentIterator.next(), dataFields)));
count++;
}
rowCounter.getAndAdd(carbonRowBatch.getSize());
return carbonRowBatch;
}
- private Object[] convertToNoDictionaryToBytes(Object[] data) {
+ private Object[] convertToNoDictionaryToBytes(Object[] data, DataField[] dataFields) {
Object[] newData = new Object[data.length];
- for (int i = 0; i < noDictionaryMapping.length; i++) {
- if (noDictionaryMapping[i]) {
+ for (int i = 0; i < data.length; i++) {
+ if (i < noDictionaryMapping.length && noDictionaryMapping[i]) {
newData[i] = DataTypeUtil
.getBytesDataDataTypeForNoDictionaryColumn(data[orderOfData[i]], dataTypes[i]);
} else {
- newData[i] = data[orderOfData[i]];
+ // if this is a complex column then recursively comver the data into Byte Array.
+ if (dataTypes[i].isComplexType()) {
+ ByteArrayOutputStream byteArray = new ByteArrayOutputStream();
+ DataOutputStream dataOutputStream = new DataOutputStream(byteArray);
+ try {
+ getBytesForComplex(data[orderOfData[i]], dataFields[i], dataOutputStream);
+ dataOutputStream.close();
+ newData[i] = byteArray.toByteArray();
+ } catch (Exception e) {
+ throw new CarbonDataLoadingException( "Loading Exception", e);
+ }
+ } else {
+ newData[i] = data[orderOfData[i]];
+ }
}
}
- if (newData.length > noDictionaryMapping.length) {
- for (int i = noDictionaryMapping.length; i < newData.length; i++) {
- newData[i] = data[orderOfData[i]];
+ System.out.println(Arrays.toString(data));
+ return newData;
+ }
+
+ private void getBytesForComplex(Object datum, DataField dataField,
+ DataOutputStream dataOutputStream) throws Exception {
+ getBytesForComplex(datum, dataField.getColumn().getDataType(),
+ ((CarbonDimension) dataField.getColumn()), dataOutputStream);
+ }
+
+ private void getBytesForComplex(Object datum, DataType dataType,
--- End diff --
Done
---
[GitHub] carbondata issue #2276: [CARBONDATA-2443][SDK]Multi level complex type suppo...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2276
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4790/
---