You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/08/15 12:54:01 UTC
[GitHub] [arrow] subbarao opened a new issue #7968: VectorSchemaRoot reuse when using dictionary encoded field
subbarao opened a new issue #7968:
URL: https://github.com/apache/arrow/issues/7968
From code example given in https://enpiar.com/arrow-site/docs/java/ipc.html
```
DictionaryProvider.MapDictionaryProvider provider = new DictionaryProvider.MapDictionaryProvider();
// create dictionary and provider
final VarCharVector dictVector = new VarCharVector("dict", allocator);
dictVector.allocateNewSafe();
dictVector.setSafe(0, "aa".getBytes());
dictVector.setSafe(1, "bb".getBytes());
dictVector.setSafe(2, "cc".getBytes());
dictVector.setValueCount(3);
Dictionary dictionary =
new Dictionary(dictVector, new DictionaryEncoding(1L, false, /*indexType=*/null));
provider.put(dictionary);
// create vector and encode it
final VarCharVector vector = new VarCharVector("vector", allocator);
vector.allocateNewSafe();
vector.setSafe(0, "bb".getBytes());
vector.setSafe(1, "bb".getBytes());
vector.setSafe(2, "cc".getBytes());
vector.setSafe(3, "aa".getBytes());
vector.setValueCount(4);
// get the encoded vector
IntVector encodedVector = (IntVector) DictionaryEncoder.encode(vector, dictionary);
// create VectorSchemaRoot
List<Field> fields = Arrays.asList(encodedVector.getField());
List<FieldVector> vectors = Arrays.asList(encodedVector);
VectorSchemaRoot root = new VectorSchemaRoot(fields, vectors);
```
VectorSchemaRoot is immutable if i want to write in batches.
```DictionaryEncoder.encode``` api always giving new vector. it is not operating on existing vector.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] wesm closed issue #7968: VectorSchemaRoot reuse when using dictionary encoded field
Posted by GitBox <gi...@apache.org>.
wesm closed issue #7968:
URL: https://github.com/apache/arrow/issues/7968
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] wesm commented on issue #7968: VectorSchemaRoot reuse when using dictionary encoded field
Posted by GitBox <gi...@apache.org>.
wesm commented on issue #7968:
URL: https://github.com/apache/arrow/issues/7968#issuecomment-674571250
Can you direct this question to the dev@ or user@ mailing list (or open a JIRA if you think you have found a bug or missing feature)?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org