You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by paul-rogers <gi...@git.apache.org> on 2017/04/03 23:56:46 UTC

[GitHub] drill pull request #789: DRILL-5356: Refactor Parquet Record Reader

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/789#discussion_r109512919
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/ParquetRecordReader.java ---
    @@ -307,164 +231,49 @@ public FragmentContext getFragmentContext() {
         return fragmentContext;
       }
     
    -  /**
    -   * Returns data type length for a given {@see ColumnDescriptor} and it's corresponding
    -   * {@see SchemaElement}. Neither is enough information alone as the max
    -   * repetition level (indicating if it is an array type) is in the ColumnDescriptor and
    -   * the length of a fixed width field is stored at the schema level.
    -   *
    -   * @return the length if fixed width, else -1
    -   */
    -  private int getDataTypeLength(ColumnDescriptor column, SchemaElement se) {
    -    if (column.getType() != PrimitiveType.PrimitiveTypeName.BINARY) {
    -      if (column.getMaxRepetitionLevel() > 0) {
    -        return -1;
    -      }
    -      if (column.getType() == PrimitiveType.PrimitiveTypeName.FIXED_LEN_BYTE_ARRAY) {
    -        return se.getType_length() * 8;
    -      } else {
    -        return getTypeLengthInBits(column.getType());
    -      }
    -    } else {
    -      return -1;
    -    }
    -  }
    -
    -  @SuppressWarnings({ "resource", "unchecked" })
       @Override
       public void setup(OperatorContext operatorContext, OutputMutator output) throws ExecutionSetupException {
         this.operatorContext = operatorContext;
    -    if (!isStarQuery()) {
    -      columnsFound = new boolean[getColumns().size()];
    -      nullFilledVectors = new ArrayList<>();
    +    if (isStarQuery()) {
    +      schema = new ParquetSchema(fragmentContext.getOptions(), rowGroupIndex);
    +    } else {
    +      schema = new ParquetSchema(fragmentContext.getOptions(), getColumns());
         }
    -    columnStatuses = new ArrayList<>();
    -    List<ColumnDescriptor> columns = footer.getFileMetaData().getSchema().getColumns();
    -    allFieldsFixedLength = true;
    -    ColumnDescriptor column;
    -    ColumnChunkMetaData columnChunkMetaData;
    -    int columnsToScan = 0;
    -    mockRecordsRead = 0;
     
    -    MaterializedField field;
    +//    ParquetMetadataConverter metaConverter = new ParquetMetadataConverter();
    +//    FileMetaData fileMetaData;
     
    --- End diff --
    
    Was being paranoid, but sure, removed the lines.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---