You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@drill.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/04/04 21:51:00 UTC

[jira] [Commented] (DRILL-5846) Improve Parquet Reader Performance for Flat Data types

    [ https://issues.apache.org/jira/browse/DRILL-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426227#comment-16426227 ] 

ASF GitHub Bot commented on DRILL-5846:
---------------------------------------

Github user vrozov commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1060#discussion_r179293968
  
    --- Diff: exec/java-exec/pom.xml ---
    @@ -836,6 +836,14 @@
             <groupId>org.apache.maven.plugins</groupId>
             <artifactId>maven-surefire-plugin</artifactId>
           </plugin>
    +      <plugin>
    +        <groupId>org.apache.maven.plugins</groupId>
    +        <artifactId>maven-compiler-plugin</artifactId>
    +        <version>2.3.2</version>
    +        <configuration>
    +            <compilerArgument>-XDignore.symbol.file</compilerArgument>
    --- End diff --
    
    Consider using `io.netty.util.internal.PlatformDependent` instead of `sun.misc.Unsafe`. 


> Improve Parquet Reader Performance for Flat Data types 
> -------------------------------------------------------
>
>                 Key: DRILL-5846
>                 URL: https://issues.apache.org/jira/browse/DRILL-5846
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Parquet
>    Affects Versions: 1.11.0
>            Reporter: salim achouche
>            Assignee: salim achouche
>            Priority: Major
>              Labels: performance
>             Fix For: 1.14.0
>
>         Attachments: 2542d447-9837-3924-dd12-f759108461e5.sys.drill, 2542d49b-88ef-38e3-a02b-b441c1295817.sys.drill
>
>
> The Parquet Reader is a key use-case for Drill. This JIRA is an attempt to further improve the Parquet Reader performance as several users reported that Parquet parsing represents the lion share of the overall query execution. It tracks Flat Data types only as Nested DTs might involve functional and processing enhancements (e.g., a nested column can be seen as a Document; user might want to perform operations scoped at the document level that is no need to span all rows). Another JIRA will be created to handle the nested columns use-case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)