You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Arina Ielchiieva (Jira)" <ji...@apache.org> on 2019/10/23 15:40:00 UTC
[jira] [Created] (DRILL-7419) Enhance Drill splitting logic for
compressed files
Arina Ielchiieva created DRILL-7419:
---------------------------------------
Summary: Enhance Drill splitting logic for compressed files
Key: DRILL-7419
URL: https://issues.apache.org/jira/browse/DRILL-7419
Project: Apache Drill
Issue Type: Improvement
Affects Versions: 1.16.0
Reporter: Arina Ielchiieva
By default Drill treats all compressed files are non splittable. Drill uses BlockMapBuilder to split file into blocks if possible. According to its code, it tries to split the file if blockSplittable is set to true and file IS NOT compressed. So even if format is block splittable but came as compressed file, it won't be split.
https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/schedule/BlockMapBuilder.java#L115
But some compression codecs can be splittable, for example; bzip2 (https://i.stack.imgur.com/jpprr.jpg). Codec type should be taken into account when considering if file can be split.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)