You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Jacques Nadeau (JIRA)" <ji...@apache.org> on 2015/01/05 00:08:34 UTC

[jira] [Resolved] (DRILL-1792) store.parquet.vector_fill_check_threshold is too high

     [ https://issues.apache.org/jira/browse/DRILL-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jacques Nadeau resolved DRILL-1792.
-----------------------------------
    Resolution: Invalid

Due to the nature of some Parquet files, you'll need to set this setting lower.  To do so, use the ALTER SESSION or ALTER SYSTEM command.  See here for changing the setting: https://cwiki.apache.org/confluence/display/DRILL/SQL+Commands+Summary

I don't remember what the default is offhand but you can view it by querying select * from sys.options;

I think setting to 1 is the most conservative setting.

> store.parquet.vector_fill_check_threshold is too high
> -----------------------------------------------------
>
>                 Key: DRILL-1792
>                 URL: https://issues.apache.org/jira/browse/DRILL-1792
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Client - CLI
>    Affects Versions: 0.6.0
>         Environment: Linux, CentOS 6 latest, MapR 4.0.1
>            Reporter: hy5446
>
> I'm trying out some queries against parquet records. My query should return about 18 rows out of 2M. But:
> 0: jdbc:drill:> select * from dfs.`myfolder` as t where t.foo.bar = `foo bar`;
> /// headers here
> Query failed: Failure while running fragment., The setting for `store.parquet.vector_fill_check_threshold` is too high for your Parquet records. Please set a lower check threshold and retry your query.
> I'm not sure how to proceed - there does not seem a lot of documentation about this. What does that variable mean? What value to set it? And using what command?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)