You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Michael Armbrust (JIRA)" <ji...@apache.org> on 2014/11/07 20:57:33 UTC

[jira] [Resolved] (SPARK-4213) SparkSQL - ParquetFilters - No support for LT, LTE, GT, GTE operators

     [ https://issues.apache.org/jira/browse/SPARK-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Armbrust resolved SPARK-4213.
-------------------------------------
    Resolution: Fixed

Issue resolved by pull request 3083
[https://github.com/apache/spark/pull/3083]

> SparkSQL - ParquetFilters - No support for LT, LTE, GT, GTE operators
> ---------------------------------------------------------------------
>
>                 Key: SPARK-4213
>                 URL: https://issues.apache.org/jira/browse/SPARK-4213
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.2.0
>         Environment: CDH5.2, Hive 0.13.1, Spark 1.2 snapshot (commit hash 76386e1a23c)
>            Reporter: Terry Siu
>            Priority: Blocker
>             Fix For: 1.2.0
>
>
> When I issue a hql query against a HiveContext where my predicate uses a column of string type with one of LT, LTE, GT, or GTE operator, I get the following error:
> scala.MatchError: StringType (of class org.apache.spark.sql.catalyst.types.StringType$)
> Looking at the code in org.apache.spark.sql.parquet.ParquetFilters, StringType is absent from the corresponding functions for creating these filters.
> To reproduce, in a Hive 0.13.1 shell, I created the following table (at a specified DB):
> create table sparkbug (
>   id int,
>   event string
> ) stored as parquet;
> Insert some sample data:
> insert into table sparkbug select 1, '2011-06-18' from <some table> limit 1;
> insert into table sparkbug select 2, '2012-01-01' from <some table> limit 1;
> Launch a spark shell and create a HiveContext to the metastore where the table above is located.
> import org.apache.spark.sql._
> import org.apache.spark.sql.SQLContext
> import org.apache.spark.sql.hive.HiveContext
> val hc = new HiveContext(sc)
> hc.setConf("spark.sql.shuffle.partitions", "10")
> hc.setConf("spark.sql.hive.convertMetastoreParquet", "true")
> hc.setConf("spark.sql.parquet.compression.codec", "snappy")
> import hc._
> hc.hql("select * from <db>.sparkbug where event >= '2011-12-01'")
> A scala.MatchError will appear in the output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org