You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Michael Armbrust (JIRA)" <ji...@apache.org> on 2014/11/07 20:57:33 UTC
[jira] [Resolved] (SPARK-4213) SparkSQL - ParquetFilters - No
support for LT, LTE, GT, GTE operators
[ https://issues.apache.org/jira/browse/SPARK-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Armbrust resolved SPARK-4213.
-------------------------------------
Resolution: Fixed
Issue resolved by pull request 3083
[https://github.com/apache/spark/pull/3083]
> SparkSQL - ParquetFilters - No support for LT, LTE, GT, GTE operators
> ---------------------------------------------------------------------
>
> Key: SPARK-4213
> URL: https://issues.apache.org/jira/browse/SPARK-4213
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.2.0
> Environment: CDH5.2, Hive 0.13.1, Spark 1.2 snapshot (commit hash 76386e1a23c)
> Reporter: Terry Siu
> Priority: Blocker
> Fix For: 1.2.0
>
>
> When I issue a hql query against a HiveContext where my predicate uses a column of string type with one of LT, LTE, GT, or GTE operator, I get the following error:
> scala.MatchError: StringType (of class org.apache.spark.sql.catalyst.types.StringType$)
> Looking at the code in org.apache.spark.sql.parquet.ParquetFilters, StringType is absent from the corresponding functions for creating these filters.
> To reproduce, in a Hive 0.13.1 shell, I created the following table (at a specified DB):
> create table sparkbug (
> id int,
> event string
> ) stored as parquet;
> Insert some sample data:
> insert into table sparkbug select 1, '2011-06-18' from <some table> limit 1;
> insert into table sparkbug select 2, '2012-01-01' from <some table> limit 1;
> Launch a spark shell and create a HiveContext to the metastore where the table above is located.
> import org.apache.spark.sql._
> import org.apache.spark.sql.SQLContext
> import org.apache.spark.sql.hive.HiveContext
> val hc = new HiveContext(sc)
> hc.setConf("spark.sql.shuffle.partitions", "10")
> hc.setConf("spark.sql.hive.convertMetastoreParquet", "true")
> hc.setConf("spark.sql.parquet.compression.codec", "snappy")
> import hc._
> hc.hql("select * from <db>.sparkbug where event >= '2011-12-01'")
> A scala.MatchError will appear in the output.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org