You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@phoenix.apache.org by "Kalyan (JIRA)" <ji...@apache.org> on 2016/08/10 07:58:20 UTC

[jira] [Commented] (PHOENIX-2336) Queries with small case column-names return empty result-set when working with Spark Datasource Plugin

    [ https://issues.apache.org/jira/browse/PHOENIX-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414873#comment-15414873 ] 

Kalyan commented on PHOENIX-2336:
---------------------------------

problem with spark sql parser ..  if expression contains "double quotes" .. there are sending as usual to next level, with out "double quotes" there are parsing using sql parser sending to next level.

i given a solution to handle in phoenix level to work.  This patch is working for above bug also:

https://issues.apache.org/jira/secure/attachment/12821623/phoenix_spark.patch

please review this

> Queries with small case column-names return empty result-set when working with Spark Datasource Plugin 
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2336
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2336
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.6.0
>            Reporter: Suhas Nalapure
>            Assignee: Josh Mahonin
>              Labels: verify
>             Fix For: 4.9.0
>
>
> Hi,
> The Spark DataFrame filter operation returns empty result-set when column-name is in the smaller case. Example below:
> DataFrame df = sqlContext.read().format("org.apache.phoenix.spark").options(params).load();
> df.filter("\"col1\" = '5.0'").show(); 
> Result:
> +---+----+---+---+---+---
> | ID|col1| c1| d2| d3| d4|
> +---+----+---+---+---+---+
> +---+----+---+---+---+---+
> Whereas the table actually has some rows matching the filter condition. And if double quotes are removed from around the column name i.e. df.filter("col1 = '5.0'").show(); , a ColumnNotFoundException is thrown:
> Exception in thread "main" java.lang.RuntimeException: org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703): Undefined column. columnName=D1
>         at org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:125)
>         at org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:80)
>         at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
>         at scala.Option.getOrElse(Option.scala:120)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)