You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Josh Mahonin (JIRA)" <ji...@apache.org> on 2015/10/27 14:35:27 UTC
[jira] [Comment Edited] (PHOENIX-2336) Queries with small case
column-names return empty result-set when working with Spark Datasource
Plugin
[ https://issues.apache.org/jira/browse/PHOENIX-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14976333#comment-14976333 ]
Josh Mahonin edited comment on PHOENIX-2336 at 10/27/15 1:34 PM:
-----------------------------------------------------------------
Hi [~snalapure@dataken.net]
I don't have too many free cycles to diagnose and fix this at the moment. If you would like to help, I would start with a reproducible test scenario in:
https://github.com/apache/phoenix/blob/master/phoenix-spark/src/it/scala/org/apache/phoenix/spark/PhoenixSparkIT.scala
>From there, debugging through the stack calls to see where the breakdown occurs would be my next step. I'm not very familiar with the code-path of views over HBase tables, so I don't know why the behaviour would be different from a regular table.
was (Author: jmahonin):
Hi [~snalapure[~snalapure@dataken.net]
I don't have too many free cycles to diagnose and fix this at the moment. If you would like to help, I would start with a reproducible test scenario in:
https://github.com/apache/phoenix/blob/master/phoenix-spark/src/it/scala/org/apache/phoenix/spark/PhoenixSparkIT.scala
>From there, debugging through the stack calls to see where the breakdown occurs would be my next step. I'm not very familiar with the code-path of views over HBase tables, so I don't know why the behaviour would be different from a regular table.
> Queries with small case column-names return empty result-set when working with Spark Datasource Plugin
> -------------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-2336
> URL: https://issues.apache.org/jira/browse/PHOENIX-2336
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.6.0
> Reporter: Suhas Nalapure
>
> Hi,
> The Spark DataFrame filter operation returns empty result-set when column-name is in the smaller case. Example below:
> DataFrame df = sqlContext.read().format("org.apache.phoenix.spark").options(params).load();
> df.filter("\"col1\" = '5.0'").show();
> Result:
> +---+----+---+---+---+---
> | ID|col1| c1| d2| d3| d4|
> +---+----+---+---+---+---+
> +---+----+---+---+---+---+
> Whereas the table actually has some rows matching the filter condition. And if double quotes are removed from around the column name i.e. df.filter("col1 = '5.0'").show(); , a ColumnNotFoundException is thrown:
> Exception in thread "main" java.lang.RuntimeException: org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703): Undefined column. columnName=D1
> at org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:125)
> at org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:80)
> at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
> at scala.Option.getOrElse(Option.scala:120)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)