You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Stephen Boesch (JIRA)" <ji...@apache.org> on 2014/12/18 06:56:13 UTC

[jira] [Closed] (SPARK-2686) Add Length support to Spark SQL and HQL and Strlen support to SQL

     [ https://issues.apache.org/jira/browse/SPARK-2686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen Boesch closed SPARK-2686.
---------------------------------
    Resolution: Later

Michael Armbrust requested this be closed while the new UDF structure is being pinned down. A new JIRA will be created for this issue pending that info being available.

> Add Length support to Spark SQL and HQL and Strlen support to SQL
> -----------------------------------------------------------------
>
>                 Key: SPARK-2686
>                 URL: https://issues.apache.org/jira/browse/SPARK-2686
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>         Environment: all
>            Reporter: Stephen Boesch
>            Priority: Minor
>              Labels: hql, length, sql
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> Syntactic, parsing, and operational support have been added for LEN(GTH) and STRLEN functions.
> Examples:
> SQL:
> import org.apache.spark.sql._
> case class TestData(key: Int, value: String)
> val sqlc = new SQLContext(sc)
> import sqlc._
>   val testData: SchemaRDD = sqlc.sparkContext.parallelize(
>     (1 to 100).map(i => TestData(i, i.toString)))
>   testData.registerAsTable("testData")
> sqlc.sql("select length(key) as key_len from testData order by key_len desc limit 5").collect
> res12: Array[org.apache.spark.sql.Row] = Array([3], [2], [2], [2], [2])
> HQL:
> val hc = new org.apache.spark.sql.hive.HiveContext(sc)
> import hc._
> hc.hql
> hql("select length(grp) from simplex").collect
> res14: Array[org.apache.spark.sql.Row] = Array([6], [6], [6], [6])
> As far as codebase changes: they have been purposefully made similar to the ones made for  for adding SUBSTR(ING) from July 17:
> SQLParser, Optimizer, Expression, stringOperations, and HiveQL were the main classes changed.  The testing suites affected are ConstantFolding and  ExpressionEvaluation.
> In addition some ad-hoc testing was done as shown in the examples.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org