You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2014/06/08 10:08:01 UTC

[jira] [Commented] (PHOENIX-939) Generalize SELECT expressions for Pig Loader

    [ https://issues.apache.org/jira/browse/PHOENIX-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14021150#comment-14021150 ] 

Hudson commented on PHOENIX-939:
--------------------------------

SUCCESS: Integrated in Phoenix | 3.0 | Hadoop1 #109 (See [https://builds.apache.org/job/Phoenix-3.0-hadoop1/109/])
PHOENIX-939 Generalize SELECT expressions for Pig Loader (Ravi) (jtaylor: rev 1f895e204da77df6d927738790116fd09eaf6b88)
* phoenix-pig/src/test/java/org/apache/phoenix/pig/util/SqlQueryToColumnInfoFunctionTest.java
* phoenix-pig/src/main/java/org/apache/phoenix/pig/util/SqlQueryToColumnInfoFunction.java
* phoenix-pig/src/main/java/org/apache/phoenix/pig/PhoenixPigConfiguration.java
* phoenix-pig/src/main/java/org/apache/phoenix/pig/PhoenixHBaseLoader.java
* phoenix-pig/src/main/java/org/apache/phoenix/pig/hadoop/PhoenixInputFormat.java
* phoenix-pig/src/main/java/org/apache/phoenix/pig/util/PhoenixPigSchemaUtil.java
* phoenix-pig/src/main/java/org/apache/phoenix/pig/hadoop/PhoenixRecordReader.java
* phoenix-pig/src/it/java/org/apache/phoenix/pig/PhoenixHBaseLoaderIT.java


> Generalize SELECT expressions for Pig Loader
> --------------------------------------------
>
>                 Key: PHOENIX-939
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-939
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 5.0.0, 3.1, 4.1
>            Reporter: James Taylor
>            Assignee: maghamravikiran
>             Fix For: 5.0.0, 3.1, 4.1
>
>         Attachments: PHOENIX-939.patch
>
>
> The current Pig Loader requires that the query contain only column references in the SELECT expressions. Instead, we should allow any expression as that will provide more general utility. For example, built-in functions, sequence references, etc. could be used then.
> Validation can be done by simply compiling the query. This does all the validation required. It's ok if it's compiled twice if need be too.
> Pig doesn't know and likely wouldn't care if the expressions in the SELECT correspond to columns in Phoenix or general expressions. You can use the ColumnProjection.getName() method to get back an alias of the SELECT expression. If no alias is provided, then the String of the expression is returned. [~prkommireddi] - can you weigh in here? You can use the Phoenix RowProjector to iterate through each ColumnProjector you get back after the compile to get the alias of the select expression (i.e. you'd give this to Pig as the "column" name) plus the data type. Note that if this is problemattic, then you could likely generate an alias name if one is not present.
> For example:
> {code}
>     RowProjector rowProj = queryPlan.getProjector();
>     for (ColumnProjector colProj : rowProj.getColumnProjectors()) {
>         String columnName = colProj.getName();
>         PDataType dataType = colProj.getExpression().getDataType();
>     }
> {code}
> If the SELECT expressions are simple column references, this would be exactly the same as is being done now. If the SELECT expressions are more complex expressions, this would work as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)