You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Hive QA (JIRA)" <ji...@apache.org> on 2016/09/04 01:05:20 UTC
[jira] [Commented] (HIVE-14159) sorting of tuple array using
multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15461995#comment-15461995 ]
Hive QA commented on HIVE-14159:
--------------------------------
Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12817103/HIVE-14159.4.patch
{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.
{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10452 tests executed
*Failed tests:*
{noformat}
TestBeeLineWithArgs - did not produce a TEST-*.xml file
TestHiveCli - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}
Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1103/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1103/console
Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-1103/
Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}
This message is automatically generated.
ATTACHMENT ID: 12817103 - PreCommit-HIVE-MASTER-Build
> sorting of tuple array using multiple field[s]
> ----------------------------------------------
>
> Key: HIVE-14159
> URL: https://issues.apache.org/jira/browse/HIVE-14159
> Project: Hive
> Issue Type: Improvement
> Components: UDF
> Reporter: Simanchal Das
> Assignee: Simanchal Das
> Labels: patch
> Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch, HIVE-14159.3.patch, HIVE-14159.4.patch
>
>
> Problem Statement:
> When we are working with complex structure of data like avro.
> Most of the times we are encountering array contains multiple tuples and each tuple have struct schema.
> Suppose here struct schema is like below:
> {noformat}
> {
> "name": "employee",
> "type": [{
> "type": "record",
> "name": "Employee",
> "namespace": "com.company.Employee",
> "fields": [{
> "name": "empId",
> "type": "int"
> }, {
> "name": "empName",
> "type": "string"
> }, {
> "name": "age",
> "type": "int"
> }, {
> "name": "salary",
> "type": "double"
> }]
> }]
> }
> {noformat}
> Then while running our hive query complex array looks like array of employee objects.
> {noformat}
> Example:
> //(array<struct<empId,empName,age,salary>>)
> Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)]
> {noformat}
> When we are implementing business use cases day to day life we are encountering problems like sorting a tuple array by specific field[s] like empId,name,salary,etc by ASC or DESC order.
> Proposal:
> I have developed a udf 'sort_array_by' which will sort a tuple array by one or more fields in ASC or DESC order provided by user ,default is ascending order .
> {noformat}
> Example:
> 1.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC");
> output: array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)]
>
> 2.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC");
> output: array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)]
> 3.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC");
> output: array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)]
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)