You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Jeff Hammerbacher (JIRA)" <ji...@apache.org> on 2008/11/10 18:55:44 UTC

[jira] Created: (HADOOP-4625) Add functionality similar to SCOPE's "virtual" views to Hive

Add functionality similar to SCOPE's "virtual" views to Hive
------------------------------------------------------------

                 Key: HADOOP-4625
                 URL: https://issues.apache.org/jira/browse/HADOOP-4625
             Project: Hadoop Core
          Issue Type: New Feature
          Components: contrib/hive
            Reporter: Jeff Hammerbacher


SCOPE has many nice features, and the ability to IMPORT/EXPORT parameterized scripts and store partial queries in named variables is one of them. Section 3.5 of the SCOPE paper has the details, and there are several examples throughout the paper. Perhaps we can choose an alternative delimiter for PARAMTER imports, however (SCOPE uses "@@...@@").

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4625) Add functionality similar to SCOPE's "virtual" views to Hive

Posted by "Jeff Hammerbacher (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646313#action_12646313 ] 

Jeff Hammerbacher commented on HADOOP-4625:
-------------------------------------------

Copy of the paper: http://icme2007.org/~jrzhou/pub/Scope.pdf.

First Example:

SELECT Ra, Rb 
FROM R 
WHERE Rb < 100 AND (Ra > 5 OR EXISTS(SELECT * FROM S WHERE Sa < 20 AND Sc = Rc)) 

Here is an equivalent script in SCOPE. 

SQ = SELECT DISTINCT Sc FROM S WHERE Sa < 20; 
M1 = SELECT Ra, Rb, Rc FROM R WHERE Rb < 100; 
M2 = SELECT Ra, Rb, Rc, Sc FROM M1 LEFT OUTER JOIN SQ ON Rc == Sc; 
Q  = SELECT Ra, Rb FROM M2 WHERE Ra > 5 OR Rc != Sc;  
R1 = SELECT A+C AS ac, B.Trim() AS B1 FROM R WHERE StringOccurs(C, "xyz") > 2;

Second Example:

e = EXTRACT query FROM "search.log" USING LogExtractor; 
s1 = SELECT query, COUNT(*) as count FROM e GROUP BY query; 
s2 = SELECT query, count FROM s1 WHERE count > 1000; 
s3 = SELECT query, count FROM s2 ORDER BY count DESC; 
OUTPUT s3 TO "qcount.result";


> Add functionality similar to SCOPE's "virtual" views to Hive
> ------------------------------------------------------------
>
>                 Key: HADOOP-4625
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4625
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/hive
>            Reporter: Jeff Hammerbacher
>
> SCOPE has many nice features, and the ability to IMPORT/EXPORT parameterized scripts and store partial queries in named variables is one of them. Section 3.5 of the SCOPE paper has the details, and there are several examples throughout the paper. Perhaps we can choose an alternative delimiter for PARAMTER imports, however (SCOPE uses "@@...@@").

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.