You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Namit Jain (JIRA)" <ji...@apache.org> on 2009/11/19 22:49:39 UTC

[jira] Created: (HIVE-946) pass context to custom mapper/reducer

pass context to custom mapper/reducer
-------------------------------------

                 Key: HIVE-946
                 URL: https://issues.apache.org/jira/browse/HIVE-946
             Project: Hadoop Hive
          Issue Type: Improvement
          Components: Query Processor
            Reporter: Namit Jain


It might be useful to pass some context to custom mapper/reducer process - 
the requirement is to identify the transform uniquely if there are multiple transforms in the same task.
For that, we can pass the operator id of the script operator in some environment variable.
If anything else is useful, please add to the list.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-946) pass context to custom mapper/reducer

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789059#action_12789059 ] 

Namit Jain commented on HIVE-946:
---------------------------------

+1

will commit if the tests pass

> pass context to custom mapper/reducer
> -------------------------------------
>
>                 Key: HIVE-946
>                 URL: https://issues.apache.org/jira/browse/HIVE-946
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Paul Yang
>         Attachments: HIVE-946.1.patch, HIVE-946.2.patch
>
>
> It might be useful to pass some context to custom mapper/reducer process - 
> the requirement is to identify the transform uniquely if there are multiple transforms in the same task.
> For that, we can pass the operator id of the script operator in some environment variable.
> If anything else is useful, please add to the list.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-946) pass context to custom mapper/reducer

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788531#action_12788531 ] 

Namit Jain commented on HIVE-946:
---------------------------------

Looks good - but can you add few more tests.

1. Dont set the variable hive.script.operator.id.env.var - write a script which actually uses $HIVE_SCRIPT_OPERATOR_ID
   (not the select clause).

2. 
 <name>hive.script.operator.id.env.var</name> 	
				297 		<value>HIVE_SCRIPT_OPERATOR_ID</value> 	
				298 		<description> Name of the environment variable within the script's process
that holds the unique script operator ID. 	
				299 		</description> 	
				300 		</property>


Add more description here.

It is the

Name of the environment variable that holds the unique script operator ID in the user's transform function 
(the custom mapper/reducer that the user has specified in the query)





> pass context to custom mapper/reducer
> -------------------------------------
>
>                 Key: HIVE-946
>                 URL: https://issues.apache.org/jira/browse/HIVE-946
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Paul Yang
>         Attachments: HIVE-946.1.patch
>
>
> It might be useful to pass some context to custom mapper/reducer process - 
> the requirement is to identify the transform uniquely if there are multiple transforms in the same task.
> For that, we can pass the operator id of the script operator in some environment variable.
> If anything else is useful, please add to the list.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-946) pass context to custom mapper/reducer

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783992#action_12783992 ] 

Namit Jain commented on HIVE-946:
---------------------------------

The name of the variable can be:

HiveScriptOperatorIdentifier

> pass context to custom mapper/reducer
> -------------------------------------
>
>                 Key: HIVE-946
>                 URL: https://issues.apache.org/jira/browse/HIVE-946
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Paul Yang
>
> It might be useful to pass some context to custom mapper/reducer process - 
> the requirement is to identify the transform uniquely if there are multiple transforms in the same task.
> For that, we can pass the operator id of the script operator in some environment variable.
> If anything else is useful, please add to the list.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HIVE-946) pass context to custom mapper/reducer

Posted by "Paul Yang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paul Yang reassigned HIVE-946:
------------------------------

    Assignee: Paul Yang

> pass context to custom mapper/reducer
> -------------------------------------
>
>                 Key: HIVE-946
>                 URL: https://issues.apache.org/jira/browse/HIVE-946
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Paul Yang
>
> It might be useful to pass some context to custom mapper/reducer process - 
> the requirement is to identify the transform uniquely if there are multiple transforms in the same task.
> For that, we can pass the operator id of the script operator in some environment variable.
> If anything else is useful, please add to the list.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HIVE-946) pass context to custom mapper/reducer

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain resolved HIVE-946.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.5.0
     Hadoop Flags: [Reviewed]

Committed. Thanks Paul

> pass context to custom mapper/reducer
> -------------------------------------
>
>                 Key: HIVE-946
>                 URL: https://issues.apache.org/jira/browse/HIVE-946
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Paul Yang
>             Fix For: 0.5.0
>
>         Attachments: HIVE-946.1.patch, HIVE-946.2.patch
>
>
> It might be useful to pass some context to custom mapper/reducer process - 
> the requirement is to identify the transform uniquely if there are multiple transforms in the same task.
> For that, we can pass the operator id of the script operator in some environment variable.
> If anything else is useful, please add to the list.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-946) pass context to custom mapper/reducer

Posted by "Paul Yang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paul Yang updated HIVE-946:
---------------------------

    Attachment: HIVE-946.1.patch

I've noticed that environment variables are generally in all caps, so I renamed the variable to HIVE_SCRIPT_OPERATOR_ID.


> pass context to custom mapper/reducer
> -------------------------------------
>
>                 Key: HIVE-946
>                 URL: https://issues.apache.org/jira/browse/HIVE-946
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Paul Yang
>         Attachments: HIVE-946.1.patch
>
>
> It might be useful to pass some context to custom mapper/reducer process - 
> the requirement is to identify the transform uniquely if there are multiple transforms in the same task.
> For that, we can pass the operator id of the script operator in some environment variable.
> If anything else is useful, please add to the list.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-946) pass context to custom mapper/reducer

Posted by "Paul Yang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paul Yang updated HIVE-946:
---------------------------

    Attachment: HIVE-946.2.patch

* Changed tests to use UNION
* Changed call to get the ID from Operator.getIdentifier() to Operator.getOperatorId() as getIdentifier() does not return unique ID's in tests with multiple stages
* Added test w/o set

> pass context to custom mapper/reducer
> -------------------------------------
>
>                 Key: HIVE-946
>                 URL: https://issues.apache.org/jira/browse/HIVE-946
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Paul Yang
>         Attachments: HIVE-946.1.patch, HIVE-946.2.patch
>
>
> It might be useful to pass some context to custom mapper/reducer process - 
> the requirement is to identify the transform uniquely if there are multiple transforms in the same task.
> For that, we can pass the operator id of the script operator in some environment variable.
> If anything else is useful, please add to the list.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.