You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Raghotham Murthy (JIRA)" <ji...@apache.org> on 2008/12/31 21:39:44 UTC

[jira] Created: (HIVE-204) Provide option to run hadoop via TestMiniMR

Provide option to run hadoop via TestMiniMR
-------------------------------------------

                 Key: HIVE-204
                 URL: https://issues.apache.org/jira/browse/HIVE-204
             Project: Hadoop Hive
          Issue Type: Improvement
          Components: Testing Infrastructure
            Reporter: Raghotham Murthy


Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-204:
----------------------------

    Attachment: hive.204.3.patch

> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>         Attachments: hive.204.1.patch, hive.204.2.patch, hive.204.3.patch
>
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705201#action_12705201 ] 

Zheng Shao commented on HIVE-204:
---------------------------------

+1 but it failed some 0.17.0 tests.

Seems like class path problem.

{code}
ant -Dhadoop.version=0.17.0 package test -Dtestcase=TestCliDriver -Dqfile=input16_cc.q

    [junit] java.lang.ClassNotFoundException: org.apache.hadoop.hive.serde2.TestSerDe
    [junit] Continuing ...
    [junit] org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.serde2.TestSerDe
    [junit]     at org.apache.hadoop.hive.ql.exec.MapOperator.initialize(MapOperator.java:153)
    [junit]     at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:57)
    [junit]     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
    [junit]     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
    [junit]     at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:157)
    [junit] Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.serde2.TestSerDe
    [junit]     at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
    [junit]     at java.security.AccessController.doPrivileged(Native Method)
    [junit]     at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
    [junit]     at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    [junit]     at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
    [junit]     at org.apache.hadoop.hive.ql.exec.MapOperator.initialize(MapOperator.java:108)
    [junit]     ... 4 more
{code}

For this patch, we need to do both:
{code}
ant clean test
ant -Dhadoop.version=0.17.0 clean test
{code}

If the tests are passed, I am OK with this committed. But I am gone for the next 2 weeks, so maybe somebody else can help commit it.


> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>         Attachments: hive.204.1.patch, hive.204.2.patch, hive.204.3.patch
>
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Raghotham Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660288#action_12660288 ] 

Raghotham Murthy commented on HIVE-204:
---------------------------------------

Ashish mentioned to me that MapRedTask(local mode, different vm) and ExecDriver(non-local mode, same vm) existed. This jira is to be able to run local mode, same vm for junit tests only for stepping through map and reduce tasks.

This jira is asking for the ability to submit jobs using the MiniMRCluster class. According to javadocs - "This class creates a single-process Map-Reduce cluster for junit testing. One thread is created for each server."


> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-204:
----------------------------

    Status: Patch Available  (was: Open)

added another testcase TestCliMiniMRDriver which creates a miniMR cluster for each test.
Right now, it is not enabled for all the tests since it takes a long time and is not parallelized yet.
That can be done in a follow-up.

The user can pass a list of comma-separated files to be run to the miniMR cluster. 

> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>         Attachments: hive.204.1.patch
>
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702892#action_12702892 ] 

Edward Capriolo commented on HIVE-204:
--------------------------------------

How can we mark classes so the testing infrastructure can identify them. Should the classes be name TestMR?

> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-204:
----------------------------

    Attachment: hive.204.1.patch

> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>         Attachments: hive.204.1.patch
>
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-204:
----------------------------

    Attachment: hive.204.5.patch

added the 'svn stat' output by mistake last time

> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>         Attachments: hive.204.1.patch, hive.204.2.patch, hive.204.3.patch, hive.204.4.patch, hive.204.5.patch
>
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-204:
----------------------------

    Attachment: hive.204.7.patch

incorporated comments

> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>         Attachments: hive.204.1.patch, hive.204.2.patch, hive.204.3.patch, hive.204.4.patch, hive.204.5.patch, hive.204.6.patch, hive.204.7.patch
>
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-204:
----------------------------

    Attachment: hive.204.6.patch

> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>         Attachments: hive.204.1.patch, hive.204.2.patch, hive.204.3.patch, hive.204.4.patch, hive.204.5.patch, hive.204.6.patch
>
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Raghotham Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghotham Murthy updated HIVE-204:
----------------------------------

    Assignee: Namit Jain

> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashish Thusoo updated HIVE-204:
-------------------------------

       Resolution: Fixed
    Fix Version/s: 0.4.0
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

committed. Thanks Namit!!


> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>             Fix For: 0.4.0
>
>         Attachments: hive.204.1.patch, hive.204.2.patch, hive.204.3.patch, hive.204.4.patch, hive.204.5.patch, hive.204.6.patch, hive.204.7.patch
>
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707406#action_12707406 ] 

Namit Jain commented on HIVE-204:
---------------------------------

1. You need reflection, since you cant cast dfs to a given class (it is different in 17 and 19) - without casting appropriately, you cannot call the method
2. Will do
3. By default, qfile gets all the .1 files in the test directory, and I did not want to change that behavior. I did not want to run miniMR cluster tests for all the files,
    since it takes a awefully long time. Once the tests have been parallelized, I agree clusterQueryFiles can be removed, but we would need it till then.

Will do 2, and upload the new patch

> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>         Attachments: hive.204.1.patch, hive.204.2.patch, hive.204.3.patch, hive.204.4.patch, hive.204.5.patch
>
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710550#action_12710550 ] 

Ashish Thusoo commented on HIVE-204:
------------------------------------

getting some conflicts in build-common.xml. Can you regenerate the patch?

Also some review comments are as follows:

1. Instead of minimr.query.files is it better to just introduce an option -Dmode=miniMR so that this can be run on any of the files. In order to run it in that mode on the two files, we could just call TestCliDriver with that mode and pass only those files in the query.files variable.
2. In QTestUtil.java I did not quite understand the following code:
  +      // hive.metastore.warehouse.dir needs to be set relative to the jobtracker
  +      String fsName = conf.get("fs.default.name");
  +      assert fsName != null;
  +      conf.set("hive.metastore.warehouse.dir", fsName.concat("/build/ql/test/data/warehouse/"));
  +      
  +      conf.set("mapred.job.tracker", "localhost:" + mr.getJobTrackerPort());
Where do you specify the root of the filesystem?


> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>         Attachments: hive.204.1.patch, hive.204.2.patch, hive.204.3.patch, hive.204.4.patch, hive.204.5.patch, hive.204.6.patch
>
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660268#action_12660268 ] 

Joydeep Sen Sarma commented on HIVE-204:
----------------------------------------

we do have the option of submitting the job from the same vm (which is what we do in non-local mode). in local mode - we ran into issues with multiple tasks running in same jvm (we generally assume one task per jvm).

is this asking for running tests in non-local mode against a test cluster (miniMR?) (same as hive-117?)

> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705843#action_12705843 ] 

Ashish Thusoo commented on HIVE-204:
------------------------------------

The latest patch is an svn stat output instead of svn diff output.


> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>         Attachments: hive.204.1.patch, hive.204.2.patch, hive.204.3.patch, hive.204.4.patch
>
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward Capriolo updated HIVE-204:
---------------------------------

    Comment: was deleted

(was: How can we mark classes so the testing infrastructure can identify them. Should the classes be name TestMR?)

> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710799#action_12710799 ] 

Namit Jain commented on HIVE-204:
---------------------------------

1. I agree with the mode approach instead of another target. But we still need minimr.query.files to run the tests with the build - which tests get picked up by default.
    So, it is a just a cleaner way - let us talk about it offline
2. conf.set("hive.metastore.warehouse.dir", fsName.concat("/build/ql/test/data/warehouse/"));
    Note that a concatenation is being performed above. 
   new MiniMRCluster() sets the configuration variable fs.default.name.




> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>         Attachments: hive.204.1.patch, hive.204.2.patch, hive.204.3.patch, hive.204.4.patch, hive.204.5.patch, hive.204.6.patch
>
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660427#action_12660427 ] 

Joydeep Sen Sarma commented on HIVE-204:
----------------------------------------

ok. i am not sure why exactly - but we had trouble trying to do this (local mode same vm). if the localmode, same vm combination can be made to work - the minimr stuff is not relevant - is it?

also - for the minimr cluster - will it run in a separate jvm? or will it be running in the same jvm (as cli/driver etc.)?

> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706024#action_12706024 ] 

Ashish Thusoo commented on HIVE-204:
------------------------------------

The following are my comments:

1. In QTestUtil - you probably don't need reflection to call getFileSystem. You should just use reflection for getting the MiniDFSCluster.
2. In build.xml and build-common.xml - can you move the new pathelements from the common-classpath to the test classpath in ql/build.xml. The tests for other
modules I presume don't need all these jars.
3. In QTestGenTask.java - what do you need the clusterQueryFiles for, can't you just use the the split extensions that you have made to the queryFile member variable?


> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>         Attachments: hive.204.1.patch, hive.204.2.patch, hive.204.3.patch, hive.204.4.patch, hive.204.5.patch
>
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-204:
----------------------------

    Attachment: hive.204.2.patch

> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>         Attachments: hive.204.1.patch, hive.204.2.patch
>
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713690#action_12713690 ] 

Ashish Thusoo commented on HIVE-204:
------------------------------------

+1

Looks good to me.

We should file a separate JIRA to enable this by default after we parallelize junit (unless that JIRA is not already filed).


> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>         Attachments: hive.204.1.patch, hive.204.2.patch, hive.204.3.patch, hive.204.4.patch, hive.204.5.patch, hive.204.6.patch, hive.204.7.patch
>
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-204) Provide option to run hadoop via TestMiniMR

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-204:
----------------------------

    Attachment: hive.204.4.patch

incorporated Zheng's comments.

Hadoop 17 does not like file:// in the aux lib path

> Provide option to run hadoop via TestMiniMR
> -------------------------------------------
>
>                 Key: HIVE-204
>                 URL: https://issues.apache.org/jira/browse/HIVE-204
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>            Reporter: Raghotham Murthy
>            Assignee: Namit Jain
>         Attachments: hive.204.1.patch, hive.204.2.patch, hive.204.3.patch, hive.204.4.patch
>
>
> Right now MapRedTask does an exec on the hadoop command line. This prevents us from stepping through query execution code. If there were an option to run hadoop via MiniMR, this will create tasks within the same VM and so allows stepping into the execution code. See, src/test/org/apache/hadoop/mapred/TestMiniMRWithDFS.java in the hadoop source code for an example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.