You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2009/11/06 20:13:33 UTC

[Hadoop Wiki] Trivial Update of "Hive/HowToContribute" by Ning Zhang

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/HowToContribute" page has been changed by Ning Zhang.
The comment on this change is: move debugging hive info to Developer's guide and add a link in HowToContribute.
http://wiki.apache.org/hadoop/Hive/HowToContribute?action=diff&rev1=11&rev2=12

--------------------------------------------------

    * If the feature is added in contrib
     * Do the steps above, replacing "ql" with "contrib", and "TestCliDriver" with "TestContribCliDriver".
  
- === Debugging Hive code ===
+ === Debugging ===
- Hive code includes both client-side (compiler, semantic analyzer, and optimizer of HiveQL) code and server-side code (any operator/task implementations). The client-side code are running on your local machine so you can easily debug it using Eclipse the same way as you debug a regular local Java code.  The server-side code is distributed and running on the Hadoop cluster, so debugging server-side Hive code is a little bit complicated. Nonetheless, we can still attach the debugger to a different JVM under unit test (single machine mode). Below are the steps for how to debug on server-side code.
  
+ Please see [[http://wiki.apache.org/hadoop/Hive/DeveloperGuide#Debugging_Hive_code|Debugging Hive code]] in Development Guide.
-  * Compile Hive code with javac.debug=on. Under Hive checkout directory. {{{
-     > ant -Djavac.debug=on package
-   }}} If you have already built Hive without javac.debug=on, you can clean the build and then run the above command. {{{
-     > ant clean  # not necessary if the first time to compile
-     > ant -Djavac.debug=on package 
-   }}}
-  * Run ant test with additional options to tell the Java VM that is running Hive server-side code to wait for debugger to attach. First define some convenient macros for debugging. You can put it in your .bashrc or .cshrc. {{{
-     > export HIVE_DEBUG_PORT=8000
-     > export $HIVE_DEBUG="-Xdebug -Xrunjdwp:transport=dt_socket,address=${HIVE_DEBUG_PORT},server=y,suspend=y"
-   }}} In particular HIVE_DEBUG_PORT is the port number that the JVM is listening on and the debugger will attach to. Then run the unit test as follows: {{{
-     > $HADOOP_OPTS=$HIVE_DEBUG ant test -Dtestcase=TestCliDriver -Dqfile=<mytest>.q
-   }}} The unit test will run until it shows: {{{
-      [junit] Listening for transport dt_socket at address: 8000
-   }}}
-  * Now, you can use jdb to attach to port 8000 to debug{{{
-     > jdb -attach 8000
- }}}  or better off if you are running Eclipse and the Hive projects are already imported, you can debug with Eclipse. Under Eclipse Run -> Debug Configurations, find "Remote Java Application" at the bottom of the left panel. There should be a MapRedTask configuration already. If there is no such configuration, you can create one with the following property:
-      * Project:  the Hive project that you imported.
-      * Connection Type: Standard (Socket Attach)
-      * Connection Properties:
-        * Host: localhost  
-        * Port: 8000 
-      Then hit the "Debug" button and it will attach the JVM listening on port 8000 and continue running till the end. If you define breakpoints in the source code before hitting the "Debug" button, it will stop there. The rest is the same as debugging client-side Hive.
  
  === Creating a patch ===
  Check to see what files you have modified with: