You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Russell Melick <rm...@hmc.edu> on 2010/11/17 08:33:51 UTC

Debugging Hive

Hi all,

Apologies for the long email.  I'm attempting to connect to the hive
code with a debugger to understand the optimizer.  In this example, I
would like to pause at the beginning of the createIndex function in
DDLTask.  After following the directions on the wiki, I'm stumped as to
why connecting remotely with jdb is not working.  Any help is
appreciated.  I've detailed my process below.

Thanks,
Russell Melick



First, I check out a fresh copy of trunk:
        
> svn checkout http://svn.apache.org/repos/asf/hive/trunk hive

Then, I compile it for debugging, as I found at http://wiki.apache.org/hadoop/Hive/DeveloperGuide#Debugging_Hive_code:

> cd hive
> ant -Djavac.debug=on package

Here are the important variables before I run hive:
> echo $HIVE_DEBUG_PORT
> 8000

> echo $HIVE_DEBUG
> -Xdebug -Xrunjdwp:transport=dt_socket,address=8000,server=y,suspend=y

> echo $HADOOP_OPTS
> -Xdebug -Xrunjdwp:transport=dt_socket,address=8000,server=y,suspend=y

I then try to run one of the unit tests:

> ant test -Dtestcase=TestCliDriver -Dqfile=index_compact_1.q

The test then waits for a debugger to attach:
> [junit] Begin query: index_compact_1.q
>     [junit] Listening for transport dt_socket at address: 8000

However, when I try to attach with jdb, it doesn't work.

> jdb -attach 8000
> Set uncaught java.lang.Throwable
> Set deferred uncaught java.lang.Throwable
> Initializing jdb ...
> > 
> VM Started: No frames on the current call stack
> 
> main[1] stop at DDLTask:317
> Deferring breakpoint DDLTask:317.
> It will be set after the class is loaded.
> main[1] resume
> All threads resumed.
> > 
> The application exited

The unit test does not seem to recognize that it received the connection.  After jdb prints "application exited", the test says it is listening again, but it didn't confirm the connection.

> [junit] Begin query: index_compact_1.q
> [junit] Listening for transport dt_socket at address: 8000
> [junit] Deleted file:/home/rmelick/hive/build/ql/test/data/warehouse/default__src_src_index__
> [junit] Listening for transport dt_socket at address: 8000
> [junit] Listening for transport dt_socket at address: 8000

I have also attempted to start hiveCli, but I have had no success getting the CLI to start after attaching the debugger.

This is the only output I get from the cli:
> build/dist/bin/hive
> Unable to determine Hadoop version information.
> 'hadoop version' returned:

jdb doesn't seem to be connecting correctly
> jdb -attach 8000
> Set uncaught java.lang.Throwable
> Set deferred uncaught java.lang.Throwable
> Initializing jdb ...
> > 
> VM Started: No frames on the current call stack
> 
> main[1] stop at DDLTask:317
> Deferring breakpoint DDLTask:317.
> It will be set after the class is loaded.
> main[1] resume
> All threads resumed.
> > 
> The application exited

I've also tried connecting with Eclipse, after generating the eclipse files, importing the project, and creating the MapRedTask configuration.  It either seems to similarly connect without the client recognizing it, or throws a connection error.

Again, thanks for any help.



RE: Debugging Hive

Posted by Russell Melick <rm...@hmc.edu>.
Thanks Sanjay,

I had a look, but I got some help doing it with eclipse, which worked
great.

I've updated the wiki with that process.

http://wiki.apache.org/hadoop/Hive/DeveloperGuide#Debugging_Hive_code

Russell

On Wed, 2010-11-17 at 13:16 +0530, Sanjay Sharma wrote:
> Russell,
> See if this can help you- http://indoos.wordpress.com/2010/06/24/hive-remote-debugging/
> 
> 
> You would have to tweak the old scripts however for your Hadoop and Hive versions
> 
> Regards,
> Sanjay Sharma
> 
> 
> -----Original Message-----
> From: Russell Melick [mailto:rmelick@hmc.edu]
> Sent: Wednesday, November 17, 2010 1:04 PM
> To: hive-dev@hadoop.apache.org
> Cc: jlym@hmc.edu; mwang@hmc.edu
> Subject: Debugging Hive
> 
> Hi all,
> 
> Apologies for the long email.  I'm attempting to connect to the hive
> code with a debugger to understand the optimizer.  In this example, I
> would like to pause at the beginning of the createIndex function in
> DDLTask.  After following the directions on the wiki, I'm stumped as to
> why connecting remotely with jdb is not working.  Any help is
> appreciated.  I've detailed my process below.
> 
> Thanks,
> Russell Melick
> 
> 
> 
> First, I check out a fresh copy of trunk:
> 
> > svn checkout http://svn.apache.org/repos/asf/hive/trunk hive
> 
> Then, I compile it for debugging, as I found at http://wiki.apache.org/hadoop/Hive/DeveloperGuide#Debugging_Hive_code:
> 
> > cd hive
> > ant -Djavac.debug=on package
> 
> Here are the important variables before I run hive:
> > echo $HIVE_DEBUG_PORT
> > 8000
> 
> > echo $HIVE_DEBUG
> > -Xdebug -Xrunjdwp:transport=dt_socket,address=8000,server=y,suspend=y
> 
> > echo $HADOOP_OPTS
> > -Xdebug -Xrunjdwp:transport=dt_socket,address=8000,server=y,suspend=y
> 
> I then try to run one of the unit tests:
> 
> > ant test -Dtestcase=TestCliDriver -Dqfile=index_compact_1.q
> 
> The test then waits for a debugger to attach:
> > [junit] Begin query: index_compact_1.q
> >     [junit] Listening for transport dt_socket at address: 8000
> 
> However, when I try to attach with jdb, it doesn't work.
> 
> > jdb -attach 8000
> > Set uncaught java.lang.Throwable
> > Set deferred uncaught java.lang.Throwable
> > Initializing jdb ...
> > >
> > VM Started: No frames on the current call stack
> >
> > main[1] stop at DDLTask:317
> > Deferring breakpoint DDLTask:317.
> > It will be set after the class is loaded.
> > main[1] resume
> > All threads resumed.
> > >
> > The application exited
> 
> The unit test does not seem to recognize that it received the connection.  After jdb prints "application exited", the test says it is listening again, but it didn't confirm the connection.
> 
> > [junit] Begin query: index_compact_1.q
> > [junit] Listening for transport dt_socket at address: 8000
> > [junit] Deleted file:/home/rmelick/hive/build/ql/test/data/warehouse/default__src_src_index__
> > [junit] Listening for transport dt_socket at address: 8000
> > [junit] Listening for transport dt_socket at address: 8000
> 
> I have also attempted to start hiveCli, but I have had no success getting the CLI to start after attaching the debugger.
> 
> This is the only output I get from the cli:
> > build/dist/bin/hive
> > Unable to determine Hadoop version information.
> > 'hadoop version' returned:
> 
> jdb doesn't seem to be connecting correctly
> > jdb -attach 8000
> > Set uncaught java.lang.Throwable
> > Set deferred uncaught java.lang.Throwable
> > Initializing jdb ...
> > >
> > VM Started: No frames on the current call stack
> >
> > main[1] stop at DDLTask:317
> > Deferring breakpoint DDLTask:317.
> > It will be set after the class is loaded.
> > main[1] resume
> > All threads resumed.
> > >
> > The application exited
> 
> I've also tried connecting with Eclipse, after generating the eclipse files, importing the project, and creating the MapRedTask configuration.  It either seems to similarly connect without the client recognizing it, or throws a connection error.
> 
> Again, thanks for any help.
> 
> 
> 
> Impetus is a proud sponsor for ASCI Tour 2010 (Agile Software Community of India) on Oct 30 in Noida, India.
> 
> Meet Impetus at the Cloud Computing Expo from Nov 1-4 in Santa Clara. Our Sr. Director of Engineering, Vineet Tyagi will be speaking about ‘Using Hadoop for Deriving Intelligence from Large Data’.
> 
> Click http://www.impetus.com/ to know more. Follow us on www.twitter.com/impetuscalling
> 
> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.



RE: Debugging Hive

Posted by Sanjay Sharma <sa...@impetus.co.in>.
Russell,
See if this can help you- http://indoos.wordpress.com/2010/06/24/hive-remote-debugging/


You would have to tweak the old scripts however for your Hadoop and Hive versions

Regards,
Sanjay Sharma


-----Original Message-----
From: Russell Melick [mailto:rmelick@hmc.edu]
Sent: Wednesday, November 17, 2010 1:04 PM
To: hive-dev@hadoop.apache.org
Cc: jlym@hmc.edu; mwang@hmc.edu
Subject: Debugging Hive

Hi all,

Apologies for the long email.  I'm attempting to connect to the hive
code with a debugger to understand the optimizer.  In this example, I
would like to pause at the beginning of the createIndex function in
DDLTask.  After following the directions on the wiki, I'm stumped as to
why connecting remotely with jdb is not working.  Any help is
appreciated.  I've detailed my process below.

Thanks,
Russell Melick



First, I check out a fresh copy of trunk:

> svn checkout http://svn.apache.org/repos/asf/hive/trunk hive

Then, I compile it for debugging, as I found at http://wiki.apache.org/hadoop/Hive/DeveloperGuide#Debugging_Hive_code:

> cd hive
> ant -Djavac.debug=on package

Here are the important variables before I run hive:
> echo $HIVE_DEBUG_PORT
> 8000

> echo $HIVE_DEBUG
> -Xdebug -Xrunjdwp:transport=dt_socket,address=8000,server=y,suspend=y

> echo $HADOOP_OPTS
> -Xdebug -Xrunjdwp:transport=dt_socket,address=8000,server=y,suspend=y

I then try to run one of the unit tests:

> ant test -Dtestcase=TestCliDriver -Dqfile=index_compact_1.q

The test then waits for a debugger to attach:
> [junit] Begin query: index_compact_1.q
>     [junit] Listening for transport dt_socket at address: 8000

However, when I try to attach with jdb, it doesn't work.

> jdb -attach 8000
> Set uncaught java.lang.Throwable
> Set deferred uncaught java.lang.Throwable
> Initializing jdb ...
> >
> VM Started: No frames on the current call stack
>
> main[1] stop at DDLTask:317
> Deferring breakpoint DDLTask:317.
> It will be set after the class is loaded.
> main[1] resume
> All threads resumed.
> >
> The application exited

The unit test does not seem to recognize that it received the connection.  After jdb prints "application exited", the test says it is listening again, but it didn't confirm the connection.

> [junit] Begin query: index_compact_1.q
> [junit] Listening for transport dt_socket at address: 8000
> [junit] Deleted file:/home/rmelick/hive/build/ql/test/data/warehouse/default__src_src_index__
> [junit] Listening for transport dt_socket at address: 8000
> [junit] Listening for transport dt_socket at address: 8000

I have also attempted to start hiveCli, but I have had no success getting the CLI to start after attaching the debugger.

This is the only output I get from the cli:
> build/dist/bin/hive
> Unable to determine Hadoop version information.
> 'hadoop version' returned:

jdb doesn't seem to be connecting correctly
> jdb -attach 8000
> Set uncaught java.lang.Throwable
> Set deferred uncaught java.lang.Throwable
> Initializing jdb ...
> >
> VM Started: No frames on the current call stack
>
> main[1] stop at DDLTask:317
> Deferring breakpoint DDLTask:317.
> It will be set after the class is loaded.
> main[1] resume
> All threads resumed.
> >
> The application exited

I've also tried connecting with Eclipse, after generating the eclipse files, importing the project, and creating the MapRedTask configuration.  It either seems to similarly connect without the client recognizing it, or throws a connection error.

Again, thanks for any help.



Impetus is a proud sponsor for ASCI Tour 2010 (Agile Software Community of India) on Oct 30 in Noida, India.

Meet Impetus at the Cloud Computing Expo from Nov 1-4 in Santa Clara. Our Sr. Director of Engineering, Vineet Tyagi will be speaking about ‘Using Hadoop for Deriving Intelligence from Large Data’.

Click http://www.impetus.com/ to know more. Follow us on www.twitter.com/impetuscalling

NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.