You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hawq.apache.org by Leon Zhang <le...@gmail.com> on 2015/11/23 04:23:33 UTC

hawq 2.0 on YARN

Hi,

   Is there any tutorial about how to deploy latest HAWQ 2.0-beta on YARN
cluster?
   I just rebuild the latest code from git, and after "hawq init cluster",
it seems the segments does not work on YARN container. Any help will be
appreciated.


Thanks.

Re: hawq 2.0 on YARN

Posted by Leon Zhang <le...@gmail.com>.
Great thanks, Wen Lin, it is very helpful. Now, it works. :)

On Thu, Nov 26, 2015 at 5:13 PM, Wen Lin <wl...@pivotal.io> wrote:

> Hi, Leon,
>
> First of all, the latest HAWQ use "hawq_global_rm_type" to indicate "NONE"
> mode or "YARN" mode(But this is not the reason of the failure below).
>
> The log you attached shows that HAWQ is trying to run in YARN mode, and
> attend to register itself to Hadoop Yarn Resource manager but failed.
> (If succeed, the Progress will be 50%, not 0%)
>
> Please open your yarn-site.xml to check if property
> yarn.resourcemanager.system-metrics-publisher.enabled is true or false.
> If property yarn.resourcemanager.system-metrics-publisher.enabled is true,
> HAWQ will failed to register it to Hadoop Yarn, the progress of Hawq is
> 0%(expected 50%). In the log file of Hadoop Yarn, a null pointer exception
> occurs, just like your exception.
> This similar to
> http://zh.hortonworks.com/community/forums/topic/error-in-handling-event-type-registered-for-applicationattempt/
>
> If yarn.resourcemanager.system-metrics-publisher.enabled is disable,
> the HAWQ can register itself to Yarn successfully.I haven't investigated
> the reason and don't know why the null pointer happens, just track it.
> If it is not because of
> yarn.resourcemanager.system-metrics-publisher.enabled in your environment,
> it maybe the other things cause a null pointer happen in Yarn.
>
> Thanks!
>
>
> On Thu, Nov 26, 2015 at 4:46 PM, Leon Zhang <le...@gmail.com> wrote:
>
>> Thanks Daniel
>>
>>    After I switch "hawq_resourcemanager_server_type" to "yarn", I can see
>> the application now:
>>
>> $ yarn application -list
>>
>>
>>                 Application-Id      Application-Name
>>  Application-Type          User           Queue                   State
>>         Final-State             Progress
>>         Tracking-URL
>> application_1447985660182_0558                  hawq
>>  YARN       xiaolin         default                 RUNNING
>> UNDEFINED                   0%
>>                  url
>>
>>    But, my hawq application hang at RUNNING state. And the log shows:
>>
>>
>> 2015-11-26 16:40:16,186 INFO  security.AMRMTokenSecretManager
>> (AMRMTokenSecretManager.java:createPassword(307)) - Creating password for
>> appattempt_1447985660182_0620_000001
>> 2015-11-26 16:40:16,187 INFO  attempt.RMAppAttemptImpl
>> (RMAppAttemptImpl.java:handle(762)) - appattempt_1447985660182_0620_000001
>> State change from LAUNCHED_UNMANAGED_SAVING to LAUNCHED
>> 2015-11-26 16:40:17,193 INFO  ipc.Server (Server.java:saslProcess(1306))
>> - Auth successful for appattempt_1447985660182_0620_000001 (auth:SIMPLE)
>> 2015-11-26 16:40:17,194 INFO  resourcemanager.ApplicationMasterService
>> (ApplicationMasterService.java:registerApplicationMaster(274)) - AM
>> registration appattempt_1447985660182_0620_000001
>> 2015-11-26 16:40:17,194 INFO  resourcemanager.RMAuditLogger
>> (RMAuditLogger.java:logSuccess(127)) - USER=xiaolin IP=10.10.0.11
>> OPERATION=Register App Master   TARGET=ApplicationMasterService
>> RESULT=SUCCESS  APPID=application_1447985660182_0620
>>  APPATTEMPTID=appattempt_1447985660182_0620_000001
>> 2015-11-26 16:40:17,194 ERROR resourcemanager.ResourceManager
>> (ResourceManager.java:handle(851)) - Error in handling event type
>> REGISTERED for applicationAttempt application_1447985660182_0620
>> java.lang.NullPointerException
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher.appAttemptRegistered(SystemMetricsPublisher.java:143)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1365)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1341)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>>         at java.lang.Thread.run(Thread.java:745)
>> 2015-11-26 16:40:17,195 INFO  rmapp.RMAppImpl
>> (RMAppImpl.java:handle(718)) - application_1447985660182_0620 State change
>> from ACCEPTED to RUNNING
>> 2015-11-26 16:40:17,196 ERROR attempt.RMAppAttemptImpl
>> (RMAppAttemptImpl.java:handle(757)) - Can't handle this event at current
>> state
>> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid
>> event: STATUS_UPDATE at LAUNCHED
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>>         at java.lang.Thread.run(Thread.java:745)
>> 2015-11-26 16:40:22,197 ERROR attempt.RMAppAttemptImpl
>> (RMAppAttemptImpl.java:handle(757)) - Can't handle this event at current
>> state
>> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid
>> event: STATUS_UPDATE at LAUNCHED
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>>         at java.lang.Thread.run(Thread.java:745)
>>
>>
>> Any clue for this issue?
>>
>> Thanks in advance.
>>
>>
>> On Tue, Nov 24, 2015 at 12:48 AM, Daniel Lynch <dl...@pivotal.io> wrote:
>>
>>> here is a working config example from my lab where hawq will execute in
>>> yarn
>>>
>>>
>>>
>>>
>>> $GPHOME/etc/hawq-site.xml
>>> <?xml version="1.0" encoding="UTF-8"?>
>>> <configuration>
>>>
>>>     <property>
>>>     <name>hawq_resourcemanager_query_noresource_timeout</name>
>>> <value>30</value>
>>>     </property>
>>>
>>>     <property>
>>>         <name>hawq_master_address_host</name>
>>>         <value>node2</value>
>>>         <description>The host name of hawq master.</description>
>>>     </property>
>>>
>>>     <property>
>>>         <name>hawq_master_address_port</name>
>>>         <value>2020</value>
>>>         <description>The port of hawq master.</description>
>>>     </property>
>>>
>>>     <property>
>>>         <name>hawq_segment_address_port</name>
>>>         <value>40000</value>
>>>         <description>The port of hawq segment.</description>
>>>     </property>
>>>
>>>     <property>
>>>         <name>hawq_dfs_url</name>
>>>         <value>node2:8020/hawq_default</value>
>>>         <description>URL for accessing HDFS.</description>
>>>     </property>
>>>
>>>     <property>
>>>         <name>hawq_master_directory</name>
>>>         <value>/data/master</value>
>>>         <description>The directory of hawq master.</description>
>>>     </property>
>>>
>>>     <property>
>>>         <name>hawq_segment_directory</name>
>>>         <value>/data/primary</value>
>>>         <description>The directory of hawq segment.</description>
>>>     </property>
>>>
>>>     <property>
>>>         <name>hawq_master_temp_directory</name>
>>>         <value>/tmp</value>
>>>         <description>The temporary directory reserved for hawq
>>> master.</description>
>>>     </property>
>>>
>>>     <property>
>>>         <name>hawq_segment_temp_directory</name>
>>>         <value>/tmp</value>
>>>         <description>The temporary directory reserved for hawq
>>> segment.</description>
>>>     </property>
>>>
>>>     *<!-- HAWQ resource manager parameters -->*
>>> *    <property>*
>>> *        <name>hawq_resourcemanager_server_type</name>*
>>> *        <value>yarn</value>*
>>> *        <description>The resource manager type to start for allocating
>>> resource.*
>>> *                     'none' means hawq resource manager exclusively
>>> uses whole*
>>> *                     cluster; 'yarn' means hawq resource manager
>>> contacts YARN*
>>> *                     resource manager to negotiate resource.*
>>> *        </description>*
>>> *    </property>*
>>>
>>> *    <property>*
>>> *        <name>hawq_resourcemanager_segment_limit_memory_use</name>*
>>> *        <value>64GB</value>*
>>> *        <description>The limit of memory usage in a hawq segment when*
>>> *                     hawq_resourcemanager_server_type is set 'none'.*
>>> *        </description>*
>>> *    </property>*
>>>
>>> *    <property>*
>>> *        <name>hawq_resourcemanager_segment_limit_core_use</name>*
>>> *        <value>16</value>*
>>> *        <description>The limit of virtual core usage in a hawq segment
>>> when*
>>> *                     hawq_resourcemanager_server_type is set 'none'.*
>>> *        </description>*
>>> *    </property>*
>>>
>>> *    <property>*
>>> *        <name>hawq_resourcemanager_yarn_resourcemanager_address</name>*
>>> *        <value>node3:8050</value>*
>>> *        <description>The address of YARN resource manager
>>> server.</description>*
>>> *    </property>*
>>>
>>> *    <property>*
>>> *
>>> <name>hawq_resourcemanager_yarn_resourcemanager_scheduler_address</name>*
>>> *        <value>node3:8030</value>*
>>> *        <description>The address of YARN scheduler
>>> server.</description>*
>>> *    </property>*
>>>
>>> *    <property>*
>>> *        <name>hawq_resourcemanager_yarn_queue</name>*
>>> *        <value>default</value>*
>>> *        <description>The YARN queue name to register hawq resource
>>> manager.</description>*
>>> *    </property>*
>>>
>>> *    <property>*
>>> *        <name>hawq_resourcemanager_yarn_application_name</name>*
>>> *        <value>hawq</value>*
>>> *        <description>The application name to register hawq resource
>>> manager in YARN.</description>*
>>> *    </property>*
>>>
>>> *    <property>*
>>> *        <name>default_segment_num</name>*
>>> *       <value>16</value>*
>>> *    </property>*
>>> *    <property>*
>>> *
>>> <name>hawq_resourcemanager_query_vsegment_number_per_segment_limit</name>*
>>> *       <value>8</value>*
>>> *    </property>*
>>> *</configuration>*
>>>
>>>
>>>
>>>
>>>
>>> Daniel Lynch
>>> Mon-Fri 9-5 PST
>>> Office: 408 780 4498
>>>
>>> On Sun, Nov 22, 2015 at 9:23 PM, Leon Zhang <le...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>>    Is there any tutorial about how to deploy latest HAWQ 2.0-beta on
>>>> YARN cluster?
>>>>    I just rebuild the latest code from git, and after "hawq init
>>>> cluster", it seems the segments does not work on YARN container. Any help
>>>> will be appreciated.
>>>>
>>>>
>>>> Thanks.
>>>>
>>>
>>>
>>
>

Re: hawq 2.0 on YARN

Posted by Wen Lin <wl...@pivotal.io>.
Hi, Leon,

First of all, the latest HAWQ use "hawq_global_rm_type" to indicate "NONE"
mode or "YARN" mode(But this is not the reason of the failure below).

The log you attached shows that HAWQ is trying to run in YARN mode, and
attend to register itself to Hadoop Yarn Resource manager but failed.
(If succeed, the Progress will be 50%, not 0%)

Please open your yarn-site.xml to check if property
yarn.resourcemanager.system-metrics-publisher.enabled is true or false.
If property yarn.resourcemanager.system-metrics-publisher.enabled is true,
HAWQ will failed to register it to Hadoop Yarn, the progress of Hawq is
0%(expected 50%). In the log file of Hadoop Yarn, a null pointer exception
occurs, just like your exception.
This similar to
http://zh.hortonworks.com/community/forums/topic/error-in-handling-event-type-registered-for-applicationattempt/

If yarn.resourcemanager.system-metrics-publisher.enabled is disable,
the HAWQ can register itself to Yarn successfully.I haven't investigated
the reason and don't know why the null pointer happens, just track it.
If it is not because of
yarn.resourcemanager.system-metrics-publisher.enabled in your environment,
it maybe the other things cause a null pointer happen in Yarn.

Thanks!


On Thu, Nov 26, 2015 at 4:46 PM, Leon Zhang <le...@gmail.com> wrote:

> Thanks Daniel
>
>    After I switch "hawq_resourcemanager_server_type" to "yarn", I can see
> the application now:
>
> $ yarn application -list
>
>
>                 Application-Id      Application-Name
>  Application-Type          User           Queue                   State
>         Final-State             Progress
>         Tracking-URL
> application_1447985660182_0558                  hawq
>  YARN       xiaolin         default                 RUNNING
> UNDEFINED                   0%
>                  url
>
>    But, my hawq application hang at RUNNING state. And the log shows:
>
>
> 2015-11-26 16:40:16,186 INFO  security.AMRMTokenSecretManager
> (AMRMTokenSecretManager.java:createPassword(307)) - Creating password for
> appattempt_1447985660182_0620_000001
> 2015-11-26 16:40:16,187 INFO  attempt.RMAppAttemptImpl
> (RMAppAttemptImpl.java:handle(762)) - appattempt_1447985660182_0620_000001
> State change from LAUNCHED_UNMANAGED_SAVING to LAUNCHED
> 2015-11-26 16:40:17,193 INFO  ipc.Server (Server.java:saslProcess(1306)) -
> Auth successful for appattempt_1447985660182_0620_000001 (auth:SIMPLE)
> 2015-11-26 16:40:17,194 INFO  resourcemanager.ApplicationMasterService
> (ApplicationMasterService.java:registerApplicationMaster(274)) - AM
> registration appattempt_1447985660182_0620_000001
> 2015-11-26 16:40:17,194 INFO  resourcemanager.RMAuditLogger
> (RMAuditLogger.java:logSuccess(127)) - USER=xiaolin IP=10.10.0.11
> OPERATION=Register App Master   TARGET=ApplicationMasterService
> RESULT=SUCCESS  APPID=application_1447985660182_0620
>  APPATTEMPTID=appattempt_1447985660182_0620_000001
> 2015-11-26 16:40:17,194 ERROR resourcemanager.ResourceManager
> (ResourceManager.java:handle(851)) - Error in handling event type
> REGISTERED for applicationAttempt application_1447985660182_0620
> java.lang.NullPointerException
>         at
> org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher.appAttemptRegistered(SystemMetricsPublisher.java:143)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1365)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1341)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>         at java.lang.Thread.run(Thread.java:745)
> 2015-11-26 16:40:17,195 INFO  rmapp.RMAppImpl (RMAppImpl.java:handle(718))
> - application_1447985660182_0620 State change from ACCEPTED to RUNNING
> 2015-11-26 16:40:17,196 ERROR attempt.RMAppAttemptImpl
> (RMAppAttemptImpl.java:handle(757)) - Can't handle this event at current
> state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid
> event: STATUS_UPDATE at LAUNCHED
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>         at java.lang.Thread.run(Thread.java:745)
> 2015-11-26 16:40:22,197 ERROR attempt.RMAppAttemptImpl
> (RMAppAttemptImpl.java:handle(757)) - Can't handle this event at current
> state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid
> event: STATUS_UPDATE at LAUNCHED
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>         at java.lang.Thread.run(Thread.java:745)
>
>
> Any clue for this issue?
>
> Thanks in advance.
>
>
> On Tue, Nov 24, 2015 at 12:48 AM, Daniel Lynch <dl...@pivotal.io> wrote:
>
>> here is a working config example from my lab where hawq will execute in
>> yarn
>>
>>
>>
>>
>> $GPHOME/etc/hawq-site.xml
>> <?xml version="1.0" encoding="UTF-8"?>
>> <configuration>
>>
>>     <property>
>>     <name>hawq_resourcemanager_query_noresource_timeout</name>
>> <value>30</value>
>>     </property>
>>
>>     <property>
>>         <name>hawq_master_address_host</name>
>>         <value>node2</value>
>>         <description>The host name of hawq master.</description>
>>     </property>
>>
>>     <property>
>>         <name>hawq_master_address_port</name>
>>         <value>2020</value>
>>         <description>The port of hawq master.</description>
>>     </property>
>>
>>     <property>
>>         <name>hawq_segment_address_port</name>
>>         <value>40000</value>
>>         <description>The port of hawq segment.</description>
>>     </property>
>>
>>     <property>
>>         <name>hawq_dfs_url</name>
>>         <value>node2:8020/hawq_default</value>
>>         <description>URL for accessing HDFS.</description>
>>     </property>
>>
>>     <property>
>>         <name>hawq_master_directory</name>
>>         <value>/data/master</value>
>>         <description>The directory of hawq master.</description>
>>     </property>
>>
>>     <property>
>>         <name>hawq_segment_directory</name>
>>         <value>/data/primary</value>
>>         <description>The directory of hawq segment.</description>
>>     </property>
>>
>>     <property>
>>         <name>hawq_master_temp_directory</name>
>>         <value>/tmp</value>
>>         <description>The temporary directory reserved for hawq
>> master.</description>
>>     </property>
>>
>>     <property>
>>         <name>hawq_segment_temp_directory</name>
>>         <value>/tmp</value>
>>         <description>The temporary directory reserved for hawq
>> segment.</description>
>>     </property>
>>
>>     *<!-- HAWQ resource manager parameters -->*
>> *    <property>*
>> *        <name>hawq_resourcemanager_server_type</name>*
>> *        <value>yarn</value>*
>> *        <description>The resource manager type to start for allocating
>> resource.*
>> *                     'none' means hawq resource manager exclusively uses
>> whole*
>> *                     cluster; 'yarn' means hawq resource manager
>> contacts YARN*
>> *                     resource manager to negotiate resource.*
>> *        </description>*
>> *    </property>*
>>
>> *    <property>*
>> *        <name>hawq_resourcemanager_segment_limit_memory_use</name>*
>> *        <value>64GB</value>*
>> *        <description>The limit of memory usage in a hawq segment when*
>> *                     hawq_resourcemanager_server_type is set 'none'.*
>> *        </description>*
>> *    </property>*
>>
>> *    <property>*
>> *        <name>hawq_resourcemanager_segment_limit_core_use</name>*
>> *        <value>16</value>*
>> *        <description>The limit of virtual core usage in a hawq segment
>> when*
>> *                     hawq_resourcemanager_server_type is set 'none'.*
>> *        </description>*
>> *    </property>*
>>
>> *    <property>*
>> *        <name>hawq_resourcemanager_yarn_resourcemanager_address</name>*
>> *        <value>node3:8050</value>*
>> *        <description>The address of YARN resource manager
>> server.</description>*
>> *    </property>*
>>
>> *    <property>*
>> *
>> <name>hawq_resourcemanager_yarn_resourcemanager_scheduler_address</name>*
>> *        <value>node3:8030</value>*
>> *        <description>The address of YARN scheduler server.</description>*
>> *    </property>*
>>
>> *    <property>*
>> *        <name>hawq_resourcemanager_yarn_queue</name>*
>> *        <value>default</value>*
>> *        <description>The YARN queue name to register hawq resource
>> manager.</description>*
>> *    </property>*
>>
>> *    <property>*
>> *        <name>hawq_resourcemanager_yarn_application_name</name>*
>> *        <value>hawq</value>*
>> *        <description>The application name to register hawq resource
>> manager in YARN.</description>*
>> *    </property>*
>>
>> *    <property>*
>> *        <name>default_segment_num</name>*
>> *       <value>16</value>*
>> *    </property>*
>> *    <property>*
>> *
>> <name>hawq_resourcemanager_query_vsegment_number_per_segment_limit</name>*
>> *       <value>8</value>*
>> *    </property>*
>> *</configuration>*
>>
>>
>>
>>
>>
>> Daniel Lynch
>> Mon-Fri 9-5 PST
>> Office: 408 780 4498
>>
>> On Sun, Nov 22, 2015 at 9:23 PM, Leon Zhang <le...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>    Is there any tutorial about how to deploy latest HAWQ 2.0-beta on
>>> YARN cluster?
>>>    I just rebuild the latest code from git, and after "hawq init
>>> cluster", it seems the segments does not work on YARN container. Any help
>>> will be appreciated.
>>>
>>>
>>> Thanks.
>>>
>>
>>
>

Re: hawq 2.0 on YARN

Posted by Leon Zhang <le...@gmail.com>.
Thanks Daniel

   After I switch "hawq_resourcemanager_server_type" to "yarn", I can see
the application now:

$ yarn application -list


                Application-Id      Application-Name
 Application-Type          User           Queue                   State
        Final-State             Progress
        Tracking-URL
application_1447985660182_0558                  hawq
 YARN       xiaolin         default                 RUNNING
UNDEFINED                   0%
                 url

   But, my hawq application hang at RUNNING state. And the log shows:


2015-11-26 16:40:16,186 INFO  security.AMRMTokenSecretManager
(AMRMTokenSecretManager.java:createPassword(307)) - Creating password for
appattempt_1447985660182_0620_000001
2015-11-26 16:40:16,187 INFO  attempt.RMAppAttemptImpl
(RMAppAttemptImpl.java:handle(762)) - appattempt_1447985660182_0620_000001
State change from LAUNCHED_UNMANAGED_SAVING to LAUNCHED
2015-11-26 16:40:17,193 INFO  ipc.Server (Server.java:saslProcess(1306)) -
Auth successful for appattempt_1447985660182_0620_000001 (auth:SIMPLE)
2015-11-26 16:40:17,194 INFO  resourcemanager.ApplicationMasterService
(ApplicationMasterService.java:registerApplicationMaster(274)) - AM
registration appattempt_1447985660182_0620_000001
2015-11-26 16:40:17,194 INFO  resourcemanager.RMAuditLogger
(RMAuditLogger.java:logSuccess(127)) - USER=xiaolin IP=10.10.0.11
OPERATION=Register App Master   TARGET=ApplicationMasterService
RESULT=SUCCESS  APPID=application_1447985660182_0620
 APPATTEMPTID=appattempt_1447985660182_0620_000001
2015-11-26 16:40:17,194 ERROR resourcemanager.ResourceManager
(ResourceManager.java:handle(851)) - Error in handling event type
REGISTERED for applicationAttempt application_1447985660182_0620
java.lang.NullPointerException
        at
org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher.appAttemptRegistered(SystemMetricsPublisher.java:143)
        at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1365)
        at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1341)
        at
org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
        at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
        at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)
        at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
        at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849)
        at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830)
        at
org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266)
        at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
        at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
        at java.lang.Thread.run(Thread.java:745)
2015-11-26 16:40:17,195 INFO  rmapp.RMAppImpl (RMAppImpl.java:handle(718))
- application_1447985660182_0620 State change from ACCEPTED to RUNNING
2015-11-26 16:40:17,196 ERROR attempt.RMAppAttemptImpl
(RMAppAttemptImpl.java:handle(757)) - Can't handle this event at current
state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
STATUS_UPDATE at LAUNCHED
        at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
        at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)
        at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
        at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849)
        at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830)
        at
org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266)
        at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
        at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
        at java.lang.Thread.run(Thread.java:745)
2015-11-26 16:40:22,197 ERROR attempt.RMAppAttemptImpl
(RMAppAttemptImpl.java:handle(757)) - Can't handle this event at current
state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
STATUS_UPDATE at LAUNCHED
        at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
        at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)
        at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
        at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849)
        at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830)
        at
org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266)
        at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
        at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
        at java.lang.Thread.run(Thread.java:745)


Any clue for this issue?

Thanks in advance.


On Tue, Nov 24, 2015 at 12:48 AM, Daniel Lynch <dl...@pivotal.io> wrote:

> here is a working config example from my lab where hawq will execute in
> yarn
>
>
>
>
> $GPHOME/etc/hawq-site.xml
> <?xml version="1.0" encoding="UTF-8"?>
> <configuration>
>
>     <property>
>     <name>hawq_resourcemanager_query_noresource_timeout</name>
> <value>30</value>
>     </property>
>
>     <property>
>         <name>hawq_master_address_host</name>
>         <value>node2</value>
>         <description>The host name of hawq master.</description>
>     </property>
>
>     <property>
>         <name>hawq_master_address_port</name>
>         <value>2020</value>
>         <description>The port of hawq master.</description>
>     </property>
>
>     <property>
>         <name>hawq_segment_address_port</name>
>         <value>40000</value>
>         <description>The port of hawq segment.</description>
>     </property>
>
>     <property>
>         <name>hawq_dfs_url</name>
>         <value>node2:8020/hawq_default</value>
>         <description>URL for accessing HDFS.</description>
>     </property>
>
>     <property>
>         <name>hawq_master_directory</name>
>         <value>/data/master</value>
>         <description>The directory of hawq master.</description>
>     </property>
>
>     <property>
>         <name>hawq_segment_directory</name>
>         <value>/data/primary</value>
>         <description>The directory of hawq segment.</description>
>     </property>
>
>     <property>
>         <name>hawq_master_temp_directory</name>
>         <value>/tmp</value>
>         <description>The temporary directory reserved for hawq
> master.</description>
>     </property>
>
>     <property>
>         <name>hawq_segment_temp_directory</name>
>         <value>/tmp</value>
>         <description>The temporary directory reserved for hawq
> segment.</description>
>     </property>
>
>     *<!-- HAWQ resource manager parameters -->*
> *    <property>*
> *        <name>hawq_resourcemanager_server_type</name>*
> *        <value>yarn</value>*
> *        <description>The resource manager type to start for allocating
> resource.*
> *                     'none' means hawq resource manager exclusively uses
> whole*
> *                     cluster; 'yarn' means hawq resource manager contacts
> YARN*
> *                     resource manager to negotiate resource.*
> *        </description>*
> *    </property>*
>
> *    <property>*
> *        <name>hawq_resourcemanager_segment_limit_memory_use</name>*
> *        <value>64GB</value>*
> *        <description>The limit of memory usage in a hawq segment when*
> *                     hawq_resourcemanager_server_type is set 'none'.*
> *        </description>*
> *    </property>*
>
> *    <property>*
> *        <name>hawq_resourcemanager_segment_limit_core_use</name>*
> *        <value>16</value>*
> *        <description>The limit of virtual core usage in a hawq segment
> when*
> *                     hawq_resourcemanager_server_type is set 'none'.*
> *        </description>*
> *    </property>*
>
> *    <property>*
> *        <name>hawq_resourcemanager_yarn_resourcemanager_address</name>*
> *        <value>node3:8050</value>*
> *        <description>The address of YARN resource manager
> server.</description>*
> *    </property>*
>
> *    <property>*
> *
> <name>hawq_resourcemanager_yarn_resourcemanager_scheduler_address</name>*
> *        <value>node3:8030</value>*
> *        <description>The address of YARN scheduler server.</description>*
> *    </property>*
>
> *    <property>*
> *        <name>hawq_resourcemanager_yarn_queue</name>*
> *        <value>default</value>*
> *        <description>The YARN queue name to register hawq resource
> manager.</description>*
> *    </property>*
>
> *    <property>*
> *        <name>hawq_resourcemanager_yarn_application_name</name>*
> *        <value>hawq</value>*
> *        <description>The application name to register hawq resource
> manager in YARN.</description>*
> *    </property>*
>
> *    <property>*
> *        <name>default_segment_num</name>*
> *       <value>16</value>*
> *    </property>*
> *    <property>*
> *
> <name>hawq_resourcemanager_query_vsegment_number_per_segment_limit</name>*
> *       <value>8</value>*
> *    </property>*
> *</configuration>*
>
>
>
>
>
> Daniel Lynch
> Mon-Fri 9-5 PST
> Office: 408 780 4498
>
> On Sun, Nov 22, 2015 at 9:23 PM, Leon Zhang <le...@gmail.com> wrote:
>
>> Hi,
>>
>>    Is there any tutorial about how to deploy latest HAWQ 2.0-beta on YARN
>> cluster?
>>    I just rebuild the latest code from git, and after "hawq init
>> cluster", it seems the segments does not work on YARN container. Any help
>> will be appreciated.
>>
>>
>> Thanks.
>>
>
>

Re: hawq 2.0 on YARN

Posted by Daniel Lynch <dl...@pivotal.io>.
here is a working config example from my lab where hawq will execute in yarn




$GPHOME/etc/hawq-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<configuration>

    <property>
    <name>hawq_resourcemanager_query_noresource_timeout</name>
<value>30</value>
    </property>

    <property>
        <name>hawq_master_address_host</name>
        <value>node2</value>
        <description>The host name of hawq master.</description>
    </property>

    <property>
        <name>hawq_master_address_port</name>
        <value>2020</value>
        <description>The port of hawq master.</description>
    </property>

    <property>
        <name>hawq_segment_address_port</name>
        <value>40000</value>
        <description>The port of hawq segment.</description>
    </property>

    <property>
        <name>hawq_dfs_url</name>
        <value>node2:8020/hawq_default</value>
        <description>URL for accessing HDFS.</description>
    </property>

    <property>
        <name>hawq_master_directory</name>
        <value>/data/master</value>
        <description>The directory of hawq master.</description>
    </property>

    <property>
        <name>hawq_segment_directory</name>
        <value>/data/primary</value>
        <description>The directory of hawq segment.</description>
    </property>

    <property>
        <name>hawq_master_temp_directory</name>
        <value>/tmp</value>
        <description>The temporary directory reserved for hawq
master.</description>
    </property>

    <property>
        <name>hawq_segment_temp_directory</name>
        <value>/tmp</value>
        <description>The temporary directory reserved for hawq
segment.</description>
    </property>

    *<!-- HAWQ resource manager parameters -->*
*    <property>*
*        <name>hawq_resourcemanager_server_type</name>*
*        <value>yarn</value>*
*        <description>The resource manager type to start for allocating
resource.*
*                     'none' means hawq resource manager exclusively uses
whole*
*                     cluster; 'yarn' means hawq resource manager contacts
YARN*
*                     resource manager to negotiate resource.*
*        </description>*
*    </property>*

*    <property>*
*        <name>hawq_resourcemanager_segment_limit_memory_use</name>*
*        <value>64GB</value>*
*        <description>The limit of memory usage in a hawq segment when*
*                     hawq_resourcemanager_server_type is set 'none'.*
*        </description>*
*    </property>*

*    <property>*
*        <name>hawq_resourcemanager_segment_limit_core_use</name>*
*        <value>16</value>*
*        <description>The limit of virtual core usage in a hawq segment
when*
*                     hawq_resourcemanager_server_type is set 'none'.*
*        </description>*
*    </property>*

*    <property>*
*        <name>hawq_resourcemanager_yarn_resourcemanager_address</name>*
*        <value>node3:8050</value>*
*        <description>The address of YARN resource manager
server.</description>*
*    </property>*

*    <property>*
*
<name>hawq_resourcemanager_yarn_resourcemanager_scheduler_address</name>*
*        <value>node3:8030</value>*
*        <description>The address of YARN scheduler server.</description>*
*    </property>*

*    <property>*
*        <name>hawq_resourcemanager_yarn_queue</name>*
*        <value>default</value>*
*        <description>The YARN queue name to register hawq resource
manager.</description>*
*    </property>*

*    <property>*
*        <name>hawq_resourcemanager_yarn_application_name</name>*
*        <value>hawq</value>*
*        <description>The application name to register hawq resource
manager in YARN.</description>*
*    </property>*

*    <property>*
*        <name>default_segment_num</name>*
*       <value>16</value>*
*    </property>*
*    <property>*
*
<name>hawq_resourcemanager_query_vsegment_number_per_segment_limit</name>*
*       <value>8</value>*
*    </property>*
*</configuration>*





Daniel Lynch
Mon-Fri 9-5 PST
Office: 408 780 4498

On Sun, Nov 22, 2015 at 9:23 PM, Leon Zhang <le...@gmail.com> wrote:

> Hi,
>
>    Is there any tutorial about how to deploy latest HAWQ 2.0-beta on YARN
> cluster?
>    I just rebuild the latest code from git, and after "hawq init cluster",
> it seems the segments does not work on YARN container. Any help will be
> appreciated.
>
>
> Thanks.
>