You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@eagle.apache.org by qingwen220 <gi...@git.apache.org> on 2016/11/01 04:15:12 UTC

[GitHub] incubator-eagle pull request #591: EAGLE-704: Update spark history config to...

GitHub user qingwen220 opened a pull request:

    https://github.com/apache/incubator-eagle/pull/591

    EAGLE-704: Update spark history config to integrate with the new application framework

    https://issues.apache.org/jira/browse/EAGLE-704

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/qingwen220/incubator-eagle EAGLE-704

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-eagle/pull/591.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #591
    
----
commit 9e4a3f2dbac08d093c6f4d8980199e9e99110f20
Author: Zhao, Qingwen <qi...@apache.org>
Date:   2016-10-31T13:14:54Z

    convert spark history job

commit 9083d659193739e9fa1b5b26bb57f5bb05932bbf
Author: Zhao, Qingwen <qi...@apache.org>
Date:   2016-11-01T03:00:51Z

    update SparkHistoryJobAppConfig

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-eagle issue #591: EAGLE-704: Update spark history config to integr...

Posted by qingwen220 <gi...@git.apache.org>.
Github user qingwen220 commented on the issue:

    https://github.com/apache/incubator-eagle/pull/591
  
    @wujinhu  updated 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-eagle pull request #591: EAGLE-704: Update spark history config to...

Posted by wujinhu <gi...@git.apache.org>.
Github user wujinhu commented on a diff in the pull request:

    https://github.com/apache/incubator-eagle/pull/591#discussion_r85876408
  
    --- Diff: eagle-jpm/eagle-jpm-spark-history/src/main/resources/META-INF/providers/org.apache.eagle.jpm.spark.history.SparkHistoryJobAppProvider.xml ---
    @@ -18,179 +18,127 @@
     
     <application>
         <type>SPARK_HISTORY_JOB_APP</type>
    -    <name>Spark History Job Monitoring</name>
    +    <name>Spark History Job Monitor</name>
         <version>0.5.0-incubating</version>
         <appClass>org.apache.eagle.jpm.spark.history.SparkHistoryJobApp</appClass>
         <configuration>
    -        <!-- org.apache.eagle.jpm.spark.history.SparkHistoryJobAppConfig -->
    +        <!-- topology config -->
             <property>
    -            <name>basic.cluster</name>
    -            <displayName>cluster</displayName>
    -            <description>Cluster Name</description>
    -            <value>sandbox</value>
    +            <name>workers</name>
    +            <displayName>topology workers</displayName>
    +            <description>topology workers</description>
    +            <value>1</value>
             </property>
             <property>
    -            <name>basic.dataCenter</name>
    -            <displayName>dataCenter</displayName>
    -            <description>Data Center</description>
    -            <value>sandbox</value>
    +            <name>topology.numOfSpoutExecutors</name>
    +            <displayName>spout executors</displayName>
    +            <description>Parallelism of sparkHistoryJobFetchSpout </description>
    +            <value>1</value>
             </property>
             <property>
    -            <name>basic.jobConf.additional.info</name>
    -            <displayName>jobConf.additional.info</displayName>
    -            <description>Additional info in Job Configs</description>
    -            <value></value>
    +            <name>topology.numOfSpoutTasks</name>
    +            <displayName>spout tasks</displayName>
    +            <description>Tasks Num of sparkHistoryJobFetchSpout </description>
    +            <value>4</value>
    +        </property>
    +        <property>
    +            <name>topology.numOfParseBoltExecutors</name>
    +            <displayName>parser bolt parallelism hint</displayName>
    +            <description>Parallelism of sparkHistoryJobParseBolt </description>
    +            <value>1</value>
    +        </property>
    +        <property>
    +            <name>topology.numOfParserBoltTasks</name>
    +            <displayName>parser bolt tasks</displayName>
    +            <description>Tasks Num of sparkHistoryJobParseBolt</description>
    +            <value>4</value>
    +        </property>
    +        <property>
    +            <name>topology.spoutCrawlInterval</name>
    +            <displayName>spout crawl interval</displayName>
    +            <description>Spout crawl interval (in milliseconds)</description>
    +            <value>10000</value>
             </property>
             <property>
    -            <name>dataSourceConfig.zkQuorum</name>
    -            <displayName>zkQuorum</displayName>
    -            <description>Zookeeper Quorum</description>
    +            <name>topology.message.timeout.secs</name>
    +            <displayName>topology message timeout (secs)</displayName>
    +            <description>default timeout is 30s</description>
    +            <value>300</value>
    +        </property>
    +        <!-- zookeeper config -->
    +        <property>
    +            <name>zkStateConfig.zkQuorum</name>
    +            <displayName>zookeeper quorum list</displayName>
    +            <description>zookeeper to store topology metadata</description>
                 <value>sandbox.hortonworks.com:2181</value>
             </property>
             <property>
    -            <name>dataSourceConfig.zkRoot</name>
    -            <displayName>zkRoot</displayName>
    +            <name>zkStateConfig.zkRoot</name>
    +            <displayName>zookeeper root for topology metadata</displayName>
                 <description>Zookeeper Root</description>
                 <value>/sparkHistoryJob</value>
             </property>
             <property>
    -            <name>dataSourceConfig.zkPort</name>
    -            <displayName>zkPort</displayName>
    -            <description>Zookeeper Port</description>
    -            <value>2181</value>
    -        </property>
    -        <property>
    -            <name>dataSourceConfig.zkSessionTimeoutMs</name>
    -            <displayName>zkSessionTimeoutMs</displayName>
    +            <name>zkStateConfig.zkSessionTimeoutMs</name>
    +            <displayName>zookeeper session timeout (ms)</displayName>
                 <description>Zookeeper session timeoutMs</description>
                 <value>15000</value>
             </property>
             <property>
    -            <name>zookeeperConfig.zkRetryTimes</name>
    -            <displayName>zkRetryTimes</displayName>
    -            <description>zookeeperConfig.zkRetryTimes</description>
    +            <name>zkStateConfig.zkRetryTimes</name>
    +            <displayName>zookeeper connection retry times</displayName>
    +            <description>retry times for zookeeper connection</description>
                 <value>3</value>
             </property>
             <property>
    -            <name>zookeeperConfig.zkRetryInterval</name>
    -            <displayName>zkRetryInterval</displayName>
    -            <description>zookeeperConfig.zkRetryInterval</description>
    +            <name>zkStateConfig.zkRetryInterval</name>
    +            <displayName>zookeeper connection retry interval</displayName>
    +            <description>retry interval for zookeeper connection</description>
                 <value>20000</value>
             </property>
    +
    +        <!-- datasource config -->
             <property>
                 <name>dataSourceConfig.spark.history.server.url</name>
    -            <displayName>spark.history.server.url</displayName>
    -            <description>Spark History Server URL</description>
    +            <displayName>spark history server url</displayName>
    +            <description>spark history server URL</description>
                 <value>http://sandbox.hortonworks.com:18080</value>
    +            <required>true</required>
             </property>
             <property>
                 <name>dataSourceConfig.spark.history.server.username</name>
    -            <displayName>spark.history.server.username</displayName>
    -            <description>Spark History Server Auth Username</description>
    +            <displayName>user name for spark history server</displayName>
    +            <description>spark history server auth username</description>
    --- End diff --
    
    It seems we do not use spark history server now, the codes have been commented


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-eagle pull request #591: EAGLE-704: Update spark history config to...

Posted by wujinhu <gi...@git.apache.org>.
Github user wujinhu commented on a diff in the pull request:

    https://github.com/apache/incubator-eagle/pull/591#discussion_r85876370
  
    --- Diff: eagle-jpm/eagle-jpm-spark-history/src/main/resources/META-INF/providers/org.apache.eagle.jpm.spark.history.SparkHistoryJobAppProvider.xml ---
    @@ -18,179 +18,127 @@
     
     <application>
         <type>SPARK_HISTORY_JOB_APP</type>
    -    <name>Spark History Job Monitoring</name>
    +    <name>Spark History Job Monitor</name>
         <version>0.5.0-incubating</version>
         <appClass>org.apache.eagle.jpm.spark.history.SparkHistoryJobApp</appClass>
         <configuration>
    -        <!-- org.apache.eagle.jpm.spark.history.SparkHistoryJobAppConfig -->
    +        <!-- topology config -->
             <property>
    -            <name>basic.cluster</name>
    -            <displayName>cluster</displayName>
    -            <description>Cluster Name</description>
    -            <value>sandbox</value>
    +            <name>workers</name>
    +            <displayName>topology workers</displayName>
    +            <description>topology workers</description>
    +            <value>1</value>
             </property>
             <property>
    -            <name>basic.dataCenter</name>
    -            <displayName>dataCenter</displayName>
    -            <description>Data Center</description>
    -            <value>sandbox</value>
    +            <name>topology.numOfSpoutExecutors</name>
    +            <displayName>spout executors</displayName>
    +            <description>Parallelism of sparkHistoryJobFetchSpout </description>
    +            <value>1</value>
             </property>
             <property>
    -            <name>basic.jobConf.additional.info</name>
    -            <displayName>jobConf.additional.info</displayName>
    -            <description>Additional info in Job Configs</description>
    -            <value></value>
    +            <name>topology.numOfSpoutTasks</name>
    +            <displayName>spout tasks</displayName>
    +            <description>Tasks Num of sparkHistoryJobFetchSpout </description>
    +            <value>4</value>
    +        </property>
    +        <property>
    +            <name>topology.numOfParseBoltExecutors</name>
    +            <displayName>parser bolt parallelism hint</displayName>
    +            <description>Parallelism of sparkHistoryJobParseBolt </description>
    +            <value>1</value>
    +        </property>
    +        <property>
    +            <name>topology.numOfParserBoltTasks</name>
    +            <displayName>parser bolt tasks</displayName>
    +            <description>Tasks Num of sparkHistoryJobParseBolt</description>
    +            <value>4</value>
    +        </property>
    +        <property>
    +            <name>topology.spoutCrawlInterval</name>
    +            <displayName>spout crawl interval</displayName>
    +            <description>Spout crawl interval (in milliseconds)</description>
    +            <value>10000</value>
             </property>
             <property>
    -            <name>dataSourceConfig.zkQuorum</name>
    -            <displayName>zkQuorum</displayName>
    -            <description>Zookeeper Quorum</description>
    +            <name>topology.message.timeout.secs</name>
    +            <displayName>topology message timeout (secs)</displayName>
    +            <description>default timeout is 30s</description>
    +            <value>300</value>
    +        </property>
    +        <!-- zookeeper config -->
    +        <property>
    +            <name>zkStateConfig.zkQuorum</name>
    +            <displayName>zookeeper quorum list</displayName>
    +            <description>zookeeper to store topology metadata</description>
                 <value>sandbox.hortonworks.com:2181</value>
             </property>
             <property>
    -            <name>dataSourceConfig.zkRoot</name>
    -            <displayName>zkRoot</displayName>
    +            <name>zkStateConfig.zkRoot</name>
    +            <displayName>zookeeper root for topology metadata</displayName>
    --- End diff --
    
    This can hard code in source code, user does not need to know zkRoot details.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-eagle pull request #591: EAGLE-704: Update spark history config to...

Posted by wujinhu <gi...@git.apache.org>.
Github user wujinhu commented on a diff in the pull request:

    https://github.com/apache/incubator-eagle/pull/591#discussion_r85876323
  
    --- Diff: eagle-jpm/eagle-jpm-spark-history/src/main/resources/META-INF/providers/org.apache.eagle.jpm.spark.history.SparkHistoryJobAppProvider.xml ---
    @@ -18,179 +18,127 @@
     
     <application>
         <type>SPARK_HISTORY_JOB_APP</type>
    -    <name>Spark History Job Monitoring</name>
    +    <name>Spark History Job Monitor</name>
         <version>0.5.0-incubating</version>
         <appClass>org.apache.eagle.jpm.spark.history.SparkHistoryJobApp</appClass>
         <configuration>
    -        <!-- org.apache.eagle.jpm.spark.history.SparkHistoryJobAppConfig -->
    +        <!-- topology config -->
             <property>
    -            <name>basic.cluster</name>
    -            <displayName>cluster</displayName>
    -            <description>Cluster Name</description>
    -            <value>sandbox</value>
    +            <name>workers</name>
    +            <displayName>topology workers</displayName>
    +            <description>topology workers</description>
    +            <value>1</value>
             </property>
             <property>
    -            <name>basic.dataCenter</name>
    -            <displayName>dataCenter</displayName>
    -            <description>Data Center</description>
    -            <value>sandbox</value>
    +            <name>topology.numOfSpoutExecutors</name>
    +            <displayName>spout executors</displayName>
    +            <description>Parallelism of sparkHistoryJobFetchSpout </description>
    +            <value>1</value>
             </property>
             <property>
    -            <name>basic.jobConf.additional.info</name>
    -            <displayName>jobConf.additional.info</displayName>
    -            <description>Additional info in Job Configs</description>
    -            <value></value>
    +            <name>topology.numOfSpoutTasks</name>
    +            <displayName>spout tasks</displayName>
    +            <description>Tasks Num of sparkHistoryJobFetchSpout </description>
    +            <value>4</value>
    +        </property>
    +        <property>
    +            <name>topology.numOfParseBoltExecutors</name>
    +            <displayName>parser bolt parallelism hint</displayName>
    +            <description>Parallelism of sparkHistoryJobParseBolt </description>
    +            <value>1</value>
    +        </property>
    +        <property>
    +            <name>topology.numOfParserBoltTasks</name>
    +            <displayName>parser bolt tasks</displayName>
    +            <description>Tasks Num of sparkHistoryJobParseBolt</description>
    +            <value>4</value>
    +        </property>
    +        <property>
    +            <name>topology.spoutCrawlInterval</name>
    +            <displayName>spout crawl interval</displayName>
    +            <description>Spout crawl interval (in milliseconds)</description>
    +            <value>10000</value>
             </property>
             <property>
    -            <name>dataSourceConfig.zkQuorum</name>
    -            <displayName>zkQuorum</displayName>
    -            <description>Zookeeper Quorum</description>
    +            <name>topology.message.timeout.secs</name>
    +            <displayName>topology message timeout (secs)</displayName>
    +            <description>default timeout is 30s</description>
    +            <value>300</value>
    +        </property>
    +        <!-- zookeeper config -->
    +        <property>
    +            <name>zkStateConfig.zkQuorum</name>
    +            <displayName>zookeeper quorum list</displayName>
    +            <description>zookeeper to store topology metadata</description>
    --- End diff --
    
    zookeeper configure is in eagle server config, we can reuse


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-eagle pull request #591: EAGLE-704: Update spark history config to...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-eagle/pull/591


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---