You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hawq.apache.org by wcl14 <gi...@git.apache.org> on 2016/08/01 07:00:47 UTC

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

GitHub user wcl14 opened a pull request:

    https://github.com/apache/incubator-hawq/pull/828

    HAWQ-969. Add getting configuration from HDFS and YARN

    Add getting configuration from HDFS and YARN, and writing xml file in xml_parser.cpp.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/wcl14/incubator-hawq HAWQ-969

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-hawq/pull/828.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #828
    
----
commit f892618a5c3e69f59d8b683352141ed67b478d89
Author: Chunling Wang <wa...@126.com>
Date:   2016-08-01T06:59:17Z

    HAWQ-969. Add getting configuration from HDFS and YARN

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

Posted by wengyanqing <gi...@git.apache.org>.
Github user wengyanqing commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/828#discussion_r73454355
  
    --- Diff: src/test/feature/lib/hdfs_config.cpp ---
    @@ -0,0 +1,293 @@
    +#include <fstream>
    +#include <string>
    +#include <vector>
    +#include <unordered_set>
    +
    +#include "hdfs_config.h"
    +#include "command.h"
    +#include "psql.h"
    +#include "xml_parser.h"
    +#include "string_util.h"
    +
    +using std::string;
    +
    +namespace hawq {
    +namespace test {
    +
    +string HdfsConfig::getHdfsUser() {
    +  string cmd = "ps aux|grep hdfs.server|grep -v grep";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  if (lines.size() >= 1) {
    +    return hawq::test::trim(hawq::test::split(lines[lines.size()-1], ' ')[0]);
    +  } 
    +  return "hdfs";
    +}
    +
    +bool HdfsConfig::LoadFromHawqConfigFile() {
    +  const char *env = getenv("GPHOME");
    +  string confPath = env ? env : "";
    +  if (confPath != "") {
    +    confPath.append("/etc/hdfs-client.xml");
    +  } else {
    +    return false;
    +  }
    +
    +  hawqxmlconf.reset(new XmlConfig(confPath));
    +  hawqxmlconf->parse();
    --- End diff --
    
    Your should handle parse fail condition.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

Posted by xunzhang <gi...@git.apache.org>.
Github user xunzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/828#discussion_r72932215
  
    --- Diff: src/test/feature/lib/hdfs_config.h ---
    @@ -0,0 +1,172 @@
    +#ifndef HAWQ_SRC_TEST_FEATURE_LIB_HDFS_CONFIG_H_
    +#define HAWQ_SRC_TEST_FEATURE_LIB_HDFS_CONFIG_H_
    +
    +#include <string>
    +#include <vector>
    +
    +#include "psql.h"
    +#include "sql_util.h"
    +#include "xml_parser.h"
    +
    +namespace hawq {
    +namespace test {
    +
    +/**
    + * HdfsConfig common libray. Get detailed information about HDFS
    + * including checking state of namenodes and datanodes, get parameter value
    + * @author Chunling Wang
    + */
    +class HdfsConfig {
    +  public:
    +    /**
    +     * HdfsConfig constructor
    +     */
    +    HdfsConfig(): psql(HAWQ_DB, HAWQ_HOST, HAWQ_PORT, HAWQ_USER, HAWQ_PASSWORD) {}
    +
    +    /**
    +     * HdfsConfig desstructor
    +     */
    +    ~HdfsConfig() {}
    +
    +    /**
    +     * whether HDFS is in HA mode
    +     * @return true if HDFS is HA
    +     */
    +    bool isHA();
    +
    +    /**
    +     * whether HDFS is kerbos
    +     * @return true if HDFS is kerbos
    +     */
    +    bool isKerbos();
    +
    +    /**
    +     * whether HDFS supports truncate operation
    +     * @return true if HDFS supports truncate operation
    +     */
    +    bool isTruncate();
    +
    +    /**
    +     * get HADOOP working directory
    +     * @return HADOOP working directory
    +     */
    +    std::string getHadoopHome();
    +
    +    /**
    +     * get HDFS active namenode's hostname and port information
    +     * @param activenamenode, active namenode hostname reference which will be set
    +     * @param port, active namenode port reference which will be set
    +     * @return true if getActiveNamenode succeeded
    +     */
    +    bool getActiveNamenode(std::string &activenamenode,
    +                           int &port);
    +
    +    /**
    +     * get HDFS standby namenode's hostname and port information
    +     * @param standbynamenode, standby namenode hostname reference which will be set
    +     * @param port, standby namenode port reference which will be set
    +     * @return true if getStandbyNamenode succeeded
    +     */
    +    bool getStandbyNamenode(std::string &standbynamenode,
    +                            int &port);
    +
    +    /**
    +     * get HDFS namenode(s) information
    +     * @param namenodes, namenodes' hostnames reference which will be set
    +     * @param port, namenodes' ports reference which will be set
    +     */
    +    void getNamenodes(std::vector<std::string> &namenodes,
    +                      std::vector<int> &port);
    +
    +    /**
    +     * get HDFS datanodes information
    +     * @param datanodelist, datanodes' hostnames reference which will be set
    +     * @param port, datanodes' ports reference which will be set
    +     */
    +    void getDatanodelist(std::vector<std::string> &datanodelist,
    +                         std::vector<int> &port);
    +
    +    /**
    +     * get HDFS active datanodes information
    +     * @param activedatanodes, active datanodes' hostnames reference which will be set
    +     * @param port, active datanodes' ports reference which will be set
    +     */
    +    void getActiveDatanodes(std::vector<std::string> &activedatanodes,
    +                            std::vector<int> &port);
    +
    +    /**
    +     * whether HDFS is in safe mode
    +     * @return true if HDFS is in safe node
    +     */
    +    bool isSafemode();
    +
    +    /**
    +     * get parameter value in ./etc/hdfs-client.xml or ./etc/hadoop/hdfs-site.xml according to parameter name
    +     * @param parameterName, used to get parameter value
    +     * @param conftype, get parameter value, 'hdfs' or 'HDFS' from ./etc/hdfs-client.xml, others from ./etc/hadoop/hdfs-site.xml
    +     * @return parameter value
    +     */
    +    std::string getParameterValue(const std::string &parameterName, const std::string &conftype);
    +    
    +    /**
    +     * get parameter value in ./etc/hadoop/hdfs-site.xml according to parameter name
    +     * @param parameterName, used to get parameter value
    +     * @return parameter value
    +     */
    +    std::string getParameterValue(const std::string &parameterName);
    +
    +    /**
    +     * set parameter value in ./etc/hdfs-client.xml or ./etc/hadoop/hdfs-site.xml according to parameter name
    +     * @param parameterName, parameter name which used to set parameter value
    +     * @param parameterValue, parameter value which to be set
    +     * @param conftype, get parameter value, 'hdfs' or 'HDFS' from ./etc/hdfs-client.xml, others from ./etc/hadoop/hdfs-site.xml
    +     * @return true if succeeded
    +     */
    +    bool setParameterValue(const std::string &parameterName, const std::string &parameterValue, const std::string &conftype);
    +
    +    /**
    +     * set parameter value in ./etc/hadoop/hdfs-site.xml according to parameter name
    +     * @param parameterName, parameter name which used to set parameter value
    +     * @param parameterValue, parameter value which to be set
    +     * @return true if succeeded
    +     */
    +    bool setParameterValue(const std::string &parameterName, const std::string &parameterValue);
    +
    +  private:
    +    /**
    +     * @return yarn user
    --- End diff --
    
    TYPO: `return hdfs user`. Actually, I do not suggest that we should add doxygen-style comments of private interfaces.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

Posted by xunzhang <gi...@git.apache.org>.
Github user xunzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/828#discussion_r72937658
  
    --- Diff: src/test/feature/lib/xml_parser.h ---
    @@ -17,10 +17,27 @@ class XmlConfig {
      public:
       explicit XmlConfig(std::string);
       
    +  // read an XML file into a tree
    +  bool open();
    +
    +  // only free the XML document pointer
    +  void closeNotSave();
    +
    +  // save the updated document to disk and free the XML document pointer
    +  void closeAndSave();
    +
       // parse the configuration file
       void parse();
     
       // @param key The key of the configuration item
    +  // @param value The updated value
    +  // @param save whether save the updated document to disk, if save is false, open() and closeAndSave() should be called additionally
    +  // @ return The value of configuration item
    +  bool setString(const std::string &key, const std::string &value, const bool &save);
    --- End diff --
    
    With POD types, there is no case using const reference. For example, here we should define as `bool setString(const std::string &key, const std::string &value, bool save);`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

Posted by wengyanqing <gi...@git.apache.org>.
Github user wengyanqing commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/828#discussion_r73455316
  
    --- Diff: src/test/feature/lib/hdfs_config.h ---
    @@ -0,0 +1,175 @@
    +#ifndef HAWQ_SRC_TEST_FEATURE_LIB_HDFS_CONFIG_H_
    +#define HAWQ_SRC_TEST_FEATURE_LIB_HDFS_CONFIG_H_
    +
    +#include <string>
    +#include <vector>
    +
    +#include "psql.h"
    +#include "sql_util.h"
    +#include "xml_parser.h"
    +
    +namespace hawq {
    +namespace test {
    +
    +/**
    + * HdfsConfig common libray. Get detailed information about HDFS
    + * including checking state of namenodes and datanodes, get parameter value
    + * @author Chunling Wang
    + */
    +class HdfsConfig {
    --- End diff --
    
    Should HdfsConfig be singleton ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

Posted by wengyanqing <gi...@git.apache.org>.
Github user wengyanqing commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/828#discussion_r73454974
  
    --- Diff: src/test/feature/lib/hdfs_config.cpp ---
    @@ -0,0 +1,293 @@
    +#include <fstream>
    +#include <string>
    +#include <vector>
    +#include <unordered_set>
    +
    +#include "hdfs_config.h"
    +#include "command.h"
    +#include "psql.h"
    +#include "xml_parser.h"
    +#include "string_util.h"
    +
    +using std::string;
    +
    +namespace hawq {
    +namespace test {
    +
    +string HdfsConfig::getHdfsUser() {
    +  string cmd = "ps aux|grep hdfs.server|grep -v grep";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  if (lines.size() >= 1) {
    +    return hawq::test::trim(hawq::test::split(lines[lines.size()-1], ' ')[0]);
    +  } 
    +  return "hdfs";
    +}
    +
    +bool HdfsConfig::LoadFromHawqConfigFile() {
    +  const char *env = getenv("GPHOME");
    +  string confPath = env ? env : "";
    +  if (confPath != "") {
    +    confPath.append("/etc/hdfs-client.xml");
    +  } else {
    +    return false;
    +  }
    +
    +  hawqxmlconf.reset(new XmlConfig(confPath));
    +  hawqxmlconf->parse();
    +  return true;
    +}
    +
    +bool HdfsConfig::LoadFromHdfsConfigFile() {
    +  string confPath=getHadoopHome();
    +  if (confPath == "") {
    +    return false;
    +  }
    +  confPath.append("/etc/hadoop/hdfs-site.xml");
    +  hdfsxmlconf.reset(new XmlConfig(confPath));
    +  hdfsxmlconf->parse();
    +  return true;
    +}
    +
    +bool HdfsConfig::isHA() {
    +  const hawq::test::PSQLQueryResult &result = psql.getQueryResult(
    +       "SELECT substring(fselocation from length('hdfs:// ') for (position('/' in substring(fselocation from length('hdfs:// ')))-1)::int) "
    +       "FROM pg_filespace pgfs, pg_filespace_entry pgfse "
    +       "WHERE pgfs.fsname = 'dfs_system' AND pgfse.fsefsoid=pgfs.oid ;");
    +  std::vector<std::vector<string>> table = result.getRows();
    +  if (table.size() > 0) {
    +    int find = table[0][0].find(":");
    +    if (find < 0) {
    +      return true;
    +    } else {
    +      return false;
    +    }
    +  }
    +  return false;
    +}
    +
    +bool HdfsConfig::isConfigKerberos() {
    +  bool ret = LoadFromHawqConfigFile();
    +  if (!ret) {
    +    throw GetHawqHomeException();
    +  }
    +  string authentication = hawqxmlconf->getString("hadoop.security.authentication");
    +  if (authentication == "kerberos") {
    +    return true;
    +  } else {
    +    return false;
    +  }
    +}
    +
    +bool HdfsConfig::isTruncate() {
    +  string cmd = "hadoop fs -truncate";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  if (lines.size() >= 1) {
    +    string valueLine = lines[0];
    +    int find = valueLine.find("-truncate: Unknown command");
    +    if (find < 0) {
    +      return true;
    +    }
    +  }
    +  return false;
    +}
    +
    +string HdfsConfig::getHadoopHome() {
    +  string cmd = "ps -ef|grep hadoop";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  string hadoopHome = "";
    +  auto lines = hawq::test::split(result, '\n');
    +  for (size_t i=0; i<lines.size()-1; i++) {
    +    string valueLine = lines[i];
    +    int pos = valueLine.find("-Dhadoop.home.dir=");
    +    if (pos >=0 ) {
    +      string valueTmp = valueLine.substr(pos+18); 
    +      int valueEnd = valueTmp.find_first_of(" ");
    +      string value = valueTmp.substr(0, valueEnd);
    +      hadoopHome = hawq::test::trim(value);
    +      return hadoopHome;
    +    }
    +  }
    +  return hadoopHome;
    +}
    +
    +bool HdfsConfig::getActiveNamenode(string &activenamenode,
    +                                   int &port) {
    +    return getHANamenode("active", activenamenode, port);
    +}
    +
    +bool HdfsConfig::getStandbyNamenode(string &standbynamenode,
    +                                    int &port) {
    +    return getHANamenode("standby", standbynamenode, port);
    +}
    +
    +bool HdfsConfig::getHANamenode(const string &namenodetype,
    +                               string &namenode,
    +                               int &port) {
    +  if (!isHA()) {
    +    return false;
    +  }
    +  string namenodeService = "";
    +  string nameServiceValue = hawqxmlconf->getString("dfs.nameservices");
    +  string haNamenodesName = "dfs.ha.namenodes.";
    +  haNamenodesName.append(hawq::test::trim(nameServiceValue));
    +  string haNamenodesValue = hawqxmlconf->getString(haNamenodesName);
    +  auto haNamenodes = hawq::test::split(haNamenodesValue, ',');
    +  for (size_t i = 0; i < haNamenodes.size(); i++) {
    +    string haNamenode = hawq::test::trim(haNamenodes[i]);
    +    string cmd = "sudo -u ";
    +    cmd.append(getHdfsUser());
    +    cmd.append(" hdfs haadmin -getServiceState ");
    +    cmd.append(haNamenode);
    +    Command c(cmd);
    +    string result = c.run().getResultOutput();
    +    auto lines = hawq::test::split(result, '\n');
    +    if (lines.size() >= 1) {
    +      string valueLine = lines[0];
    +      if (valueLine == namenodetype) {
    +        namenodeService = haNamenode;
    +        break;
    +      }
    +    }
    +  }
    +  string rpcAddressName = "dfs.namenode.rpc-address.gphd-cluster.";
    +  rpcAddressName.append(namenodeService);
    +  string rpcAddressValue = hawqxmlconf->getString(rpcAddressName);
    +  auto namenodeInfo = hawq::test::split(rpcAddressValue, ':');
    +  namenode = hawq::test::trim(namenodeInfo[0]);
    +  port = std::stoi(hawq::test::trim(namenodeInfo[1]));
    +  return true;
    +}
    +
    +void HdfsConfig::getNamenodes(std::vector<string> &namenodes,
    +                              std::vector<int> &port) {
    +  string cmd = "sudo -u ";
    +  cmd.append(getHdfsUser());
    +  cmd.append(" hdfs getconf -nnRpcAddresses");
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  for (size_t i = 0; i < lines.size(); i++) {
    +    string valueLine = lines[i];
    +    auto namenodeInfo = hawq::test::split(valueLine, ':');
    +    if (namenodeInfo.size() == 2) {
    +      namenodes.push_back(hawq::test::trim(namenodeInfo[0]));
    +      port.push_back(std::stoi(hawq::test::trim(namenodeInfo[1])));
    +    }
    +  }
    +}
    +
    +void HdfsConfig::getDatanodelist(std::vector<string> &datanodelist,
    +                                 std::vector<int> &port) {
    +  string cmd = "sudo -u ";
    +  cmd.append(getHdfsUser());
    +  cmd.append(" hdfs dfsadmin -report | grep Name");
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  for (size_t i = 0; i < lines.size(); i++) {
    +    string valueLine = lines[i];
    +    auto datanodeInfo = hawq::test::split(valueLine, ':');
    +    if (datanodeInfo.size() == 3) {
    +      int portStart = datanodeInfo[2].find_first_of('(');
    +      int portEnd = datanodeInfo[2].find_first_of(')');
    +      string datanodePort = datanodeInfo[2].substr(0, portStart);
    +      string datanodeHost = datanodeInfo[2].substr(portStart+1, portEnd-portStart-1);
    +      datanodelist.push_back(hawq::test::trim(datanodeHost));
    +      port.push_back(std::stoi(hawq::test::trim(datanodePort)));
    +    }
    +  }
    +}
    +
    +void HdfsConfig::getActiveDatanodes(std::vector<string> &activedatanodes,
    +                                    std::vector<int> &port) {
    +  string cmd = "sudo -u ";
    +  cmd.append(getHdfsUser());
    +  cmd.append(" hdfs dfsadmin -report -live | grep Name");
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  for (size_t i = 0; i < lines.size(); i++) {
    +    string valueLine = lines[i];
    +    auto datanodeInfo = hawq::test::split(valueLine, ':');
    +    if (datanodeInfo.size() == 3) {
    +      int portStart = datanodeInfo[2].find_first_of('(');
    +      int portEnd = datanodeInfo[2].find_first_of(')');
    +      string datanodePort = datanodeInfo[2].substr(0, portStart);
    +      string datanodeHost = datanodeInfo[2].substr(portStart+1, portEnd-portStart-1);
    +      activedatanodes.push_back(hawq::test::trim(datanodeHost));
    +      port.push_back(std::stoi(hawq::test::trim(datanodePort)));
    +    }
    +  }
    +}
    +
    +
    +bool HdfsConfig::isSafemode() {
    +  string cmd = "hadoop fs -mkdir /tmp_hawq_test";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  if (lines.size() >= 1) {
    +    string valueLine = lines[0];
    +    int find = valueLine.find("Name node is in safe mode.");
    +    if (find >= 0) {
    +      return true;
    +    }
    +  }
    +  cmd = "hadoop fs -rm -r /tmp_hawq_test";
    +  Command c_teardown(cmd);
    +  result = c_teardown.run().getResultOutput();
    +  return false;
    +}
    +
    +string HdfsConfig::getParameterValue(const string &parameterName) {
    +  bool ret = LoadFromHdfsConfigFile();
    +  if (!ret) {
    +    throw GetHadoopHomeException();
    +  }
    +
    +  return hdfsxmlconf->getString(parameterName);
    +}
    +
    +string HdfsConfig::getParameterValue(const string &parameterName,
    +                                     const string &conftype) {
    +  if (conftype == "hdfs" || conftype == "HDFS")
    --- End diff --
    
    It's better to call lower(conftype) == "hdfs" ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

Posted by xunzhang <gi...@git.apache.org>.
Github user xunzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/828#discussion_r72933008
  
    --- Diff: src/test/feature/lib/hdfs_config.h ---
    @@ -0,0 +1,172 @@
    +#ifndef HAWQ_SRC_TEST_FEATURE_LIB_HDFS_CONFIG_H_
    +#define HAWQ_SRC_TEST_FEATURE_LIB_HDFS_CONFIG_H_
    +
    +#include <string>
    +#include <vector>
    +
    +#include "psql.h"
    +#include "sql_util.h"
    +#include "xml_parser.h"
    +
    +namespace hawq {
    +namespace test {
    +
    +/**
    + * HdfsConfig common libray. Get detailed information about HDFS
    + * including checking state of namenodes and datanodes, get parameter value
    + * @author Chunling Wang
    + */
    +class HdfsConfig {
    +  public:
    +    /**
    +     * HdfsConfig constructor
    +     */
    +    HdfsConfig(): psql(HAWQ_DB, HAWQ_HOST, HAWQ_PORT, HAWQ_USER, HAWQ_PASSWORD) {}
    +
    +    /**
    +     * HdfsConfig desstructor
    --- End diff --
    
    ...typo


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

Posted by xunzhang <gi...@git.apache.org>.
Github user xunzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/828#discussion_r72935290
  
    --- Diff: src/test/feature/lib/hdfs_config.cpp ---
    @@ -0,0 +1,284 @@
    +#include <fstream>
    +#include <string>
    +#include <vector>
    +#include <unordered_set>
    +
    +#include "hdfs_config.h"
    +#include "command.h"
    +#include "psql.h"
    +#include "xml_parser.h"
    +#include "string_util.h"
    +
    +using std::string;
    +
    +namespace hawq {
    +namespace test {
    +
    +string HdfsConfig::getHdfsUser() {
    +  string cmd = "ps aux|grep hdfs.server|grep -v grep";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  if (lines.size() >= 1) {
    +    return hawq::test::trim(hawq::test::split(lines[lines.size()-1], ' ')[0]);
    +  } 
    +  return "hdfs";
    +}
    +
    +bool HdfsConfig::LoadFromHawqConfigFile() {
    +  const char *env = getenv("GPHOME");
    +  string confPath = env ? env : "";
    +  if (!confPath.empty()) {
    +    confPath.append("/etc/hdfs-client.xml");
    +  } else {
    +    return false;
    +  }
    +
    +  hawqxmlconf.reset(new XmlConfig(confPath));
    +  hawqxmlconf->parse();
    +  return true;
    +}
    +
    +bool HdfsConfig::LoadFromHdfsConfigFile() {
    +  string confPath=getHadoopHome();
    +  if (confPath == "")
    --- End diff --
    
    Unify your coding style with parentheses


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

Posted by xunzhang <gi...@git.apache.org>.
Github user xunzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/828#discussion_r73455386
  
    --- Diff: src/test/feature/lib/hdfs_config.cpp ---
    @@ -0,0 +1,293 @@
    +#include <fstream>
    +#include <string>
    +#include <vector>
    +#include <unordered_set>
    +
    +#include "hdfs_config.h"
    +#include "command.h"
    +#include "psql.h"
    +#include "xml_parser.h"
    +#include "string_util.h"
    +
    +using std::string;
    +
    +namespace hawq {
    +namespace test {
    +
    +string HdfsConfig::getHdfsUser() {
    +  string cmd = "ps aux|grep hdfs.server|grep -v grep";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  if (lines.size() >= 1) {
    +    return hawq::test::trim(hawq::test::split(lines[lines.size()-1], ' ')[0]);
    +  } 
    +  return "hdfs";
    +}
    +
    +bool HdfsConfig::LoadFromHawqConfigFile() {
    +  const char *env = getenv("GPHOME");
    +  string confPath = env ? env : "";
    +  if (confPath != "") {
    +    confPath.append("/etc/hdfs-client.xml");
    +  } else {
    +    return false;
    +  }
    +
    +  hawqxmlconf.reset(new XmlConfig(confPath));
    +  hawqxmlconf->parse();
    +  return true;
    +}
    +
    +bool HdfsConfig::LoadFromHdfsConfigFile() {
    +  string confPath=getHadoopHome();
    +  if (confPath == "") {
    +    return false;
    +  }
    +  confPath.append("/etc/hadoop/hdfs-site.xml");
    +  hdfsxmlconf.reset(new XmlConfig(confPath));
    +  hdfsxmlconf->parse();
    +  return true;
    +}
    +
    +bool HdfsConfig::isHA() {
    +  const hawq::test::PSQLQueryResult &result = psql.getQueryResult(
    +       "SELECT substring(fselocation from length('hdfs:// ') for (position('/' in substring(fselocation from length('hdfs:// ')))-1)::int) "
    +       "FROM pg_filespace pgfs, pg_filespace_entry pgfse "
    +       "WHERE pgfs.fsname = 'dfs_system' AND pgfse.fsefsoid=pgfs.oid ;");
    +  std::vector<std::vector<string>> table = result.getRows();
    +  if (table.size() > 0) {
    +    int find = table[0][0].find(":");
    +    if (find < 0) {
    +      return true;
    +    } else {
    +      return false;
    +    }
    +  }
    +  return false;
    +}
    +
    +bool HdfsConfig::isConfigKerberos() {
    +  bool ret = LoadFromHawqConfigFile();
    +  if (!ret) {
    +    throw GetHawqHomeException();
    +  }
    +  string authentication = hawqxmlconf->getString("hadoop.security.authentication");
    +  if (authentication == "kerberos") {
    +    return true;
    +  } else {
    +    return false;
    +  }
    +}
    +
    +bool HdfsConfig::isTruncate() {
    +  string cmd = "hadoop fs -truncate";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  if (lines.size() >= 1) {
    +    string valueLine = lines[0];
    +    int find = valueLine.find("-truncate: Unknown command");
    +    if (find < 0) {
    +      return true;
    +    }
    +  }
    +  return false;
    +}
    +
    +string HdfsConfig::getHadoopHome() {
    +  string cmd = "ps -ef|grep hadoop";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  string hadoopHome = "";
    +  auto lines = hawq::test::split(result, '\n');
    +  for (size_t i=0; i<lines.size()-1; i++) {
    +    string valueLine = lines[i];
    +    int pos = valueLine.find("-Dhadoop.home.dir=");
    +    if (pos >=0 ) {
    +      string valueTmp = valueLine.substr(pos+18); 
    +      int valueEnd = valueTmp.find_first_of(" ");
    +      string value = valueTmp.substr(0, valueEnd);
    +      hadoopHome = hawq::test::trim(value);
    +      return hadoopHome;
    +    }
    +  }
    +  return hadoopHome;
    +}
    +
    +bool HdfsConfig::getActiveNamenode(string &activenamenode,
    +                                   int &port) {
    +    return getHANamenode("active", activenamenode, port);
    +}
    +
    +bool HdfsConfig::getStandbyNamenode(string &standbynamenode,
    +                                    int &port) {
    +    return getHANamenode("standby", standbynamenode, port);
    +}
    +
    +bool HdfsConfig::getHANamenode(const string &namenodetype,
    +                               string &namenode,
    +                               int &port) {
    +  if (!isHA()) {
    +    return false;
    +  }
    +  string namenodeService = "";
    +  string nameServiceValue = hawqxmlconf->getString("dfs.nameservices");
    +  string haNamenodesName = "dfs.ha.namenodes.";
    +  haNamenodesName.append(hawq::test::trim(nameServiceValue));
    +  string haNamenodesValue = hawqxmlconf->getString(haNamenodesName);
    +  auto haNamenodes = hawq::test::split(haNamenodesValue, ',');
    +  for (size_t i = 0; i < haNamenodes.size(); i++) {
    +    string haNamenode = hawq::test::trim(haNamenodes[i]);
    +    string cmd = "sudo -u ";
    +    cmd.append(getHdfsUser());
    +    cmd.append(" hdfs haadmin -getServiceState ");
    +    cmd.append(haNamenode);
    +    Command c(cmd);
    +    string result = c.run().getResultOutput();
    +    auto lines = hawq::test::split(result, '\n');
    +    if (lines.size() >= 1) {
    +      string valueLine = lines[0];
    +      if (valueLine == namenodetype) {
    +        namenodeService = haNamenode;
    +        break;
    +      }
    +    }
    +  }
    +  string rpcAddressName = "dfs.namenode.rpc-address.gphd-cluster.";
    +  rpcAddressName.append(namenodeService);
    +  string rpcAddressValue = hawqxmlconf->getString(rpcAddressName);
    +  auto namenodeInfo = hawq::test::split(rpcAddressValue, ':');
    +  namenode = hawq::test::trim(namenodeInfo[0]);
    +  port = std::stoi(hawq::test::trim(namenodeInfo[1]));
    +  return true;
    +}
    +
    +void HdfsConfig::getNamenodes(std::vector<string> &namenodes,
    +                              std::vector<int> &port) {
    +  string cmd = "sudo -u ";
    +  cmd.append(getHdfsUser());
    +  cmd.append(" hdfs getconf -nnRpcAddresses");
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  for (size_t i = 0; i < lines.size(); i++) {
    +    string valueLine = lines[i];
    +    auto namenodeInfo = hawq::test::split(valueLine, ':');
    +    if (namenodeInfo.size() == 2) {
    +      namenodes.push_back(hawq::test::trim(namenodeInfo[0]));
    +      port.push_back(std::stoi(hawq::test::trim(namenodeInfo[1])));
    +    }
    +  }
    +}
    +
    +void HdfsConfig::getDatanodelist(std::vector<string> &datanodelist,
    +                                 std::vector<int> &port) {
    +  string cmd = "sudo -u ";
    +  cmd.append(getHdfsUser());
    +  cmd.append(" hdfs dfsadmin -report | grep Name");
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  for (size_t i = 0; i < lines.size(); i++) {
    +    string valueLine = lines[i];
    +    auto datanodeInfo = hawq::test::split(valueLine, ':');
    +    if (datanodeInfo.size() == 3) {
    +      int portStart = datanodeInfo[2].find_first_of('(');
    +      int portEnd = datanodeInfo[2].find_first_of(')');
    +      string datanodePort = datanodeInfo[2].substr(0, portStart);
    +      string datanodeHost = datanodeInfo[2].substr(portStart+1, portEnd-portStart-1);
    +      datanodelist.push_back(hawq::test::trim(datanodeHost));
    +      port.push_back(std::stoi(hawq::test::trim(datanodePort)));
    +    }
    +  }
    +}
    +
    +void HdfsConfig::getActiveDatanodes(std::vector<string> &activedatanodes,
    +                                    std::vector<int> &port) {
    +  string cmd = "sudo -u ";
    +  cmd.append(getHdfsUser());
    +  cmd.append(" hdfs dfsadmin -report -live | grep Name");
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  for (size_t i = 0; i < lines.size(); i++) {
    +    string valueLine = lines[i];
    +    auto datanodeInfo = hawq::test::split(valueLine, ':');
    +    if (datanodeInfo.size() == 3) {
    +      int portStart = datanodeInfo[2].find_first_of('(');
    +      int portEnd = datanodeInfo[2].find_first_of(')');
    +      string datanodePort = datanodeInfo[2].substr(0, portStart);
    +      string datanodeHost = datanodeInfo[2].substr(portStart+1, portEnd-portStart-1);
    +      activedatanodes.push_back(hawq::test::trim(datanodeHost));
    +      port.push_back(std::stoi(hawq::test::trim(datanodePort)));
    +    }
    +  }
    +}
    +
    +
    +bool HdfsConfig::isSafemode() {
    +  string cmd = "hadoop fs -mkdir /tmp_hawq_test";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  if (lines.size() >= 1) {
    +    string valueLine = lines[0];
    +    int find = valueLine.find("Name node is in safe mode.");
    +    if (find >= 0) {
    +      return true;
    +    }
    +  }
    +  cmd = "hadoop fs -rm -r /tmp_hawq_test";
    +  Command c_teardown(cmd);
    +  result = c_teardown.run().getResultOutput();
    +  return false;
    +}
    +
    +string HdfsConfig::getParameterValue(const string &parameterName) {
    +  bool ret = LoadFromHdfsConfigFile();
    +  if (!ret) {
    +    throw GetHadoopHomeException();
    +  }
    +
    +  return hdfsxmlconf->getString(parameterName);
    +}
    +
    +string HdfsConfig::getParameterValue(const string &parameterName,
    +                                     const string &conftype) {
    +  if (conftype == "hdfs" || conftype == "HDFS")
    --- End diff --
    
    `hawq::test::lower(string)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

Posted by xunzhang <gi...@git.apache.org>.
Github user xunzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/828#discussion_r72935273
  
    --- Diff: src/test/feature/lib/hdfs_config.cpp ---
    @@ -0,0 +1,284 @@
    +#include <fstream>
    +#include <string>
    +#include <vector>
    +#include <unordered_set>
    +
    +#include "hdfs_config.h"
    +#include "command.h"
    +#include "psql.h"
    +#include "xml_parser.h"
    +#include "string_util.h"
    +
    +using std::string;
    +
    +namespace hawq {
    +namespace test {
    +
    +string HdfsConfig::getHdfsUser() {
    +  string cmd = "ps aux|grep hdfs.server|grep -v grep";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  if (lines.size() >= 1) {
    +    return hawq::test::trim(hawq::test::split(lines[lines.size()-1], ' ')[0]);
    +  } 
    +  return "hdfs";
    +}
    +
    +bool HdfsConfig::LoadFromHawqConfigFile() {
    +  const char *env = getenv("GPHOME");
    +  string confPath = env ? env : "";
    +  if (!confPath.empty()) {
    +    confPath.append("/etc/hdfs-client.xml");
    +  } else {
    +    return false;
    +  }
    +
    +  hawqxmlconf.reset(new XmlConfig(confPath));
    +  hawqxmlconf->parse();
    +  return true;
    +}
    +
    +bool HdfsConfig::LoadFromHdfsConfigFile() {
    +  string confPath=getHadoopHome();
    +  if (confPath == "")
    +    return false;
    +  confPath.append("/etc/hadoop/hdfs-site.xml");
    +  hdfsxmlconf.reset(new XmlConfig(confPath));
    +  hdfsxmlconf->parse();
    +  return true;
    +}
    +
    +bool HdfsConfig::isHA() {
    +  bool ret = LoadFromHawqConfigFile();
    +  if (!ret) {
    +    return false;
    +  }
    +  string nameservice = hawqxmlconf->getString("dfs.nameservices");
    +  if (nameservice.length() > 0) {
    +    return true;
    +  } else {
    +    return false;
    +  }
    +}
    +
    +bool HdfsConfig::isKerbos() {
    +  bool ret = LoadFromHawqConfigFile();
    +  if (!ret) {
    +    return false;
    +  }
    +  string authentication = hawqxmlconf->getString("hadoop.security.authentication");
    +  if (authentication == "kerberos") {
    +    return true;
    +  } else {
    +    return false;
    +  }
    +}
    +
    +bool HdfsConfig::isTruncate() {
    +  string cmd = "hadoop fs -truncate";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  if (lines.size() >= 1) {
    +    string valueLine = lines[0];
    +    int find = valueLine.find("-truncate: Unknown command");
    +    if (find < 0)
    --- End diff --
    
    Unify your coding style with parentheses


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

Posted by xunzhang <gi...@git.apache.org>.
Github user xunzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/828#discussion_r72934755
  
    --- Diff: src/test/feature/testlib/test_lib.cpp ---
    @@ -56,6 +58,69 @@ TEST_F(TestCommonLib, TestHawqConfig) {
       return;
     }
     
    +TEST_F(TestCommonLib, TestHdfsConfig) {
    +  hawq::test::HdfsConfig hc;
    +  hc.isHA();
    --- End diff --
    
    As I mentioned in **[REVIEW-LINK-1]**, if you implemented with loading from a configuration file, you should not just invoke the interface. You should load from a test config file and use `EXPECT_EQ`-like macro function to write your tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq issue #828: HAWQ-969. Add getting configuration from HDFS and...

Posted by xunzhang <gi...@git.apache.org>.
Github user xunzhang commented on the issue:

    https://github.com/apache/incubator-hawq/pull/828
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

Posted by xunzhang <gi...@git.apache.org>.
Github user xunzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/828#discussion_r72935754
  
    --- Diff: src/test/feature/lib/hdfs_config.cpp ---
    @@ -0,0 +1,284 @@
    +#include <fstream>
    +#include <string>
    +#include <vector>
    +#include <unordered_set>
    +
    +#include "hdfs_config.h"
    +#include "command.h"
    +#include "psql.h"
    +#include "xml_parser.h"
    +#include "string_util.h"
    +
    +using std::string;
    +
    +namespace hawq {
    +namespace test {
    +
    +string HdfsConfig::getHdfsUser() {
    +  string cmd = "ps aux|grep hdfs.server|grep -v grep";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  if (lines.size() >= 1) {
    +    return hawq::test::trim(hawq::test::split(lines[lines.size()-1], ' ')[0]);
    +  } 
    +  return "hdfs";
    +}
    +
    +bool HdfsConfig::LoadFromHawqConfigFile() {
    +  const char *env = getenv("GPHOME");
    +  string confPath = env ? env : "";
    +  if (!confPath.empty()) {
    +    confPath.append("/etc/hdfs-client.xml");
    +  } else {
    +    return false;
    +  }
    +
    +  hawqxmlconf.reset(new XmlConfig(confPath));
    +  hawqxmlconf->parse();
    +  return true;
    +}
    +
    +bool HdfsConfig::LoadFromHdfsConfigFile() {
    +  string confPath=getHadoopHome();
    +  if (confPath == "")
    +    return false;
    +  confPath.append("/etc/hadoop/hdfs-site.xml");
    +  hdfsxmlconf.reset(new XmlConfig(confPath));
    +  hdfsxmlconf->parse();
    +  return true;
    +}
    +
    +bool HdfsConfig::isHA() {
    +  bool ret = LoadFromHawqConfigFile();
    +  if (!ret) {
    +    return false;
    +  }
    +  string nameservice = hawqxmlconf->getString("dfs.nameservices");
    +  if (nameservice.length() > 0) {
    +    return true;
    +  } else {
    +    return false;
    +  }
    +}
    +
    +bool HdfsConfig::isKerbos() {
    +  bool ret = LoadFromHawqConfigFile();
    +  if (!ret) {
    +    return false;
    --- End diff --
    
    What is the semantics of the interface returning `false`? If it means that "without Kerbos", the implementation is incorrect. You should throw exception or load default configuration instead.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

Posted by wcl14 <gi...@git.apache.org>.
Github user wcl14 closed the pull request at:

    https://github.com/apache/incubator-hawq/pull/828


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

Posted by xunzhang <gi...@git.apache.org>.
Github user xunzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/828#discussion_r72934303
  
    --- Diff: src/test/feature/lib/hdfs_config.cpp ---
    @@ -0,0 +1,284 @@
    +#include <fstream>
    +#include <string>
    +#include <vector>
    +#include <unordered_set>
    +
    +#include "hdfs_config.h"
    +#include "command.h"
    +#include "psql.h"
    +#include "xml_parser.h"
    +#include "string_util.h"
    +
    +using std::string;
    +
    +namespace hawq {
    +namespace test {
    +
    +string HdfsConfig::getHdfsUser() {
    +  string cmd = "ps aux|grep hdfs.server|grep -v grep";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  if (lines.size() >= 1) {
    +    return hawq::test::trim(hawq::test::split(lines[lines.size()-1], ' ')[0]);
    +  } 
    +  return "hdfs";
    +}
    +
    +bool HdfsConfig::LoadFromHawqConfigFile() {
    +  const char *env = getenv("GPHOME");
    +  string confPath = env ? env : "";
    +  if (!confPath.empty()) {
    +    confPath.append("/etc/hdfs-client.xml");
    +  } else {
    +    return false;
    +  }
    +
    +  hawqxmlconf.reset(new XmlConfig(confPath));
    +  hawqxmlconf->parse();
    +  return true;
    +}
    +
    +bool HdfsConfig::LoadFromHdfsConfigFile() {
    +  string confPath=getHadoopHome();
    +  if (confPath == "")
    +    return false;
    +  confPath.append("/etc/hadoop/hdfs-site.xml");
    +  hdfsxmlconf.reset(new XmlConfig(confPath));
    +  hdfsxmlconf->parse();
    +  return true;
    +}
    +
    +bool HdfsConfig::isHA() {
    --- End diff --
    
    [REVIEW-LINK-1] I think the implementation of these functions are problematic. Say if user modified these configuration files without restart HAWQ, the result could be confused. A better way is to get these kinds of status from running HAWQ, is that possible?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq issue #828: HAWQ-969. Add getting configuration from HDFS and...

Posted by wengyanqing <gi...@git.apache.org>.
Github user wengyanqing commented on the issue:

    https://github.com/apache/incubator-hawq/pull/828
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

Posted by xunzhang <gi...@git.apache.org>.
Github user xunzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/828#discussion_r72936411
  
    --- Diff: src/test/feature/lib/hdfs_config.h ---
    @@ -0,0 +1,172 @@
    +#ifndef HAWQ_SRC_TEST_FEATURE_LIB_HDFS_CONFIG_H_
    +#define HAWQ_SRC_TEST_FEATURE_LIB_HDFS_CONFIG_H_
    +
    +#include <string>
    +#include <vector>
    +
    +#include "psql.h"
    +#include "sql_util.h"
    +#include "xml_parser.h"
    +
    +namespace hawq {
    +namespace test {
    +
    +/**
    + * HdfsConfig common libray. Get detailed information about HDFS
    + * including checking state of namenodes and datanodes, get parameter value
    + * @author Chunling Wang
    + */
    +class HdfsConfig {
    +  public:
    +    /**
    +     * HdfsConfig constructor
    +     */
    +    HdfsConfig(): psql(HAWQ_DB, HAWQ_HOST, HAWQ_PORT, HAWQ_USER, HAWQ_PASSWORD) {}
    +
    +    /**
    +     * HdfsConfig desstructor
    +     */
    +    ~HdfsConfig() {}
    +
    +    /**
    +     * whether HDFS is in HA mode
    +     * @return true if HDFS is HA
    +     */
    +    bool isHA();
    +
    +    /**
    +     * whether HDFS is kerbos
    +     * @return true if HDFS is kerbos
    +     */
    +    bool isKerbos();
    +
    +    /**
    +     * whether HDFS supports truncate operation
    +     * @return true if HDFS supports truncate operation
    +     */
    +    bool isTruncate();
    +
    +    /**
    +     * get HADOOP working directory
    +     * @return HADOOP working directory
    +     */
    +    std::string getHadoopHome();
    +
    +    /**
    +     * get HDFS active namenode's hostname and port information
    +     * @param activenamenode, active namenode hostname reference which will be set
    +     * @param port, active namenode port reference which will be set
    +     * @return true if getActiveNamenode succeeded
    +     */
    +    bool getActiveNamenode(std::string &activenamenode,
    +                           int &port);
    +
    +    /**
    +     * get HDFS standby namenode's hostname and port information
    +     * @param standbynamenode, standby namenode hostname reference which will be set
    +     * @param port, standby namenode port reference which will be set
    +     * @return true if getStandbyNamenode succeeded
    +     */
    +    bool getStandbyNamenode(std::string &standbynamenode,
    +                            int &port);
    +
    +    /**
    +     * get HDFS namenode(s) information
    +     * @param namenodes, namenodes' hostnames reference which will be set
    +     * @param port, namenodes' ports reference which will be set
    +     */
    +    void getNamenodes(std::vector<std::string> &namenodes,
    +                      std::vector<int> &port);
    +
    +    /**
    +     * get HDFS datanodes information
    +     * @param datanodelist, datanodes' hostnames reference which will be set
    +     * @param port, datanodes' ports reference which will be set
    +     */
    +    void getDatanodelist(std::vector<std::string> &datanodelist,
    +                         std::vector<int> &port);
    +
    +    /**
    +     * get HDFS active datanodes information
    +     * @param activedatanodes, active datanodes' hostnames reference which will be set
    +     * @param port, active datanodes' ports reference which will be set
    +     */
    +    void getActiveDatanodes(std::vector<std::string> &activedatanodes,
    +                            std::vector<int> &port);
    +
    +    /**
    +     * whether HDFS is in safe mode
    +     * @return true if HDFS is in safe node
    +     */
    +    bool isSafemode();
    +
    +    /**
    +     * get parameter value in ./etc/hdfs-client.xml or ./etc/hadoop/hdfs-site.xml according to parameter name
    +     * @param parameterName, used to get parameter value
    +     * @param conftype, get parameter value, 'hdfs' or 'HDFS' from ./etc/hdfs-client.xml, others from ./etc/hadoop/hdfs-site.xml
    +     * @return parameter value
    +     */
    +    std::string getParameterValue(const std::string &parameterName, const std::string &conftype);
    +    
    +    /**
    +     * get parameter value in ./etc/hadoop/hdfs-site.xml according to parameter name
    +     * @param parameterName, used to get parameter value
    +     * @return parameter value
    +     */
    +    std::string getParameterValue(const std::string &parameterName);
    +
    +    /**
    +     * set parameter value in ./etc/hdfs-client.xml or ./etc/hadoop/hdfs-site.xml according to parameter name
    +     * @param parameterName, parameter name which used to set parameter value
    +     * @param parameterValue, parameter value which to be set
    +     * @param conftype, get parameter value, 'hdfs' or 'HDFS' from ./etc/hdfs-client.xml, others from ./etc/hadoop/hdfs-site.xml
    +     * @return true if succeeded
    +     */
    +    bool setParameterValue(const std::string &parameterName, const std::string &parameterValue, const std::string &conftype);
    +
    +    /**
    +     * set parameter value in ./etc/hadoop/hdfs-site.xml according to parameter name
    +     * @param parameterName, parameter name which used to set parameter value
    +     * @param parameterValue, parameter value which to be set
    +     * @return true if succeeded
    +     */
    +    bool setParameterValue(const std::string &parameterName, const std::string &parameterValue);
    +
    +  private:
    +    /**
    +     * @return yarn user
    +     */
    +    std::string getHdfsUser();
    +    /**
    +     * load key-value parameters in ./etc/hdfs-client.xml
    +     * @return true if succeeded
    +     */
    +    bool LoadFromHawqConfigFile();
    +
    +    /**
    +     * load key-value parameters in ./etc/hadoop/hdfs-site.xml
    +     * @return true if succeeded
    +     */
    +    bool LoadFromHdfsConfigFile();
    +
    +    /**
    +     * get HDFS active or standby namenode information in HA mode according to the namenodetype
    +     * @param namenodetype, used to specify active or standby namenode information
    +     * @param namenode, namenode hostname reference which will be set
    +     * @param port, namenode port reference which will be set
    +     * @return true if getHANamenode succeeded
    +     */
    +    bool getHANamenode(const std::string &namenodetype,
    +                       std::string &namenode,
    +                                   int &port);
    +
    +  private:
    +    std::unique_ptr<XmlConfig> hawqxmlconf;
    +    std::unique_ptr<XmlConfig> hdfsxmlconf;
    +    hawq::test::PSQL psql;
    --- End diff --
    
    You don't need psql object in this class. Remember modifying the construction too. Same problem also exists in `YarnConfig`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

Posted by xunzhang <gi...@git.apache.org>.
Github user xunzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/828#discussion_r72936024
  
    --- Diff: src/test/feature/lib/hdfs_config.h ---
    @@ -0,0 +1,172 @@
    +#ifndef HAWQ_SRC_TEST_FEATURE_LIB_HDFS_CONFIG_H_
    +#define HAWQ_SRC_TEST_FEATURE_LIB_HDFS_CONFIG_H_
    +
    +#include <string>
    +#include <vector>
    +
    +#include "psql.h"
    +#include "sql_util.h"
    +#include "xml_parser.h"
    +
    +namespace hawq {
    +namespace test {
    +
    +/**
    + * HdfsConfig common libray. Get detailed information about HDFS
    + * including checking state of namenodes and datanodes, get parameter value
    + * @author Chunling Wang
    + */
    +class HdfsConfig {
    +  public:
    +    /**
    +     * HdfsConfig constructor
    +     */
    +    HdfsConfig(): psql(HAWQ_DB, HAWQ_HOST, HAWQ_PORT, HAWQ_USER, HAWQ_PASSWORD) {}
    +
    +    /**
    +     * HdfsConfig desstructor
    +     */
    +    ~HdfsConfig() {}
    +
    +    /**
    +     * whether HDFS is in HA mode
    +     * @return true if HDFS is HA
    +     */
    +    bool isHA();
    +
    +    /**
    +     * whether HDFS is kerbos
    +     * @return true if HDFS is kerbos
    +     */
    +    bool isKerbos();
    --- End diff --
    
    I am not sure kerbos is a short name of kerberos. I have googled that, I think you use a incorrect library name.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

Posted by xunzhang <gi...@git.apache.org>.
Github user xunzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/828#discussion_r72936674
  
    --- Diff: src/test/feature/lib/yarn_config.cpp ---
    @@ -0,0 +1,267 @@
    +#include <fstream>
    +#include <string>
    +#include <vector>
    +#include <unordered_set>
    +
    +#include "yarn_config.h"
    +#include "command.h"
    +#include "psql.h"
    +#include "xml_parser.h"
    +#include "string_util.h"
    +
    +using std::string;
    +
    +namespace hawq {
    +namespace test {
    +
    +string YarnConfig::getYarnUser() {
    +  string cmd = "ps aux|grep yarn.server|grep -v grep";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  if (lines.size() >= 1) {
    +    return hawq::test::trim(hawq::test::split(lines[lines.size()-1], ' ')[0]);
    +  } 
    +  return "yarn";
    +}
    +
    +bool YarnConfig::LoadFromHawqConfigFile() {
    +  const char *env = getenv("GPHOME");
    +  string confPath = env ? env : "";
    +  if (!confPath.empty()) {
    +    confPath.append("/etc/yarn-client.xml");
    +  } else {
    +    return false;
    +  }
    +
    +  hawqxmlconf.reset(new XmlConfig(confPath));
    +  hawqxmlconf->parse();
    +  return true;
    +}
    +
    +bool YarnConfig::LoadFromYarnConfigFile() {
    +  string confPath=getHadoopHome();
    +  if (confPath == "")
    +    return false;
    +  confPath.append("/etc/hadoop/yarn-site.xml");
    +  yarnxmlconf.reset(new XmlConfig(confPath));
    +  yarnxmlconf->parse();
    +  return true;
    +}
    +
    +bool YarnConfig::isHA() {
    +  bool ret = LoadFromHawqConfigFile();
    +  if (!ret) {
    +    return false;
    +  }
    +  string nameservice = hawqxmlconf->getString("yarn.resourcemanager.ha");
    +  if (nameservice.length() > 0) {
    +    return true;
    +  } else {
    +    return false;
    +  }
    +}
    +
    +bool YarnConfig::isKerbos() {
    +  bool ret = LoadFromHawqConfigFile();
    +  if (!ret) {
    +    return false;
    +  }
    +  string authentication = hawqxmlconf->getString("hadoop.security.authentication");
    +  if (authentication == "kerberos") {
    +    return true;
    +  } else {
    +    return false;
    +  }
    +}
    +
    +string YarnConfig::getHadoopHome() {
    +  string cmd = "ps -ef|grep hadoop";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  string hadoopHome = "";
    +  auto lines = hawq::test::split(result, '\n');
    +  for (size_t i=0; i<lines.size()-1; i++) {
    +    string valueLine = lines[i];
    +    int pos = valueLine.find("-Dhadoop.home.dir=");
    +    if (pos >=0 ) {
    +      string valueTmp = valueLine.substr(pos+18); 
    +      int valueEnd = valueTmp.find_first_of(" ");
    +      string value = valueTmp.substr(0, valueEnd);
    +      hadoopHome = hawq::test::trim(value);
    +      return hadoopHome;
    +    }
    +  }
    +  return hadoopHome;
    +}
    +
    +bool YarnConfig::getActiveRM(string &activeRM,
    +                                   int &port) {
    +    return getHARM("active", activeRM, port);
    +}
    +
    +bool YarnConfig::getStandbyRM(string &standbyRM,
    +                                    int &port) {
    +    return getHARM("standby", standbyRM, port);
    +}
    +
    +bool YarnConfig::getHARM(const string &RMtype,
    +                               string &RM,
    +                               int &port) {
    +  if (!isHA())
    +    return false;
    +  string RMService = "";
    +  string haRMValue = "rm1,rm2";
    +  auto haRMs = hawq::test::split(haRMValue, ',');
    +  for (size_t i = 0; i < haRMs.size(); i++) {
    +    string haRM = hawq::test::trim(haRMs[i]);
    +    string cmd ="sudo -u ";
    +    cmd.append(getYarnUser());
    +    cmd.append(" yarn rmadmin -getServiceState ");
    +    cmd.append(haRM);
    +    Command c(cmd);
    +    string result = c.run().getResultOutput();
    +    auto lines = hawq::test::split(result, '\n');
    +    if (lines.size() >= 2) {
    +      string valueLine = lines[1];
    +      if (valueLine == RMtype) {
    +        RMService = haRM;
    +        break;
    +      }
    +    }
    +  }
    +  bool ret = LoadFromYarnConfigFile();
    +  if (!ret) {
    +    return false;
    +  }
    +  string rpcAddressName = "yarn.resourcemanager.address.";
    +  rpcAddressName.append(RMService);
    +  string rpcAddressValue = yarnxmlconf->getString(rpcAddressName);
    +  auto RMInfo = hawq::test::split(rpcAddressValue, ':');
    +  RM = hawq::test::trim(RMInfo[0]);
    +  port = std::stoi(hawq::test::trim(RMInfo[1]));
    +  return true;
    +}
    +
    +bool YarnConfig::getRMList(std::vector<string> &RMList,
    +                           std::vector<int> &ports){
    +  string RM = "";
    +  int port;
    +  if (isHA()) {
    +    getActiveRM(RM, port);
    +    RMList.push_back(RM);
    +    ports.push_back(port);
    +    getStandbyRM(RM, port);
    +    RMList.push_back(RM);
    +    ports.push_back(port);
    +    return true;
    +  }
    +
    +  bool ret = LoadFromYarnConfigFile();
    +  if (!ret) {
    +    return false;
    +  }
    +
    +  string RMAddressName = "yarn.resourcemanager.address";
    +  string RMAddressValue = yarnxmlconf->getString(RMAddressName);
    +  if (RMAddressValue == "")
    +    return false;
    +  auto RMInfo = hawq::test::split(RMAddressValue, ':');
    +  RM = hawq::test::trim(RMInfo[0]);
    +  port = std::stoi(hawq::test::trim(RMInfo[1]));
    +  RMList.push_back(RM);
    +  ports.push_back(port);
    +  return true;
    +}
    +
    +void YarnConfig::getNodeManagers(std::vector<string> &nodemanagers,
    +                                 std::vector<int> &port) {
    +  string cmd = "sudo -u ";
    +  cmd.append(getYarnUser());
    +  cmd.append(" yarn node -list -all");
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  bool begin = false;
    +  for (size_t i=0; i<lines.size(); i++) {
    +    if (!begin) {
    +      if (lines[i].find("Node-Id") != string::npos) {
    +        begin = true;
    +      }
    +    } else {
    +      string values = hawq::test::split(lines[i], '\t')[0];
    +      nodemanagers.push_back(hawq::test::trim(hawq::test::split(values, ':')[0]));
    +      port.push_back(std::stoi(hawq::test::trim(hawq::test::split(values, ':')[1])));
    +    }
    +  }
    +}
    +
    +void YarnConfig::getActiveNodeManagers(std::vector<string> &nodemanagers,
    +                                 std::vector<int> &port) {
    +  string cmd = "sudo -u ";
    +  cmd.append(getYarnUser());
    +  cmd.append(" yarn node -list -states RUNNING");
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  bool begin = false;
    +  for (size_t i=0; i<lines.size(); i++) {
    +    if (!begin) {
    +      if (lines[i].find("Node-Id") != string::npos) {
    +        begin = true;
    +      }
    +    } else {
    +      string values = hawq::test::split(lines[i], '\t')[0];
    +      nodemanagers.push_back(hawq::test::trim(hawq::test::split(values, ':')[0]));
    +      port.push_back(std::stoi(hawq::test::trim(hawq::test::split(values, ':')[1])));
    +    }
    +  }
    +}
    +
    +
    +string YarnConfig::getParameterValue(const string &parameterName) {
    +  bool ret = LoadFromYarnConfigFile();
    +  if (!ret) {
    +    return NULL;
    --- End diff --
    
    return type is `string`...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request #828: HAWQ-969. Add getting configuration from H...

Posted by wengyanqing <gi...@git.apache.org>.
Github user wengyanqing commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/828#discussion_r73454751
  
    --- Diff: src/test/feature/lib/hdfs_config.cpp ---
    @@ -0,0 +1,293 @@
    +#include <fstream>
    +#include <string>
    +#include <vector>
    +#include <unordered_set>
    +
    +#include "hdfs_config.h"
    +#include "command.h"
    +#include "psql.h"
    +#include "xml_parser.h"
    +#include "string_util.h"
    +
    +using std::string;
    +
    +namespace hawq {
    +namespace test {
    +
    +string HdfsConfig::getHdfsUser() {
    +  string cmd = "ps aux|grep hdfs.server|grep -v grep";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  if (lines.size() >= 1) {
    +    return hawq::test::trim(hawq::test::split(lines[lines.size()-1], ' ')[0]);
    +  } 
    +  return "hdfs";
    +}
    +
    +bool HdfsConfig::LoadFromHawqConfigFile() {
    +  const char *env = getenv("GPHOME");
    +  string confPath = env ? env : "";
    +  if (confPath != "") {
    +    confPath.append("/etc/hdfs-client.xml");
    +  } else {
    +    return false;
    +  }
    +
    +  hawqxmlconf.reset(new XmlConfig(confPath));
    +  hawqxmlconf->parse();
    +  return true;
    +}
    +
    +bool HdfsConfig::LoadFromHdfsConfigFile() {
    +  string confPath=getHadoopHome();
    +  if (confPath == "") {
    +    return false;
    +  }
    +  confPath.append("/etc/hadoop/hdfs-site.xml");
    +  hdfsxmlconf.reset(new XmlConfig(confPath));
    +  hdfsxmlconf->parse();
    +  return true;
    +}
    +
    +bool HdfsConfig::isHA() {
    +  const hawq::test::PSQLQueryResult &result = psql.getQueryResult(
    +       "SELECT substring(fselocation from length('hdfs:// ') for (position('/' in substring(fselocation from length('hdfs:// ')))-1)::int) "
    +       "FROM pg_filespace pgfs, pg_filespace_entry pgfse "
    +       "WHERE pgfs.fsname = 'dfs_system' AND pgfse.fsefsoid=pgfs.oid ;");
    +  std::vector<std::vector<string>> table = result.getRows();
    +  if (table.size() > 0) {
    +    int find = table[0][0].find(":");
    +    if (find < 0) {
    +      return true;
    +    } else {
    +      return false;
    +    }
    +  }
    +  return false;
    +}
    +
    +bool HdfsConfig::isConfigKerberos() {
    +  bool ret = LoadFromHawqConfigFile();
    +  if (!ret) {
    +    throw GetHawqHomeException();
    +  }
    +  string authentication = hawqxmlconf->getString("hadoop.security.authentication");
    +  if (authentication == "kerberos") {
    +    return true;
    +  } else {
    +    return false;
    +  }
    +}
    +
    +bool HdfsConfig::isTruncate() {
    +  string cmd = "hadoop fs -truncate";
    +  Command c(cmd);
    +  string result = c.run().getResultOutput();
    +  auto lines = hawq::test::split(result, '\n');
    +  if (lines.size() >= 1) {
    +    string valueLine = lines[0];
    +    int find = valueLine.find("-truncate: Unknown command");
    +    if (find < 0) {
    +      return true;
    +    }
    +  }
    +  return false;
    +}
    +
    +string HdfsConfig::getHadoopHome() {
    +  string cmd = "ps -ef|grep hadoop";
    --- End diff --
    
    I see lots of places have the same logic. 
    1. run command
    2. get command result by lines
    3. process lines result
    It's better to extract the common logic into function. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---