You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafodion.apache.org by "Liu, Ming (Ming)" <mi...@esgyn.cn> on 2015/09/16 11:47:24 UTC

答复: [Questions] sqps process types and usages

Hi, YuanYuan,

I just want to comment on question 2 and 3. As you mentioned, I don't have answer to Q1.

Idtmsrv is not used in current Trafodion, Trafodion DTM is a peer-to-peer design, no single master, each TM is a peer and do everything locally. So idtmsrv is not really used, you can ignore that. A single master will be a bottleneck for scalability, Trafodion does not need a global cluster wide sequential ID, so it has a very good scalability. Hadoop 1.0 has a global single jobtracker, and Hadoop 2.0 change it to YARN. Same idea here. Avoid a single master. 

Please check two env vars: DCS_INSTALL_DIR and REST_INSTALL_DIR. 
Set them to the location where DCS and REST are installed on your system, before invoke sqstart.

DCS Server and REST are not monitor processes, not started via sqshell, so probably not be visible via sqps. I don't know a proper way to show them. Others may help here, I heard there is a dcscheck or some similar command to show status of DCS. Not familiar with REST and DCS.

Thanks,
Ming

-----邮件原件-----
发件人: Nieyuanyuan [mailto:nieyuanyuan@huawei.com] 
发送时间: 2015年9月16日 10:40
收件人: dev <de...@trafodion.incubator.apache.org>
抄送: Lijian (Q) <ji...@huawei.com>
主题: [Questions] sqps process types and usages

Hi, Hans, Narendra,

Actually I have already asked this question for Liu, Ming, but he said he can't answer all, the question is, the command 'sqps' can be used to list the running processes, e.g.:

[nieyy@redhat-72 local_hadoop]$ sqps
Processing cluster.conf on local host redhat-72 [$Z000H4K] Shell/shell Version 1.0.1 Release 1.1.0 (Build debug [nieyy], date 08Sep15) [$Z000H4K] %ps
[$Z000H4K] NID,PID(os)  PRI TYPE STATES  NAME        PARENT      PROGRAM
[$Z000H4K] ------------ --- ---- ------- ----------- ----------- ---------------
[$Z000H4K] 000,00041272 000 WDG  ES--A-- $WDG000     NONE        sqwatchdog
[$Z000H4K] 000,00041273 000 PSD  ES--A-- $PSD000     NONE        pstartd
[$Z000H4K] 000,00041292 001 GEN  ES--A-- $TSID0      NONE        idtmsrv
[$Z000H4K] 000,00041306 001 DTM  ES--A-- $TM0        NONE        tm
[$Z000H4K] 000,00042522 001 GEN  ES--A-- $ZSC000     NONE        mxsscp
[$Z000H4K] 000,00042553 001 SSMP ES--A-- $ZSM000     NONE        mxssmp
[$Z000H4K] 000,00044067 001 GEN  ES--A-- $Z0010Z2    NONE        mxosrvr
[$Z000H4K] 000,00044430 001 GEN  ES--A-- $Z00119F    NONE        mxosrvr
[$Z000H4K] 000,00044539 001 GEN  ES--A-- $Z0011CJ    NONE        mxosrvr
[$Z000H4K] 000,00044659 001 GEN  ES--A-- $ZLOBSRV0   NONE        mxlobsrvr
[$Z000H4K] 000,00044337 001 GEN  ES--A-- $Z00116S    NONE        mxosrvr
[$Z000H4K] 000,00050937 001 GEN  ES--A-- $Z0016KC    $Z00119F    tdm_arkcmp
[$Z000H4K] 000,00051321 001 GEN  ES--A-- $Z0016WB    $Z0016KC    tdm_arkcmp
[$Z000H4K] 000,00018926 001 GEN  ES--A-- $Z000FFR    NONE        sqlci
[$Z000H4K] 000,00020985 001 GEN  ES--A-- $Z000H4K    NONE        shell
[$Z000H4K] 001,00041274 000 WDG  ES--A-- $WDG001     NONE        sqwatchdog
[$Z000H4K] 001,00041275 000 PSD  ES--A-- $PSD001     NONE        pstartd
[$Z000H4K] 001,00041816 001 DTM  ES--A-- $TM1        NONE        tm
[$Z000H4K] 001,00042537 001 GEN  ES--A-- $ZSC001     NONE        mxsscp
[$Z000H4K] 001,00042568 001 SSMP ES--A-- $ZSM001     NONE        mxssmp
[$Z000H4K] 001,00044666 001 GEN  ES--A-- $ZLOBSRV1   NONE        mxlobsrvr

According to the architecture diagram, some process types are easy to understand, like sqlci, tm, mxosrvr, but:

1.       Do we have a description of how many process types, what are they used for, and how they interact w/ each other? (I can't find too much details from the code)

2.       Why idtmsrv is a temp solution and should be removed later as Liu, Ming said? If so, how to generate a unique global ID in the new design?

3.       Which one is the DCS Master? And so does DCS Server? Looks like my dev env has no such processes, if yes, does that mean DCS Master & DCS Server is not needed for a single node dev env?

Thanks.

Re: 答复: [Questions] sqps process types and usages

Posted by "Liu, Ming (Ming)" <mi...@esgyn.cn>.
Hi, yuanyuan,

Yes idtmsrv is not used in trafodion now. You can ignore it.
We are developing enhancement to trafodion continuously, new version will release regularly.


Thanks,
Ming
发送自我的HTC Phone

Nieyuanyuan <ni...@huawei.com>编写:


Hi, LiuMing,

Thanks for your answer, I checked DCS_INSTALL_DIR & REST_INSTALL_DIR, it was set.

[nieyy@redhat-72 rest-1.1.0]$ echo $DCS_INSTALL_DIR
/home/nieyy/trafodion_build/incubator-trafodion-stable-1.1/core/sqf/sql/local_hadoop/dcs-1.1.0
[nieyy@redhat-72 rest-1.1.0]$ echo $REST_INSTALL_DIR
/home/nieyy/trafodion_build/incubator-trafodion-stable-1.1/core/sqf/sql/local_hadoop/rest-1.1.0

What do you mean current Trafodion? 2.0 under developing? I am using 1.1 so far, sqps did show idtmsrv, so you mean it's not used even it's the process list, right?

And I can't find dcscheck command in my build environment, only dcsstart & dcsstop, I am using a hadoop sandbox.

Thanks.

-----邮件原件-----
发件人: Liu, Ming (Ming) [mailto:ming.liu@esgyn.cn]
发送时间: 2015年9月16日 17:47
收件人: dev@trafodion.incubator.apache.org
抄送: Lijian (Q)
主题: 答复: [Questions] sqps process types and usages

Hi, YuanYuan,

I just want to comment on question 2 and 3. As you mentioned, I don't have answer to Q1.

Idtmsrv is not used in current Trafodion, Trafodion DTM is a peer-to-peer design, no single master, each TM is a peer and do everything locally. So idtmsrv is not really used, you can ignore that. A single master will be a bottleneck for scalability, Trafodion does not need a global cluster wide sequential ID, so it has a very good scalability. Hadoop 1.0 has a global single jobtracker, and Hadoop 2.0 change it to YARN. Same idea here. Avoid a single master.

Please check two env vars: DCS_INSTALL_DIR and REST_INSTALL_DIR.
Set them to the location where DCS and REST are installed on your system, before invoke sqstart.

DCS Server and REST are not monitor processes, not started via sqshell, so probably not be visible via sqps. I don't know a proper way to show them. Others may help here, I heard there is a dcscheck or some similar command to show status of DCS. Not familiar with REST and DCS.

Thanks,
Ming

-----邮件原件-----
发件人: Nieyuanyuan [mailto:nieyuanyuan@huawei.com]
发送时间: 2015年9月16日 10:40
收件人: dev <de...@trafodion.incubator.apache.org>
抄送: Lijian (Q) <ji...@huawei.com>
主题: [Questions] sqps process types and usages

Hi, Hans, Narendra,

Actually I have already asked this question for Liu, Ming, but he said he can't answer all, the question is, the command 'sqps' can be used to list the running processes, e.g.:

[nieyy@redhat-72 local_hadoop]$ sqps
Processing cluster.conf on local host redhat-72 [$Z000H4K] Shell/shell Version 1.0.1 Release 1.1.0 (Build debug [nieyy], date 08Sep15) [$Z000H4K] %ps
[$Z000H4K] NID,PID(os)  PRI TYPE STATES  NAME        PARENT      PROGRAM
[$Z000H4K] ------------ --- ---- ------- ----------- ----------- ---------------
[$Z000H4K] 000,00041272 000 WDG  ES--A-- $WDG000     NONE        sqwatchdog
[$Z000H4K] 000,00041273 000 PSD  ES--A-- $PSD000     NONE        pstartd
[$Z000H4K] 000,00041292 001 GEN  ES--A-- $TSID0      NONE        idtmsrv
[$Z000H4K] 000,00041306 001 DTM  ES--A-- $TM0        NONE        tm
[$Z000H4K] 000,00042522 001 GEN  ES--A-- $ZSC000     NONE        mxsscp
[$Z000H4K] 000,00042553 001 SSMP ES--A-- $ZSM000     NONE        mxssmp
[$Z000H4K] 000,00044067 001 GEN  ES--A-- $Z0010Z2    NONE        mxosrvr
[$Z000H4K] 000,00044430 001 GEN  ES--A-- $Z00119F    NONE        mxosrvr
[$Z000H4K] 000,00044539 001 GEN  ES--A-- $Z0011CJ    NONE        mxosrvr
[$Z000H4K] 000,00044659 001 GEN  ES--A-- $ZLOBSRV0   NONE        mxlobsrvr
[$Z000H4K] 000,00044337 001 GEN  ES--A-- $Z00116S    NONE        mxosrvr
[$Z000H4K] 000,00050937 001 GEN  ES--A-- $Z0016KC    $Z00119F    tdm_arkcmp
[$Z000H4K] 000,00051321 001 GEN  ES--A-- $Z0016WB    $Z0016KC    tdm_arkcmp
[$Z000H4K] 000,00018926 001 GEN  ES--A-- $Z000FFR    NONE        sqlci
[$Z000H4K] 000,00020985 001 GEN  ES--A-- $Z000H4K    NONE        shell
[$Z000H4K] 001,00041274 000 WDG  ES--A-- $WDG001     NONE        sqwatchdog
[$Z000H4K] 001,00041275 000 PSD  ES--A-- $PSD001     NONE        pstartd
[$Z000H4K] 001,00041816 001 DTM  ES--A-- $TM1        NONE        tm
[$Z000H4K] 001,00042537 001 GEN  ES--A-- $ZSC001     NONE        mxsscp
[$Z000H4K] 001,00042568 001 SSMP ES--A-- $ZSM001     NONE        mxssmp
[$Z000H4K] 001,00044666 001 GEN  ES--A-- $ZLOBSRV1   NONE        mxlobsrvr

According to the architecture diagram, some process types are easy to understand, like sqlci, tm, mxosrvr, but:

1.       Do we have a description of how many process types, what are they used for, and how they interact w/ each other? (I can't find too much details from the code)

2.       Why idtmsrv is a temp solution and should be removed later as Liu, Ming said? If so, how to generate a unique global ID in the new design?

3.       Which one is the DCS Master? And so does DCS Server? Looks like my dev env has no such processes, if yes, does that mean DCS Master & DCS Server is not needed for a single node dev env?

Thanks.

Re: 答复: [Questions] sqps process types and usages

Posted by RuoYu Zuo <jo...@gmail.com>.
I don’t think we provids dcscheck in trafodion, these are jvm hosting
services, monitor does not oversight them. The log file tells you whether
dcs started successfully or not.

And there’s also a web UI for monitoring the dcs services status, once the
dcs services are started, you can browse the web at http://<IP/host
name>:<port number>. In the latest codes, by default, the prot number is
40010. If not, you need to change the dcs-site.xml file under your
$DCS_INSTALL_DIR/conf dir. Change the value of dcs.master.info.port to the
prot number you want to use, then restart dcs services, you will be able to
browse that web UI then.



——
Life’s a journey not a destination, don’t just tell what tomorrow brings.

On Thu, Sep 17, 2015 at 10:01 AM, Nieyuanyuan <ni...@huawei.com>
wrote:

> Hi, LiuMing,
>
> Thanks for your answer, I checked DCS_INSTALL_DIR & REST_INSTALL_DIR, it
> was set.
>
> [nieyy@redhat-72 rest-1.1.0]$ echo $DCS_INSTALL_DIR
>
> /home/nieyy/trafodion_build/incubator-trafodion-stable-1.1/core/sqf/sql/local_hadoop/dcs-1.1.0
> [nieyy@redhat-72 rest-1.1.0]$ echo $REST_INSTALL_DIR
>
> /home/nieyy/trafodion_build/incubator-trafodion-stable-1.1/core/sqf/sql/local_hadoop/rest-1.1.0
>
> What do you mean current Trafodion? 2.0 under developing? I am using 1.1
> so far, sqps did show idtmsrv, so you mean it's not used even it's the
> process list, right?
>
> And I can't find dcscheck command in my build environment, only dcsstart &
> dcsstop, I am using a hadoop sandbox.
>
> Thanks.
>
> -----邮件原件-----
> 发件人: Liu, Ming (Ming) [mailto:ming.liu@esgyn.cn]
> 发送时间: 2015年9月16日 17:47
> 收件人: dev@trafodion.incubator.apache.org
> 抄送: Lijian (Q)
> 主题: 答复: [Questions] sqps process types and usages
>
> Hi, YuanYuan,
>
> I just want to comment on question 2 and 3. As you mentioned, I don't have
> answer to Q1.
>
> Idtmsrv is not used in current Trafodion, Trafodion DTM is a peer-to-peer
> design, no single master, each TM is a peer and do everything locally. So
> idtmsrv is not really used, you can ignore that. A single master will be a
> bottleneck for scalability, Trafodion does not need a global cluster wide
> sequential ID, so it has a very good scalability. Hadoop 1.0 has a global
> single jobtracker, and Hadoop 2.0 change it to YARN. Same idea here. Avoid
> a single master.
>
> Please check two env vars: DCS_INSTALL_DIR and REST_INSTALL_DIR.
> Set them to the location where DCS and REST are installed on your system,
> before invoke sqstart.
>
> DCS Server and REST are not monitor processes, not started via sqshell, so
> probably not be visible via sqps. I don't know a proper way to show them.
> Others may help here, I heard there is a dcscheck or some similar command
> to show status of DCS. Not familiar with REST and DCS.
>
> Thanks,
> Ming
>
> -----邮件原件-----
> 发件人: Nieyuanyuan [mailto:nieyuanyuan@huawei.com]
> 发送时间: 2015年9月16日 10:40
> 收件人: dev <de...@trafodion.incubator.apache.org>
> 抄送: Lijian (Q) <ji...@huawei.com>
> 主题: [Questions] sqps process types and usages
>
> Hi, Hans, Narendra,
>
> Actually I have already asked this question for Liu, Ming, but he said he
> can't answer all, the question is, the command 'sqps' can be used to list
> the running processes, e.g.:
>
> [nieyy@redhat-72 local_hadoop]$ sqps
> Processing cluster.conf on local host redhat-72 [$Z000H4K] Shell/shell
> Version 1.0.1 Release 1.1.0 (Build debug [nieyy], date 08Sep15) [$Z000H4K]
> %ps
> [$Z000H4K] NID,PID(os)  PRI TYPE STATES  NAME        PARENT      PROGRAM
> [$Z000H4K] ------------ --- ---- ------- ----------- -----------
> ---------------
> [$Z000H4K] 000,00041272 000 WDG  ES--A-- $WDG000     NONE        sqwatchdog
> [$Z000H4K] 000,00041273 000 PSD  ES--A-- $PSD000     NONE        pstartd
> [$Z000H4K] 000,00041292 001 GEN  ES--A-- $TSID0      NONE        idtmsrv
> [$Z000H4K] 000,00041306 001 DTM  ES--A-- $TM0        NONE        tm
> [$Z000H4K] 000,00042522 001 GEN  ES--A-- $ZSC000     NONE        mxsscp
> [$Z000H4K] 000,00042553 001 SSMP ES--A-- $ZSM000     NONE        mxssmp
> [$Z000H4K] 000,00044067 001 GEN  ES--A-- $Z0010Z2    NONE        mxosrvr
> [$Z000H4K] 000,00044430 001 GEN  ES--A-- $Z00119F    NONE        mxosrvr
> [$Z000H4K] 000,00044539 001 GEN  ES--A-- $Z0011CJ    NONE        mxosrvr
> [$Z000H4K] 000,00044659 001 GEN  ES--A-- $ZLOBSRV0   NONE        mxlobsrvr
> [$Z000H4K] 000,00044337 001 GEN  ES--A-- $Z00116S    NONE        mxosrvr
> [$Z000H4K] 000,00050937 001 GEN  ES--A-- $Z0016KC    $Z00119F    tdm_arkcmp
> [$Z000H4K] 000,00051321 001 GEN  ES--A-- $Z0016WB    $Z0016KC    tdm_arkcmp
> [$Z000H4K] 000,00018926 001 GEN  ES--A-- $Z000FFR    NONE        sqlci
> [$Z000H4K] 000,00020985 001 GEN  ES--A-- $Z000H4K    NONE        shell
> [$Z000H4K] 001,00041274 000 WDG  ES--A-- $WDG001     NONE        sqwatchdog
> [$Z000H4K] 001,00041275 000 PSD  ES--A-- $PSD001     NONE        pstartd
> [$Z000H4K] 001,00041816 001 DTM  ES--A-- $TM1        NONE        tm
> [$Z000H4K] 001,00042537 001 GEN  ES--A-- $ZSC001     NONE        mxsscp
> [$Z000H4K] 001,00042568 001 SSMP ES--A-- $ZSM001     NONE        mxssmp
> [$Z000H4K] 001,00044666 001 GEN  ES--A-- $ZLOBSRV1   NONE        mxlobsrvr
>
> According to the architecture diagram, some process types are easy to
> understand, like sqlci, tm, mxosrvr, but:
>
> 1.       Do we have a description of how many process types, what are they
> used for, and how they interact w/ each other? (I can't find too much
> details from the code)
>
> 2.       Why idtmsrv is a temp solution and should be removed later as
> Liu, Ming said? If so, how to generate a unique global ID in the new design?
>
> 3.       Which one is the DCS Master? And so does DCS Server? Looks like
> my dev env has no such processes, if yes, does that mean DCS Master & DCS
> Server is not needed for a single node dev env?
>
> Thanks.
>

答复: [Questions] sqps process types and usages

Posted by Nieyuanyuan <ni...@huawei.com>.
Hi, LiuMing,

Thanks for your answer, I checked DCS_INSTALL_DIR & REST_INSTALL_DIR, it was set.

[nieyy@redhat-72 rest-1.1.0]$ echo $DCS_INSTALL_DIR
/home/nieyy/trafodion_build/incubator-trafodion-stable-1.1/core/sqf/sql/local_hadoop/dcs-1.1.0
[nieyy@redhat-72 rest-1.1.0]$ echo $REST_INSTALL_DIR   
/home/nieyy/trafodion_build/incubator-trafodion-stable-1.1/core/sqf/sql/local_hadoop/rest-1.1.0

What do you mean current Trafodion? 2.0 under developing? I am using 1.1 so far, sqps did show idtmsrv, so you mean it's not used even it's the process list, right?

And I can't find dcscheck command in my build environment, only dcsstart & dcsstop, I am using a hadoop sandbox.

Thanks.

-----邮件原件-----
发件人: Liu, Ming (Ming) [mailto:ming.liu@esgyn.cn] 
发送时间: 2015年9月16日 17:47
收件人: dev@trafodion.incubator.apache.org
抄送: Lijian (Q)
主题: 答复: [Questions] sqps process types and usages

Hi, YuanYuan,

I just want to comment on question 2 and 3. As you mentioned, I don't have answer to Q1.

Idtmsrv is not used in current Trafodion, Trafodion DTM is a peer-to-peer design, no single master, each TM is a peer and do everything locally. So idtmsrv is not really used, you can ignore that. A single master will be a bottleneck for scalability, Trafodion does not need a global cluster wide sequential ID, so it has a very good scalability. Hadoop 1.0 has a global single jobtracker, and Hadoop 2.0 change it to YARN. Same idea here. Avoid a single master. 

Please check two env vars: DCS_INSTALL_DIR and REST_INSTALL_DIR. 
Set them to the location where DCS and REST are installed on your system, before invoke sqstart.

DCS Server and REST are not monitor processes, not started via sqshell, so probably not be visible via sqps. I don't know a proper way to show them. Others may help here, I heard there is a dcscheck or some similar command to show status of DCS. Not familiar with REST and DCS.

Thanks,
Ming

-----邮件原件-----
发件人: Nieyuanyuan [mailto:nieyuanyuan@huawei.com] 
发送时间: 2015年9月16日 10:40
收件人: dev <de...@trafodion.incubator.apache.org>
抄送: Lijian (Q) <ji...@huawei.com>
主题: [Questions] sqps process types and usages

Hi, Hans, Narendra,

Actually I have already asked this question for Liu, Ming, but he said he can't answer all, the question is, the command 'sqps' can be used to list the running processes, e.g.:

[nieyy@redhat-72 local_hadoop]$ sqps
Processing cluster.conf on local host redhat-72 [$Z000H4K] Shell/shell Version 1.0.1 Release 1.1.0 (Build debug [nieyy], date 08Sep15) [$Z000H4K] %ps
[$Z000H4K] NID,PID(os)  PRI TYPE STATES  NAME        PARENT      PROGRAM
[$Z000H4K] ------------ --- ---- ------- ----------- ----------- ---------------
[$Z000H4K] 000,00041272 000 WDG  ES--A-- $WDG000     NONE        sqwatchdog
[$Z000H4K] 000,00041273 000 PSD  ES--A-- $PSD000     NONE        pstartd
[$Z000H4K] 000,00041292 001 GEN  ES--A-- $TSID0      NONE        idtmsrv
[$Z000H4K] 000,00041306 001 DTM  ES--A-- $TM0        NONE        tm
[$Z000H4K] 000,00042522 001 GEN  ES--A-- $ZSC000     NONE        mxsscp
[$Z000H4K] 000,00042553 001 SSMP ES--A-- $ZSM000     NONE        mxssmp
[$Z000H4K] 000,00044067 001 GEN  ES--A-- $Z0010Z2    NONE        mxosrvr
[$Z000H4K] 000,00044430 001 GEN  ES--A-- $Z00119F    NONE        mxosrvr
[$Z000H4K] 000,00044539 001 GEN  ES--A-- $Z0011CJ    NONE        mxosrvr
[$Z000H4K] 000,00044659 001 GEN  ES--A-- $ZLOBSRV0   NONE        mxlobsrvr
[$Z000H4K] 000,00044337 001 GEN  ES--A-- $Z00116S    NONE        mxosrvr
[$Z000H4K] 000,00050937 001 GEN  ES--A-- $Z0016KC    $Z00119F    tdm_arkcmp
[$Z000H4K] 000,00051321 001 GEN  ES--A-- $Z0016WB    $Z0016KC    tdm_arkcmp
[$Z000H4K] 000,00018926 001 GEN  ES--A-- $Z000FFR    NONE        sqlci
[$Z000H4K] 000,00020985 001 GEN  ES--A-- $Z000H4K    NONE        shell
[$Z000H4K] 001,00041274 000 WDG  ES--A-- $WDG001     NONE        sqwatchdog
[$Z000H4K] 001,00041275 000 PSD  ES--A-- $PSD001     NONE        pstartd
[$Z000H4K] 001,00041816 001 DTM  ES--A-- $TM1        NONE        tm
[$Z000H4K] 001,00042537 001 GEN  ES--A-- $ZSC001     NONE        mxsscp
[$Z000H4K] 001,00042568 001 SSMP ES--A-- $ZSM001     NONE        mxssmp
[$Z000H4K] 001,00044666 001 GEN  ES--A-- $ZLOBSRV1   NONE        mxlobsrvr

According to the architecture diagram, some process types are easy to understand, like sqlci, tm, mxosrvr, but:

1.       Do we have a description of how many process types, what are they used for, and how they interact w/ each other? (I can't find too much details from the code)

2.       Why idtmsrv is a temp solution and should be removed later as Liu, Ming said? If so, how to generate a unique global ID in the new design?

3.       Which one is the DCS Master? And so does DCS Server? Looks like my dev env has no such processes, if yes, does that mean DCS Master & DCS Server is not needed for a single node dev env?

Thanks.