You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Arvind Narain (JIRA)" <ji...@apache.org> on 2017/04/12 18:16:41 UTC

[jira] [Reopened] (TRAFODION-1989) after uninstall, and reinstall again, dcscheck report dcs master not up

     [ https://issues.apache.org/jira/browse/TRAFODION-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arvind Narain reopened TRAFODION-1989:
--------------------------------------
      Assignee:     (was: Amanda Moran)

Reopening to fix this issue in Ambari installer. If the trafodion userid changes then jps command doesn't work resulting in check scripts giving wrong info.

Bash installer has the following change:

function fixPermissions {

#Change ownership of this file to be owned by traf user and traf group
#Errors with JPS will happen if not modified 
if [[ -d /tmp/hsperfdata_trafodion ]]; then 
   $TRAF_PDSH sudo chown -R $TRAF_USER.trafodion /tmp/hsperfdata_trafodion
fi

}

pyinstaller has the following change (including rm -rf during uninstall):

    # change permission for hsperfdata
    if os.path.exists(TRAF_HSPERFDATA_FILE):
        run_cmd('chown -R %s:%s %s' % (traf_user, traf_group, TRAF_HSPERFDATA_FILE))

===
Problem seen in Ambari env (dcs master is up but shows as down):

[trafodion@rmarton-2 tmp]$ ls -ld hsper*
drwxr-xr-x 2 ambari-qa hadoop    4096 Apr 10 23:36 hsperfdata_ambari-qa
drwxr-xr-x 2 hbase     hadoop    4096 Apr 10 14:56 hsperfdata_hbase
drwxr-xr-x 2 hcat      hadoop    4096 Apr 10 14:45 hsperfdata_hcat
drwxr-xr-x 2 hdfs      hadoop    4096 Apr 10 14:38 hsperfdata_hdfs
drwxr-xr-x 2 hive      hadoop    4096 Apr 10 14:45 hsperfdata_hive
drwxr-xr-x 2 mapred    hadoop    4096 Apr 10 14:44 hsperfdata_mapred
drwxr-xr-x 2 root      root      4096 Apr 10 21:41 hsperfdata_root
drwxr-xr-x 2      1003 trafodion 4096 Apr 10 20:21 hsperfdata_trafodion
drwxr-xr-x 2 yarn      hadoop    4096 Apr 10 14:43 hsperfdata_yarn
drwxr-xr-x 2 zookeeper hadoop    4096 Apr 10 14:49 hsperfdata_zookeeper
[trafodion@rmarton-2 tmp]$ id
uid=512(trafodion) gid=503(trafodion) groups=503(trafodion)
[trafodion@rmarton-2 tmp]$ jps
[trafodion@rmarton-2 tmp]$ id
uid=512(trafodion) gid=503(trafodion) groups=503(trafodion)
[trafodion@rmarton-2 tmp]$ dcscheck

DcsMaster is not started. Please start DCS using 'dcsstart' command...

Process             Configured   Actual        Down
---------    ----------   ------        ----
DcsMaster    2             0             2
DcsServer    2             0             2
mxosrvr             16            16            

[trafodion@rmarton-2 tmp]$ cdw
[trafodion@rmarton-2 apache-trafodion_server-2.1.0]$ cd dcs-2.1.0/
[trafodion@rmarton-2 dcs-2.1.0]$ bin/zkcli
-bash: bin/zkcli: No such file or directory
[trafodion@rmarton-2 dcs-2.1.0]$ bin/dcs zkcli
Connecting to rmarton-2.novalocal:2181
Welcome to ZooKeeper!
JLine support is enabled

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: rmarton-2.novalocal:2181(CONNECTED) 0] ls /trafodion/dcs/master
[rmarton-2.novalocal:23400:100:1491861269695]
[zk: rmarton-2.novalocal:2181(CONNECTED) 1] quit
Quitting...
[trafodion@rmarton-2 dcs-2.1.0]$


> after uninstall, and reinstall again, dcscheck report dcs master not up
> -----------------------------------------------------------------------
>
>                 Key: TRAFODION-1989
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-1989
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: installer
>            Reporter: liu ming
>
> in the same cluster, one install a trafodion version.
> then use trafodion_uninstaller to remove the installation.
> later at same cluster, install again via trafodion_install
> after the new installation, dcscheck fail.
> jps cannot show related DCS java processes, but those processes are up and running.
> It is due to a legacy /tmp/hsperfdata_trafodion direcotry belong to the previous trafodion user. The new user cannot write into this folder, but jps rely on it to work correct. And dcscheck rely on jps to work correctly.
> Either installer remove that legacy directory, or uninstaller remove it. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)