You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Ruhul (Jira)" <ji...@apache.org> on 2021/03/08 11:38:00 UTC

[jira] [Commented] (HDDS-4893) Issues with running on kubernetes cluster

    [ https://issues.apache.org/jira/browse/HDDS-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297287#comment-17297287 ] 

Ruhul commented on HDDS-4893:
-----------------------------

On 3/8/21 10:08 AM,  wrote:
Hi,
 
 Do you have running OM pod? Can you use the cluster?
 
 for example with this:
 
 kubectl exec -it scm-0
 ozone sh volume create /vol1
 
------------------------
 
The OM pod does not get created. It stays at Pending state, as shown below because it is waiting for csi-node to complete.  The csi-node attempts to start multiple times, but fails with the error shown in my original issue.  I am pulling the images from my private docker hub, they were created using the following command (using latest main buildin ozone-1.1.0-SNAPSHOT):
 
mvn clean install -f pom.xml -DskipTests -Pdocker-build,docker-push -Ddocker.image=myhub/images:ozone
 
I also tried the 1.0.0 GA release binaries, without the use of private images, but using the default images provided by the scripts, and the same problem occurs there as well.
 
I started the deployment using the following command:
at /home/ubuntu/packages/ozone-1.1.0-SNAPSHOT/kubernetes/examples/ozone
ubuntu@master ~/p/o/k/e/ozone> kubectl apply -R -f .

 
Let me know if you need any more info.
 
ubuntu@master ~/p/o/k/e/ozone> kubectl get pods
NAME                                        READY   STATUS    RESTARTS   AGE
csi-node-87snq                              2/2     Running   2          2m54s
csi-node-cjfp8                              2/2     Running   2          2m54s
csi-node-l7jh6                              2/2     Running   2          2m54s
csi-provisioner-54dbfd7487-96ctc            2/2     Running   0          2m55s
datanode-0                                  0/1     Pending   0          2m55s
freon-bc8c784d7-9qtq2                       1/1     Running   0          2m55s
om-0                                        0/1     Pending   0          2m55s
ozone-csi-test-webserver-6fd6d7c68b-tvdft   0/1     Pending   0          2m55s
s3g-0                                       0/1     Pending   0          2m54s
scm-0                                       0/1     Pending   0          2m54s{color:#888888}
{color}
 
 

> Issues with running on kubernetes cluster
> -----------------------------------------
>
>                 Key: HDDS-4893
>                 URL: https://issues.apache.org/jira/browse/HDDS-4893
>             Project: Apache Ozone
>          Issue Type: Test
>          Components: Ozone Manager
>    Affects Versions: 1.0.0, 1.1.0
>            Reporter: Ruhul
>            Priority: Major
>             Fix For: 1.0.0, 1.1.0
>
>
> Getting the following error from csi-node while running in a kubernetes cluster using kubernetes/example/ozone manifests..
> 2021-03-02 20:01:26 WARN OMProxyInfo:48 - OzoneManager address om-0:9862 for serviceID null remains unresolved for node ID null Check your ozone-site.xml file to ensure ozone manager addresses are configured properly.
> 2021-03-02 20:01:29 INFO RetryInvocationHandler:411 - com.google.protobuf.ServiceException: java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: "om-0":9862; java.net.UnknownHostException; For more details see: http://wiki.apache.org/hadoop/UnknownHost, while invoking $Proxy15.submitRequest over nodeId=null,nodeAddress=om-0:9862 after 1 failover attempts. Trying to failover after sleeping for 4000ms.
> 2
> My config-configmap.yaml
> apiVersion: v1
> kind: ConfigMap
> metadata:
>  name: config
> data:
>  OZONE-SITE.XML_hdds.datanode.dir: /data/storage
>  OZONE-SITE.XML_ozone.scm.datanode.id.dir: /data
>  OZONE-SITE.XML_ozone.metadata.dirs: /data/metadata
>  OZONE-SITE.XML_ozone.scm.block.client.address: scm-0.scm
>  OZONE-SITE.XML_ozone.om.address: om-0.om
>  OZONE-SITE.XML_ozone.scm.client.address: scm-0.scm
>  OZONE-SITE.XML_ozone.scm.names: scm-0.scm
>  OZONE-SITE.XML_hdds.scm.safemode.min.datanode: "3"
>  LOG4J.PROPERTIES_log4j.rootLogger: INFO, stdout
>  LOG4J.PROPERTIES_log4j.appender.stdout: org.apache.log4j.ConsoleAppender
>  LOG4J.PROPERTIES_log4j.appender.stdout.layout: org.apache.log4j.PatternLayout
>  LOG4J.PROPERTIES_log4j.appender.stdout.layout.ConversionPattern: '%d{yyyy-MM-dd
>  HH:mm:ss} %-5p %c\{1}:%L - %m%n'
>  OZONE-SITE.XML_ozone.csi.s3g.address: http://s3g-0.s3g:9878
>  OZONE-SITE.XML_ozone.csi.socket: /var/lib/csi/csi.sock
>  OZONE-SITE.XML_ozone.csi.owner: hadoop
> CORE-SITE.XML_fs.ofs.impl: org.apache.hadoop.fs.ozone.RootedOzoneFileSystem
>  CORE-SITE.XML_fs.o3fs.impl: org.apache.hadoop.fs.ozone.OzoneFileSystem
>  CORE-SITE.XML_fs.trash.interval: "1"
>  CORE-SITE.XML_fs.defaultFS: ofs://om-0.om
> OZONE-SITE.XML_ozone.security.enabled: "false"
>  OZONE-SITE.XML_hdds.scm.safemode.min.datanode: "3"
>  OZONE-SITE.XML_ozone.om.http-address: om-0:9874
>  OZONE-SITE.XML_ozone.replication: "3"
>  no_proxy: om-0.om,scm-0.om,s3g-0.om,recon,kdc,localhost,127.0.0.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org