You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ozone.apache.org by "Elek, Marton" <el...@apache.org> on 2021/03/08 09:08:42 UTC

Fwd: Issues with running on Kubernetes cluster

(Forwarding from issues@)

Hi,
I am having issues with running on kubernetes cluster using the 
kubernetes/examples/ozone references. I tried both the 1.0.0 released 
build and latest build from the main branch.  The latest error I am 
receiving is the one below from the csi-node. Am I missing some config 
setting for ozone-site.xml?

2021-03-02 20:01:26 WARN  NativeCodeLoader:60 - Unable to load 
native-hadoop library for your platform... using builtin-java classes 
where applicable
2021-03-02 20:01:26 WARN  OMProxyInfo:48 - OzoneManager address 
om-0:9862 for serviceID null remains unresolved for node ID null Check 
your ozone-site.xml file to ensure ozone manager addresses are 
configured properly.
2021-03-02 20:01:29 INFO  RetryInvocationHandler:411 - 
com.google.protobuf.ServiceException: java.net.UnknownHostException: 
Invalid host name: local host is: (unknown); destination host is: 
"om-0":9862; java.net.UnknownHostException; For more details see: 
http://wiki.apache.org/hadoop/UnknownHost, while invoking 
$Proxy15.submitRequest over nodeId=null,nodeAddress=om-0:9862 after 1 
failover attempts. Trying to failover after sleeping for 4000ms.


My configmap:

apiVersion: v1
kind: ConfigMap
metadata:
   name: config
data:
   OZONE-SITE.XML_hdds.datanode.dir: /data/storage
   OZONE-SITE.XML_ozone.scm.datanode.id.dir: /data
   OZONE-SITE.XML_ozone.metadata.dirs: /data/metadata
   OZONE-SITE.XML_ozone.scm.block.client.address: scm-0.scm
   OZONE-SITE.XML_ozone.om.address: om-0.om
   OZONE-SITE.XML_ozone.scm.client.address: scm-0.scm
   OZONE-SITE.XML_ozone.scm.names: scm-0.scm
   OZONE-SITE.XML_hdds.scm.safemode.min.datanode: "3"
   LOG4J.PROPERTIES_log4j.rootLogger: INFO, stdout
   LOG4J.PROPERTIES_log4j.appender.stdout: org.apache.log4j.ConsoleAppender
   LOG4J.PROPERTIES_log4j.appender.stdout.layout: 
org.apache.log4j.PatternLayout
   LOG4J.PROPERTIES_log4j.appender.stdout.layout.ConversionPattern: 
'%d{yyyy-MM-dd
     HH:mm:ss} %-5p %c{1}:%L - %m%n'
   OZONE-SITE.XML_ozone.csi.s3g.address: http://s3g-0.s3g:9878
   OZONE-SITE.XML_ozone.csi.socket: /var/lib/csi/csi.sock
   OZONE-SITE.XML_ozone.csi.owner: hadoop

   CORE-SITE.XML_fs.ofs.impl: 
org.apache.hadoop.fs.ozone.RootedOzoneFileSystem
   CORE-SITE.XML_fs.o3fs.impl: org.apache.hadoop.fs.ozone.OzoneFileSystem
   CORE-SITE.XML_fs.trash.interval: "1"
   CORE-SITE.XML_fs.defaultFS: ofs://om-0.om

   OZONE-SITE.XML_ozone.security.enabled: "false"
   OZONE-SITE.XML_hdds.scm.safemode.min.datanode: "3"
   OZONE-SITE.XML_ozone.om.http-address: om-0:9874
   OZONE-SITE.XML_ozone.replication: "3"



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
For additional commands, e-mail: dev-help@ozone.apache.org


Re: Fwd: Issues with running on Kubernetes cluster

Posted by Ruhul Quddus <rq...@qianalysis.com>.
Hi Marton,

The OM pod does not get created. It stays at Pending state, as shown below
because it is waiting for csi-node to complete.  The csi-node attempts to
start multiple times, but fails with the error shown in my original issue.
I am pulling the images from my private docker hub, they were created using
the following command (using latest main buildin ozone-1.1.0-SNAPSHOT):

mvn clean install -f pom.xml -DskipTests -Pdocker-build,docker-push
-Ddocker.image=myhub/images:ozone

I also tried the 1.0.0 GA release binaries, without the use of private
images, but using the default images provided by the scripts, and the same
problem occurs there as well.

I started the deployment using the following command:
at /home/ubuntu/packages/ozone-1.1.0-SNAPSHOT/kubernetes/examples/ozone
ubuntu@master ~/p/o/k/e/ozone> kubectl apply -R -f .

Let me know if you need any more info.

ubuntu@master ~/p/o/k/e/ozone> kubectl get pods
NAME                                        READY   STATUS    RESTARTS   AGE
csi-node-87snq                              2/2     Running   2
 2m54s
csi-node-cjfp8                              2/2     Running   2
 2m54s
csi-node-l7jh6                              2/2     Running   2
 2m54s
csi-provisioner-54dbfd7487-96ctc            2/2     Running   0
 2m55s
datanode-0                                  0/1     Pending   0
 2m55s
freon-bc8c784d7-9qtq2                       1/1     Running   0
 2m55s
om-0                                        0/1     Pending   0
 2m55s
ozone-csi-test-webserver-6fd6d7c68b-tvdft   0/1     Pending   0
 2m55s
s3g-0                                       0/1     Pending   0
 2m54s
scm-0                                       0/1     Pending   0
 2m54s

Ruhul

On Mon, Mar 8, 2021 at 4:10 AM Elek, Marton <el...@apache.org> wrote:

>
> Hi,
>
> Do you have running OM pod? Can you use the cluster?
>
> for example with this:
>
> kubectl exec -it scm-0
> ozone sh volume create /vol1
>
>
> Marton
>
>
> On 3/8/21 10:08 AM, Elek, Marton wrote:
> >
> > (Forwarding from issues@)
> >
> > Hi,
> > I am having issues with running on kubernetes cluster using the
> > kubernetes/examples/ozone references. I tried both the 1.0.0 released
> > build and latest build from the main branch.  The latest error I am
> > receiving is the one below from the csi-node. Am I missing some config
> > setting for ozone-site.xml?
> >
> > 2021-03-02 20:01:26 WARN  NativeCodeLoader:60 - Unable to load
> > native-hadoop library for your platform... using builtin-java classes
> > where applicable
> > 2021-03-02 20:01:26 WARN  OMProxyInfo:48 - OzoneManager address
> > om-0:9862 for serviceID null remains unresolved for node ID null Check
> > your ozone-site.xml file to ensure ozone manager addresses are
> > configured properly.
> > 2021-03-02 20:01:29 INFO  RetryInvocationHandler:411 -
> > com.google.protobuf.ServiceException: java.net.UnknownHostException:
> > Invalid host name: local host is: (unknown); destination host is:
> > "om-0":9862; java.net.UnknownHostException; For more details see:
> > http://wiki.apache.org/hadoop/UnknownHost, while invoking
> > $Proxy15.submitRequest over nodeId=null,nodeAddress=om-0:9862 after 1
> > failover attempts. Trying to failover after sleeping for 4000ms.
> >
> >
> > My configmap:
> >
> > apiVersion: v1
> > kind: ConfigMap
> > metadata:
> >    name: config
> > data:
> >    OZONE-SITE.XML_hdds.datanode.dir: /data/storage
> >    OZONE-SITE.XML_ozone.scm.datanode.id.dir: /data
> >    OZONE-SITE.XML_ozone.metadata.dirs: /data/metadata
> >    OZONE-SITE.XML_ozone.scm.block.client.address: scm-0.scm
> >    OZONE-SITE.XML_ozone.om.address: om-0.om
> >    OZONE-SITE.XML_ozone.scm.client.address: scm-0.scm
> >    OZONE-SITE.XML_ozone.scm.names: scm-0.scm
> >    OZONE-SITE.XML_hdds.scm.safemode.min.datanode: "3"
> >    LOG4J.PROPERTIES_log4j.rootLogger: INFO, stdout
> >    LOG4J.PROPERTIES_log4j.appender.stdout:
> org.apache.log4j.ConsoleAppender
> >    LOG4J.PROPERTIES_log4j.appender.stdout.layout:
> > org.apache.log4j.PatternLayout
> >    LOG4J.PROPERTIES_log4j.appender.stdout.layout.ConversionPattern:
> > '%d{yyyy-MM-dd
> >      HH:mm:ss} %-5p %c{1}:%L - %m%n'
> >    OZONE-SITE.XML_ozone.csi.s3g.address: http://s3g-0.s3g:9878
> >    OZONE-SITE.XML_ozone.csi.socket: /var/lib/csi/csi.sock
> >    OZONE-SITE.XML_ozone.csi.owner: hadoop
> >
> >    CORE-SITE.XML_fs.ofs.impl:
> > org.apache.hadoop.fs.ozone.RootedOzoneFileSystem
> >    CORE-SITE.XML_fs.o3fs.impl: org.apache.hadoop.fs.ozone.OzoneFileSystem
> >    CORE-SITE.XML_fs.trash.interval: "1"
> >    CORE-SITE.XML_fs.defaultFS: ofs://om-0.om
> >
> >    OZONE-SITE.XML_ozone.security.enabled: "false"
> >    OZONE-SITE.XML_hdds.scm.safemode.min.datanode: "3"
> >    OZONE-SITE.XML_ozone.om.http-address: om-0:9874
> >    OZONE-SITE.XML_ozone.replication: "3"
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> > For additional commands, e-mail: dev-help@ozone.apache.org
> >
>

Re: Fwd: Issues with running on Kubernetes cluster

Posted by "Elek, Marton" <el...@apache.org>.
Hi,

Do you have running OM pod? Can you use the cluster?

for example with this:

kubectl exec -it scm-0
ozone sh volume create /vol1


Marton


On 3/8/21 10:08 AM, Elek, Marton wrote:
> 
> (Forwarding from issues@)
> 
> Hi,
> I am having issues with running on kubernetes cluster using the 
> kubernetes/examples/ozone references. I tried both the 1.0.0 released 
> build and latest build from the main branch.  The latest error I am 
> receiving is the one below from the csi-node. Am I missing some config 
> setting for ozone-site.xml?
> 
> 2021-03-02 20:01:26 WARN  NativeCodeLoader:60 - Unable to load 
> native-hadoop library for your platform... using builtin-java classes 
> where applicable
> 2021-03-02 20:01:26 WARN  OMProxyInfo:48 - OzoneManager address 
> om-0:9862 for serviceID null remains unresolved for node ID null Check 
> your ozone-site.xml file to ensure ozone manager addresses are 
> configured properly.
> 2021-03-02 20:01:29 INFO  RetryInvocationHandler:411 - 
> com.google.protobuf.ServiceException: java.net.UnknownHostException: 
> Invalid host name: local host is: (unknown); destination host is: 
> "om-0":9862; java.net.UnknownHostException; For more details see: 
> http://wiki.apache.org/hadoop/UnknownHost, while invoking 
> $Proxy15.submitRequest over nodeId=null,nodeAddress=om-0:9862 after 1 
> failover attempts. Trying to failover after sleeping for 4000ms.
> 
> 
> My configmap:
> 
> apiVersion: v1
> kind: ConfigMap
> metadata:
>    name: config
> data:
>    OZONE-SITE.XML_hdds.datanode.dir: /data/storage
>    OZONE-SITE.XML_ozone.scm.datanode.id.dir: /data
>    OZONE-SITE.XML_ozone.metadata.dirs: /data/metadata
>    OZONE-SITE.XML_ozone.scm.block.client.address: scm-0.scm
>    OZONE-SITE.XML_ozone.om.address: om-0.om
>    OZONE-SITE.XML_ozone.scm.client.address: scm-0.scm
>    OZONE-SITE.XML_ozone.scm.names: scm-0.scm
>    OZONE-SITE.XML_hdds.scm.safemode.min.datanode: "3"
>    LOG4J.PROPERTIES_log4j.rootLogger: INFO, stdout
>    LOG4J.PROPERTIES_log4j.appender.stdout: org.apache.log4j.ConsoleAppender
>    LOG4J.PROPERTIES_log4j.appender.stdout.layout: 
> org.apache.log4j.PatternLayout
>    LOG4J.PROPERTIES_log4j.appender.stdout.layout.ConversionPattern: 
> '%d{yyyy-MM-dd
>      HH:mm:ss} %-5p %c{1}:%L - %m%n'
>    OZONE-SITE.XML_ozone.csi.s3g.address: http://s3g-0.s3g:9878
>    OZONE-SITE.XML_ozone.csi.socket: /var/lib/csi/csi.sock
>    OZONE-SITE.XML_ozone.csi.owner: hadoop
> 
>    CORE-SITE.XML_fs.ofs.impl: 
> org.apache.hadoop.fs.ozone.RootedOzoneFileSystem
>    CORE-SITE.XML_fs.o3fs.impl: org.apache.hadoop.fs.ozone.OzoneFileSystem
>    CORE-SITE.XML_fs.trash.interval: "1"
>    CORE-SITE.XML_fs.defaultFS: ofs://om-0.om
> 
>    OZONE-SITE.XML_ozone.security.enabled: "false"
>    OZONE-SITE.XML_hdds.scm.safemode.min.datanode: "3"
>    OZONE-SITE.XML_ozone.om.http-address: om-0:9874
>    OZONE-SITE.XML_ozone.replication: "3"
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
> For additional commands, e-mail: dev-help@ozone.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ozone.apache.org
For additional commands, e-mail: dev-help@ozone.apache.org