You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "fapifta (via GitHub)" <gi...@apache.org> on 2023/02/09 20:56:57 UTC

[GitHub] [ozone] fapifta commented on a diff in pull request #4250: HDDS-5966. [HTTPFSGW] Update module doc, and place it in Ozone project docs

fapifta commented on code in PR #4250:
URL: https://github.com/apache/ozone/pull/4250#discussion_r1101836862


##########
hadoop-hdds/docs/content/interface/HttpFS.md:
##########
@@ -0,0 +1,119 @@
+---
+title: HttpFS Gateway
+weight: 4
+menu:
+main:
+parent: "Client Interfaces"
+summary: Ozone HttpFS is a WebHDFS compatible interface implementation, as a separate role it provides an easy integration with Ozone.
+---
+
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+Ozone HttpFS can be used to integrate Ozone with other tools via REST API.
+
+## Introduction
+
+Ozone HttpFS started as forking the HDFS HttpFS endpoint implementation ([HDDS-5448](https://issues.apache.org/jira/browse/HDDS-5448)). It is added a separate role to Ozone, like S3G. 

Review Comment:
   How about this wording: "Ozone HttpFS is forked from the HDFS HttpFS endpoint implementation." ?
   Also in the second sentence here: "added _as_ a separate" I believe.



##########
hadoop-hdds/docs/content/interface/HttpFS.md:
##########
@@ -0,0 +1,119 @@
+---
+title: HttpFS Gateway
+weight: 4
+menu:
+main:
+parent: "Client Interfaces"
+summary: Ozone HttpFS is a WebHDFS compatible interface implementation, as a separate role it provides an easy integration with Ozone.
+---
+
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+Ozone HttpFS can be used to integrate Ozone with other tools via REST API.
+
+## Introduction
+
+Ozone HttpFS started as forking the HDFS HttpFS endpoint implementation ([HDDS-5448](https://issues.apache.org/jira/browse/HDDS-5448)). It is added a separate role to Ozone, like S3G. 
+
+HttpFS is a server that provides a REST HTTP gateway supporting File System operations (read and write). And it is interoperable with the **webhdfs** REST HTTP API.
+
+HttpFS can be used to access data in Ozone on a cluster behind of a firewall (the HttpFS server acts as a gateway and is the only system that is allowed to cross the firewall into the cluster).
+
+HttpFS can be used to access data in Ozone using HTTP utilities (such as curl and wget) and HTTP libraries Perl from other languages than Java.
+
+The **webhdfs** client FileSystem implementation can be used to access HttpFS using the Ozone filesystem command line tool (`ozone fs`) as well as from Java applications using the Hadoop FileSystem Java API.
+
+HttpFS has built-in security supporting Hadoop pseudo authentication and HTTP SPNEGO Kerberos and other pluggable authentication mechanisms. It also provides Hadoop proxy user support.

Review Comment:
   SPNEGO I would call "SPNEGO with Kerberos" or "Kerberos SPNEGO" instead of  "HTTP SPNEGO Kerberos".



##########
hadoop-hdds/docs/content/interface/HttpFS.md:
##########
@@ -0,0 +1,119 @@
+---
+title: HttpFS Gateway
+weight: 4
+menu:
+main:
+parent: "Client Interfaces"
+summary: Ozone HttpFS is a WebHDFS compatible interface implementation, as a separate role it provides an easy integration with Ozone.
+---
+
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+Ozone HttpFS can be used to integrate Ozone with other tools via REST API.
+
+## Introduction
+
+Ozone HttpFS started as forking the HDFS HttpFS endpoint implementation ([HDDS-5448](https://issues.apache.org/jira/browse/HDDS-5448)). It is added a separate role to Ozone, like S3G. 
+
+HttpFS is a server that provides a REST HTTP gateway supporting File System operations (read and write). And it is interoperable with the **webhdfs** REST HTTP API.
+
+HttpFS can be used to access data in Ozone on a cluster behind of a firewall (the HttpFS server acts as a gateway and is the only system that is allowed to cross the firewall into the cluster).
+
+HttpFS can be used to access data in Ozone using HTTP utilities (such as curl and wget) and HTTP libraries Perl from other languages than Java.
+
+The **webhdfs** client FileSystem implementation can be used to access HttpFS using the Ozone filesystem command line tool (`ozone fs`) as well as from Java applications using the Hadoop FileSystem Java API.
+
+HttpFS has built-in security supporting Hadoop pseudo authentication and HTTP SPNEGO Kerberos and other pluggable authentication mechanisms. It also provides Hadoop proxy user support.
+
+
+## Getting started
+
+HttpFS itself is Java Jetty web-application. Ozone HttpFS Gateway is a separated component which provides access to Ozone via REST API. It should be started additional to the regular Ozone components.
+
+You can start a docker based cluster, including the HttpFS gateway from the release package.
+
+Go to the `compose/ozone` directory and start the server:
+
+```bash
+docker-compose up -d --scale datanode=3
+```
+
+You can now see the HttpFS gateway in docker with the name `ozone_httpfs`.
+HttpFS HTTP web-service API calls are HTTP REST calls that map to an Ozone file system operation. For example, using the `curl` Unix command.
+
+E.g. in the docker cluster you can execute commands these:
+
+* `curl -i -X PUT "http://httpfs:14000/webhdfs/v1/vol1?op=MKDIRS&user.name=hdfs"` creates a volume called `vol1`.
+
+
+* `$ curl 'http://httpfs-host:14000/webhdfs/v1/user/foo/README.txt?op=OPEN&user.name=foo'` returns the contents of the `/user/foo/README.txt` key.
+
+
+## Supported operations
+
+These are the WebHDFS REST API operations that are supported/unsupported in Ozone.
+
+### File and Directory Operations

Review Comment:
   For the following tables I would like to suggest to use the following terms:
   - supported
   - not implemented in Ozone
   - not implemented in Ozone FileSystem API
   
   For this to be meaningful, we should add the note about the internals I have suggested earlier.



##########
hadoop-hdds/docs/content/interface/HttpFS.md:
##########
@@ -0,0 +1,119 @@
+---
+title: HttpFS Gateway
+weight: 4
+menu:
+main:
+parent: "Client Interfaces"
+summary: Ozone HttpFS is a WebHDFS compatible interface implementation, as a separate role it provides an easy integration with Ozone.
+---
+
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+Ozone HttpFS can be used to integrate Ozone with other tools via REST API.
+
+## Introduction
+
+Ozone HttpFS started as forking the HDFS HttpFS endpoint implementation ([HDDS-5448](https://issues.apache.org/jira/browse/HDDS-5448)). It is added a separate role to Ozone, like S3G. 
+
+HttpFS is a server that provides a REST HTTP gateway supporting File System operations (read and write). And it is interoperable with the **webhdfs** REST HTTP API.

Review Comment:
   I think we should call HttpFS as a service. In the second sentence, I think we do not need the starting "And", we can just simply start with "It is".



##########
hadoop-hdds/docs/content/interface/HttpFS.md:
##########
@@ -0,0 +1,119 @@
+---
+title: HttpFS Gateway
+weight: 4
+menu:
+main:
+parent: "Client Interfaces"
+summary: Ozone HttpFS is a WebHDFS compatible interface implementation, as a separate role it provides an easy integration with Ozone.
+---
+
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+Ozone HttpFS can be used to integrate Ozone with other tools via REST API.
+
+## Introduction
+
+Ozone HttpFS started as forking the HDFS HttpFS endpoint implementation ([HDDS-5448](https://issues.apache.org/jira/browse/HDDS-5448)). It is added a separate role to Ozone, like S3G. 
+
+HttpFS is a server that provides a REST HTTP gateway supporting File System operations (read and write). And it is interoperable with the **webhdfs** REST HTTP API.
+
+HttpFS can be used to access data in Ozone on a cluster behind of a firewall (the HttpFS server acts as a gateway and is the only system that is allowed to cross the firewall into the cluster).
+
+HttpFS can be used to access data in Ozone using HTTP utilities (such as curl and wget) and HTTP libraries Perl from other languages than Java.
+
+The **webhdfs** client FileSystem implementation can be used to access HttpFS using the Ozone filesystem command line tool (`ozone fs`) as well as from Java applications using the Hadoop FileSystem Java API.
+
+HttpFS has built-in security supporting Hadoop pseudo authentication and HTTP SPNEGO Kerberos and other pluggable authentication mechanisms. It also provides Hadoop proxy user support.
+
+
+## Getting started
+
+HttpFS itself is Java Jetty web-application. Ozone HttpFS Gateway is a separated component which provides access to Ozone via REST API. It should be started additional to the regular Ozone components.
+
+You can start a docker based cluster, including the HttpFS gateway from the release package.
+
+Go to the `compose/ozone` directory and start the server:
+
+```bash
+docker-compose up -d --scale datanode=3
+```
+
+You can now see the HttpFS gateway in docker with the name `ozone_httpfs`.

Review Comment:
   What do you think about: "You can/should find" as a starter of this sentence?



##########
hadoop-hdds/docs/content/interface/HttpFS.md:
##########
@@ -0,0 +1,119 @@
+---
+title: HttpFS Gateway
+weight: 4
+menu:
+main:
+parent: "Client Interfaces"
+summary: Ozone HttpFS is a WebHDFS compatible interface implementation, as a separate role it provides an easy integration with Ozone.
+---
+
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+Ozone HttpFS can be used to integrate Ozone with other tools via REST API.
+
+## Introduction
+
+Ozone HttpFS started as forking the HDFS HttpFS endpoint implementation ([HDDS-5448](https://issues.apache.org/jira/browse/HDDS-5448)). It is added a separate role to Ozone, like S3G. 
+
+HttpFS is a server that provides a REST HTTP gateway supporting File System operations (read and write). And it is interoperable with the **webhdfs** REST HTTP API.
+
+HttpFS can be used to access data in Ozone on a cluster behind of a firewall (the HttpFS server acts as a gateway and is the only system that is allowed to cross the firewall into the cluster).
+
+HttpFS can be used to access data in Ozone using HTTP utilities (such as curl and wget) and HTTP libraries Perl from other languages than Java.
+
+The **webhdfs** client FileSystem implementation can be used to access HttpFS using the Ozone filesystem command line tool (`ozone fs`) as well as from Java applications using the Hadoop FileSystem Java API.
+
+HttpFS has built-in security supporting Hadoop pseudo authentication and HTTP SPNEGO Kerberos and other pluggable authentication mechanisms. It also provides Hadoop proxy user support.
+
+
+## Getting started
+
+HttpFS itself is Java Jetty web-application. Ozone HttpFS Gateway is a separated component which provides access to Ozone via REST API. It should be started additional to the regular Ozone components.

Review Comment:
   In the first sentence here I would mention that the HttpFS service uses the Ozone FileSystem API implementation internally, like this:
   "HttpFS service itself is a Jetty based web-application that uses the Hadoop FileSystem API to talk to the cluster, it is a separate service which provides access to Ozone via a REST API."
   
   I believe the last sentence should start as: "It should be started additionally to the regular...", or "It should be started in addition to the regular..."



##########
hadoop-hdds/docs/content/interface/HttpFS.md:
##########
@@ -0,0 +1,119 @@
+---
+title: HttpFS Gateway
+weight: 4
+menu:
+main:
+parent: "Client Interfaces"
+summary: Ozone HttpFS is a WebHDFS compatible interface implementation, as a separate role it provides an easy integration with Ozone.
+---
+
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+Ozone HttpFS can be used to integrate Ozone with other tools via REST API.
+
+## Introduction
+
+Ozone HttpFS started as forking the HDFS HttpFS endpoint implementation ([HDDS-5448](https://issues.apache.org/jira/browse/HDDS-5448)). It is added a separate role to Ozone, like S3G. 
+
+HttpFS is a server that provides a REST HTTP gateway supporting File System operations (read and write). And it is interoperable with the **webhdfs** REST HTTP API.
+
+HttpFS can be used to access data in Ozone on a cluster behind of a firewall (the HttpFS server acts as a gateway and is the only system that is allowed to cross the firewall into the cluster).

Review Comment:
   instead of "in Ozone on a cluster" I would say "on an Ozone cluster". Here I would also use service instead of server.



##########
hadoop-hdds/docs/content/interface/HttpFS.md:
##########
@@ -0,0 +1,119 @@
+---
+title: HttpFS Gateway
+weight: 4
+menu:
+main:
+parent: "Client Interfaces"
+summary: Ozone HttpFS is a WebHDFS compatible interface implementation, as a separate role it provides an easy integration with Ozone.
+---
+
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+Ozone HttpFS can be used to integrate Ozone with other tools via REST API.
+
+## Introduction
+
+Ozone HttpFS started as forking the HDFS HttpFS endpoint implementation ([HDDS-5448](https://issues.apache.org/jira/browse/HDDS-5448)). It is added a separate role to Ozone, like S3G. 
+
+HttpFS is a server that provides a REST HTTP gateway supporting File System operations (read and write). And it is interoperable with the **webhdfs** REST HTTP API.
+
+HttpFS can be used to access data in Ozone on a cluster behind of a firewall (the HttpFS server acts as a gateway and is the only system that is allowed to cross the firewall into the cluster).
+
+HttpFS can be used to access data in Ozone using HTTP utilities (such as curl and wget) and HTTP libraries Perl from other languages than Java.
+
+The **webhdfs** client FileSystem implementation can be used to access HttpFS using the Ozone filesystem command line tool (`ozone fs`) as well as from Java applications using the Hadoop FileSystem Java API.
+
+HttpFS has built-in security supporting Hadoop pseudo authentication and HTTP SPNEGO Kerberos and other pluggable authentication mechanisms. It also provides Hadoop proxy user support.
+
+
+## Getting started
+
+HttpFS itself is Java Jetty web-application. Ozone HttpFS Gateway is a separated component which provides access to Ozone via REST API. It should be started additional to the regular Ozone components.
+
+You can start a docker based cluster, including the HttpFS gateway from the release package.
+
+Go to the `compose/ozone` directory and start the server:
+
+```bash
+docker-compose up -d --scale datanode=3
+```
+
+You can now see the HttpFS gateway in docker with the name `ozone_httpfs`.
+HttpFS HTTP web-service API calls are HTTP REST calls that map to an Ozone file system operation. For example, using the `curl` Unix command.
+
+E.g. in the docker cluster you can execute commands these:

Review Comment:
   I guess "like" is missing from this part: "commands like these"?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org