You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Sammi Chen (JIRA)" <ji...@apache.org> on 2019/01/16 14:44:00 UTC

[jira] [Commented] (HADOOP-15616) Incorporate Tencent Cloud COS File System Implementation

    [ https://issues.apache.org/jira/browse/HADOOP-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16744103#comment-16744103 ] 

Sammi Chen commented on HADOOP-15616:
-------------------------------------

[~yuyang733], thanks for working on this. I have gone through the 007 patch briefly. Here are some findings. 
 # Is file/directory last access time tracked by cos?  There are applicaitons which depend on these information
 # "cosn" is used as the schma of Tencent Cos. Any specific reason to add "n" to the schema? There are both "cos" and "cosn" used in config keys, a unified one is expected. 
 # Site content is not provided. Site content is used to educate user what this feature is and how to use it. The site will looks like this https://hadoop.apache.org/docs/r3.1.1/hadoop-azure-datalake/index.html 
 # suggest putting all final static constant values into one configuration file instead of spreading over several files. for example, move final static fields like "SCHEME", "COS_MAX_LISTING_LENGTH" to Constants.java
 # "cos-hadoop-plugin-v5.3" is v5.3 the cos_api API version. Is it able to get the version dynamically?
 # The size of default thread pool of upload, download and copy action seems too large, use a reasonable value. Cannot occupy all resources by one module.
 # The thread pool of upload, download and copy are shared between all files of one cos filesystem instance, one big file may starve other files, suggest using SemaphoredDelegatingExecutor to fairly share resources between files. It's just an improvement suggestion. We can do it as a follow-on.
 # Is it too long to wait Long.MAX_VALUE milliseconds for the close of the thread pool? 
   this.boundedCopyThreadPool.awaitTermination(Long.MAX_VALUE, TimeUnit.MILLISECONDS);
 # import * is not a recommended coding style in Hadoop

> Incorporate Tencent Cloud COS File System Implementation
> --------------------------------------------------------
>
>                 Key: HADOOP-15616
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15616
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/cos
>            Reporter: Junping Du
>            Assignee: YangY
>            Priority: Major
>         Attachments: HADOOP-15616.001.patch, HADOOP-15616.002.patch, HADOOP-15616.003.patch, HADOOP-15616.004.patch, HADOOP-15616.005.patch, HADOOP-15616.006.patch, HADOOP-15616.007.patch, Tencent-COS-Integrated.pdf
>
>
> Tencent cloud is top 2 cloud vendors in China market and the object store COS ([https://intl.cloud.tencent.com/product/cos]) is widely used among China’s cloud users but now it is hard for hadoop user to access data laid on COS storage as no native support for COS in Hadoop.
> This work aims to integrate Tencent cloud COS with Hadoop/Spark/Hive, just like what we do before for S3, ADL, OSS, etc. With simple configuration, Hadoop applications can read/write data from COS without any code change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org