You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/11/08 11:46:45 UTC

[GitHub] [dolphinscheduler] Radeity opened a new issue, #12821: [Improvement][Resource Center] Compress before uploading to object storage

Radeity opened a new issue, #12821:
URL: https://github.com/apache/dolphinscheduler/issues/12821

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar feature requirement.
   
   
   ### Description
   
   In current version, DS support use `HDFS`, `OSS`, `S3` as storage layer of resource center. Nowadays, most of storage capacity is provided by cloud storage service which means OSS and S3 are in widely use. Nevertheless, charge depends on objects' size. I've noticed that in our implementation, we just import the package and use client SDK provided by cloud vendors to upload object, such as:
   ```java
   import com.amazonaws.services.s3.*;
   
   public boolean mkdir(String tenantCode, ...) {
       ...
       s3Client.putObject(putObjectRequest);
       ...
   }
   
   public void vimFile(String tenantCode,...) {
       ...
       S3Object o = s3Client.getObject(BUCKET_NAME, srcFilePath);
       ...
   }
   ```
   I've explored the source code and find that the client SDK finally put raw objects into content and upload them by http request. Http protocol has its own standard to compress transmission data packets by `gzip` algorithm which can reduce network I/O. However, packets will be decompressed and the size of object remain unchanged.
   
   Therefore, we can use better compression algorithm like `Zstandard` to compress file or directory before `putObject`, and decompress objects after `getObject`. Bring in this client-side compression step, DS can effectively reduce object size because of effectiveness of compression algorithm.
   
   
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] Radeity commented on issue #12821: [Improvement][Resource Center] Compress before uploading to object storage

Posted by GitBox <gi...@apache.org>.
Radeity commented on issue #12821:
URL: https://github.com/apache/dolphinscheduler/issues/12821#issuecomment-1308277230

   @SbloodyS Hi, you forgot to set milestone.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #12821: [Improvement][Resource Center] Compress before uploading to object storage

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #12821:
URL: https://github.com/apache/dolphinscheduler/issues/12821#issuecomment-1344934949

   This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] Radeity commented on issue #12821: [Improvement][Resource Center] Compress before uploading to object storage

Posted by GitBox <gi...@apache.org>.
Radeity commented on issue #12821:
URL: https://github.com/apache/dolphinscheduler/issues/12821#issuecomment-1307084468

   If someone make sure that client-side SDK support effective compression algorithm mentioned before (like Zstandard), feel free to correct me!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] github-actions[bot] closed issue #12821: [Improvement][Resource Center] Compress before uploading to object storage

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed issue #12821: [Improvement][Resource Center] Compress before uploading to object storage
URL: https://github.com/apache/dolphinscheduler/issues/12821


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #12821: [Improvement][Resource Center] Compress before uploading to object storage

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #12821:
URL: https://github.com/apache/dolphinscheduler/issues/12821#issuecomment-1307081155

   Thank you for your feedback, we have received your issue, Please wait patiently for a reply.
   * In order for us to understand your request as soon as possible, please provide detailed information、version or pictures.
   * If you haven't received a reply for a long time, you can [join our slack](https://s.apache.org/dolphinscheduler-slack) and send your question to channel `#troubleshooting`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #12821: [Improvement][Resource Center] Compress before uploading to object storage

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #12821:
URL: https://github.com/apache/dolphinscheduler/issues/12821#issuecomment-1356917973

   This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] SbloodyS commented on issue #12821: [Improvement][Resource Center] Compress before uploading to object storage

Posted by GitBox <gi...@apache.org>.
SbloodyS commented on issue #12821:
URL: https://github.com/apache/dolphinscheduler/issues/12821#issuecomment-1308298249

   > @SbloodyS Hi, you forgot to set milestone.
   
   Since this is not urgent. Please take your time freely. Will set the milestone after you submit the PR. 👍  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] Radeity commented on issue #12821: [Improvement][Resource Center] Compress before uploading to object storage

Posted by GitBox <gi...@apache.org>.
Radeity commented on issue #12821:
URL: https://github.com/apache/dolphinscheduler/issues/12821#issuecomment-1308335347

   > > @SbloodyS Hi, you forgot to set milestone.
   > 
   > Since this is not urgent. Please take your time freely. Will set the milestone after you submit the PR. 👍
   
   Okay, i'll finish it as soon as i can.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org