You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/12/15 08:38:56 UTC

[GitHub] [doris-spark-connector] luolei604 opened a new pull request, #60: [improvement]Add an option to set the partition size of the final write stage

luolei604 opened a new pull request, #60:
URL: https://github.com/apache/doris-spark-connector/pull/60

   # Proposed changes
   
   Add an option to set the partition size of the final write stage
   
   1. We can increase the parallelism of the computation and reduce the write doris parallelism to reduce write compaction pressure.
   2. After the spark RDD is filtered, the number of records for each partition is small and the number of partitions is large. The writing frequency becomes high and resources are wasted.
   
   
   
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   3. Has unit tests been added: (Yes/No/No Need)
   4. Has document been added or modified: (Yes/No/No Need)
   5. Does it need to update dependencies: (Yes/No)
   6. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   before :
   ![image](https://user-images.githubusercontent.com/39718951/207811635-368b8458-e466-445f-9302-23da1223d827.png)
   
   after :
   ![image](https://user-images.githubusercontent.com/39718951/207810643-30499216-b69f-4f57-bd1c-da950f364eb5.png)
   
   ![image](https://user-images.githubusercontent.com/39718951/207810721-b15e386f-8485-4d13-8353-ba62efdf0e7e.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris-spark-connector] hf200012 merged pull request #60: [improvement]Add an option to set the partition size of the final write stage

Posted by GitBox <gi...@apache.org>.
hf200012 merged PR #60:
URL: https://github.com/apache/doris-spark-connector/pull/60


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris-spark-connector] luolei604 commented on pull request #60: [improvement]Add an option to set the partition size of the final write stage

Posted by GitBox <gi...@apache.org>.
luolei604 commented on PR #60:
URL: https://github.com/apache/doris-spark-connector/pull/60#issuecomment-1352742022

   @hf200012  would you mind take a look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris-spark-connector] lexluo09 commented on pull request #60: [improvement]Add an option to set the partition size of the final write stage

Posted by GitBox <gi...@apache.org>.
lexluo09 commented on PR #60:
URL: https://github.com/apache/doris-spark-connector/pull/60#issuecomment-1358921022

   > LGTM, and please update the docs on the official website to explain that `doris.sink.task.partition.size` is required, and the difference between setting `doris.sink.task.use.repartition` to true or false.
   
   Ok, thank you very much for your advice


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris-spark-connector] gnehil commented on pull request #60: [improvement]Add an option to set the partition size of the final write stage

Posted by GitBox <gi...@apache.org>.
gnehil commented on PR #60:
URL: https://github.com/apache/doris-spark-connector/pull/60#issuecomment-1357400384

   LGTM, and please update the docs on the official website to explain that `doris.sink.task.partition.size` is required, and the difference between setting `doris.sink.task.use.repartition` to true or false.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org