You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by "morningman (via GitHub)" <gi...@apache.org> on 2023/01/24 04:28:18 UTC

[GitHub] [doris] morningman commented on pull request #15839: [Feature] support segment builder tool

morningman commented on PR #15839:
URL: https://github.com/apache/doris/pull/15839#issuecomment-1401381547

   Some questions and suggestions:
   
   1. `builder_scanner` doesn't seem to be used? Only `builder_scanner_memtable` is used?
   2. Need to unify the inputs and outputs:
   	
   	1. Inputs:
   	
   		* header file in json and data file in parquet(can be orc or other supported file format)
   		* In the code, the reading methods of different file systems can be unified, and there is no need to use `isHDFS` to judge.
   	
   	2. Outputs
   	
   		* new header file in json and Doris segment data file
   
   3. The final upload logic can be encapsulated without being limited to HDFS
   4. I think we can generate a manifest file to save all output file list. So that the downstream system can read it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org