You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by GitBox <gi...@apache.org> on 2021/11/25 14:02:42 UTC

[GitHub] [carbondata] bieremayi opened a new pull request #4239: Supplementary information for add segment syntax .

bieremayi opened a new pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239


    ### Why is this PR needed?
    
   Supplementary information for add segment syntax .
    
    ### What changes were proposed in this PR?
   
   1. add segment option (partition)
   2. segment-management-on-carbondata.md link addsegment-guide.md
   
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - No
   
       
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] chenliang613 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
chenliang613 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-985950253


   @bieremayi  thanks for your pull request,  please revert some typo issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on pull request #4239: Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-984304973


   @bieremayi please raise a JIRA ticket and add the same to your PR


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] chenliang613 commented on a change in pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
chenliang613 commented on a change in pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#discussion_r762373056



##########
File path: docs/addsegment-guide.md
##########
@@ -27,16 +27,144 @@ Heterogeneous format segments aims to solve this problem by avoiding data conver
 ### Add segment with path and format
 Users can add the existing data as a segment to the carbon table provided the schema of the data
  and the carbon table should be the same. 
+ 
+ Syntax
+ 
+   ```
+   ALTER TABLE [db_name.]table_name ADD SEGMENT OPTIONS(property_name=property_value, ...)
+   ```
+
+**Supported properties:**
+
+| Property                                                     | Description                                                  |
+| ------------------------------------------------------------ | ------------------------------------------------------------ |
+| [PATH](#path)           | User external old table path         |
+| [FORMAT](#format)       | User external old table file format             |
+| [PARTITION](#partition) | Extract partition info for partition table , should be form of "a:int, b:string"             |
+
+
+-
+  You can use the following options to add segment:
+
+  - ##### PATH: 
+    User old table path.
+    
+    ``` 
+    OPTIONS('PATH'='hdfs://usr/oldtable')
+    ```
+
+  - ##### FORMAT:
+   User old table file format. eg : parquet, orc
+
+    ```
+    OPTIONS('FORMAT'='parquet')
+    ```
+  - ##### PARTITION:
+   Extract partition info for partition table , should be form of "a:int, b:string"
+
+    ```
+    OPTIONS('PARTITION'='a:int, b:string')
+    ```
+  
 
-```
-alter table table_name add segment options ('path'= 'hdfs://usr/oldtable','format'='parquet')
-```
 In the above command user can add the existing data to the carbon table as a new segment and also
  can provide the data format.
 
 During add segment, it will infer the schema from data and validates the schema against the carbon table. 
 If the schema doesn’t match it throws an exception.
 
+**Example:**
+
+Exist old hive partition table , stored as orc or parquet file format:
+
+
+```sql
+CREATE TABLE default.log_parquet_par (
+	id BIGINT,
+	event_time BIGINT,
+	ip STRING
+)PARTITIONED BY (                              
+	day INT,                                    
+	hour INT,                                   
+	type INT                                    
+)                                              
+STORED AS parquet
+LOCATION 'hdfs://bieremayi/user/hive/warehouse/log_parquet_par';
+```
+
+Parquet File Location : 
+
+```
+25.1 K  75.2 K  /user/hive/warehouse/log_parquet_par/day=20211123/hour=12/type=0

Review comment:
       please remove these info : 25.1 K 75.2 K, same comments for other parts in this PR




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997203684


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6168/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] brijoobopanna commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
brijoobopanna commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-992347579


   retest this please
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997994255


   LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] bieremayi commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
bieremayi commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-984544760


   > 
   
   
   
   > @bieremayi please raise a JIRA ticket and add the same to your PR
   
   I've already done it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-981618382


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4398/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-986014290


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6143/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997801285






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] chenliang613 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
chenliang613 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997973492


   LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4239: Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#discussion_r760773600



##########
File path: docs/segment-management-on-carbondata.md
##########
@@ -207,4 +208,4 @@ concept which helps to maintain consistency of data and easy transaction managem
     spark.sql("select count(empno) from carbon.input.segments.db.carbontable_Multi_Thread").show();
      }
    }
-  ```
+  ```

Review comment:
       please revert this change




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-981607132


   Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/532/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] chenliang613 commented on a change in pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
chenliang613 commented on a change in pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#discussion_r762373098



##########
File path: docs/addsegment-guide.md
##########
@@ -27,16 +27,144 @@ Heterogeneous format segments aims to solve this problem by avoiding data conver
 ### Add segment with path and format
 Users can add the existing data as a segment to the carbon table provided the schema of the data
  and the carbon table should be the same. 
+ 
+ Syntax
+ 
+   ```
+   ALTER TABLE [db_name.]table_name ADD SEGMENT OPTIONS(property_name=property_value, ...)
+   ```
+
+**Supported properties:**
+
+| Property                                                     | Description                                                  |
+| ------------------------------------------------------------ | ------------------------------------------------------------ |
+| [PATH](#path)           | User external old table path         |
+| [FORMAT](#format)       | User external old table file format             |
+| [PARTITION](#partition) | Extract partition info for partition table , should be form of "a:int, b:string"             |
+
+
+-
+  You can use the following options to add segment:
+
+  - ##### PATH: 
+    User old table path.
+    
+    ``` 
+    OPTIONS('PATH'='hdfs://usr/oldtable')
+    ```
+
+  - ##### FORMAT:
+   User old table file format. eg : parquet, orc
+
+    ```
+    OPTIONS('FORMAT'='parquet')
+    ```
+  - ##### PARTITION:
+   Extract partition info for partition table , should be form of "a:int, b:string"
+
+    ```
+    OPTIONS('PARTITION'='a:int, b:string')
+    ```
+  
 
-```
-alter table table_name add segment options ('path'= 'hdfs://usr/oldtable','format'='parquet')
-```
 In the above command user can add the existing data to the carbon table as a new segment and also
  can provide the data format.
 
 During add segment, it will infer the schema from data and validates the schema against the carbon table. 
 If the schema doesn’t match it throws an exception.
 
+**Example:**
+
+Exist old hive partition table , stored as orc or parquet file format:
+
+
+```sql
+CREATE TABLE default.log_parquet_par (
+	id BIGINT,
+	event_time BIGINT,
+	ip STRING
+)PARTITIONED BY (                              
+	day INT,                                    
+	hour INT,                                   
+	type INT                                    
+)                                              
+STORED AS parquet
+LOCATION 'hdfs://bieremayi/user/hive/warehouse/log_parquet_par';
+```
+
+Parquet File Location : 
+
+```
+25.1 K  75.2 K  /user/hive/warehouse/log_parquet_par/day=20211123/hour=12/type=0
+8.7 K   26.2 K  /user/hive/warehouse/log_parquet_par/day=20211123/hour=12/type=1

Review comment:
       please remove : 8.7 K 26.2 K




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-986008865


   Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/534/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-992482259


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4420/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] asfgit closed pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-984645969


   Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/533/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-987858746


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4401/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#discussion_r768320746



##########
File path: docs/addsegment-guide.md
##########
@@ -27,16 +27,144 @@ Heterogeneous format segments aims to solve this problem by avoiding data conver
 ### Add segment with path and format
 Users can add the existing data as a segment to the carbon table provided the schema of the data
  and the carbon table should be the same. 
+ 
+ Syntax
+ 
+   ```
+   ALTER TABLE [db_name.]table_name ADD SEGMENT OPTIONS(property_name=property_value, ...)
+   ```
+
+**Supported properties:**
+
+| Property                                                     | Description                                                  |
+| ------------------------------------------------------------ | ------------------------------------------------------------ |
+| [PATH](#path)           | User external old table path         |
+| [FORMAT](#format)       | User external old table file format             |
+| [PARTITION](#partition) | Partition info for partition table , should be form of "a:int, b:string"             |
+
+
+-
+  You can use the following options to add segment:
+
+  - ##### PATH: 
+    User old table path.
+    
+    ``` 
+    OPTIONS('PATH'='hdfs://usr/oldtable')
+    ```
+
+  - ##### FORMAT:
+   User old table file format. eg : parquet, orc

Review comment:
       please format this line.
   ```suggestion
       User old table file format. eg : parquet, orc
   ```

##########
File path: docs/addsegment-guide.md
##########
@@ -27,16 +27,144 @@ Heterogeneous format segments aims to solve this problem by avoiding data conver
 ### Add segment with path and format
 Users can add the existing data as a segment to the carbon table provided the schema of the data
  and the carbon table should be the same. 
+ 
+ Syntax
+ 
+   ```
+   ALTER TABLE [db_name.]table_name ADD SEGMENT OPTIONS(property_name=property_value, ...)
+   ```
+
+**Supported properties:**
+
+| Property                                                     | Description                                                  |
+| ------------------------------------------------------------ | ------------------------------------------------------------ |
+| [PATH](#path)           | User external old table path         |
+| [FORMAT](#format)       | User external old table file format             |
+| [PARTITION](#partition) | Partition info for partition table , should be form of "a:int, b:string"             |
+
+
+-
+  You can use the following options to add segment:
+
+  - ##### PATH: 
+    User old table path.
+    
+    ``` 
+    OPTIONS('PATH'='hdfs://usr/oldtable')
+    ```
+
+  - ##### FORMAT:
+   User old table file format. eg : parquet, orc
+
+    ```
+    OPTIONS('FORMAT'='parquet')
+    ```
+  - ##### PARTITION:
+   Partition info for partition table , should be form of "a:int, b:string"

Review comment:
       ```suggestion
       Partition info for partition table , should be form of "a:int, b:string"
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4239: Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#discussion_r760774669



##########
File path: docs/faq.md
##########
@@ -213,6 +214,11 @@ spark.speculation is a group of configuration, that can monitor trailing tasks a
 
 spark.blacklist.enabled, avoid reduction of available executors due to blacklist mechanism.
 
+## How to manage mix file format in carbondata table
+
+[Heterogeneous format segments in carbondata](./addsegment-guide.md)

Review comment:
       ```suggestion
   Refer [Heterogeneous format segments in carbondata](./addsegment-guide.md)
   ```

##########
File path: docs/faq.md
##########
@@ -29,6 +29,7 @@
 * [Why different time zone result for select query output when query SDK writer output?](#why-different-time-zone-result-for-select-query-output-when-query-sdk-writer-output)
 * [How to check LRU cache memory footprint?](#how-to-check-lru-cache-memory-footprint)
 * [How to deal with the trailing task in query?](#How-to-deal-with-the-trailing-task-in-query)
+* [How to manage mix file format in carbondata table?](#How-to-manage-mix-file-format-in-carbondata-table)

Review comment:
       ```suggestion
   * [How to manage mixed file format in carbondata table?](#How-to-manage-mix-file-format-in-carbondata-table)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-984655517


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6142/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] chenliang613 commented on a change in pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
chenliang613 commented on a change in pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#discussion_r762373056



##########
File path: docs/addsegment-guide.md
##########
@@ -27,16 +27,144 @@ Heterogeneous format segments aims to solve this problem by avoiding data conver
 ### Add segment with path and format
 Users can add the existing data as a segment to the carbon table provided the schema of the data
  and the carbon table should be the same. 
+ 
+ Syntax
+ 
+   ```
+   ALTER TABLE [db_name.]table_name ADD SEGMENT OPTIONS(property_name=property_value, ...)
+   ```
+
+**Supported properties:**
+
+| Property                                                     | Description                                                  |
+| ------------------------------------------------------------ | ------------------------------------------------------------ |
+| [PATH](#path)           | User external old table path         |
+| [FORMAT](#format)       | User external old table file format             |
+| [PARTITION](#partition) | Extract partition info for partition table , should be form of "a:int, b:string"             |
+
+
+-
+  You can use the following options to add segment:
+
+  - ##### PATH: 
+    User old table path.
+    
+    ``` 
+    OPTIONS('PATH'='hdfs://usr/oldtable')
+    ```
+
+  - ##### FORMAT:
+   User old table file format. eg : parquet, orc
+
+    ```
+    OPTIONS('FORMAT'='parquet')
+    ```
+  - ##### PARTITION:
+   Extract partition info for partition table , should be form of "a:int, b:string"
+
+    ```
+    OPTIONS('PARTITION'='a:int, b:string')
+    ```
+  
 
-```
-alter table table_name add segment options ('path'= 'hdfs://usr/oldtable','format'='parquet')
-```
 In the above command user can add the existing data to the carbon table as a new segment and also
  can provide the data format.
 
 During add segment, it will infer the schema from data and validates the schema against the carbon table. 
 If the schema doesn’t match it throws an exception.
 
+**Example:**
+
+Exist old hive partition table , stored as orc or parquet file format:
+
+
+```sql
+CREATE TABLE default.log_parquet_par (
+	id BIGINT,
+	event_time BIGINT,
+	ip STRING
+)PARTITIONED BY (                              
+	day INT,                                    
+	hour INT,                                   
+	type INT                                    
+)                                              
+STORED AS parquet
+LOCATION 'hdfs://bieremayi/user/hive/warehouse/log_parquet_par';
+```
+
+Parquet File Location : 
+
+```
+25.1 K  75.2 K  /user/hive/warehouse/log_parquet_par/day=20211123/hour=12/type=0

Review comment:
       please remove these info : 25.1 K 75.2 K




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997205438


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4425/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-987845972


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6144/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-984599488


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4399/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] bieremayi commented on a change in pull request #4239: Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
bieremayi commented on a change in pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#discussion_r760997361



##########
File path: docs/segment-management-on-carbondata.md
##########
@@ -207,4 +208,4 @@ concept which helps to maintain consistency of data and easy transaction managem
     spark.sql("select count(empno) from carbon.input.segments.db.carbontable_Multi_Thread").show();
      }
    }
-  ```
+  ```

Review comment:
       ok!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] chenliang613 commented on pull request #4239: Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
chenliang613 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-981504408


   add to whitelist


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-986003785


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4400/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-992467232


   Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/554/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-992468631


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6163/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] bieremayi commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
bieremayi commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997199647


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997818335


   Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/565/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997674455


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997203341


   Build Failed  with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/559/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-981603944


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6141/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-987842322


   Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/535/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: Supplementary information for add segment syntax .

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-979240803


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org