You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by GitBox <gi...@apache.org> on 2021/11/25 14:02:42 UTC
[GitHub] [carbondata] bieremayi opened a new pull request #4239: Supplementary information for add segment syntax .
bieremayi opened a new pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239
### Why is this PR needed?
Supplementary information for add segment syntax .
### What changes were proposed in this PR?
1. add segment option (partition)
2. segment-management-on-carbondata.md link addsegment-guide.md
### Does this PR introduce any user interface change?
- No
### Is any new testcase added?
- No
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] chenliang613 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
chenliang613 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-985950253
@bieremayi thanks for your pull request, please revert some typo issues.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on pull request #4239: Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-984304973
@bieremayi please raise a JIRA ticket and add the same to your PR
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] chenliang613 commented on a change in pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
chenliang613 commented on a change in pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#discussion_r762373056
##########
File path: docs/addsegment-guide.md
##########
@@ -27,16 +27,144 @@ Heterogeneous format segments aims to solve this problem by avoiding data conver
### Add segment with path and format
Users can add the existing data as a segment to the carbon table provided the schema of the data
and the carbon table should be the same.
+
+ Syntax
+
+ ```
+ ALTER TABLE [db_name.]table_name ADD SEGMENT OPTIONS(property_name=property_value, ...)
+ ```
+
+**Supported properties:**
+
+| Property | Description |
+| ------------------------------------------------------------ | ------------------------------------------------------------ |
+| [PATH](#path) | User external old table path |
+| [FORMAT](#format) | User external old table file format |
+| [PARTITION](#partition) | Extract partition info for partition table , should be form of "a:int, b:string" |
+
+
+-
+ You can use the following options to add segment:
+
+ - ##### PATH:
+ User old table path.
+
+ ```
+ OPTIONS('PATH'='hdfs://usr/oldtable')
+ ```
+
+ - ##### FORMAT:
+ User old table file format. eg : parquet, orc
+
+ ```
+ OPTIONS('FORMAT'='parquet')
+ ```
+ - ##### PARTITION:
+ Extract partition info for partition table , should be form of "a:int, b:string"
+
+ ```
+ OPTIONS('PARTITION'='a:int, b:string')
+ ```
+
-```
-alter table table_name add segment options ('path'= 'hdfs://usr/oldtable','format'='parquet')
-```
In the above command user can add the existing data to the carbon table as a new segment and also
can provide the data format.
During add segment, it will infer the schema from data and validates the schema against the carbon table.
If the schema doesn’t match it throws an exception.
+**Example:**
+
+Exist old hive partition table , stored as orc or parquet file format:
+
+
+```sql
+CREATE TABLE default.log_parquet_par (
+ id BIGINT,
+ event_time BIGINT,
+ ip STRING
+)PARTITIONED BY (
+ day INT,
+ hour INT,
+ type INT
+)
+STORED AS parquet
+LOCATION 'hdfs://bieremayi/user/hive/warehouse/log_parquet_par';
+```
+
+Parquet File Location :
+
+```
+25.1 K 75.2 K /user/hive/warehouse/log_parquet_par/day=20211123/hour=12/type=0
Review comment:
please remove these info : 25.1 K 75.2 K, same comments for other parts in this PR
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997203684
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6168/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] brijoobopanna commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
brijoobopanna commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-992347579
retest this please
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997994255
LGTM
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] bieremayi commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
bieremayi commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-984544760
>
> @bieremayi please raise a JIRA ticket and add the same to your PR
I've already done it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-981618382
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4398/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-986014290
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6143/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997801285
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] chenliang613 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
chenliang613 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997973492
LGTM
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4239: Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#discussion_r760773600
##########
File path: docs/segment-management-on-carbondata.md
##########
@@ -207,4 +208,4 @@ concept which helps to maintain consistency of data and easy transaction managem
spark.sql("select count(empno) from carbon.input.segments.db.carbontable_Multi_Thread").show();
}
}
- ```
+ ```
Review comment:
please revert this change
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-981607132
Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/532/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] chenliang613 commented on a change in pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
chenliang613 commented on a change in pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#discussion_r762373098
##########
File path: docs/addsegment-guide.md
##########
@@ -27,16 +27,144 @@ Heterogeneous format segments aims to solve this problem by avoiding data conver
### Add segment with path and format
Users can add the existing data as a segment to the carbon table provided the schema of the data
and the carbon table should be the same.
+
+ Syntax
+
+ ```
+ ALTER TABLE [db_name.]table_name ADD SEGMENT OPTIONS(property_name=property_value, ...)
+ ```
+
+**Supported properties:**
+
+| Property | Description |
+| ------------------------------------------------------------ | ------------------------------------------------------------ |
+| [PATH](#path) | User external old table path |
+| [FORMAT](#format) | User external old table file format |
+| [PARTITION](#partition) | Extract partition info for partition table , should be form of "a:int, b:string" |
+
+
+-
+ You can use the following options to add segment:
+
+ - ##### PATH:
+ User old table path.
+
+ ```
+ OPTIONS('PATH'='hdfs://usr/oldtable')
+ ```
+
+ - ##### FORMAT:
+ User old table file format. eg : parquet, orc
+
+ ```
+ OPTIONS('FORMAT'='parquet')
+ ```
+ - ##### PARTITION:
+ Extract partition info for partition table , should be form of "a:int, b:string"
+
+ ```
+ OPTIONS('PARTITION'='a:int, b:string')
+ ```
+
-```
-alter table table_name add segment options ('path'= 'hdfs://usr/oldtable','format'='parquet')
-```
In the above command user can add the existing data to the carbon table as a new segment and also
can provide the data format.
During add segment, it will infer the schema from data and validates the schema against the carbon table.
If the schema doesn’t match it throws an exception.
+**Example:**
+
+Exist old hive partition table , stored as orc or parquet file format:
+
+
+```sql
+CREATE TABLE default.log_parquet_par (
+ id BIGINT,
+ event_time BIGINT,
+ ip STRING
+)PARTITIONED BY (
+ day INT,
+ hour INT,
+ type INT
+)
+STORED AS parquet
+LOCATION 'hdfs://bieremayi/user/hive/warehouse/log_parquet_par';
+```
+
+Parquet File Location :
+
+```
+25.1 K 75.2 K /user/hive/warehouse/log_parquet_par/day=20211123/hour=12/type=0
+8.7 K 26.2 K /user/hive/warehouse/log_parquet_par/day=20211123/hour=12/type=1
Review comment:
please remove : 8.7 K 26.2 K
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-986008865
Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/534/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-992482259
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4420/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] asfgit closed pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-984645969
Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/533/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-987858746
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4401/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#discussion_r768320746
##########
File path: docs/addsegment-guide.md
##########
@@ -27,16 +27,144 @@ Heterogeneous format segments aims to solve this problem by avoiding data conver
### Add segment with path and format
Users can add the existing data as a segment to the carbon table provided the schema of the data
and the carbon table should be the same.
+
+ Syntax
+
+ ```
+ ALTER TABLE [db_name.]table_name ADD SEGMENT OPTIONS(property_name=property_value, ...)
+ ```
+
+**Supported properties:**
+
+| Property | Description |
+| ------------------------------------------------------------ | ------------------------------------------------------------ |
+| [PATH](#path) | User external old table path |
+| [FORMAT](#format) | User external old table file format |
+| [PARTITION](#partition) | Partition info for partition table , should be form of "a:int, b:string" |
+
+
+-
+ You can use the following options to add segment:
+
+ - ##### PATH:
+ User old table path.
+
+ ```
+ OPTIONS('PATH'='hdfs://usr/oldtable')
+ ```
+
+ - ##### FORMAT:
+ User old table file format. eg : parquet, orc
Review comment:
please format this line.
```suggestion
User old table file format. eg : parquet, orc
```
##########
File path: docs/addsegment-guide.md
##########
@@ -27,16 +27,144 @@ Heterogeneous format segments aims to solve this problem by avoiding data conver
### Add segment with path and format
Users can add the existing data as a segment to the carbon table provided the schema of the data
and the carbon table should be the same.
+
+ Syntax
+
+ ```
+ ALTER TABLE [db_name.]table_name ADD SEGMENT OPTIONS(property_name=property_value, ...)
+ ```
+
+**Supported properties:**
+
+| Property | Description |
+| ------------------------------------------------------------ | ------------------------------------------------------------ |
+| [PATH](#path) | User external old table path |
+| [FORMAT](#format) | User external old table file format |
+| [PARTITION](#partition) | Partition info for partition table , should be form of "a:int, b:string" |
+
+
+-
+ You can use the following options to add segment:
+
+ - ##### PATH:
+ User old table path.
+
+ ```
+ OPTIONS('PATH'='hdfs://usr/oldtable')
+ ```
+
+ - ##### FORMAT:
+ User old table file format. eg : parquet, orc
+
+ ```
+ OPTIONS('FORMAT'='parquet')
+ ```
+ - ##### PARTITION:
+ Partition info for partition table , should be form of "a:int, b:string"
Review comment:
```suggestion
Partition info for partition table , should be form of "a:int, b:string"
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4239: Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#discussion_r760774669
##########
File path: docs/faq.md
##########
@@ -213,6 +214,11 @@ spark.speculation is a group of configuration, that can monitor trailing tasks a
spark.blacklist.enabled, avoid reduction of available executors due to blacklist mechanism.
+## How to manage mix file format in carbondata table
+
+[Heterogeneous format segments in carbondata](./addsegment-guide.md)
Review comment:
```suggestion
Refer [Heterogeneous format segments in carbondata](./addsegment-guide.md)
```
##########
File path: docs/faq.md
##########
@@ -29,6 +29,7 @@
* [Why different time zone result for select query output when query SDK writer output?](#why-different-time-zone-result-for-select-query-output-when-query-sdk-writer-output)
* [How to check LRU cache memory footprint?](#how-to-check-lru-cache-memory-footprint)
* [How to deal with the trailing task in query?](#How-to-deal-with-the-trailing-task-in-query)
+* [How to manage mix file format in carbondata table?](#How-to-manage-mix-file-format-in-carbondata-table)
Review comment:
```suggestion
* [How to manage mixed file format in carbondata table?](#How-to-manage-mix-file-format-in-carbondata-table)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-984655517
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6142/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] chenliang613 commented on a change in pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
chenliang613 commented on a change in pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#discussion_r762373056
##########
File path: docs/addsegment-guide.md
##########
@@ -27,16 +27,144 @@ Heterogeneous format segments aims to solve this problem by avoiding data conver
### Add segment with path and format
Users can add the existing data as a segment to the carbon table provided the schema of the data
and the carbon table should be the same.
+
+ Syntax
+
+ ```
+ ALTER TABLE [db_name.]table_name ADD SEGMENT OPTIONS(property_name=property_value, ...)
+ ```
+
+**Supported properties:**
+
+| Property | Description |
+| ------------------------------------------------------------ | ------------------------------------------------------------ |
+| [PATH](#path) | User external old table path |
+| [FORMAT](#format) | User external old table file format |
+| [PARTITION](#partition) | Extract partition info for partition table , should be form of "a:int, b:string" |
+
+
+-
+ You can use the following options to add segment:
+
+ - ##### PATH:
+ User old table path.
+
+ ```
+ OPTIONS('PATH'='hdfs://usr/oldtable')
+ ```
+
+ - ##### FORMAT:
+ User old table file format. eg : parquet, orc
+
+ ```
+ OPTIONS('FORMAT'='parquet')
+ ```
+ - ##### PARTITION:
+ Extract partition info for partition table , should be form of "a:int, b:string"
+
+ ```
+ OPTIONS('PARTITION'='a:int, b:string')
+ ```
+
-```
-alter table table_name add segment options ('path'= 'hdfs://usr/oldtable','format'='parquet')
-```
In the above command user can add the existing data to the carbon table as a new segment and also
can provide the data format.
During add segment, it will infer the schema from data and validates the schema against the carbon table.
If the schema doesn’t match it throws an exception.
+**Example:**
+
+Exist old hive partition table , stored as orc or parquet file format:
+
+
+```sql
+CREATE TABLE default.log_parquet_par (
+ id BIGINT,
+ event_time BIGINT,
+ ip STRING
+)PARTITIONED BY (
+ day INT,
+ hour INT,
+ type INT
+)
+STORED AS parquet
+LOCATION 'hdfs://bieremayi/user/hive/warehouse/log_parquet_par';
+```
+
+Parquet File Location :
+
+```
+25.1 K 75.2 K /user/hive/warehouse/log_parquet_par/day=20211123/hour=12/type=0
Review comment:
please remove these info : 25.1 K 75.2 K
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997205438
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4425/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-987845972
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6144/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-984599488
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4399/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] bieremayi commented on a change in pull request #4239: Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
bieremayi commented on a change in pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#discussion_r760997361
##########
File path: docs/segment-management-on-carbondata.md
##########
@@ -207,4 +208,4 @@ concept which helps to maintain consistency of data and easy transaction managem
spark.sql("select count(empno) from carbon.input.segments.db.carbontable_Multi_Thread").show();
}
}
- ```
+ ```
Review comment:
ok!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] chenliang613 commented on pull request #4239: Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
chenliang613 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-981504408
add to whitelist
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-986003785
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4400/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-992467232
Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/554/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-992468631
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6163/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] bieremayi commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
bieremayi commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997199647
retest this please
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997818335
Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/565/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997674455
retest this please
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-997203341
Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/559/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-981603944
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6141/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: [CARBONDATA-4315] Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-987842322
Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/535/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4239: Supplementary information for add segment syntax .
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4239:
URL: https://github.com/apache/carbondata/pull/4239#issuecomment-979240803
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org