You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@carbondata.apache.org by in...@apache.org on 2020/10/13 05:47:47 UTC
[carbondata] branch master updated: [CARBONDATA-4010] Doc changes
for long strings.
This is an automated email from the ASF dual-hosted git repository.
indhumuthumurugesh pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git
The following commit(s) were added to refs/heads/master by this push:
new 9dfbd91 [CARBONDATA-4010] Doc changes for long strings.
9dfbd91 is described below
commit 9dfbd9122e2e1f94dbd351ed6d6f2e039c3818cf
Author: Nihal ojha <ni...@gmail.com>
AuthorDate: Fri Sep 25 14:26:07 2020 +0530
[CARBONDATA-4010] Doc changes for long strings.
Why is this PR needed?
Added documentation change for the handling of long strings(length greater than 32000) as bad record and set/unset of longStringColumns.
What changes were proposed in this PR?
Added documentation change for the handling of long strings(length greater than 32000) as bad record and set/unset of longStringColumns.
Does this PR introduce any user interface change?
No
Is any new testcase added?
No
This closes #3959
Co-authored-by: Karan980 <ka...@gmail.com>
---
docs/ddl-of-carbondata.md | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/docs/ddl-of-carbondata.md b/docs/ddl-of-carbondata.md
index ca9a321..56d7e4e 100644
--- a/docs/ddl-of-carbondata.md
+++ b/docs/ddl-of-carbondata.md
@@ -426,7 +426,8 @@ CarbonData DDL statements are documented here,which includes:
- ##### String longer than 32000 characters
In common scenarios, the length of string is less than 32000,
- so carbondata stores the length of content using Short to reduce memory and space consumption.
+ so carbondata stores the length of content using Short to reduce memory and space consumption,
+ and it handles strings which have length greater than 32000 as a bad record. Refer [bad record handling](https://github.com/apache/carbondata/blob/master/docs/dml-of-carbondata.md#bad-records-handling) section for better understanding.
To support string longer than 32000 characters, carbondata introduces a table property called `LONG_STRING_COLUMNS`.
For these columns, carbondata internally stores the length of content using Integer.
@@ -812,7 +813,19 @@ Users can specify which columns to include and exclude for local dictionary gene
```
ALTER TABLE tablename UNSET TBLPROPERTIES('SORT_SCOPE')
```
+ - ##### Long String Columns
+ Example to SET Long String Columns:
+ ```
+ ALTER TABLE tablename SET TBLPROPERTIES('LONG_STRING_COLUMNS'='column1')
+ ```
+ **NOTE:** Only string columns can be set to long string columns. Cannot set sort columns to long string columns.
+ Example to UNSET Long String Columns:
+ ```
+ ALTER TABLE tablename UNSET TBLPROPERTIES('LONG_STRING_COLUMNS')
+ ```
+ **NOTE:** On unset, long string columns are set to their original datatypes.
+
- ##### SORT COLUMNS
Example to SET SORT COLUMNS:
```