You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by NamanRastogi <gi...@git.apache.org> on 2019/01/10 14:21:30 UTC
[GitHub] carbondata pull request #3064: [WIP] Updated DOC for No-Sort Compaction and ...
GitHub user NamanRastogi opened a pull request:
https://github.com/apache/carbondata/pull/3064
[WIP] Updated DOC for No-Sort Compaction and a few Fixes
1. Updated Doc
2. Checking SORT_SCOPE in session property CARBON.TABLE.LOAD.SORT.SCOPE in CarbonTable.getSortScope()
3. Throw error when an invalid command is executed through SET Command.
4. Other Minor Fixes
- [x] Any interfaces changed? -> NO
- [x] Any backward compatibility impacted? -> NO
- [x] Document update required? -> NO
- [x] Testing done -> Yes
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/NamanRastogi/carbondata nosort_compaction_imporv
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/3064.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3064
----
commit 78828ad9c508e2c35e9c9f6f17f81a874c7410c7
Author: namanrastogi <na...@...>
Date: 2019-01-10T09:10:23Z
Updated DOC for No-Sort Compaction
----
---
[GitHub] carbondata issue #3064: [CARBONDATA-3243] Updated DOC for No-Sort Compaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3064
Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10513/
---
[GitHub] carbondata issue #3064: [CARBONDATA-3243] Updated DOC for No-Sort Compaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3064
Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10525/
---
[GitHub] carbondata issue #3064: [CARBONDATA-3243] Updated DOC for No-Sort Compaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3064
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2264/
---
[GitHub] carbondata issue #3064: [CARBONDATA-3243] Updated DOC for No-Sort Compaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3064
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2486/
---
[GitHub] carbondata issue #3064: [CARBONDATA-3243] Updated DOC for No-Sort Compaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3064
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2474/
---
[GitHub] carbondata issue #3064: [WIP] Updated DOC for No-Sort Compaction and a few F...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3064
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2472/
---
[GitHub] carbondata issue #3064: [CARBONDATA-3243] Updated DOC for No-Sort Compaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3064
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2494/
---
[GitHub] carbondata issue #3064: [CARBONDATA-3243] Updated DOC for No-Sort Compaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3064
Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10512/
---
[GitHub] carbondata issue #3064: [CARBONDATA-3243] Updated DOC for No-Sort Compaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3064
Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10533/
---
[GitHub] carbondata pull request #3064: [CARBONDATA-3243] Updated DOC for No-Sort Com...
Posted by qiuchenjian <gi...@git.apache.org>.
Github user qiuchenjian commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/3064#discussion_r246975368
--- Diff: docs/dml-of-carbondata.md ---
@@ -106,6 +107,13 @@ CarbonData DML statements are documented here,which includes:
OPTIONS('FILEHEADER'='column1,column2')
```
+ - ##### SORT_SCOPE:
+ Sort Scope to be used for the current load. This overrides the Sort Scope of Table.
--- End diff --
```suggestion
Sort Scope is used for the current load. This overrides the Sort Scope of Table.
```
---
[GitHub] carbondata pull request #3064: [CARBONDATA-3243] Updated DOC for No-Sort Com...
Posted by qiuchenjian <gi...@git.apache.org>.
Github user qiuchenjian commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/3064#discussion_r246975324
--- Diff: docs/dml-of-carbondata.md ---
@@ -49,6 +49,7 @@ CarbonData DML statements are documented here,which includes:
| [COMMENTCHAR](#commentchar) | Character used to comment the rows in the input csv file. Those rows will be skipped from processing |
| [HEADER](#header) | Whether the input csv files have header row |
| [FILEHEADER](#fileheader) | If header is not present in the input csv, what is the column names to be used for data read from input csv |
+| [SORT_SCOPE](#sort_scope) | Sort Scope to be used for current load. |
--- End diff --
```suggestion
| [SORT_SCOPE](#sort_scope) | Sort Scope is used for current load. |
```
---
[GitHub] carbondata issue #3064: [CARBONDATA-3243] Updated DOC for No-Sort Compaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3064
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2473/
---
[GitHub] carbondata issue #3064: [WIP] Updated DOC for No-Sort Compaction and a few F...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3064
Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10511/
---
[GitHub] carbondata issue #3064: [CARBONDATA-3243] Updated DOC for No-Sort Compaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3064
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2255/
---
[GitHub] carbondata issue #3064: [CARBONDATA-3243] Updated DOC for No-Sort Compaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3064
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2254/
---
[GitHub] carbondata pull request #3064: [CARBONDATA-3243] Updated DOC for No-Sort Com...
Posted by NamanRastogi <gi...@git.apache.org>.
Github user NamanRastogi commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/3064#discussion_r247010601
--- Diff: docs/configuration-parameters.md ---
@@ -208,6 +208,7 @@ RESET
| carbon.options.date.format | Specifies the data format of the date columns in the data being loaded |
| carbon.options.timestamp.format | Specifies the timestamp format of the time stamp columns in the data being loaded |
| carbon.options.sort.scope | Specifies how the current data load should be sorted with. **NOTE:** Refer to [Data Loading Configuration](#data-loading-configuration)#carbon.sort.scope for detailed information. |
+| carbon.table.load.sort.scope | Overrides the SORT_SCOPE provides in CREATE TABLE. |
--- End diff --
"provides" changed to "provided". This was a spelling mistake.
"Overrides" is correct. No Change.
---
[GitHub] carbondata issue #3064: [CARBONDATA-3243] Updated DOC for No-Sort Compaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3064
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2263/
---
[GitHub] carbondata pull request #3064: [CARBONDATA-3243] Updated DOC for No-Sort Com...
Posted by qiuchenjian <gi...@git.apache.org>.
Github user qiuchenjian commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/3064#discussion_r246975024
--- Diff: docs/configuration-parameters.md ---
@@ -208,6 +208,7 @@ RESET
| carbon.options.date.format | Specifies the data format of the date columns in the data being loaded |
| carbon.options.timestamp.format | Specifies the timestamp format of the time stamp columns in the data being loaded |
| carbon.options.sort.scope | Specifies how the current data load should be sorted with. **NOTE:** Refer to [Data Loading Configuration](#data-loading-configuration)#carbon.sort.scope for detailed information. |
+| carbon.table.load.sort.scope | Overrides the SORT_SCOPE provides in CREATE TABLE. |
--- End diff --
```suggestion
| carbon.table.load.sort.scope | Override the SORT_SCOPE provided in CREATE TABLE. |
```
---
[GitHub] carbondata pull request #3064: [CARBONDATA-3243] Updated DOC for No-Sort Com...
Posted by NamanRastogi <gi...@git.apache.org>.
Github user NamanRastogi commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/3064#discussion_r247097474
--- Diff: integration/spark-common/src/main/scala/org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala ---
@@ -1201,6 +1202,17 @@ abstract class CarbonDDLSqlParser extends AbstractCarbonSparkSQLParser {
}
}
+ // Validate SORT_SCOPE
+ if(options.exists(_._1.equalsIgnoreCase("SORT_SCOPE"))) {
+ val optionValue: String = options.get("sort_scope").get.head._2
+ if (!CarbonUtil.isValidSortOption(optionValue)) {
+ throw new InvalidConfigurationException(
+ s"Passing invalid SORT_SCOPE '$optionValue', valid SORT_SCOPE are 'NO_SORT'," +
+ s" 'BATCH_SORT', 'LOCAL_SORT' and 'GLOBAL_SORT' ")
+ }
+
+ }
+
--- End diff --
Done.
---
[GitHub] carbondata pull request #3064: [CARBONDATA-3243] Updated DOC for No-Sort Com...
Posted by shardul-cr7 <gi...@git.apache.org>.
Github user shardul-cr7 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/3064#discussion_r247096965
--- Diff: integration/spark-common/src/main/scala/org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala ---
@@ -1201,6 +1202,17 @@ abstract class CarbonDDLSqlParser extends AbstractCarbonSparkSQLParser {
}
}
+ // Validate SORT_SCOPE
+ if(options.exists(_._1.equalsIgnoreCase("SORT_SCOPE"))) {
+ val optionValue: String = options.get("sort_scope").get.head._2
+ if (!CarbonUtil.isValidSortOption(optionValue)) {
+ throw new InvalidConfigurationException(
+ s"Passing invalid SORT_SCOPE '$optionValue', valid SORT_SCOPE are 'NO_SORT'," +
+ s" 'BATCH_SORT', 'LOCAL_SORT' and 'GLOBAL_SORT' ")
+ }
+
+ }
+
--- End diff --
Remove empty lines and properly format the code.
---
[GitHub] carbondata pull request #3064: [CARBONDATA-3243] Updated DOC for No-Sort Com...
Posted by NamanRastogi <gi...@git.apache.org>.
Github user NamanRastogi commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/3064#discussion_r247011216
--- Diff: docs/dml-of-carbondata.md ---
@@ -106,6 +107,13 @@ CarbonData DML statements are documented here,which includes:
OPTIONS('FILEHEADER'='column1,column2')
```
+ - ##### SORT_SCOPE:
+ Sort Scope to be used for the current load. This overrides the Sort Scope of Table.
--- End diff --
"to be" is correct. No change.
---
[GitHub] carbondata issue #3064: [WIP] Updated DOC for No-Sort Compaction and a few F...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3064
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2253/
---
[GitHub] carbondata pull request #3064: [WIP] Updated DOC for No-Sort Compaction and ...
Posted by kunal642 <gi...@git.apache.org>.
Github user kunal642 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/3064#discussion_r246777849
--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/hive/execution/command/CarbonHiveCommands.scala ---
@@ -127,6 +127,9 @@ object CarbonSetCommand {
else if (isCarbonProperty) {
sessionParams.addProperty(key, value)
}
+ else {
--- End diff --
remove this.. if spark property is set then it should not be validated by carbon
---
[GitHub] carbondata pull request #3064: [CARBONDATA-3243] Updated DOC for No-Sort Com...
Posted by NamanRastogi <gi...@git.apache.org>.
Github user NamanRastogi commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/3064#discussion_r247010700
--- Diff: docs/dml-of-carbondata.md ---
@@ -49,6 +49,7 @@ CarbonData DML statements are documented here,which includes:
| [COMMENTCHAR](#commentchar) | Character used to comment the rows in the input csv file. Those rows will be skipped from processing |
| [HEADER](#header) | Whether the input csv files have header row |
| [FILEHEADER](#fileheader) | If header is not present in the input csv, what is the column names to be used for data read from input csv |
+| [SORT_SCOPE](#sort_scope) | Sort Scope to be used for current load. |
--- End diff --
"to be" is correct. No change.
---
[GitHub] carbondata issue #3064: [CARBONDATA-3243] Updated DOC for No-Sort Compaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3064
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2274/
---
[GitHub] carbondata issue #3064: [CARBONDATA-3243] Updated DOC for No-Sort Compaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/3064
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2267/
---
[GitHub] carbondata pull request #3064: [CARBONDATA-3243] Updated DOC for No-Sort Com...
Posted by NamanRastogi <gi...@git.apache.org>.
Github user NamanRastogi commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/3064#discussion_r246780308
--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/hive/execution/command/CarbonHiveCommands.scala ---
@@ -127,6 +127,9 @@ object CarbonSetCommand {
else if (isCarbonProperty) {
sessionParams.addProperty(key, value)
}
+ else {
--- End diff --
Done.
---