You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by ajantha-bhat <gi...@git.apache.org> on 2018/04/20 11:10:02 UTC
[GitHub] carbondata pull request #2198: [CARBONDATA-2369] Add a document for Non Tran...
GitHub user ajantha-bhat opened a pull request:
https://github.com/apache/carbondata/pull/2198
[CARBONDATA-2369] Add a document for Non Transactional table with SDK writer guide
[CARBONDATA-2369] Add a document for Non Transactional table with SDK writer guide
As per PR#2131 [CARBONDATA-2313]
Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:
- [ ] Any interfaces changed? No
- [ ] Any backward compatibility impacted? No
- [ ] Document update required? yes, updated
- [ ] Testing done -- NA
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ajantha-bhat/carbondata master_doc
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/2198.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2198
----
commit 4506e44f75f7723a9e1b18c110a9b68bdbe0582d
Author: ajantha-bhat <aj...@...>
Date: 2018-04-20T11:06:37Z
[CARBONDATA-2369] Add a document for Non Transactional table with SDK writer guide
----
---
[GitHub] carbondata pull request #2198: [CARBONDATA-2369] Add a document for Non Tran...
Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2198#discussion_r183789400
--- Diff: docs/sdk-writer-guide.md ---
@@ -0,0 +1,172 @@
+# SDK Writer Guide
+In the carbon jars package, there exist a carbondata-store-sdk-x.x.x-SNAPSHOT.jar.
+This SDK writer, writes carbondata file and carbonindex file at a given path.
+External client can make use of this writer to convert other format data or live data to create carbondata and index files.
+These SDK writer output contains just a carbondata and carbonindex files. No metadata folder will be present.
+
+## Quick example
+
+```scala
+ import java.io.IOException;
+
+ import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+ import org.apache.carbondata.core.metadata.datatype.DataTypes;
+ import org.apache.carbondata.sdk.file.CarbonWriter;
+ import org.apache.carbondata.sdk.file.CarbonWriterBuilder;
+ import org.apache.carbondata.sdk.file.Field;
+ import org.apache.carbondata.sdk.file.Schema;
+
+ public class TestSdk {
+
+ public static void main(String[] args) throws IOException, InvalidLoadOptionException {
+ testSdkWriter();
+ }
+
+ public static void testSdkWriter() throws IOException, InvalidLoadOptionException {
+ String path ="/home/root1/Documents/ab/temp";
+
+ Field[] fields =new Field[2];
+ fields[0] = new Field("name", DataTypes.STRING);
+ fields[1] = new Field("age", DataTypes.INT);
+
+ Schema schema =new Schema(fields);
+
+ CarbonWriterBuilder builder = CarbonWriter.builder()
+ .withSchema(schema)
+ .outputPath(path);
+
+ CarbonWriter writer = builder.buildWriterForCSVInput();
+
+ int rows = 5;
+ for (int i = 0; i < rows; i++) {
+ writer.write(new String[]{"robot" + (i % 10), String.valueOf(i)});
+ }
+ writer.close();
+ }
+ }
+```
+
+## Datatypes Mapping
+Each of SQL data types are mapped into data types of SDK. Following are the mapping:
+
+| SQL DataTypes | Mapped SDK DataTypes |
+|---------------|----------------------|
+| BOOLEAN | DataTypes.BOOLEAN |
+| SMALLINT | DataTypes.SHORT |
+| INTEGER | DataTypes.INT |
+| BIGINT | DataTypes.LONG |
+| DOUBLE | DataTypes.DOUBLE |
+| VARCHAR | DataTypes.STRING |
+| DATE | DataTypes.DATE |
+| TIMESTAMP | DataTypes.TIMESTAMP |
+| STRING | DataTypes.STRING |
+| DECIMAL | DataTypes.createDecimalType(precision, scale) |
+
+
+## API List
+```
--- End diff --
Add these methods under class CarbonWriterBuilder
---
[GitHub] carbondata pull request #2198: [CARBONDATA-2369] Add a document for Non Tran...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/carbondata/pull/2198
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on the issue:
https://github.com/apache/carbondata/pull/2198
retest this please
---
[GitHub] carbondata pull request #2198: [CARBONDATA-2369] Add a document for Non Tran...
Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2198#discussion_r183791062
--- Diff: docs/sdk-writer-guide.md ---
@@ -0,0 +1,172 @@
+# SDK Writer Guide
+In the carbon jars package, there exist a carbondata-store-sdk-x.x.x-SNAPSHOT.jar.
+This SDK writer, writes carbondata file and carbonindex file at a given path.
+External client can make use of this writer to convert other format data or live data to create carbondata and index files.
+These SDK writer output contains just a carbondata and carbonindex files. No metadata folder will be present.
+
+## Quick example
+
+```scala
+ import java.io.IOException;
+
+ import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+ import org.apache.carbondata.core.metadata.datatype.DataTypes;
+ import org.apache.carbondata.sdk.file.CarbonWriter;
+ import org.apache.carbondata.sdk.file.CarbonWriterBuilder;
+ import org.apache.carbondata.sdk.file.Field;
+ import org.apache.carbondata.sdk.file.Schema;
+
+ public class TestSdk {
+
+ public static void main(String[] args) throws IOException, InvalidLoadOptionException {
+ testSdkWriter();
+ }
+
+ public static void testSdkWriter() throws IOException, InvalidLoadOptionException {
+ String path ="/home/root1/Documents/ab/temp";
+
+ Field[] fields =new Field[2];
+ fields[0] = new Field("name", DataTypes.STRING);
+ fields[1] = new Field("age", DataTypes.INT);
+
+ Schema schema =new Schema(fields);
+
+ CarbonWriterBuilder builder = CarbonWriter.builder()
+ .withSchema(schema)
+ .outputPath(path);
+
+ CarbonWriter writer = builder.buildWriterForCSVInput();
+
+ int rows = 5;
+ for (int i = 0; i < rows; i++) {
+ writer.write(new String[]{"robot" + (i % 10), String.valueOf(i)});
+ }
+ writer.close();
+ }
+ }
+```
+
+## Datatypes Mapping
+Each of SQL data types are mapped into data types of SDK. Following are the mapping:
+
+| SQL DataTypes | Mapped SDK DataTypes |
+|---------------|----------------------|
+| BOOLEAN | DataTypes.BOOLEAN |
+| SMALLINT | DataTypes.SHORT |
+| INTEGER | DataTypes.INT |
+| BIGINT | DataTypes.LONG |
+| DOUBLE | DataTypes.DOUBLE |
+| VARCHAR | DataTypes.STRING |
+| DATE | DataTypes.DATE |
+| TIMESTAMP | DataTypes.TIMESTAMP |
+| STRING | DataTypes.STRING |
+| DECIMAL | DataTypes.createDecimalType(precision, scale) |
+
+
+## API List
+```
+/**
+* prepares the builder with the schema provided
+* @param schema is instance of Schema
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder withSchema(Schema schema);
--- End diff --
1. Add CarbonWriter class details of methods in classe CarbonWriter
2. Add method buildWriterForCSVInput, buildWriterForAvroInput
3. Add example, writing CSV record and writing Avro record.
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2198
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5239/
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2198
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4065/
---
[GitHub] carbondata pull request #2198: [CARBONDATA-2369] Add a document for Non Tran...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2198#discussion_r183615513
--- Diff: docs/data-management-on-carbondata.md ---
@@ -174,6 +174,50 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
```
+## CREATE EXTERNAL TABLE
+ This function allows user to create external table by specifying location.
+ ```
+ CREATE EXTERNAL TABLE [IF NOT EXISTS] [db_name.]table_name
+ STORED BY 'carbondata' LOCATION ‘$FilesPath’
+ ```
+
+### Create external table on managed table data location.
+ Managed table data location provided will have both FACT and Metadata folder.
+ This data can be generated by creating a normal carbon table and use this path as $FilesPath in the above syntax.
+
+ **Example:**
+ ```
+ sql("CREATE TABLE origin(key INT, value STRING) STORED BY 'carbondata'")
+ sql("INSERT INTO origin select 100,'spark'")
+ sql("INSERT INTO origin select 200,'hive'")
+ // creates a table in $storeLocation/origin
+
+ sql(s"""
+ |CREATE EXTERNAL TABLE source
+ |STORED BY 'carbondata'
+ |LOCATION '$storeLocation/origin'
+ """.stripMargin)
+ checkAnswer(sql("SELECT count(*) from source"), sql("SELECT count(*) from origin"))
+ ```
+
+### Create external table on Non-Transactional table data location.
--- End diff --
There > there
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2198
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5354/
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on the issue:
https://github.com/apache/carbondata/pull/2198
LGTM
---
[GitHub] carbondata pull request #2198: [CARBONDATA-2369] Add a document for Non Tran...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2198#discussion_r183031656
--- Diff: docs/sdk-writer-guide.md ---
@@ -0,0 +1,140 @@
+# SDK Writer Guide
+In the carbon jars package, there exist a carbondata-store-sdk-x.x.x-SNAPSHOT.jar.
+This SDK writer, writes carbondata file and carbonindex file at a given path.
+External client can make use of this writer to convert other format data or live data to create carbondata and index files.
+These SDK writer output contains just a carbondata and carbonindex files. No metadata folder will be present.
+
+## Quick example
+
+```scala
+ import java.io.IOException;
+
+ import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+ import org.apache.carbondata.core.metadata.datatype.DataTypes;
+ import org.apache.carbondata.sdk.file.CarbonWriter;
+ import org.apache.carbondata.sdk.file.CarbonWriterBuilder;
+ import org.apache.carbondata.sdk.file.Field;
+ import org.apache.carbondata.sdk.file.Schema;
+
+ public class TestSdk {
+
+ public static void main(String[] args) throws IOException, InvalidLoadOptionException {
+ testSdkWriter();
+ }
+
+ public static void testSdkWriter() throws IOException, InvalidLoadOptionException {
+ String path ="/home/root1/Documents/ab/temp";
+
+ Field[] fields =new Field[2];
+ fields[0] = new Field("name", DataTypes.STRING);
+ fields[1] = new Field("age", DataTypes.INT);
+
+ Schema schema =new Schema(fields);
+
+ CarbonWriterBuilder builder = CarbonWriter.builder()
+ .withSchema(schema)
+ .outputPath(path);
+
+ CarbonWriter writer = builder.buildWriterForCSVInput();
+
+ int rows = 5;
+ for (int i = 0; i < rows; i++) {
+ writer.write(new String[]{"robot" + (i % 10), String.valueOf(i)});
+ }
+ writer.close();
+ }
+ }
+```
+
+## Datatypes Mapping
+Each of SQL data types are mapped into data types of SDK. Following are the mapping:
+| SQL DataTypes | Mapped SDK DataTypes |
--- End diff --
Table formatting has issue, please check
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2198
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5380/
---
[GitHub] carbondata pull request #2198: [CARBONDATA-2369] Add a document for Non Tran...
Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2198#discussion_r183625844
--- Diff: docs/data-management-on-carbondata.md ---
@@ -174,6 +174,50 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
```
+## CREATE EXTERNAL TABLE
+ This function allows user to create external table by specifying location.
+ ```
+ CREATE EXTERNAL TABLE [IF NOT EXISTS] [db_name.]table_name
+ STORED BY 'carbondata' LOCATION ‘$FilesPath’
+ ```
+
+### Create external table on managed table data location.
+ Managed table data location provided will have both FACT and Metadata folder.
+ This data can be generated by creating a normal carbon table and use this path as $FilesPath in the above syntax.
+
+ **Example:**
+ ```
+ sql("CREATE TABLE origin(key INT, value STRING) STORED BY 'carbondata'")
+ sql("INSERT INTO origin select 100,'spark'")
+ sql("INSERT INTO origin select 200,'hive'")
+ // creates a table in $storeLocation/origin
+
+ sql(s"""
+ |CREATE EXTERNAL TABLE source
+ |STORED BY 'carbondata'
+ |LOCATION '$storeLocation/origin'
+ """.stripMargin)
+ checkAnswer(sql("SELECT count(*) from source"), sql("SELECT count(*) from origin"))
+ ```
+
+### Create external table on Non-Transactional table data location.
--- End diff --
done. Modified.
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2198
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4506/
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2198
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4190/
---
[GitHub] carbondata pull request #2198: [CARBONDATA-2369] Add a document for Non Tran...
Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2198#discussion_r183793026
--- Diff: docs/sdk-writer-guide.md ---
@@ -0,0 +1,172 @@
+# SDK Writer Guide
+In the carbon jars package, there exist a carbondata-store-sdk-x.x.x-SNAPSHOT.jar.
+This SDK writer, writes carbondata file and carbonindex file at a given path.
+External client can make use of this writer to convert other format data or live data to create carbondata and index files.
+These SDK writer output contains just a carbondata and carbonindex files. No metadata folder will be present.
+
+## Quick example
+
+```scala
+ import java.io.IOException;
+
+ import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+ import org.apache.carbondata.core.metadata.datatype.DataTypes;
+ import org.apache.carbondata.sdk.file.CarbonWriter;
+ import org.apache.carbondata.sdk.file.CarbonWriterBuilder;
+ import org.apache.carbondata.sdk.file.Field;
+ import org.apache.carbondata.sdk.file.Schema;
+
+ public class TestSdk {
+
+ public static void main(String[] args) throws IOException, InvalidLoadOptionException {
+ testSdkWriter();
+ }
+
+ public static void testSdkWriter() throws IOException, InvalidLoadOptionException {
+ String path ="/home/root1/Documents/ab/temp";
+
+ Field[] fields =new Field[2];
+ fields[0] = new Field("name", DataTypes.STRING);
+ fields[1] = new Field("age", DataTypes.INT);
+
+ Schema schema =new Schema(fields);
+
+ CarbonWriterBuilder builder = CarbonWriter.builder()
+ .withSchema(schema)
+ .outputPath(path);
+
+ CarbonWriter writer = builder.buildWriterForCSVInput();
+
+ int rows = 5;
+ for (int i = 0; i < rows; i++) {
+ writer.write(new String[]{"robot" + (i % 10), String.valueOf(i)});
+ }
+ writer.close();
+ }
+ }
+```
+
+## Datatypes Mapping
+Each of SQL data types are mapped into data types of SDK. Following are the mapping:
+
+| SQL DataTypes | Mapped SDK DataTypes |
+|---------------|----------------------|
+| BOOLEAN | DataTypes.BOOLEAN |
+| SMALLINT | DataTypes.SHORT |
+| INTEGER | DataTypes.INT |
+| BIGINT | DataTypes.LONG |
+| DOUBLE | DataTypes.DOUBLE |
+| VARCHAR | DataTypes.STRING |
+| DATE | DataTypes.DATE |
+| TIMESTAMP | DataTypes.TIMESTAMP |
+| STRING | DataTypes.STRING |
+| DECIMAL | DataTypes.createDecimalType(precision, scale) |
+
+
+## API List
+```
+/**
+* prepares the builder with the schema provided
+* @param schema is instance of Schema
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder withSchema(Schema schema);
+```
+
+```
+/**
+* Sets the output path of the writer builder
+* @param path is the absolute path where output files are written
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder outputPath(String path);
+```
+
+```
+/**
+* If set false, writes the carbondata and carbonindex files in a flat folder structure
+* @param isTransactionalTable is a boolelan value if set to false then writes
+* the carbondata and carbonindex files in a flat folder structure
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder isTransactionalTable(boolean isTransactionalTable);
+```
+
+```
+/**
+* to set the timestamp in the carbondata and carbonindex index files
+* @param UUID is a timestamp to be used in the carbondata
+* and carbonindex index files
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder uniqueIdentifier(long UUID);
+```
+
+```
+/**
+* To set the carbondata file size in MB between 1MB-2048MB
+* @param blockSize is size in MB between 1MB to 2048 MB
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder withBlockSize(int blockSize);
+```
+
+```
+/**
+* To set the blocklet size of carbondata file
+* @param blockletSize is blocklet size in MB
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder withBlockletSize(int blockletSize);
+```
+
+```
+/**
+* sets the list of columns that needs to be in sorted order
+* @param sortColumns is a string array of columns that needs to be sorted.
+* If it is null, all dimensions are selected for sorting
+* If it is empty array, no columns are sorted
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder sortBy(String[] sortColumns);
+```
+
+```
+/**
+* If set, creates a schema file in metadata folder.
--- End diff --
what is the default value, what is the effect of setting isTransactionTable(true/false)
---
[GitHub] carbondata pull request #2198: [CARBONDATA-2369] Add a document for Non Tran...
Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2198#discussion_r183035777
--- Diff: docs/sdk-writer-guide.md ---
@@ -0,0 +1,140 @@
+# SDK Writer Guide
+In the carbon jars package, there exist a carbondata-store-sdk-x.x.x-SNAPSHOT.jar.
+This SDK writer, writes carbondata file and carbonindex file at a given path.
+External client can make use of this writer to convert other format data or live data to create carbondata and index files.
+These SDK writer output contains just a carbondata and carbonindex files. No metadata folder will be present.
+
+## Quick example
+
+```scala
+ import java.io.IOException;
+
+ import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+ import org.apache.carbondata.core.metadata.datatype.DataTypes;
+ import org.apache.carbondata.sdk.file.CarbonWriter;
+ import org.apache.carbondata.sdk.file.CarbonWriterBuilder;
+ import org.apache.carbondata.sdk.file.Field;
+ import org.apache.carbondata.sdk.file.Schema;
+
+ public class TestSdk {
+
+ public static void main(String[] args) throws IOException, InvalidLoadOptionException {
+ testSdkWriter();
+ }
+
+ public static void testSdkWriter() throws IOException, InvalidLoadOptionException {
+ String path ="/home/root1/Documents/ab/temp";
+
+ Field[] fields =new Field[2];
+ fields[0] = new Field("name", DataTypes.STRING);
+ fields[1] = new Field("age", DataTypes.INT);
+
+ Schema schema =new Schema(fields);
+
+ CarbonWriterBuilder builder = CarbonWriter.builder()
+ .withSchema(schema)
+ .outputPath(path);
+
+ CarbonWriter writer = builder.buildWriterForCSVInput();
+
+ int rows = 5;
+ for (int i = 0; i < rows; i++) {
+ writer.write(new String[]{"robot" + (i % 10), String.valueOf(i)});
+ }
+ writer.close();
+ }
+ }
+```
+
+## Datatypes Mapping
+Each of SQL data types are mapped into data types of SDK. Following are the mapping:
+| SQL DataTypes | Mapped SDK DataTypes |
--- End diff --
ok. Fixed it.
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on the issue:
https://github.com/apache/carbondata/pull/2198
LGTM
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2198
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4226/
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2198
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4532/
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2198
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5369/
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2198
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4213/
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2198
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4501/
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2198
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5393/
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2198
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4060/
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on the issue:
https://github.com/apache/carbondata/pull/2198
LGTM
---
[GitHub] carbondata pull request #2198: [CARBONDATA-2369] Add a document for Non Tran...
Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2198#discussion_r183792089
--- Diff: docs/sdk-writer-guide.md ---
@@ -0,0 +1,172 @@
+# SDK Writer Guide
+In the carbon jars package, there exist a carbondata-store-sdk-x.x.x-SNAPSHOT.jar.
+This SDK writer, writes carbondata file and carbonindex file at a given path.
+External client can make use of this writer to convert other format data or live data to create carbondata and index files.
+These SDK writer output contains just a carbondata and carbonindex files. No metadata folder will be present.
+
+## Quick example
+
+```scala
+ import java.io.IOException;
+
+ import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+ import org.apache.carbondata.core.metadata.datatype.DataTypes;
+ import org.apache.carbondata.sdk.file.CarbonWriter;
+ import org.apache.carbondata.sdk.file.CarbonWriterBuilder;
+ import org.apache.carbondata.sdk.file.Field;
+ import org.apache.carbondata.sdk.file.Schema;
+
+ public class TestSdk {
+
+ public static void main(String[] args) throws IOException, InvalidLoadOptionException {
+ testSdkWriter();
+ }
+
+ public static void testSdkWriter() throws IOException, InvalidLoadOptionException {
+ String path ="/home/root1/Documents/ab/temp";
+
+ Field[] fields =new Field[2];
+ fields[0] = new Field("name", DataTypes.STRING);
+ fields[1] = new Field("age", DataTypes.INT);
+
+ Schema schema =new Schema(fields);
+
+ CarbonWriterBuilder builder = CarbonWriter.builder()
+ .withSchema(schema)
+ .outputPath(path);
+
+ CarbonWriter writer = builder.buildWriterForCSVInput();
+
+ int rows = 5;
+ for (int i = 0; i < rows; i++) {
+ writer.write(new String[]{"robot" + (i % 10), String.valueOf(i)});
+ }
+ writer.close();
+ }
+ }
+```
+
+## Datatypes Mapping
+Each of SQL data types are mapped into data types of SDK. Following are the mapping:
+
+| SQL DataTypes | Mapped SDK DataTypes |
+|---------------|----------------------|
+| BOOLEAN | DataTypes.BOOLEAN |
+| SMALLINT | DataTypes.SHORT |
+| INTEGER | DataTypes.INT |
+| BIGINT | DataTypes.LONG |
+| DOUBLE | DataTypes.DOUBLE |
+| VARCHAR | DataTypes.STRING |
+| DATE | DataTypes.DATE |
+| TIMESTAMP | DataTypes.TIMESTAMP |
+| STRING | DataTypes.STRING |
+| DECIMAL | DataTypes.createDecimalType(precision, scale) |
+
+
+## API List
+```
+/**
+* prepares the builder with the schema provided
+* @param schema is instance of Schema
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder withSchema(Schema schema);
+```
+
+```
+/**
+* Sets the output path of the writer builder
+* @param path is the absolute path where output files are written
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder outputPath(String path);
+```
+
+```
+/**
+* If set false, writes the carbondata and carbonindex files in a flat folder structure
--- End diff --
What when is the behaviour when set to true, what is the default value for all optional parameters
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2198
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4518/
---
[GitHub] carbondata pull request #2198: [CARBONDATA-2369] Add a document for Non Tran...
Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2198#discussion_r183793655
--- Diff: docs/sdk-writer-guide.md ---
@@ -0,0 +1,172 @@
+# SDK Writer Guide
+In the carbon jars package, there exist a carbondata-store-sdk-x.x.x-SNAPSHOT.jar.
+This SDK writer, writes carbondata file and carbonindex file at a given path.
+External client can make use of this writer to convert other format data or live data to create carbondata and index files.
+These SDK writer output contains just a carbondata and carbonindex files. No metadata folder will be present.
+
+## Quick example
+
+```scala
+ import java.io.IOException;
+
+ import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+ import org.apache.carbondata.core.metadata.datatype.DataTypes;
+ import org.apache.carbondata.sdk.file.CarbonWriter;
+ import org.apache.carbondata.sdk.file.CarbonWriterBuilder;
+ import org.apache.carbondata.sdk.file.Field;
+ import org.apache.carbondata.sdk.file.Schema;
+
+ public class TestSdk {
+
+ public static void main(String[] args) throws IOException, InvalidLoadOptionException {
+ testSdkWriter();
+ }
+
+ public static void testSdkWriter() throws IOException, InvalidLoadOptionException {
+ String path ="/home/root1/Documents/ab/temp";
+
+ Field[] fields =new Field[2];
+ fields[0] = new Field("name", DataTypes.STRING);
+ fields[1] = new Field("age", DataTypes.INT);
+
+ Schema schema =new Schema(fields);
+
+ CarbonWriterBuilder builder = CarbonWriter.builder()
+ .withSchema(schema)
+ .outputPath(path);
+
+ CarbonWriter writer = builder.buildWriterForCSVInput();
+
+ int rows = 5;
+ for (int i = 0; i < rows; i++) {
+ writer.write(new String[]{"robot" + (i % 10), String.valueOf(i)});
+ }
+ writer.close();
+ }
+ }
+```
+
+## Datatypes Mapping
+Each of SQL data types are mapped into data types of SDK. Following are the mapping:
+
+| SQL DataTypes | Mapped SDK DataTypes |
+|---------------|----------------------|
+| BOOLEAN | DataTypes.BOOLEAN |
+| SMALLINT | DataTypes.SHORT |
+| INTEGER | DataTypes.INT |
+| BIGINT | DataTypes.LONG |
+| DOUBLE | DataTypes.DOUBLE |
+| VARCHAR | DataTypes.STRING |
+| DATE | DataTypes.DATE |
+| TIMESTAMP | DataTypes.TIMESTAMP |
+| STRING | DataTypes.STRING |
+| DECIMAL | DataTypes.createDecimalType(precision, scale) |
+
+
+## API List
+```
+/**
+* prepares the builder with the schema provided
+* @param schema is instance of Schema
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder withSchema(Schema schema);
+```
+
+```
+/**
+* Sets the output path of the writer builder
+* @param path is the absolute path where output files are written
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder outputPath(String path);
+```
+
+```
+/**
+* If set false, writes the carbondata and carbonindex files in a flat folder structure
+* @param isTransactionalTable is a boolelan value if set to false then writes
+* the carbondata and carbonindex files in a flat folder structure
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder isTransactionalTable(boolean isTransactionalTable);
+```
+
+```
+/**
+* to set the timestamp in the carbondata and carbonindex index files
+* @param UUID is a timestamp to be used in the carbondata
+* and carbonindex index files
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder uniqueIdentifier(long UUID);
+```
+
+```
+/**
+* To set the carbondata file size in MB between 1MB-2048MB
+* @param blockSize is size in MB between 1MB to 2048 MB
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder withBlockSize(int blockSize);
+```
+
+```
+/**
+* To set the blocklet size of carbondata file
+* @param blockletSize is blocklet size in MB
+* @return updated CarbonWriterBuilder
+*/
+public CarbonWriterBuilder withBlockletSize(int blockletSize);
+```
+
+```
+/**
+* sets the list of columns that needs to be in sorted order
+* @param sortColumns is a string array of columns that needs to be sorted.
+* If it is null, all dimensions are selected for sorting
--- End diff --
What is the default value if not set. Default value if not set should be mentioned for all APIs
---
[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2198
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4195/
---