You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by xubo245 <gi...@git.apache.org> on 2018/04/25 07:12:56 UTC

[GitHub] carbondata pull request #2226: [CARBONDATA-2384] SDK support write data into...

GitHub user xubo245 opened a pull request:

    https://github.com/apache/carbondata/pull/2226

    [CARBONDATA-2384] SDK support write data into S3

    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
    
     - [ ] Testing done
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xubo245/carbondata CARBONDATA-2384-SDKS3

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2226.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2226
    
----
commit 5dc8d9312c1266ab643f8f6834c404a77c41451a
Author: xubo245 <60...@...>
Date:   2018-04-25T07:09:44Z

    [CARBONDATA-2384] SDK support write data into S3

----


---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    retest this please


---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    @jackylk @sounakr CI pass. please check it again.


---

[GitHub] carbondata pull request #2226: [CARBONDATA-2384] SDK support write/read data...

Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2226#discussion_r184360833
  
    --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/sdk/SDkWriteS3Example.scala ---
    @@ -0,0 +1,124 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.carbondata.sdk
    +
    +import org.slf4j.{Logger, LoggerFactory}
    +
    +import org.apache.carbondata.core.metadata.datatype.DataTypes
    +import org.apache.carbondata.sdk.file.{CarbonReader, CarbonWriter, Field, Schema}
    +
    +/**
    + * Generate data and write data to S3 by SDK, no spark
    + */
    +object SDKWriteS3Example {
    +
    +  // scalastyle:off println
    +  /**
    +   * This example demonstrate usage of
    +   *
    +   * @param args require three parameters "Access-key" "Secret-key"
    +   *             "s3-endpoint", other is optional
    +   */
    +  def main(args: Array[String]) {
    +    val logger: Logger = LoggerFactory.getLogger(this.getClass)
    +    if (args.length < 2 || args.length > 6) {
    +      logger.error("Usage: java CarbonS3Example: <access-key> <secret-key>" +
    +        "<s3-endpoint> [table-path-on-s3] [number-of-rows] [persistSchema]")
    +      System.exit(0)
    +    }
    +
    +    val path = if (args.length > 3) {
    +      args(3)
    +    } else {
    +      "s3a://sdk/WriterOutput"
    +    }
    +
    +    val num = if (args.length > 4) {
    +      Integer.parseInt(args(4))
    +    } else {
    +      3
    +    }
    +
    +    val persistSchema = if (args.length > 5) {
    +      if (args(5).equalsIgnoreCase("true")) {
    +        true
    +      } else {
    +        false
    +      }
    +    } else {
    +      true
    +    }
    +
    +    // getCanonicalPath gives path with \, so code expects /.
    +    val writerPath = path.replace("\\", "/");
    +
    +    val fields: Array[Field] = new Array[Field](3)
    +    fields(0) = new Field("name", DataTypes.STRING)
    +    fields(1) = new Field("age", DataTypes.INT)
    +    fields(2) = new Field("height", DataTypes.DOUBLE)
    +
    +    try {
    +      val builder = CarbonWriter.builder()
    +        .withSchema(new Schema(fields))
    +        .outputPath(writerPath)
    +        .isTransactionalTable(true)
    --- End diff --
    
    @sounakr I change the code to support configure write Non Transactional table. But not carbonReader don't support read Non Transactional table. I will raise another PR to support this function.


---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4241/



---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5446/



---

[GitHub] carbondata pull request #2226: [CARBONDATA-2384] SDK support write/read data...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2226


---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5386/



---

[GitHub] carbondata pull request #2226: [CARBONDATA-2384] SDK support write/read data...

Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2226#discussion_r184033067
  
    --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/sdk/SDkWriteS3Example.scala ---
    @@ -0,0 +1,124 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.carbondata.sdk
    +
    +import org.slf4j.{Logger, LoggerFactory}
    +
    +import org.apache.carbondata.core.metadata.datatype.DataTypes
    +import org.apache.carbondata.sdk.file.{CarbonReader, CarbonWriter, Field, Schema}
    +
    +/**
    + * Generate data and write data to S3 by SDK, no spark
    + */
    +object SDKWriteS3Example {
    +
    +  // scalastyle:off println
    +  /**
    +   * This example demonstrate usage of
    +   *
    +   * @param args require three parameters "Access-key" "Secret-key"
    +   *             "s3-endpoint", other is optional
    +   */
    +  def main(args: Array[String]) {
    +    val logger: Logger = LoggerFactory.getLogger(this.getClass)
    +    if (args.length < 2 || args.length > 6) {
    +      logger.error("Usage: java CarbonS3Example: <access-key> <secret-key>" +
    +        "<s3-endpoint> [table-path-on-s3] [number-of-rows] [persistSchema]")
    +      System.exit(0)
    +    }
    +
    +    val path = if (args.length > 3) {
    +      args(3)
    +    } else {
    +      "s3a://sdk/WriterOutput"
    +    }
    +
    +    val num = if (args.length > 4) {
    +      Integer.parseInt(args(4))
    +    } else {
    +      3
    +    }
    +
    +    val persistSchema = if (args.length > 5) {
    +      if (args(5).equalsIgnoreCase("true")) {
    +        true
    +      } else {
    +        false
    +      }
    +    } else {
    +      true
    +    }
    +
    +    // getCanonicalPath gives path with \, so code expects /.
    +    val writerPath = path.replace("\\", "/");
    +
    +    val fields: Array[Field] = new Array[Field](3)
    +    fields(0) = new Field("name", DataTypes.STRING)
    +    fields(1) = new Field("age", DataTypes.INT)
    +    fields(2) = new Field("height", DataTypes.DOUBLE)
    +
    +    try {
    +      val builder = CarbonWriter.builder()
    +        .withSchema(new Schema(fields))
    +        .outputPath(writerPath)
    +        .isTransactionalTable(true)
    --- End diff --
    
    Please write test cases for Non Transactional table in reader and writer.


---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4568/



---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4284/



---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5385/



---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4522/



---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4218/



---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4219/



---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4525/



---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4521/



---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4546/



---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4262/



---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write data into S3

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5383/



---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4523/



---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    LGTM


---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write data into S3

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4215/



---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4572/



---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5408/



---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write data into S3

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5382/



---

[GitHub] carbondata issue #2226: [CARBONDATA-2384] SDK support write/read data into/f...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2226
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5429/



---

[GitHub] carbondata pull request #2226: [CARBONDATA-2384] SDK support write/read data...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2226#discussion_r184036532
  
    --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/sdk/SDkWriteS3Example.scala ---
    @@ -0,0 +1,124 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.carbondata.sdk
    +
    +import org.slf4j.{Logger, LoggerFactory}
    +
    +import org.apache.carbondata.core.metadata.datatype.DataTypes
    +import org.apache.carbondata.sdk.file.{CarbonReader, CarbonWriter, Field, Schema}
    +
    +/**
    + * Generate data and write data to S3 by SDK, no spark
    + */
    +object SDKWriteS3Example {
    +
    +  // scalastyle:off println
    +  /**
    +   * This example demonstrate usage of
    +   *
    +   * @param args require three parameters "Access-key" "Secret-key"
    +   *             "s3-endpoint", other is optional
    +   */
    +  def main(args: Array[String]) {
    +    val logger: Logger = LoggerFactory.getLogger(this.getClass)
    +    if (args.length < 2 || args.length > 6) {
    +      logger.error("Usage: java CarbonS3Example: <access-key> <secret-key>" +
    +        "<s3-endpoint> [table-path-on-s3] [number-of-rows] [persistSchema]")
    +      System.exit(0)
    +    }
    +
    +    val path = if (args.length > 3) {
    +      args(3)
    +    } else {
    +      "s3a://sdk/WriterOutput"
    +    }
    +
    +    val num = if (args.length > 4) {
    +      Integer.parseInt(args(4))
    +    } else {
    +      3
    +    }
    +
    +    val persistSchema = if (args.length > 5) {
    +      if (args(5).equalsIgnoreCase("true")) {
    +        true
    +      } else {
    +        false
    +      }
    +    } else {
    +      true
    +    }
    +
    +    // getCanonicalPath gives path with \, so code expects /.
    +    val writerPath = path.replace("\\", "/");
    +
    +    val fields: Array[Field] = new Array[Field](3)
    +    fields(0) = new Field("name", DataTypes.STRING)
    +    fields(1) = new Field("age", DataTypes.INT)
    +    fields(2) = new Field("height", DataTypes.DOUBLE)
    +
    +    try {
    +      val builder = CarbonWriter.builder()
    +        .withSchema(new Schema(fields))
    +        .outputPath(writerPath)
    +        .isTransactionalTable(true)
    --- End diff --
    
    @sounakr @xubo245
     What does the meaning of `Transactional table` in carbondata?


---

[GitHub] carbondata pull request #2226: [CARBONDATA-2384] SDK support write/read data...

Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2226#discussion_r184361451
  
    --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/sdk/SDkWriteS3Example.scala ---
    @@ -0,0 +1,124 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.carbondata.sdk
    +
    +import org.slf4j.{Logger, LoggerFactory}
    +
    +import org.apache.carbondata.core.metadata.datatype.DataTypes
    +import org.apache.carbondata.sdk.file.{CarbonReader, CarbonWriter, Field, Schema}
    +
    +/**
    + * Generate data and write data to S3 by SDK, no spark
    + */
    +object SDKWriteS3Example {
    +
    +  // scalastyle:off println
    +  /**
    +   * This example demonstrate usage of
    +   *
    +   * @param args require three parameters "Access-key" "Secret-key"
    +   *             "s3-endpoint", other is optional
    +   */
    +  def main(args: Array[String]) {
    +    val logger: Logger = LoggerFactory.getLogger(this.getClass)
    +    if (args.length < 2 || args.length > 6) {
    +      logger.error("Usage: java CarbonS3Example: <access-key> <secret-key>" +
    +        "<s3-endpoint> [table-path-on-s3] [number-of-rows] [persistSchema]")
    +      System.exit(0)
    +    }
    +
    +    val path = if (args.length > 3) {
    +      args(3)
    +    } else {
    +      "s3a://sdk/WriterOutput"
    +    }
    +
    +    val num = if (args.length > 4) {
    +      Integer.parseInt(args(4))
    +    } else {
    +      3
    +    }
    +
    +    val persistSchema = if (args.length > 5) {
    +      if (args(5).equalsIgnoreCase("true")) {
    +        true
    +      } else {
    +        false
    +      }
    +    } else {
    +      true
    +    }
    +
    +    // getCanonicalPath gives path with \, so code expects /.
    +    val writerPath = path.replace("\\", "/");
    +
    +    val fields: Array[Field] = new Array[Field](3)
    +    fields(0) = new Field("name", DataTypes.STRING)
    +    fields(1) = new Field("age", DataTypes.INT)
    +    fields(2) = new Field("height", DataTypes.DOUBLE)
    +
    +    try {
    +      val builder = CarbonWriter.builder()
    +        .withSchema(new Schema(fields))
    +        .outputPath(writerPath)
    +        .isTransactionalTable(true)
    --- End diff --
    
     If  isTransactionalTable set false, writes the carbondata and carbonindex files in a flat folder structure. Please check https://github.com/apache/carbondata/blob/master/docs/sdk-writer-guide.md. @xuchuanyin 


---