You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by zzcclp <gi...@git.apache.org> on 2018/01/26 16:07:22 UTC

[GitHub] carbondata pull request #1867: [CARBONDATA-2055]Support integrating Stream t...

GitHub user zzcclp opened a pull request:

    https://github.com/apache/carbondata/pull/1867

    [CARBONDATA-2055]Support integrating Stream table with Spark Streaming.

    Currently CarbonData just support integrating with Spark Structured Streaming
    which requires Kafka's version must be >= 0.10.
    But there are still many users integrating Spark Streaming with
    kafka 0.8, the cost of upgrading kafka is too much.
    So CarbonData need to integrate with Spark Streaming too.
    
    Please see the discussion in mailing list:
    http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Should-CarbonData-need-to-integrate-with-Spark-Streaming-too-td35341.html
    
    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
    
     - [ ] Testing done
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zzcclp/carbondata CARBONDATA-2055

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1867.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1867
    
----
commit 7531bfebb61844b25717d3c2b2af5bcd28a2f497
Author: Zhang Zhichao <44...@...>
Date:   2018-01-26T16:03:19Z

    [CARBONDATA-2055]Support integrating Stream table with Spark Streaming.
    
    Currently CarbonData just support integrating with Spark Structured Streaming
    which requires Kafka's version must be >= 0.10.
    But there are still many users integrating Spark Streaming with
    kafka 0.8, the cost of upgrading kafka is too much.
    So CarbonData need to integrate with Spark Streaming too.
    
    Please see the discussion in mailing list:
    
    http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Should-CarbonData-need-to-integrate-with-Spark-Streaming-too-td35341.html

----


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3653/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1926/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1957/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1925/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    retest sdv please


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    retest sdv please.


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3516/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3157/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1983/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2333/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2363/



---

[GitHub] carbondata pull request #1867: [CARBONDATA-2055][Streaming][WIP]Support inte...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1867#discussion_r170856731
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonSparkStreamFactory.scala ---
    @@ -0,0 +1,58 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql
    +
    +import org.apache.commons.lang3.StringUtils
    +
    +import org.apache.carbondata.streaming.CarbonStreamException
    +import org.apache.carbondata.streaming.CarbonStreamSparkStreaming
    +import org.apache.carbondata.streaming.CarbonStreamSparkStreamingWriter
    +
    +/**
    + * Create CarbonStreamSparkStreamingWriter for stream table
    + * when integrate with Spark Streaming
    + */
    +object CarbonSparkStreamFactory {
    +
    +  def getStreamSparkStreamWriter(
    +    dbNameStr: String,
    +    tableName: String): CarbonStreamSparkStreamingWriter =
    +    synchronized {
    +    val dbName = if (StringUtils.isEmpty(dbNameStr)) "default" else dbNameStr
    +    val key = dbName + "." + tableName
    +    if (CarbonStreamSparkStreaming.getTableMap.containsKey(key)) {
    +      CarbonStreamSparkStreaming.getTableMap.get(key)
    +    } else {
    +      if (StringUtils.isEmpty(tableName) || tableName.contains(" ")) {
    +        throw new CarbonStreamException("Table creation failed. " +
    +                                        "Table name must not be blank or " +
    +                                        "cannot contain blank space")
    +      }
    +      val carbonTable = CarbonEnv.getCarbonTable(Some(dbName),
    +        tableName)(SparkSession.builder().getOrCreate())
    --- End diff --
    
    build two SparkSession repeatedly in line 47 and line 53, build one instead


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2035/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    retest this please


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming]Support integrating Stre...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3985/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3225/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3160/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3155/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    retest sdv please


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3129/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3193/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3131/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    retest sdv please


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3156/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3401/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    retest this please


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3158/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3128/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming]Support integrating Stre...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    @chenliang613 @jackylk @QiangCai  please help to review, thanks.


---

[GitHub] carbondata pull request #1867: [CARBONDATA-2055][Streaming][WIP]Support inte...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1867#discussion_r164652953
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonStreamSparkStreamFactory.scala ---
    @@ -0,0 +1,58 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql
    +
    +import org.apache.commons.lang3.StringUtils
    +
    +import org.apache.carbondata.streaming.CarbonStreamException
    +import org.apache.carbondata.streaming.CarbonStreamSparkStreaming
    +import org.apache.carbondata.streaming.CarbonStreamSparkStreamingWriter
    +
    +/**
    + * Create CarbonStreamSparkStreamingWriter for stream table
    + * when integrate with Spark Streaming
    + */
    +object CarbonStreamSparkStreamFactory {
    --- End diff --
    
    How about using "CarbonSparkStreamingFactory" as name?


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3490/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming]Support integrating Stre...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2740/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3588/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    retest sdv please


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3159/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming]Support integrating Stre...

Posted by QiangCai <gi...@git.apache.org>.
Github user QiangCai commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    merged, thanks for your contribution.


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming]Support integrating Stre...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2764/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming]Support integrating Stre...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3741/



---

[GitHub] carbondata pull request #1867: [CARBONDATA-2055][Streaming][WIP]Support inte...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1867#discussion_r170854883
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonSparkStreamFactory.scala ---
    @@ -0,0 +1,58 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql
    +
    +import org.apache.commons.lang3.StringUtils
    +
    +import org.apache.carbondata.streaming.CarbonStreamException
    +import org.apache.carbondata.streaming.CarbonStreamSparkStreaming
    +import org.apache.carbondata.streaming.CarbonStreamSparkStreamingWriter
    +
    +/**
    + * Create CarbonStreamSparkStreamingWriter for stream table
    --- End diff --
    
    change to `Create [[CarbonStreamSparkStreamingWriter]] for stream table`


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3452/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3491/



---

[GitHub] carbondata pull request #1867: [CARBONDATA-2055][Streaming][WIP]Support inte...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1867#discussion_r164690569
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonStreamSparkStreamFactory.scala ---
    @@ -0,0 +1,58 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql
    +
    +import org.apache.commons.lang3.StringUtils
    +
    +import org.apache.carbondata.streaming.CarbonStreamException
    +import org.apache.carbondata.streaming.CarbonStreamSparkStreaming
    +import org.apache.carbondata.streaming.CarbonStreamSparkStreamingWriter
    +
    +/**
    + * Create CarbonStreamSparkStreamingWriter for stream table
    + * when integrate with Spark Streaming
    + */
    +object CarbonStreamSparkStreamFactory {
    --- End diff --
    
    OK, I will change it.


---

[GitHub] carbondata pull request #1867: [CARBONDATA-2055][Streaming][WIP]Support inte...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1867#discussion_r170958463
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonSparkStreamFactory.scala ---
    @@ -0,0 +1,58 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql
    +
    +import org.apache.commons.lang3.StringUtils
    +
    +import org.apache.carbondata.streaming.CarbonStreamException
    +import org.apache.carbondata.streaming.CarbonStreamSparkStreaming
    +import org.apache.carbondata.streaming.CarbonStreamSparkStreamingWriter
    +
    +/**
    + * Create CarbonStreamSparkStreamingWriter for stream table
    --- End diff --
    
    Done


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1923/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    This is phase 1, it focuses on reusing the code of integration with Structured Streaming.
    In next phase, it needs to refactor code to make it better for using.
    
    @jackylk @QiangCai @chenliang613 please review, thanks.


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3127/



---

[GitHub] carbondata pull request #1867: [CARBONDATA-2055][Streaming][WIP]Support inte...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1867#discussion_r164651907
  
    --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonStreamSparkStreamingExample.scala ---
    @@ -0,0 +1,245 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.examples
    +
    +import java.io.{File, PrintWriter}
    +import java.net.ServerSocket
    +
    +import org.apache.hadoop.conf.Configuration
    +import org.apache.spark.rdd.RDD
    +import org.apache.spark.sql.CarbonEnv
    +import org.apache.spark.sql.CarbonStreamSparkStreamFactory
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.sql.SaveMode
    +import org.apache.spark.sql.SparkSession
    +import org.apache.spark.streaming.{Seconds, StreamingContext, Time}
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants
    +import org.apache.carbondata.core.util.CarbonProperties
    +import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath}
    +import org.apache.carbondata.streaming.CarbonSparkStreamingListener
    +import org.apache.carbondata.streaming.parser.CarbonStreamParser
    +
    +/**
    + * This example introduces how to use Spark Streaming to write data
    + * to CarbonData stream table.
    + */
    +// scalastyle:off println
    +
    +case class StreamData(id: Int, name: String, city: String, salary: Float)
    --- End diff --
    
    Can you explain , what are the difference of "CarbonStreamSparkStreamingExample.scala" and "CarbonBatchSparkStreamingExample", whether can merge the two examples into one , or not ?


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3160/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming]Support integrating Stre...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    LGTM


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    retest sdv please


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1959/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3601/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3271/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3469/



---

[GitHub] carbondata pull request #1867: [CARBONDATA-2055][Streaming][WIP]Support inte...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1867#discussion_r170958659
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonSparkStreamFactory.scala ---
    @@ -0,0 +1,58 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql
    +
    +import org.apache.commons.lang3.StringUtils
    +
    +import org.apache.carbondata.streaming.CarbonStreamException
    +import org.apache.carbondata.streaming.CarbonStreamSparkStreaming
    +import org.apache.carbondata.streaming.CarbonStreamSparkStreamingWriter
    +
    +/**
    + * Create CarbonStreamSparkStreamingWriter for stream table
    + * when integrate with Spark Streaming
    + */
    +object CarbonSparkStreamFactory {
    +
    +  def getStreamSparkStreamWriter(
    +    dbNameStr: String,
    +    tableName: String): CarbonStreamSparkStreamingWriter =
    +    synchronized {
    +    val dbName = if (StringUtils.isEmpty(dbNameStr)) "default" else dbNameStr
    +    val key = dbName + "." + tableName
    +    if (CarbonStreamSparkStreaming.getTableMap.containsKey(key)) {
    +      CarbonStreamSparkStreaming.getTableMap.get(key)
    +    } else {
    +      if (StringUtils.isEmpty(tableName) || tableName.contains(" ")) {
    +        throw new CarbonStreamException("Table creation failed. " +
    +                                        "Table name must not be blank or " +
    +                                        "cannot contain blank space")
    +      }
    +      val carbonTable = CarbonEnv.getCarbonTable(Some(dbName),
    +        tableName)(SparkSession.builder().getOrCreate())
    --- End diff --
    
    pass the SparkSession as parameter from application, done


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming]Support integrating Stre...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3728/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1922/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    retest sdv please


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by QiangCai <gi...@git.apache.org>.
Github user QiangCai commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    LGTM


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3218/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2350/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3130/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3191/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming]Support integrating Stre...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4008/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    retest sdv please


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3155/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    retest this please


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3191/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1921/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2163/



---

[GitHub] carbondata pull request #1867: [CARBONDATA-2055][Streaming][WIP]Support inte...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1867#discussion_r164689949
  
    --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonStreamSparkStreamingExample.scala ---
    @@ -0,0 +1,245 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.examples
    +
    +import java.io.{File, PrintWriter}
    +import java.net.ServerSocket
    +
    +import org.apache.hadoop.conf.Configuration
    +import org.apache.spark.rdd.RDD
    +import org.apache.spark.sql.CarbonEnv
    +import org.apache.spark.sql.CarbonStreamSparkStreamFactory
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.sql.SaveMode
    +import org.apache.spark.sql.SparkSession
    +import org.apache.spark.streaming.{Seconds, StreamingContext, Time}
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants
    +import org.apache.carbondata.core.util.CarbonProperties
    +import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath}
    +import org.apache.carbondata.streaming.CarbonSparkStreamingListener
    +import org.apache.carbondata.streaming.parser.CarbonStreamParser
    +
    +/**
    + * This example introduces how to use Spark Streaming to write data
    + * to CarbonData stream table.
    + */
    +// scalastyle:off println
    +
    +case class StreamData(id: Int, name: String, city: String, salary: Float)
    --- End diff --
    
    "CarbonStreamSparkStreamingExample.scala" introduces how to use Spark Streaming to write data to stream table, and "CarbonBatchSparkStreamingExample" introduces how to use Spark Streaming to write data to batch loading table which  will auto compact segment. There are some different configurations and process flow, so I think using two examples is more clear.


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3438/



---

[GitHub] carbondata pull request #1867: [CARBONDATA-2055][Streaming][WIP]Support inte...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1867#discussion_r170958496
  
    --- Diff: streaming/src/main/scala/org/apache/carbondata/streaming/CarbonStreamSparkStreaming.scala ---
    @@ -0,0 +1,187 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.streaming
    +
    +import java.util
    +
    +import scala.collection.JavaConverters._
    +
    +import org.apache.hadoop.conf.Configuration
    +import org.apache.spark.sql.DataFrame
    +import org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink
    +import org.apache.spark.sql.execution.streaming.Sink
    +import org.apache.spark.sql.SaveMode
    +import org.apache.spark.sql.SparkSession
    +import org.apache.spark.streaming.Time
    +
    +import org.apache.carbondata.common.logging.LogServiceFactory
    +import org.apache.carbondata.core.locks.{CarbonLockFactory, ICarbonLock, LockUsage}
    +import org.apache.carbondata.core.metadata.schema.table.CarbonTable
    +
    +class CarbonStreamSparkStreamingWriter {
    +
    +  private val LOGGER = LogServiceFactory.getLogService(this.getClass.getCanonicalName)
    +
    +  private var isInitialize: Boolean = false
    +
    +  private var lock: ICarbonLock = null
    +  private var carbonTable: CarbonTable = null
    +  private var configuration: Configuration = null
    +  private var carbonAppendableStreamSink: Sink = null
    +  private val sparkSession: SparkSession = SparkSession.builder().getOrCreate()
    +
    +  def this(carbonTable: CarbonTable, configuration: Configuration) {
    +    this()
    +    this.carbonTable = carbonTable
    +    this.configuration = configuration
    +    this.option("dbName", carbonTable.getDatabaseName)
    +    this.option("tableName", carbonTable.getTableName)
    +  }
    +
    +  /**
    +   * Acquired the lock for stream table
    +   */
    +  def lockStreamTable(): Unit = {
    +    lock = CarbonLockFactory.getCarbonLockObj(carbonTable.getAbsoluteTableIdentifier,
    +      LockUsage.STREAMING_LOCK)
    +    if (lock.lockWithRetries()) {
    +      LOGGER.info("Acquired the lock for stream table: " +
    +                  carbonTable.getDatabaseName + "." +
    +                  carbonTable.getTableName)
    +    } else {
    +      LOGGER.error("Not able to acquire the lock for stream table:" +
    +                   carbonTable.getDatabaseName + "." + carbonTable.getTableName)
    +      throw new InterruptedException(
    +        "Not able to acquire the lock for stream table: " + carbonTable.getDatabaseName + "." +
    +        carbonTable.getTableName)
    +    }
    +  }
    +
    +  /**
    +   * unlock for stream table
    +   */
    +  def unLockStreamTable(): Unit = {
    +    if (null != lock) {
    +      lock.unlock()
    +      LOGGER.info("unlock for stream table: " +
    +                  carbonTable.getDatabaseName + "." +
    +                  carbonTable.getTableName)
    +    }
    +  }
    +
    +  def initialize(): Unit = {
    +    carbonAppendableStreamSink = StreamSinkFactory.createStreamTableSink(
    +      sparkSession,
    +      configuration,
    +      carbonTable,
    +      extraOptions.toMap).asInstanceOf[CarbonAppendableStreamSink]
    +
    +    lockStreamTable()
    +
    +    isInitialize = true
    +  }
    +
    +  def writeStreamData(dataFrame: DataFrame, time: Time): Unit = {
    +    if (!isInitialize) {
    +      initialize()
    +    }
    +    carbonAppendableStreamSink.addBatch(time.milliseconds, dataFrame)
    +  }
    +
    +  private val extraOptions = new scala.collection.mutable.HashMap[String, String]
    +  private var mode: SaveMode = SaveMode.ErrorIfExists
    +
    +  /**
    +   * Specifies the behavior when data or table already exists. Options include:
    +   *   - `SaveMode.Overwrite`: overwrite the existing data.
    +   *   - `SaveMode.Append`: append the data.
    +   *   - `SaveMode.Ignore`: ignore the operation (i.e. no-op).
    +   *   - `SaveMode.ErrorIfExists`: default option, throw an exception at runtime.
    +   */
    +  def mode(saveMode: SaveMode): CarbonStreamSparkStreamingWriter = {
    +    if (mode == SaveMode.ErrorIfExists) {
    +      mode = saveMode
    +    }
    +    this
    +  }
    +
    +  /**
    +   * Specifies the behavior when data or table already exists. Options include:
    +   *   - `SaveMode.Overwrite`: overwrite the existing data.
    --- End diff --
    
    Done


---

[GitHub] carbondata pull request #1867: [CARBONDATA-2055][Streaming]Support integrati...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/1867


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3476/



---

[GitHub] carbondata pull request #1867: [CARBONDATA-2055][Streaming][WIP]Support inte...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1867#discussion_r170856116
  
    --- Diff: streaming/src/main/scala/org/apache/carbondata/streaming/CarbonStreamSparkStreaming.scala ---
    @@ -0,0 +1,187 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.streaming
    +
    +import java.util
    +
    +import scala.collection.JavaConverters._
    +
    +import org.apache.hadoop.conf.Configuration
    +import org.apache.spark.sql.DataFrame
    +import org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink
    +import org.apache.spark.sql.execution.streaming.Sink
    +import org.apache.spark.sql.SaveMode
    +import org.apache.spark.sql.SparkSession
    +import org.apache.spark.streaming.Time
    +
    +import org.apache.carbondata.common.logging.LogServiceFactory
    +import org.apache.carbondata.core.locks.{CarbonLockFactory, ICarbonLock, LockUsage}
    +import org.apache.carbondata.core.metadata.schema.table.CarbonTable
    +
    +class CarbonStreamSparkStreamingWriter {
    +
    +  private val LOGGER = LogServiceFactory.getLogService(this.getClass.getCanonicalName)
    +
    +  private var isInitialize: Boolean = false
    +
    +  private var lock: ICarbonLock = null
    +  private var carbonTable: CarbonTable = null
    +  private var configuration: Configuration = null
    +  private var carbonAppendableStreamSink: Sink = null
    +  private val sparkSession: SparkSession = SparkSession.builder().getOrCreate()
    +
    +  def this(carbonTable: CarbonTable, configuration: Configuration) {
    +    this()
    +    this.carbonTable = carbonTable
    +    this.configuration = configuration
    +    this.option("dbName", carbonTable.getDatabaseName)
    +    this.option("tableName", carbonTable.getTableName)
    +  }
    +
    +  /**
    +   * Acquired the lock for stream table
    +   */
    +  def lockStreamTable(): Unit = {
    +    lock = CarbonLockFactory.getCarbonLockObj(carbonTable.getAbsoluteTableIdentifier,
    +      LockUsage.STREAMING_LOCK)
    +    if (lock.lockWithRetries()) {
    +      LOGGER.info("Acquired the lock for stream table: " +
    +                  carbonTable.getDatabaseName + "." +
    +                  carbonTable.getTableName)
    +    } else {
    +      LOGGER.error("Not able to acquire the lock for stream table:" +
    +                   carbonTable.getDatabaseName + "." + carbonTable.getTableName)
    +      throw new InterruptedException(
    +        "Not able to acquire the lock for stream table: " + carbonTable.getDatabaseName + "." +
    +        carbonTable.getTableName)
    +    }
    +  }
    +
    +  /**
    +   * unlock for stream table
    +   */
    +  def unLockStreamTable(): Unit = {
    +    if (null != lock) {
    +      lock.unlock()
    +      LOGGER.info("unlock for stream table: " +
    +                  carbonTable.getDatabaseName + "." +
    +                  carbonTable.getTableName)
    +    }
    +  }
    +
    +  def initialize(): Unit = {
    +    carbonAppendableStreamSink = StreamSinkFactory.createStreamTableSink(
    +      sparkSession,
    +      configuration,
    +      carbonTable,
    +      extraOptions.toMap).asInstanceOf[CarbonAppendableStreamSink]
    +
    +    lockStreamTable()
    +
    +    isInitialize = true
    +  }
    +
    +  def writeStreamData(dataFrame: DataFrame, time: Time): Unit = {
    +    if (!isInitialize) {
    +      initialize()
    +    }
    +    carbonAppendableStreamSink.addBatch(time.milliseconds, dataFrame)
    +  }
    +
    +  private val extraOptions = new scala.collection.mutable.HashMap[String, String]
    +  private var mode: SaveMode = SaveMode.ErrorIfExists
    +
    +  /**
    +   * Specifies the behavior when data or table already exists. Options include:
    +   *   - `SaveMode.Overwrite`: overwrite the existing data.
    +   *   - `SaveMode.Append`: append the data.
    +   *   - `SaveMode.Ignore`: ignore the operation (i.e. no-op).
    +   *   - `SaveMode.ErrorIfExists`: default option, throw an exception at runtime.
    +   */
    +  def mode(saveMode: SaveMode): CarbonStreamSparkStreamingWriter = {
    +    if (mode == SaveMode.ErrorIfExists) {
    +      mode = saveMode
    +    }
    +    this
    +  }
    +
    +  /**
    +   * Specifies the behavior when data or table already exists. Options include:
    +   *   - `SaveMode.Overwrite`: overwrite the existing data.
    --- End diff --
    
    saveMode parameter is String, change the description


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3414/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming]Support integrating Stre...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2729/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming]Support integrating Stre...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3974/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3481/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3571/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3285/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3157/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    anyone can help to review this pr?


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3415/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3363/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2387/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3627/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2414/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2279/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    retest sdv please


---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3178/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1867
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1924/



---