You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@carbondata.apache.org by dylan <dy...@163.com> on 2017/09/12 08:48:16 UTC

Fw:carbonthriftserver can not be load many times








-------- Forwarding messages --------
From: "dylan" <dy...@163.com>
Date: 2017-09-12 16:25:56
To: user <us...@carbondata.apache.org>
Subject: carbonthriftserver can not be load many times

hello :
     when i use carbondata,i use step by
        1.create table and load data
        2.use carbonthriftserver,select * from table limit 1(it's ok)
        3.update the table
        4.use carbonthriftserver,select * from table limit 1(it's bad) ,the  error is :


       i kown carbonthrifserver use btree cache the carbonindex,
       and when i update the table the index is change,and carbonthriftserver didn't know the has changed,
       So every time i have to restart the carbonthriftserver, do not know if you run into this problem?
       Is this a design flaw, or is there a better advice to help me solve this problem, thanks!

Re: Fw:carbonthriftserver can not be load many times

Posted by dylan <dy...@163.com>.

hello ravipesala:
    thanks for your reply,
    i am use carbondata:1.1.0,spark:1.6.0.
    and I reproduce in accordance with the official quick-start-guide case
again,
    1.Creating a Table
cc.sql("create table IF NOT EXISTS  carbondb.test_table(id string,name
String,city String,age int) stored by 'carbondata' ")

    2.load data into table
  cc.sql("load data inpath 'hdfs://nameservice1/user/zz/sample.csv' into
table carbondb.test_table")

    3.start carbonthriftserver
   /home/zz/spark-1.6.0-bin-hadoop2.6/bin/spark-submit  --master local[*]
--driver-java-options="-Dcarbon.properties.filepath=/home/zz/spark-1.6.0-bin-hadoop2.6/conf/carbon.properties" 
--executor-memory 4G  --driver-memory 2g  --conf
spark.serializer=org.apache.spark.serializer.KryoSerializer   --conf
"spark.sql.shuffle.partitions=3" --conf spark.speculation=true   --class
org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
/home/zz/spark-1.6.0-bin-hadoop2.6/carbonlib/carbondata_2.10-1.1.0-shade-hadoop2.2.0.jar
hdfs://nameservice1/user/zz/rp_carbon_store

   4.Connecting to CarbonData Thrift Server Using Beeline.

	 <http://chuantu.biz/t6/47/1505284993x2890202558.jpg> 

   5.drop table
   cc.sql("drop table carbondb.test_table")

   6.recreate table and load data
    cc.sql("create table IF NOT EXISTS  carbondb.test_table(id string,name
String,city String,age int) stored by 'carbondata' ")
    cc.sql("load data inpath 'hdfs://nameservice1/user/zz/sample.csv' into
table carbondb.test_table")

   7.select data use beeline 
    <http://chuantu.biz/t6/47/1505284937x1034817476.jpg> 
   Like the above error, the cache is not updated


Trouble to help me see thank you！



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Fw:carbonthriftserver can not be load many times

Posted by Ravindra Pesala <ra...@gmail.com>.

Hi,

I think you are using mysql hive metastore and connected thrift server and
spark-shell at same time,  So 2 drivers are accessing the carbonstore at
same time and changing the metadata of it. It seems there are some
refresh issues in carbon in this case. Please raise a jira ticket we will
look into it.

Regards,
Ravindra.

On 15 September 2017 at 14:27, dylan <dy...@163.com> wrote:

> hi Ravi :
>    in my case,1,2,5 and 6 step is hone session on spark-shell ,4 and 7 is
> one session on beeline,
>
>  According to your Suggest,i test this case on the current master branch,
>
>  when i use beeline there is no Btree load failed info,but in my table
> there
> is no data,All the data is null.
> but in spark-shell is ok.
> 0: jdbc:hive2://localhost:10000> select * from  carbondb.test_table;
> +-------+-------+-------+-------+--+
> |  id   | name  | city  |  age  |
> +-------+-------+-------+-------+--+
> | NULL  | NULL  | NULL  | NULL  |
> | NULL  | NULL  | NULL  | NULL  |
> | NULL  | NULL  | NULL  | NULL  |
> +-------+-------+-------+-------+--+
>
>
> thanks!
>
>
>
>
>
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.
> n5.nabble.com/
>



-- 
Thanks & Regards,
Ravi

Re: Fw:carbonthriftserver can not be load many times

Posted by dylan <dy...@163.com>.

hi Ravi :
   in my case,1,2,5 and 6 step is hone session on spark-shell ,4 and 7 is
one session on beeline,

 According to your Suggest,i test this case on the current master branch,

 when i use beeline there is no Btree load failed info,but in my table there
is no data,All the data is null.
but in spark-shell is ok.
0: jdbc:hive2://localhost:10000> select * from  carbondb.test_table;
+-------+-------+-------+-------+--+
|  id   | name  | city  |  age  |
+-------+-------+-------+-------+--+
| NULL  | NULL  | NULL  | NULL  |
| NULL  | NULL  | NULL  | NULL  |
| NULL  | NULL  | NULL  | NULL  |
+-------+-------+-------+-------+--+


thanks!





--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Fw:carbonthriftserver can not be load many times

Posted by Ravindra Pesala <ra...@gmail.com>.

Hi,

I have a confusion here.

1 and 2 steps are done through one beeline session and  3,4 and 5 are done
from another beeline session?

And also can you try it on the current master branch if the same issue
exists?


Regards,
Ravindra.

On 13 September 2017 at 15:14, dylan <dy...@163.com> wrote:

>
> hello ravipesala:
>     thanks for your reply,
>     i am use carbondata version is 1.1.0 and spark version is 1.6.0.
>     and I reproduce in accordance with the official quick-start-guide case
> again,
>     1.Creating a Table
> cc.sql("create table IF NOT EXISTS  carbondb.test_table(id string,name
> String,city String,age int) stored by 'carbondata' ")
>
>     2.load data into table
>   cc.sql("load data inpath 'hdfs://nameservice1/user/zz/sample.csv' into
> table carbondb.test_table")
>
>     3.start carbonthriftserver
>    /home/zz/spark-1.6.0-bin-hadoop2.6/bin/spark-submit  --master local[*]
> --driver-java-options="-Dcarbon.properties.filepath=/
> home/zz/spark-1.6.0-bin-hadoop2.6/conf/carbon.properties"
> --executor-memory 4G  --driver-memory 2g  --conf
> spark.serializer=org.apache.spark.serializer.KryoSerializer   --conf
> "spark.sql.shuffle.partitions=3" --conf spark.speculation=true   --class
> org.apache.carbondata.spark.thriftserver.CarbonThriftServer
> /home/zz/spark-1.6.0-bin-hadoop2.6/carbonlib/carbondata_2.10-1.1.0-shade-
> hadoop2.2.0.jar
> hdfs://nameservice1/user/zz/rp_carbon_store
>
>    4.Connecting to CarbonData Thrift Server Using Beeline.
>
>          <http://chuantu.biz/t6/47/1505284993x2890202558.jpg>
>
>    5.drop table
>    cc.sql("drop table carbondb.test_table")
>
>    6.recreate table and load data
>     cc.sql("create table IF NOT EXISTS  carbondb.test_table(id string,name
> String,city String,age int) stored by 'carbondata' ")
>     cc.sql("load data inpath 'hdfs://nameservice1/user/zz/sample.csv' into
> table carbondb.test_table")
>
>    7.select data use beeline
>     <http://chuantu.biz/t6/47/1505284937x1034817476.jpg>
>    Like the above error, the cache is not updated
>
>    and last i want to ask a question,
>    if i not do step 5 and I executed the reloading data directly,query data
> is ok,
>     but the data is added, not covered, is the design is like this, or a
> bug?
>
>
> Trouble to help me see thank you！
>
>
>
>
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.
> n5.nabble.com/
>



-- 
Thanks & Regards,
Ravi

Re: Fw:carbonthriftserver can not be load many times

Posted by dylan <dy...@163.com>.

hello ravipesala:
    thanks for your reply,
    i am use carbondata version is 1.1.0 and spark version is 1.6.0.
    and I reproduce in accordance with the official quick-start-guide case
again,
    1.Creating a Table
cc.sql("create table IF NOT EXISTS  carbondb.test_table(id string,name
String,city String,age int) stored by 'carbondata' ")

    2.load data into table
  cc.sql("load data inpath 'hdfs://nameservice1/user/zz/sample.csv' into
table carbondb.test_table")

    3.start carbonthriftserver
   /home/zz/spark-1.6.0-bin-hadoop2.6/bin/spark-submit  --master local[*]
--driver-java-options="-Dcarbon.properties.filepath=/home/zz/spark-1.6.0-bin-hadoop2.6/conf/carbon.properties" 
--executor-memory 4G  --driver-memory 2g  --conf
spark.serializer=org.apache.spark.serializer.KryoSerializer   --conf
"spark.sql.shuffle.partitions=3" --conf spark.speculation=true   --class
org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
/home/zz/spark-1.6.0-bin-hadoop2.6/carbonlib/carbondata_2.10-1.1.0-shade-hadoop2.2.0.jar
hdfs://nameservice1/user/zz/rp_carbon_store

   4.Connecting to CarbonData Thrift Server Using Beeline.

	 <http://chuantu.biz/t6/47/1505284993x2890202558.jpg> 

   5.drop table
   cc.sql("drop table carbondb.test_table")

   6.recreate table and load data
    cc.sql("create table IF NOT EXISTS  carbondb.test_table(id string,name
String,city String,age int) stored by 'carbondata' ")
    cc.sql("load data inpath 'hdfs://nameservice1/user/zz/sample.csv' into
table carbondb.test_table")

   7.select data use beeline 
    <http://chuantu.biz/t6/47/1505284937x1034817476.jpg> 
   Like the above error, the cache is not updated

   and last i want to ask a question,
   if i not do step 5 and I executed the reloading data directly,query data
is ok,
    but the data is added, not covered, is the design is like this, or a
bug?
   

Trouble to help me see thank you！




--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Fw:carbonthriftserver can not be load many times

Posted by Ravindra Pesala <ra...@gmail.com>.

Hi,

It is not the behavior of carbondata, it must be a bug. Usually, when you
update then the cache refreshes for next query.
Please provide following information.
1. Carbondata and Spark version you are using.
2. Testcase to reproduce this issue.

Regards,
Ravindra.

On 12 September 2017 at 14:18, dylan <dy...@163.com> wrote:

>
>
>
>
>
>
> -------- Forwarding messages --------
> From: "dylan" <dy...@163.com>
> Date: 2017-09-12 16:25:56
> To: user <us...@carbondata.apache.org>
> Subject: carbonthriftserver can not be load many times
> hello :
>      when i use carbondata,i use step by
>         1.create table and load data
>         2.use carbonthriftserver,select * from table limit 1(it's ok)
>         3.update the table
>         4.use carbonthriftserver,select * from table limit 1(it's bad)
> ,the  error is :
>
>        i kown carbonthrifserver use btree cache the carbonindex,
>        and when i update the table the index is change,and
> carbonthriftserver didn't know the has changed,
>        So every time i have to restart the carbonthriftserver, do not know
> if you run into this problem?
>        Is this a design flaw, or is there a better advice to help me solve
> this problem, thanks!
>
>
>
>
>
>
>



-- 
Thanks & Regards,
Ravi