You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by Sea <26...@qq.com> on 2017/10/07 11:56:54 UTC
Re: [DISCUSSION] Support Database Location Configuration whileCreating Database
Hi, Shahid:
I think you misundertood my meaning, the databaseLocation you mentioned is like carbon.storeLocation, not databaseLocation in Hive.
Your databaseLocation is like `hive.metastore.warehouse.dir`,
The default behavior in spark(hive):
If we do not specify database location.
databaseLocation = hive.metastore.warehouse.dir/spark.sql.warehouse.dir + "/" + databaseName.db + "/" + tableName
So databaseLocation is unique.
If we do not specify table location:
databaseLocation + '/' + tableName
------------------ Original ------------------
From: "mohdshahidkhan1987";<mo...@gmail.com>;
Date: Fri, Oct 6, 2017 08:40 PM
To: "dev"<de...@carbondata.apache.org>;
Subject: Re: [DISCUSSION] Support Database Location Configuration whileCreating Database
Hi Sea,
1. create database with location is supported by spark(hive) only, carbon
will not have any own implementation for create database. It is mention here
just for reference regarding the location attribute.
2. Why carbon want to keep tablePath = 'databaseLocation “/” +
database_Name + "/" + tableName`
There is problem if we keep the tablePath same as hive. For
CarbonFileMetaStore, carbon creates
the schema file at <TablePath>/Metadata/schema
If carbon skips adding databaseName, then two table having same name from
two different databases pointing to the same database location will cause
problem during table creation, load and query.
Even in case hive if two tables in different databases with same are
created, then we are showing then when either of the table is queried, the
content from both the tables are shown.
3. What does `Carbon.update.sync.folder` means?
This is to configure the directory for modifiedTime.mdt.
Earlier the directory path for modifiedTime.mdt was fixed to
carbon.storeLocation, but what if user decides to remove the name
service of the carbon.storeLocation.
This is required for the federation cluster, where multiple name services
will be available. So if the nameservice to which the the directory for
modifiedTime.mdt is removed then the directory could be
changed.
Regards,
Shahid
--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Re: [DISCUSSION] Support Database Location Configuration
whileCreating Database
Posted by Mohammad Shahid Khan <mo...@gmail.com>.
Hi Dev,
Please find updated design documents for "*Support Database Location
Configuration while Creating Database*"
Changes:
Carbon will follow the same approach as hive is following.
The table path should be formed from database location or fixed Carbon
store location and table name as given below.
*There will be three possible scenarios:*
I. Table path for the databases defined with location attribute.
tablePath = databaseLocation +”/” + tableName
II. Table path for the databases defined without location attribute.
tablePath = carbon.storeLocation + “/” + database_Name+”.db” +”/”
+ tableName
III. New table path for the default database.
tablePath = carbon.storeLocation +”/” + tableName
Regards,
Shahid
On Sat, Oct 7, 2017 at 5:26 PM, Sea <26...@qq.com> wrote:
> Hi, Shahid:
> I think you misundertood my meaning, the databaseLocation you
> mentioned is like carbon.storeLocation, not databaseLocation in Hive.
> Your databaseLocation is like `hive.metastore.warehouse.dir`,
> The default behavior in spark(hive):
> If we do not specify database location.
> databaseLocation = hive.metastore.warehouse.dir/spark.sql.warehouse.dir
> + "/" + databaseName.db + "/" + tableName
> So databaseLocation is unique.
> If we do not specify table location:
> databaseLocation + '/' + tableName
>
> ------------------ Original ------------------
> *From: * "mohdshahidkhan1987";<mo...@gmail.com>;
> *Date: * Fri, Oct 6, 2017 08:40 PM
> *To: * "dev"<de...@carbondata.apache.org>;
> *Subject: * Re: [DISCUSSION] Support Database Location Configuration
> whileCreating Database
>
> Hi Sea,
>
> 1. create database with location is supported by spark(hive) only, carbon
> will not have any own implementation for create database. It is mention
> here
> just for reference regarding the location attribute.
> 2. Why carbon want to keep tablePath = 'databaseLocation “/” +
> database_Name + "/" + tableName`
>
> There is problem if we keep the tablePath same as hive. For
> CarbonFileMetaStore, carbon creates
> the schema file at <TablePath>/Metadata/schema
>
> If carbon skips adding databaseName, then two table having same name from
> two different databases pointing to the same database location will cause
> problem during table creation, load and query.
>
> Even in case hive if two tables in different databases with same are
> created, then we are showing then when either of the table is queried, the
> content from both the tables are shown.
>
> 3. What does `Carbon.update.sync.folder` means?
> This is to configure the directory for modifiedTime.mdt.
> Earlier the directory path for modifiedTime.mdt was fixed to
> carbon.storeLocation, but what if user decides to remove the name
> service of the carbon.storeLocation.
> This is required for the federation cluster, where multiple name services
> will be available. So if the nameservice to which the the directory for
> modifiedTime.mdt is removed then the directory could be
> changed.
>
> Regards,
> Shahid
>
>
>
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.
> n5.nabble.com/
>