You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Oleksandr Polishchuk (JIRA)" <ji...@apache.org> on 2019/02/21 08:56:00 UTC
[jira] [Created] (HIVE-21302) datanucleus.schema.autoCreateAll=true
will not work properly as schematool
Oleksandr Polishchuk created HIVE-21302:
-------------------------------------------
Summary: datanucleus.schema.autoCreateAll=true will not work properly as schematool
Key: HIVE-21302
URL: https://issues.apache.org/jira/browse/HIVE-21302
Project: Hive
Issue Type: Bug
Affects Versions: 2.3.4, 2.1.0
Reporter: Oleksandr Polishchuk
The bug was found while working with configured environment: {{Apache Hadoop cluster}} and {{Apache Hive}} (the same issue on both versions 2.1 and 2.3).
Configuration of working environment was performed in the following steps:
# Installed Apache Hadoop cluster
# Installed over cluster - Hive component
# Configured properties in {{hive-site.xml}}
{code:xml}
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=test_db;create=true</value>
<description>the URL of the Derby database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.apache.derby.jdbc.EmbeddedDriver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/usr/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>datanucleus.schema.autoCreateAll</name>
<value>true</value>
<description>creates necessary schema on a startup if one doesnt exist.</description>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
{code}
# Installed {{[IJ|http://db.apache.org/derby/papers/DerbyTut/index.html]}} utility to work with embedded DB, according to steps in the guide.
# Launched services: {{hiveserver2}} and {{metastore}}
**
{code:java}
hive --service hiveserver2
{code}
** {code:java}
hive --service metastore
{code}
# The next step was to start the utility in the next steps:
**
{code:java}
$ cd $DERBY_INSTALL/bin
{code}
**
{code:java}
$ . setEmbeddedCP
{code}
**
{code:java}
$ echo $CLASSPATH /opt/Apache/db-derby-10.14.2.0-bin/lib/derby.jar:/opt/Apache/db-derby-10.14.2.0-bin/lib/derbytools.jar:
{code}
** Start up {{ij}} with this command: {{./ij}}
{code:java}
connect 'jdbc:derby:/opt/hadoop/hive/metastore_db';
{code}
But after performing that command appears the next error.
{code:java}
ij> connect 'jdbc:derby:/opt/hadoop/hive/metastore_db';
ERROR XJ040: Failed to start database '/opt/hadoop/hive/metastore_db' with class loader sun.misc.Launcher$AppClassLoader@5e2de80c, see the next exception for details.
ERROR XSDB6: Another instance of Derby may have already booted the database /opt/hadoop/hive/metastore_db.
{code}
To resolve this error need to stop all hive services, like hiveserver2 and metastore. Whereupon again enter the command for connect to DB connect {{'jdbc:derby:/opt/hadoop/hive/metastore_db';}} that was successful.
# As far as metastore service was launched, he generated all tables lists in the [package.jdo|https://github.com/apache/hive/blob/branch-2.3/metastore/src/model/package.jdo] file, which we could see if perform the next command in {{ij}} utility: {{ij(CONNECTION1)> show tables;}}
{code:java}
ij> connect 'jdbc:derby:/opt/hadoop/hive/metastore_db';
ij(CONNECTION1)> show tables;
TABLE_SCHEM |TABLE_NAME |REMARKS
------------------------------------------------------------------------
SYS |SYSALIASES |
SYS |SYSCHECKS |
SYS |SYSCOLPERMS |
SYS |SYSCOLUMNS |
SYS |SYSCONGLOMERATES |
SYS |SYSCONSTRAINTS |
SYS |SYSDEPENDS |
SYS |SYSFILES |
SYS |SYSFOREIGNKEYS |
SYS |SYSKEYS |
SYS |SYSPERMS |
SYS |SYSROLES |
SYS |SYSROUTINEPERMS |
SYS |SYSSCHEMAS |
SYS |SYSSEQUENCES |
SYS |SYSSTATEMENTS |
SYS |SYSSTATISTICS |
SYS |SYSTABLEPERMS |
SYS |SYSTABLES |
SYS |SYSTRIGGERS |
SYS |SYSUSERS |
SYS |SYSVIEWS |
SYSIBM |SYSDUMMY1 |
APP |BUCKETING_COLS |
APP |CDS |
APP |COLUMNS_V2 |
APP |DATABASE_PARAMS |
APP |DBS |
APP |GLOBAL_PRIVS |
APP |PARTITIONS |
APP |PARTITION_KEYS |
APP |PARTITION_KEY_VALS |
APP |PARTITION_PARAMS |
APP |PART_COL_STATS |
APP |ROLES |
APP |SDS |
APP |SD_PARAMS |
APP |SEQUENCE_TABLE |
APP |SERDES |
APP |SERDE_PARAMS |
APP |SKEWED_COL_NAMES |
APP |SKEWED_COL_VALUE_LOC_MAP |
APP |SKEWED_STRING_LIST |
APP |SKEWED_STRING_LIST_VALUES |
APP |SKEWED_VALUES |
APP |SORT_COLS |
APP |TABLE_PARAMS |
APP |TAB_COL_STATS |
APP |TBLS |
APP |VERSION |
50 rows selected
{code}
# But if instead of start/stop services use the next command: {{/opt/hadoop/hive/bin/schematool -dbType derby -initSchema}}.
The result will be fundamentally different because the this command involve {{hive-schema-2.1.0.mysql.sql}} script that generate the following tables:
{code:java}
ij(CONNECTION1)> show tables;
TABLE_SCHEM |TABLE_NAME |REMARKS
------------------------------------------------------------------------
SYS |SYSALIASES |
SYS |SYSCHECKS |
SYS |SYSCOLPERMS |
SYS |SYSCOLUMNS |
SYS |SYSCONGLOMERATES |
SYS |SYSCONSTRAINTS |
SYS |SYSDEPENDS |
SYS |SYSFILES |
SYS |SYSFOREIGNKEYS |
SYS |SYSKEYS |
SYS |SYSPERMS |
SYS |SYSROLES |
SYS |SYSROUTINEPERMS |
SYS |SYSSCHEMAS |
SYS |SYSSEQUENCES |
SYS |SYSSTATEMENTS |
SYS |SYSSTATISTICS |
SYS |SYSTABLEPERMS |
SYS |SYSTABLES |
SYS |SYSTRIGGERS |
SYS |SYSUSERS |
SYS |SYSVIEWS |
SYSIBM |SYSDUMMY1 |
APP |AUX_TABLE |
APP |BUCKETING_COLS |
APP |CDS |
APP |COLUMNS |
APP |COLUMNS_V2 |
APP |COMPACTION_QUEUE |
APP |COMPLETED_COMPACTIONS |
APP |COMPLETED_TXN_COMPONENTS |
APP |DATABASE_PARAMS |
APP |DBS |
APP |DB_PRIVS |
APP |DELEGATION_TOKENS |
APP |FUNCS |
APP |FUNC_RU |
APP |GLOBAL_PRIVS |
APP |HIVE_LOCKS |
APP |IDXS |
APP |INDEX_PARAMS |
APP |KEY_CONSTRAINTS |
APP |MASTER_KEYS |
APP |NEXT_COMPACTION_QUEUE_ID |
APP |NEXT_LOCK_ID |
APP |NEXT_TXN_ID |
APP |NOTIFICATION_LOG |
APP |NOTIFICATION_SEQUENCE |
APP |NUCLEUS_TABLES |
APP |PARTITIONS |
APP |PARTITION_EVENTS |
APP |PARTITION_KEYS |
APP |PARTITION_KEY_VALS |
APP |PARTITION_PARAMS |
APP |PART_COL_PRIVS |
APP |PART_COL_STATS |
APP |PART_PRIVS |
APP |ROLES |
APP |ROLE_MAP |
APP |SDS |
APP |SD_PARAMS |
APP |SEQUENCE_TABLE |
APP |SERDES |
APP |SERDE_PARAMS |
APP |SKEWED_COL_NAMES |
APP |SKEWED_COL_VALUE_LOC_MAP |
APP |SKEWED_STRING_LIST |
APP |SKEWED_STRING_LIST_VALUES |
APP |SKEWED_VALUES |
APP |SORT_COLS |
APP |TABLE_PARAMS |
APP |TAB_COL_STATS |
APP |TBLS |
APP |TBL_COL_PRIVS |
APP |TBL_PRIVS |
APP |TXNS |
APP |TXN_COMPONENTS |
APP |TYPES |
APP |TYPE_FIELDS |
APP |VERSION |
APP |WRITE_SET |
81 rows selected
{code}
As a result we could see that property that was used in the {{hive-site.xml}} doesn’t work properly and we should avoid using her. That is we must use only {{schematool}}.
{code:java}
<property>
<name>datanucleus.schema.autoCreateAll</name>
<value>true</value>
<description>creates necessary schema on a startup if one doesnt exist.</description>
</property>
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)