You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/10/22 06:16:19 UTC
[GitHub] [hudi] mutoulbj opened a new issue #3845: [SUPPORT]`if not exists` doesn't work on create table in spark-sql
mutoulbj opened a new issue #3845:
URL: https://github.com/apache/hudi/issues/3845
**_Tips before filing an issue_**
- Have you gone through our [FAQs](https://cwiki.apache.org/confluence/display/HUDI/FAQ)?
- Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
- If you have triaged this as a bug, then file an [issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
**Describe the problem you faced**
when use `create table if not exists` to create hudi table, get error when table exists.
**To Reproduce**
Steps to reproduce the behavior:
1. run `create table if not exists ttt (id bigint, name string) using hudi options (type='cow', primaryKey='id');`, succeed create the ttt table;
2.rerun it,get error: `Error in query: Specified schema in create table statement is not equal to the table schema.You should not specify the schema for an exist table: `dev`.`ttt` ;`
**Expected behavior**
does not raise exception when table exists.
**Environment Description**
* Hudi version :0.9.0
* Spark version :3.0.3
* Hive version :3.1.2
* Hadoop version :3.2.2
* Storage (HDFS/S3/GCS..) :HDFS
* Running on Docker? (yes/no) :no
**Additional context**
it seems hudi can not ignore the hudi meta columns.
**Stacktrace**
```> create table if not exists ttt (id bigint, name string) using hudi options (type='cow', primaryKey='id');
1211789 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 0: get_database: dev
1211789 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=root ip=unknown-ip-addr cmd=get_database: dev
1211792 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 0: get_database: dev
1211792 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=root ip=unknown-ip-addr cmd=get_database: dev
1211804 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 0: get_database: dev
1211804 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=root ip=unknown-ip-addr cmd=get_database: dev
1211807 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 0: get_database: dev
1211807 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=root ip=unknown-ip-addr cmd=get_database: dev
1211821 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 0: get_table : db=dev tbl=ttt
1211821 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=root ip=unknown-ip-addr cmd=get_table : db=dev tbl=ttt
1211823 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 0: get_database: dev
1211823 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=root ip=unknown-ip-addr cmd=get_database: dev
1211826 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 0: get_database: dev
1211826 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=root ip=unknown-ip-addr cmd=get_database: dev
1211857 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 0: get_database: dev
1211857 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=root ip=unknown-ip-addr cmd=get_database: dev
1211859 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 0: get_table : db=dev tbl=ttt
1211859 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=root ip=unknown-ip-addr cmd=get_table : db=dev tbl=ttt
1211881 [main] INFO org.apache.spark.sql.hive.HiveUtils - Initializing HiveMetastoreConnection version 2.3.7 using Spark classes.
1211881 [main] INFO org.apache.spark.sql.hive.client.HiveClientImpl - Warehouse location for Hive client (version 2.3.7) is /user/hive/warehouse
1211902 [main] INFO org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController - Created SQLStdHiveAccessController for session context : HiveAuthzSessionContext [sessionString=1e7f3c52-32fc-48fc-990c-76c55678273d, clientType=HIVECLI]
1211903 [main] WARN org.apache.hadoop.hive.ql.session.SessionState - METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory.
1211903 [main] INFO hive.metastore - Mestastore configuration hive.metastore.filter.hook changed from org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook
1211904 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 0: Cleaning up thread local RawStore...
1211904 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=root ip=unknown-ip-addr cmd=Cleaning up thread local RawStore...
1211904 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 0: Done cleaning up thread local RawStore
1211905 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=root ip=unknown-ip-addr cmd=Done cleaning up thread local RawStore
1211929 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 0: create_table: Table(tableName:ttt, dbName:dev, owner:root, createTime:1634883038, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:_hoodie_commit_time, type:string, comment:null), FieldSchema(name:_hoodie_commit_seqno, type:string, comment:null), FieldSchema(name:_hoodie_record_key, type:string, comment:null), FieldSchema(name:_hoodie_partition_path, type:string, comment:null), FieldSchema(name:_hoodie_file_name, type:string, comment:null), FieldSchema(name:id, type:bigint, comment:null), FieldSchema(name:name, type:string, comment:null)], location:hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt, inputFormat:org.apache.hudi.hadoop.HoodieParquetInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe, pa
rameters:{path=hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt, serialization.format=1, type=cow, primaryKey=id}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{})), partitionKeys:[], parameters:{spark.sql.sources.schema.part.0={"type":"struct","fields":[{"name":"_hoodie_commit_time","type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_commit_seqno","type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_record_key","type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_partition_path","type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_file_name","type":"string","nullable":true,"metadata":{}},{"name":"id","type":"long","nullable":true,"metadata":{}},{"name":"name","type":"string","nullable":true,"metadata":{}}]}, spark.sql.sources.schema.numParts=1, spark.sql.sources.provider=hudi, spark.sql.create.version=3.0.3}, viewOriginalText:null, viewExpanded
Text:null, tableType:MANAGED_TABLE, privileges:PrincipalPrivilegeSet(userPrivileges:{root=[PrivilegeGrantInfo(privilege:INSERT, createTime:-1, grantor:root, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:SELECT, createTime:-1, grantor:root, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:UPDATE, createTime:-1, grantor:root, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:DELETE, createTime:-1, grantor:root, grantorType:USER, grantOption:true)]}, groupPrivileges:null, rolePrivileges:null))
1211929 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=root ip=unknown-ip-addr cmd=create_table: Table(tableName:ttt, dbName:dev, owner:root, createTime:1634883038, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:_hoodie_commit_time, type:string, comment:null), FieldSchema(name:_hoodie_commit_seqno, type:string, comment:null), FieldSchema(name:_hoodie_record_key, type:string, comment:null), FieldSchema(name:_hoodie_partition_path, type:string, comment:null), FieldSchema(name:_hoodie_file_name, type:string, comment:null), FieldSchema(name:id, type:bigint, comment:null), FieldSchema(name:name, type:string, comment:null)], location:hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt, inputFormat:org.apache.hudi.hadoop.HoodieParquetInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hiv
e.ql.io.parquet.serde.ParquetHiveSerDe, parameters:{path=hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt, serialization.format=1, type=cow, primaryKey=id}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{})), partitionKeys:[], parameters:{spark.sql.sources.schema.part.0={"type":"struct","fields":[{"name":"_hoodie_commit_time","type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_commit_seqno","type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_record_key","type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_partition_path","type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_file_name","type":"string","nullable":true,"metadata":{}},{"name":"id","type":"long","nullable":true,"metadata":{}},{"name":"name","type":"string","nullable":true,"metadata":{}}]}, spark.sql.sources.schema.numParts=1, spark.sql.sources.provider=hudi, spark.sql.create.version=3
.0.3}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, privileges:PrincipalPrivilegeSet(userPrivileges:{root=[PrivilegeGrantInfo(privilege:INSERT, createTime:-1, grantor:root, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:SELECT, createTime:-1, grantor:root, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:UPDATE, createTime:-1, grantor:root, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:DELETE, createTime:-1, grantor:root, grantorType:USER, grantOption:true)]}, groupPrivileges:null, rolePrivileges:null))
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.wm.default.pool.size does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.task.scheduler.preempt.independent does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.output.format.arrow does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.tez.llap.min.reducer.per.executor does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.arrow.root.allocator.limit does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.vectorized.use.checked.expressions does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.tez.dynamic.semijoin.reduction.for.mapjoin does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.vectorized.complex.types.enabled does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.wm.worker.threads does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.repl.partitions.dump.parallelism does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.uri.selection does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.strict.checks.no.partition.filter does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.tez.dynamic.semijoin.reduction.for.dpp.factor does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.stats.filter.in.min.ratio does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.client.cache.initial.capacity does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.stats.ndv.estimate.percent does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.webui.cors.allowed.methods does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.optimize.joinreducededuplication does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.client.cache.enabled does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.stats.fetch.bitvector does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.disable.unsafe.external.table.operations does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.materializedview.rewriting.incremental does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.materializedviews.registry.impl does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.event.db.notification.api.auth does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.exec.orc.delta.streaming.optimizations.enabled does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.stats.ndv.algo does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.spark.job.max.tasks does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.msck.repair.batch.max.retries does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.prewarm.spark.timeout does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.optimize.update.table.properties.from.serde.list does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.plugin.client.num.threads does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.test.bucketcodec.version does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.query.reexecution.enabled does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.materializedview.rewriting.time.window does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.query.reexecution.stats.cache.batch.size does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.webui.cors.allowed.headers does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.join.inner.residual does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.active.passive.ha.enable does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.io.trace.always.dump does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.query.reexecution.stats.persist.scope does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.mm.allow.originals does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.internal.ss.authz.settings.applied.marker does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.compactor.compact.insert.only does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.txn.xlock.iow does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.spark.rsc.conf.list does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.stats.jdbc.timeout does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.cache.defaultfs.only.native.fileid does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.spark.optimize.shuffle.serde does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.testing.remove.logs does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.distcp.privileged.doAs does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.strict.checks.orderby.no.limit does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.client.cache.expiry.time does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.io.allocator.defrag.headroom does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.notification.event.consumers does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.vectorized.input.format.supports.enabled does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.client.cache.max.capacity does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.repl.dumpdir.clean.freq does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.spark.use.ts.stats.for.mapjoin does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.repl.dump.include.acid.tables does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.webui.use.pam does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.query.reexecution.max.count does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.io.share.object.pools does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.optimize.update.table.properties.from.serde does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.service.metrics.codahale.reporter.classes does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.tez.session.events.print.summary does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.io.vrb.queue.limit.base does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.mm.avoid.s3.globstatus does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.repl.replica.functions.root.dir does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.query.results.cache.max.entry.lifetime does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.limit.connections.per.user does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.thrift.http.compression.enabled does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.vectorized.execution.ptf.enabled does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.optimize.shared.work.extended does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.vectorized.row.identifier.enabled does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.query.reexecution.always.collect.operator.stats does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.repl.dumpdir.ttl does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.local.time.zone does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.tez.wm.am.registry.timeout does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.active.passive.ha.registry.namespace does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.create.as.insert.only does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.mapjoin.memory.oversubscribe.factor does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.arrow.batch.size does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.notification.sequence.lock.retry.sleep.interval does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.repl.approx.max.load.tasks does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.query.results.cache.enabled does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.legacy.schema.for.all.serdes does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.tez.dag.status.check.interval does not exist
1211954 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.druid.bitmap.type does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.spark.dynamic.partition.pruning.map.join.only does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.memory.oversubscription.max.executors.per.query does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.io.trace.size does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.plugin.rpc.num.handlers does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.wm.allow.any.pool.via.jdbc does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.vectorized.groupby.complex.types.enabled does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.avro.timestamp.skip.conversion does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.query.results.cache.nontransactional.tables.enabled does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.stats.correlated.multi.key.joins does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.db.type does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.streaming.auto.flush.check.interval.size does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.zookeeper.connection.timeout does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.query.reexecution.strategies does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.limit.connections.per.user.ipaddress does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.mapjoin.memory.monitor.check.interval does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.optimize.shared.work does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.stats.estimate does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.io.allocator.discard.method does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.tez.cartesian-product.enabled does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.notification.sequence.lock.max.retries does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.heap.memory.monitor.usage.threshold does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.privilege.synchronizer.interval does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.vectorized.adaptor.suppress.evaluate.exceptions does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.materializedview.rebuild.incremental does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.query.results.cache.max.entry.size does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.spark.stage.max.tasks does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.testing.short.logs does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.streaming.auto.flush.enabled does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.spark.explain.user does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.operation.log.cleanup.delay does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.repl.dump.metadata.only does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.optimize.countdistinct does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.auto.convert.join.shuffle.max.size does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.plugin.acl does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.schema.info.class does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.tez.queue.access.check does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.external.splits.temp.table.storage.format does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.io.row.wrapper.enabled does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.constraint.notnull.enforce does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.cli.print.escape.crlf does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.trigger.validation.interval does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.webui.cors.allowed.origins does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.limit.connections.per.ipaddress does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.external.splits.order.by.force.single.split does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.client.cache.stats.enabled does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.notification.event.poll.interval does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.transactional.concatenate.noblock does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.materializedview.rewriting.strategy does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.vectorized.if.expr.mode does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.exim.test.mode does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.query.results.cache.directory does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.query.results.cache.wait.for.pending.results does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.remove.orderby.in.subquery does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.tez.bmj.use.subcache does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.io.vrb.queue.limit.min does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.wm.pool.metrics does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.repl.add.raw.reserved.namespace does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.resource.use.hdfs.location does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.stats.num.nulls.estimate.percent does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.io.acid does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.zk.sm.session.timeout does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.vectorized.ptf.max.memory.buffering.batch.count does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.task.scheduler.am.registry does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.druid.overlord.address.default does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.optimize.remove.sq_count_check does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.webui.enable.cors does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.stats.retries.wait does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.vectorized.row.serde.inputformat.excludes does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.query.reexecution.stats.cache.size does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.combine.equivalent.work.optimization does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.lock.query.string.max.length does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.llap.io.track.cache.usage does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.use.orc.codec.pool does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.query.results.cache.max.size does not exist
1211955 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.repl.bootstrap.dump.open.txn.timeout does not exist
1211955 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 0: Opening raw store with implementation class:org.apache.hadoop.hive.metastore.ObjectStore
1211956 [main] INFO org.apache.hadoop.hive.metastore.ObjectStore - ObjectStore, initialize called
1211961 [main] INFO org.apache.hadoop.hive.metastore.MetaStoreDirectSql - Using direct SQL, underlying DB is MYSQL
1211961 [main] INFO org.apache.hadoop.hive.metastore.ObjectStore - Initialized ObjectStore
1211964 [main] WARN org.apache.hadoop.hive.metastore.HiveMetaStore - Location: hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt specified for non-external table:ttt
1211965 [main] INFO org.apache.hadoop.hive.common.FileUtils - Creating directory if it doesn't exist: hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt
1212097 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 0: get_database: dev
1212097 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=root ip=unknown-ip-addr cmd=get_database: dev
1212099 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 0: get_database: dev
1212099 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=root ip=unknown-ip-addr cmd=get_database: dev
1212103 [main] INFO org.apache.spark.sql.hudi.command.CreateHoodieTableCommand - Init hoodie.properties for ttt
1212116 [main] INFO org.apache.hudi.common.table.HoodieTableMetaClient - Initializing hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt as hoodie table hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt
1212271 [main] INFO org.apache.hudi.common.table.HoodieTableMetaClient - Loading HoodieTableMetaClient from hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt
1212272 [main] INFO org.apache.hudi.common.table.HoodieTableConfig - Loading table properties from hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt/.hoodie/hoodie.properties
1212275 [main] INFO org.apache.hudi.common.table.HoodieTableMetaClient - Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt
1212275 [main] INFO org.apache.hudi.common.table.HoodieTableMetaClient - Finished initializing Table of type COPY_ON_WRITE from hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt
Time taken: 0.499 seconds
1212281 [main] INFO org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver - Time taken: 0.499 seconds
spark-sql>
>
> create table if not exists ttt (id bigint, name string) using hudi options (type='cow', primaryKey='id');
1272842 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 0: get_database: dev
1272842 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=root ip=unknown-ip-addr cmd=get_database: dev
1272845 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 0: get_database: dev
1272845 [main] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=root ip=unknown-ip-addr cmd=get_database: dev
1272850 [main] INFO org.apache.hudi.common.table.HoodieTableMetaClient - Loading HoodieTableMetaClient from hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt
1272852 [main] INFO org.apache.hudi.common.table.HoodieTableConfig - Loading table properties from hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt/.hoodie/hoodie.properties
1272855 [main] INFO org.apache.hudi.common.table.HoodieTableMetaClient - Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt
1272856 [main] INFO org.apache.hudi.common.table.timeline.HoodieActiveTimeline - Loaded instants upto : Optional.empty
Error in query: Specified schema in create table statement is not equal to the table schema.You should not specify the schema for an exist table: `dev`.`ttt` ;
spark-sql> ```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] BenjMaq commented on issue #3845: [SUPPORT]`if not exists` doesn't work on create table in spark-sql
Posted by GitBox <gi...@apache.org>.
BenjMaq commented on issue #3845:
URL: https://github.com/apache/hudi/issues/3845#issuecomment-949640297
I am facing the same issue
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] xushiyan commented on issue #3845: [SUPPORT]`if not exists` doesn't work on create table in spark-sql
Posted by GitBox <gi...@apache.org>.
xushiyan commented on issue #3845:
URL: https://github.com/apache/hudi/issues/3845#issuecomment-950442738
@mutoulbj @BenjMaq Thanks for raising this! It does make sense to print a message indicating table exists instead of errorring. Filing a JIRA and please feel free to take it if you're interested!
https://issues.apache.org/jira/browse/HUDI-2611
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] xushiyan closed issue #3845: [SUPPORT]`if not exists` doesn't work on create table in spark-sql
Posted by GitBox <gi...@apache.org>.
xushiyan closed issue #3845:
URL: https://github.com/apache/hudi/issues/3845
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org