You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/01/28 09:44:00 UTC
[GitHub] [iceberg] dilipbiswal commented on issue #2175: MERGE INTO statement cannot run with Spark SQL

dilipbiswal commented on issue #2175:
URL: https://github.com/apache/iceberg/issues/2175#issuecomment-768930127


   Hello,
   
   Here is the grammar  for matched clause and matched action.
   
   matchedClause
       : WHEN MATCHED (AND matchedCond=booleanExpression)? THEN matchedAction
       ;
   
   matchedAction
       : DELETE
       | UPDATE SET ASTERISK
       | UPDATE SET assignmentList
       ;
   
   You are missing the :"UPDATE" keyword ?
   
   Regards
   -- Dilip
   On Thu, Jan 28, 2021 at 1:28 AM wjxiz <no...@github.com> wrote:
   
   > I'm trying to test the merge into statement in Spark SQL with Iceberg
   > extension:
   >
   > $SPARK_HOME/bin/spark-shell --master $MASTER \
   > --driver-memory ${DRIVE_MEMORY}G \
   > --executor-memory ${EXECUTOR_MEMORY}G \
   > --executor-cores $EXECUTOR_CORES \
   > --num-executors $NUM_EXECUTOR \
   > --conf spark.sql.catalogImplementation=hive \
   > --conf spark.task.cpus=1 \
   > --conf spark.locality.wait=0 \
   > --conf spark.yarn.maxAppAttempts=1 \
   > --conf spark.sql.shuffle.partitions=24 \
   > --conf spark.sql.files.maxPartitionBytes=128m \
   > --conf spark.sql.warehouse.dir=$OUT \
   > --conf spark.task.resource.gpu.amount=0.08 \
   > --conf spark.executor.resource.gpu.amount=1 \
   > --packages org.apache.iceberg:iceberg-spark3-runtime:0.11.0 \
   > --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \
   > --conf spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog \
   > --conf spark.sql.catalog.spark_catalog.type=hive \
   > --conf spark.sql.catalog.local=org.apache.iceberg.spark.SparkCatalog \
   > --conf spark.sql.catalog.local.type=hadoop \
   > --conf spark.sql.catalog.local.warehouse=$PWD/warehouse
   >
   >
   > Here's my test:
   >
   > scala> spark.sql("""
   >      | merge into local.db.table1 t
   >      | using ( select * from local.db.table1_update) s
   >      | on t.data = s.data
   >      | when matched then set t.id = t.id + s.id
   >      | when not matched then insert *
   >      | """)
   > org.apache.spark.sql.catalyst.parser.ParseException:
   > mismatched input 'set' expecting {'DELETE', 'UPDATE'}(line 5, pos 18)
   >
   > == SQL ==
   >
   > merge into local.db.table1 t
   > using ( select * from local.db.table1_update) s
   > on t.data = s.data
   > when matched then set t.id = t.id + s.id
   > ------------------^^^
   > when not matched then insert *
   >
   >   at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:266)
   >   at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:133)
   >   at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
   >   at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:81)
   >   at org.apache.spark.sql.catalyst.parser.extensions.IcebergSparkSqlExtensionsParser.parsePlan(IcebergSparkSqlExtensionsParser.scala:100)
   >   at org.apache.spark.sql.SparkSession.$anonfun$sql$2(SparkSession.scala:605)
   >   at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
   >   at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:605)
   >   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
   >   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:602)
   >   ... 53 elided
   >
   >
   > for the environment:
   >
   > scala> spark.sparkContext.listJars
   > res0: Seq[String] = Vector(spark://10.19.183.124:38945/jars/org.apache.iceberg_iceberg-spark3-runtime-0.11.0.jar)
   >
   > not 100% sure if the extension has been loaded correctly because it seems
   > to be a parser issue that it doesn't recognize "set".
   >
   > —
   > You are receiving this because you are subscribed to this thread.
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/iceberg/issues/2175>, or unsubscribe
   > <https://github.com/notifications/unsubscribe-auth/ADMQ6BQN365B7PJMLAVYJGTS4EU5RANCNFSM4WWWVB2Q>
   > .
   >
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org