You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/01/19 20:42:25 UTC

[GitHub] [hudi] LucassLin opened a new issue #4642: [SUPPORT] Hudi Merge Into

LucassLin opened a new issue #4642:
URL: https://github.com/apache/hudi/issues/4642


   Hi Team,
   
   I followed https://hudi.apache.org/docs/quick-start-guide#mergeinto and do a partial update on a table but getting the following issue
   
   22/01/18 23:22:40 ERROR org.apache.spark.deploy.yarn.ApplicationMaster: User class threw exception: java.lang.AssertionError: assertion failed: No plan for MergeIntoTable (((col1#4217 = col1#258) && (col2#4230 = col2#305)) && ((col3#4241 = col3#316) && (col4#4287 = col4#1441))), [updateaction(None, assignment(col5#4242, col5#317), assignment(col6#4244, col6#319))], [insertaction(None)]
   
   Hudi version 0.10.0
   Spark version 2.4.7
   Following is the code I have:
   
   val historicalDF = spark.read.format("org.apache.hudi").load(basePath)
   historicalDF.createOrReplaceTempView("historical_data")
   incrementalDF.createOrReplaceTempView("incremental_data")
   val sqlPartialUpdate =
         s"""
          | merge into historical_data as target
          | using (
          |   select * from incremental_data
          | ) source
          | on  target.col1 = source.col1
          | and target.col2 = source.col2
          | and target.col3 = source.col3
          | and target.col4 = source.col4
          | when matched then
          |   update set target.col5 = source.col5, target.col6 = source.col6
          | when not matched then insert *
          """.stripMargin
   spark.sql(sqlPartialUpdate)
   I would really appreciate if anyone can help with this issue, or point me in the right direction if in case I've missed anything.
   Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] dongkelun edited a comment on issue #4642: [SUPPORT] Hudi Merge Into

Posted by GitBox <gi...@apache.org>.
dongkelun edited a comment on issue #4642:
URL: https://github.com/apache/hudi/issues/4642#issuecomment-1017039375


   Instead of using a temporary table, try changing the target table to an entity table. The current version should not support the situation that the target table is a temporary table.And the target table must be a Hudi table 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] LucassLin commented on issue #4642: [SUPPORT] Hudi Merge Into

Posted by GitBox <gi...@apache.org>.
LucassLin commented on issue #4642:
URL: https://github.com/apache/hudi/issues/4642#issuecomment-1017119391


   > ```scala
   > historicalDF.write.format("hudi").saveAsTable("tableName")
   > ```
   > 
   > Sorry, I didn't see this one just now. It's OK from Hudi version 0.9.0, because there's still hudi sparkSql
   > 
   > In addition, it is also possible to configure sync hive, but there are bugs in previous versions. See this PR for details [3745](https://github.com/apache/hudi/pull/3745)
   > 
   > Of course, you can also create a hive table. As long as the attributes are completely consistent, it is essentially the same as sync hive
   
   thanks for the replies. I tried using hudi createTable sql command but getting 
   ```
   Exception = MetaException(message:Got exception: java.io.IOException Error accessing gs://*
   ```
   I also tried using saveAsTable but seems like there might be some issue with hive config which causes
   ```
   Exception = Invalid host name: local host is:
   ```
   I will try to resolve these once I get back to work and see if the entity table would solve the mergeInto issue. Thanks again for your help.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] dongkelun edited a comment on issue #4642: [SUPPORT] Hudi Merge Into

Posted by GitBox <gi...@apache.org>.
dongkelun edited a comment on issue #4642:
URL: https://github.com/apache/hudi/issues/4642#issuecomment-1017039375


   Instead of using a temporary table, try changing the target table to an entity table. The current version should not support the situation that the target table is a temporary table 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] LucassLin commented on issue #4642: [SUPPORT] Hudi Merge Into

Posted by GitBox <gi...@apache.org>.
LucassLin commented on issue #4642:
URL: https://github.com/apache/hudi/issues/4642#issuecomment-1030922663


   > @LucassLin : Can we have any updates on this end.
   
   After fixing the access issue, https://hudi.apache.org/docs/quick-start-guide/#mergeinto works well for us.
   Thanks team!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan closed issue #4642: [SUPPORT] Hudi Merge Into

Posted by GitBox <gi...@apache.org>.
nsivabalan closed issue #4642:
URL: https://github.com/apache/hudi/issues/4642


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #4642: [SUPPORT] Hudi Merge Into

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #4642:
URL: https://github.com/apache/hudi/issues/4642#issuecomment-1017018173


   @YannByron @dongkelun : Can you folks assist here please.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] dongkelun commented on issue #4642: [SUPPORT] Hudi Merge Into

Posted by GitBox <gi...@apache.org>.
dongkelun commented on issue #4642:
URL: https://github.com/apache/hudi/issues/4642#issuecomment-1017108838


   ```scala
   historicalDF.write.format("hudi").saveAsTable("tableName")
   ```
   Sorry, I didn't see this one just now. It's OK from Hudi version 0.9.0, because there's still hudi sparkSql
   
   In addition, it is also possible to configure sync hive, but there are bugs in previous versions. See this PR for details [3745](https://github.com/apache/hudi/pull/3745)
   
   Of course, you can also create a hive table. As long as the attributes are completely consistent, it is essentially the same as sync hive


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] dongkelun edited a comment on issue #4642: [SUPPORT] Hudi Merge Into

Posted by GitBox <gi...@apache.org>.
dongkelun edited a comment on issue #4642:
URL: https://github.com/apache/hudi/issues/4642#issuecomment-1017091320


   @LucassLin 
   In official documents:[https://hudi.apache.org/docs/quick-start-guide/](https://hudi.apache.org/docs/quick-start-guide/)
   #Create Table # SparkSQL
   ```sql
   -- create a mor non-partitioned table without preCombineField provided
   create table hudi_mor_tbl (
     id int,
     name string,
     price double,
     ts bigint
   ) using hudi
   tblproperties (
     type = 'mor',
     primaryKey = 'id',
     preCombineField = 'ts'
   );
   ```
   The document is a bit wrong in this place. type ='cow' should be type = 'mor',This parameter only controls the table type
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] dongkelun commented on issue #4642: [SUPPORT] Hudi Merge Into

Posted by GitBox <gi...@apache.org>.
dongkelun commented on issue #4642:
URL: https://github.com/apache/hudi/issues/4642#issuecomment-1017039375


   Try the target table instead of the temporary table. The target table should not be a temporary table at present


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #4642: [SUPPORT] Hudi Merge Into

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #4642:
URL: https://github.com/apache/hudi/issues/4642#issuecomment-1030879448


   @LucassLin : Can we have any updates on this end. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] LucassLin commented on issue #4642: [SUPPORT] Hudi Merge Into

Posted by GitBox <gi...@apache.org>.
LucassLin commented on issue #4642:
URL: https://github.com/apache/hudi/issues/4642#issuecomment-1017089251


   > Instead of using a temporary table, try changing the target table to an entity table. The current version should not support the situation that the target table is a temporary table.And the target table must be a Hudi table
   
   Thanks for the reply. By entity table, do you mean something like
   ```
   historicalDF.write.saveAsTable("tableName")
   ```
   Can you also elaborate more on "And the target table must be a Hudi table"? How do I ensure the table I write is Hudi table? Is there a specific API to use to create this entity table as hudi table?
   
   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] dongkelun commented on issue #4642: [SUPPORT] Hudi Merge Into

Posted by GitBox <gi...@apache.org>.
dongkelun commented on issue #4642:
URL: https://github.com/apache/hudi/issues/4642#issuecomment-1017091320


   @LucassLin 
   In official documents:[https://hudi.apache.org/docs/quick-start-guide/](https://hudi.apache.org/docs/quick-start-guide/)
   #Create Table # SparkSQL
   ```sql
   -- create a mor non-partitioned table without preCombineField provided
   create table hudi_mor_tbl (
     id int,
     name string,
     price double,
     ts bigint
   ) using hudi
   tblproperties (
     type = 'cow',
     primaryKey = 'id',
     preCombineField = 'ts'
   );
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #4642: [SUPPORT] Hudi Merge Into

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #4642:
URL: https://github.com/apache/hudi/issues/4642#issuecomment-1032005281


   thanks @dongkelun  for the assistance.
   Thanks for updating us @LucassLin 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org