You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "qingyuan18 (via GitHub)" <gi...@apache.org> on 2023/04/05 04:37:35 UTC

[GitHub] [hudi] qingyuan18 opened a new issue, #8382: [SUPPORT] hudi 0.12 spark batch ingestion throw out archive format validation error

qingyuan18 opened a new issue, #8382:
URL: https://github.com/apache/hudi/issues/8382

   **_Tips before filing an issue_**
   
   - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
   
   - Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
   
   - If you have triaged this as a bug, then file an [issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   **Describe the problem you faced**
   Hudi 0.12 ,spark 3.2, writeToHudiByNoPartition throw out exception:
   ![image](https://user-images.githubusercontent.com/35717759/229981673-5ee72ffd-e9cf-43d3-b073-924cc7b7226e.png)
   
   seems like it doesn't recognize the hudi's archive metadata avro format
   
    **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. read datasource with spark dataframe
   2. 
   config hudi write parameter as following
   ![image](https://user-images.githubusercontent.com/35717759/229981894-7b2bbd4d-0710-462f-a679-cf9774c6bc45.png)
   
   3. run the spark app with writeHudi function:
   writeToHudiByPartition(
         df2,
         sinkTable,
         sink_alliances_table_key,
         sink_alliances_distinct_field,
         "date_part",
         hiveDB,
         save_path)
   
   
   4. after 30 commits, which trigger the archive process, it throw out the exception as ahead
   
   **Expected behavior**
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version : 0.12.0
   
   * Spark version : 3.2.1
   
   * Hive version : 3.2.1
   
   * Hadoop version :
   
   * Storage (HDFS/S3/GCS..) : s3
   
   * Running on Docker? (yes/no) : no
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   **Stacktrace**
   
   ```Add the stacktrace of the error.```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8382: [SUPPORT] hudi 0.12 spark batch ingestion throw out archive format validation error

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8382:
URL: https://github.com/apache/hudi/issues/8382#issuecomment-1496962270

   Did you also clean the .hoodie/archive folder ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8382: [SUPPORT] hudi 0.12 spark batch ingestion throw out archive format validation error

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8382:
URL: https://github.com/apache/hudi/issues/8382#issuecomment-1498447334

   It looks like a version compatibility issue, in old version Hudi, the archived entry does not have field: `operationType`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8382: [SUPPORT] hudi 0.12 spark batch ingestion throw out archive format validation error

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8382:
URL: https://github.com/apache/hudi/issues/8382#issuecomment-1496952221

   Did you ever try to write to a legacy table? It seems a version compatibility.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] qingyuan18 commented on issue #8382: [SUPPORT] hudi 0.12 spark batch ingestion throw out archive format validation error

Posted by "qingyuan18 (via GitHub)" <gi...@apache.org>.
qingyuan18 commented on issue #8382:
URL: https://github.com/apache/hudi/issues/8382#issuecomment-1497007079

   yes, indeed
   
   
   
   ---Original---
   From: "Danny ***@***.***&gt;
   Date: Wed, Apr 5, 2023 14:02 PM
   To: ***@***.***&gt;;
   Cc: ***@***.******@***.***&gt;;
   Subject: Re: [apache/hudi] [SUPPORT] hudi 0.12 spark batch ingestion throw outarchive format validation error (Issue #8382)
   
   
   
   
    
   Did you also clean the .hoodie/archive folder ?
    
   —
   Reply to this email directly, view it on GitHub, or unsubscribe.
   You are receiving this because you authored the thread.Message ID: ***@***.***&gt;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] ad1happy2go commented on issue #8382: [SUPPORT] hudi 0.12 spark batch ingestion throw out archive format validation error

Posted by "ad1happy2go (via GitHub)" <gi...@apache.org>.
ad1happy2go commented on issue #8382:
URL: https://github.com/apache/hudi/issues/8382#issuecomment-1569712600

   @qingyuan18 Were you able to resolve this issue? If yes can you share the resolution please.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] qingyuan18 commented on issue #8382: [SUPPORT] hudi 0.12 spark batch ingestion throw out archive format validation error

Posted by "qingyuan18 (via GitHub)" <gi...@apache.org>.
qingyuan18 commented on issue #8382:
URL: https://github.com/apache/hudi/issues/8382#issuecomment-1496957557

   no , i have cleaned up table dataand re run the job .&nbsp;
   th error still reproduce
   
   
   
   ---Original---
   From: "Danny ***@***.***&gt;
   Date: Wed, Apr 5, 2023 13:48 PM
   To: ***@***.***&gt;;
   Cc: ***@***.******@***.***&gt;;
   Subject: Re: [apache/hudi] [SUPPORT] hudi 0.12 spark batch ingestion throw outarchive format validation error (Issue #8382)
   
   
   
   
    
   Did you ever try to write to a legacy table? It seems a version compatibility.
    
   —
   Reply to this email directly, view it on GitHub, or unsubscribe.
   You are receiving this because you authored the thread.Message ID: ***@***.***&gt;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org