You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/02/04 07:11:37 UTC

[GitHub] [hudi] Gatsby-Lee opened a new issue #4748: [SUPPORT] Question: Does "SINGLE_WRITER" also utilize if hoodie.write.lock.* is provided?

Gatsby-Lee opened a new issue #4748:
URL: https://github.com/apache/hudi/issues/4748


   Hi, 
   
   I am leaving my question here since I don't know where to put this type of question.
   I also googled the question I have, but I couldn't find one.
   Please let me know where I should put this type of question next time.
   
   My question is related to Hudi write.
   
   **Context**
   
   - Hudi version: 0.9 ( running in AWS Glue 3 )
   - using Hudi Metadata table
   ```
   "hoodie.metadata.enable": "true"
   ```
   - using SINGLE_WRITER
   ```
   "hoodie.cleaner.policy.failed.writes": "EAGER"
   "hoodie.write.concurrency.mode": "SINGLE_WRITER"
   ```
   - using zookeeper as lock provider
   ```
   "hoodie.write.lock.provider": "org.apache.hudi.client.transaction.lock.ZookeeperBasedLockProvider",
   "hoodie.write.lock.zookeeper.port": port,
   "hoodie.write.lock.zookeeper.url": zookeeper_url
   "hoodie.write.lock.zookeeper.base_path": "/hudi_write_lock",
   "hoodie.write.lock.zookeeper.lock_key": table_name,
   ```
   
   I setup the zookeeper as lock provider to use Hudi Metadata table with async Hudi service.
   https://hudi.apache.org/docs/metadata/
   
   **Question:**
   Even if hoodie.write.concurrency.mode=SINGLE_WRITER, do the Hudi table services ( cleaner, compaction, clustering ) + Hudi write use the provided zoopkeeper lock?
   
   Somehow, I keep seeing FileNotFoundException issue when Hudi tries to access files based on Hudi Metadata.
   Therefore, I am investigating what makes Hudi Metadata out-dated.
   
   Thank you
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #4748: [SUPPORT] Question: Does "SINGLE_WRITER" also utilize if hoodie.write.lock.* is provided?

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #4748:
URL: https://github.com/apache/hudi/issues/4748#issuecomment-1030873277


   you have to set concurrency.mode=optimistic_concurrency_control. more info can be found [here](https://hudi.apache.org/docs/0.8.0/concurrency_control/).
   
   if you set it to single writer, lock providers are not effective. 
   
   So, may be its an issue w/ metadata table. w/o enabling metadata, does your pipeline run smoothly w/o any issues? 
   Also, we have a command in hudi-cli in 0.10.1 or latest master "metadata validate-files" that you can try it out. It might give file mismatches between fs based listing and metadata based listing. 
   If you wish to rebuild metadata, you can delete the metadata from hudi-cli and your next write will bootstrap the metadata table again. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #4748: [SUPPORT] Question: Does "SINGLE_WRITER" also utilize if hoodie.write.lock.* is provided?

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #4748:
URL: https://github.com/apache/hudi/issues/4748#issuecomment-1039524963


   Q1, Q2: @prashantwason : Can you help answer this. 
   Q3, Q4: yes.
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #4748: [SUPPORT] Question: Does "SINGLE_WRITER" also utilize if hoodie.write.lock.* is provided?

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #4748:
URL: https://github.com/apache/hudi/issues/4748#issuecomment-1030873828


   CC @yihua @manojpec @codope 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #4748: [SUPPORT] Question: Does "SINGLE_WRITER" also utilize if hoodie.write.lock.* is provided?

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #4748:
URL: https://github.com/apache/hudi/issues/4748#issuecomment-1030873438


   Recently we have a patch to add validation tool for metadata. you can give this one a try as well. 
   https://github.com/apache/hudi/pull/4721
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] Gatsby-Lee commented on issue #4748: [SUPPORT] Question: Does "SINGLE_WRITER" also utilize if hoodie.write.lock.* is provided?

Posted by GitBox <gi...@apache.org>.
Gatsby-Lee commented on issue #4748:
URL: https://github.com/apache/hudi/issues/4748#issuecomment-1032019905


   @nsivabalan 
   
   Again, Thank you for your reply.
   
   First, You're right.
   I can't find **hoodie.metadata.validate** in 0.10.0. ( It's still available in 0.9 )
   Thank you for making it clear.
   
   About "Metadata Table", ( Since the current AWS Marketplace provides Hudi 0.9, let me ask questions about 0.9 and 0.10 )
   Q1. On Hudi 0.9, to use Async Hudi Tables services, does the lock provider has to be configured?
   Q2. On Hudi 0.9, to use Async Hudi Tables services, does "hoodie.write.concurrency.mode" has to be OPTIMISTIC_CONCURRENCY_CONTROL?
   Q3. On Hudi 0.10, to use Async Hudi Tables services, does the lock provider has to be configured?
   ( I think this is YES )
   Q4. On Hudi 0.10, to use Async Hudi Tables services, does "hoodie.write.concurrency.mode" has to be OPTIMISTIC_CONCURRENCY_CONTROL?
   
   Thank you for your help!!
   Gatsby


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] Gatsby-Lee commented on issue #4748: [SUPPORT] Question: Does "SINGLE_WRITER" also utilize if hoodie.write.lock.* is provided?

Posted by GitBox <gi...@apache.org>.
Gatsby-Lee commented on issue #4748:
URL: https://github.com/apache/hudi/issues/4748#issuecomment-1031121393


   @nsivabalan 
   
   Thank you very much for your reply.
   
   The reason I started questioning about the context is due to this information in Hudi document.
   
   references: https://hudi.apache.org/docs/0.10.0/metadata
   """
   If your current deployment model is single writer along with async table services (such as cleaning, clustering, compaction) configured, then it is a must to have 'lock providers configured' before turning on the metadata table.
   """
   Q1. I guess this document is true as of Hudi 0.10.0 ?
   
   And about the metadata validation, since I use AWS Glue, I don't have a way to run Hudi CLI.
   Q2. Can "hoodie.metadata.validate=true" be the alternative?
   
   Thank you
   Gatsby


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] prashantwason commented on issue #4748: [SUPPORT] Question: Does "SINGLE_WRITER" also utilize if hoodie.write.lock.* is provided?

Posted by GitBox <gi...@apache.org>.
prashantwason commented on issue #4748:
URL: https://github.com/apache/hudi/issues/4748#issuecomment-1039671459


   Q1, Q2: YES
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] Gatsby-Lee commented on issue #4748: [SUPPORT] Question: Does "SINGLE_WRITER" also utilize if hoodie.write.lock.* is provided?

Posted by GitBox <gi...@apache.org>.
Gatsby-Lee commented on issue #4748:
URL: https://github.com/apache/hudi/issues/4748#issuecomment-1040017002


   @nsivabalan @prashantwason 
   Thank you very much for all your answers. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #4748: [SUPPORT] Question: Does "SINGLE_WRITER" also utilize if hoodie.write.lock.* is provided?

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #4748:
URL: https://github.com/apache/hudi/issues/4748#issuecomment-1032000450


   Q1: yes.
   Q2: we do not honor validation config in 0.10.0. we had it in 0.8.0, but not in 0.10.0. 
   So, once the patch lands, probably you can make use of it. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #4748: [SUPPORT] Question: Does "SINGLE_WRITER" also utilize if hoodie.write.lock.* is provided?

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #4748:
URL: https://github.com/apache/hudi/issues/4748#issuecomment-1039524963


   Q1, Q2: @prashantwason : Can you help answer this. 
   Q3, Q4: yes.
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] Gatsby-Lee closed issue #4748: [SUPPORT] Question: Does "SINGLE_WRITER" also utilize if hoodie.write.lock.* is provided?

Posted by GitBox <gi...@apache.org>.
Gatsby-Lee closed issue #4748:
URL: https://github.com/apache/hudi/issues/4748


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] prashantwason commented on issue #4748: [SUPPORT] Question: Does "SINGLE_WRITER" also utilize if hoodie.write.lock.* is provided?

Posted by GitBox <gi...@apache.org>.
prashantwason commented on issue #4748:
URL: https://github.com/apache/hudi/issues/4748#issuecomment-1039671459


   Q1, Q2: YES
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] Gatsby-Lee closed issue #4748: [SUPPORT] Question: Does "SINGLE_WRITER" also utilize if hoodie.write.lock.* is provided?

Posted by GitBox <gi...@apache.org>.
Gatsby-Lee closed issue #4748:
URL: https://github.com/apache/hudi/issues/4748


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] Gatsby-Lee commented on issue #4748: [SUPPORT] Question: Does "SINGLE_WRITER" also utilize if hoodie.write.lock.* is provided?

Posted by GitBox <gi...@apache.org>.
Gatsby-Lee commented on issue #4748:
URL: https://github.com/apache/hudi/issues/4748#issuecomment-1040017002


   @nsivabalan @prashantwason 
   Thank you very much for all your answers. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org