You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/09/23 00:37:37 UTC

[GitHub] [hudi] yihua opened a new pull request, #6756: [HUDI-4805] Update FAQ with workarounds for HBase issues

yihua opened a new pull request, #6756:
URL: https://github.com/apache/hudi/pull/6756

   ### Change Logs
   
   As above.
   ### Impact
   
   **Risk level: none**
   
   The website can be built and visualized properly locally with `npm start`.
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] bhasudha merged pull request #6756: [HUDI-4805] Update FAQ with workarounds for HBase issues

Posted by GitBox <gi...@apache.org>.
bhasudha merged PR #6756:
URL: https://github.com/apache/hudi/pull/6756


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] jiangbiao910 commented on pull request #6756: [HUDI-4805] Update FAQ with workarounds for HBase issues

Posted by GitBox <gi...@apache.org>.
jiangbiao910 commented on PR #6756:
URL: https://github.com/apache/hudi/pull/6756#issuecomment-1260592363

   @yihua Hello,We've met `"java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()` ,So I set  "hoodie.metadata.enable"="false" .
   This means that version **0.12** has resolved the hbase version incompatibility with the Hadoop version?
   Do we still need to adapt hbase and Hadoop versions?
   Looking forward to your reply.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xuzifu666 commented on a diff in pull request #6756: [HUDI-4805] Update FAQ with workarounds for HBase issues

Posted by "xuzifu666 (via GitHub)" <gi...@apache.org>.
xuzifu666 commented on code in PR #6756:
URL: https://github.com/apache/hudi/pull/6756#discussion_r1135876302


##########
website/versioned_docs/version-0.12.0/faq.md:
##########
@@ -581,6 +581,48 @@ After the second write:
 |  20220622204044318|20220622204044318...|                 1|                      |890aafc0-d897-44d...|hudi.apache.com|  1|   1|
 |  20220622204208997|20220622204208997...|                 2|                      |890aafc0-d897-44d...|             null|  1|   2|
 
+### I see two different records for the same record key value, each record key with a different timestamp format. How is this possible?
+
+This is a known issue with enabling row-writer for bulk_insert operation. When you do a bulk_insert followed by another
+write operation such as upsert/insert this might be observed for timestamp fields specifically. For example, bulk_insert might produce
+timestamp `2016-12-29 09:54:00.0` for record key whereas non bulk_insert write operation might produce a long value like
+`1483023240000000` for the record key thus creating two different records. To fix this, starting 0.10.1 a new config [hoodie.datasource.write.keygenerator.consistent.logical.timestamp.enabled](https://hudi.apache.org/docs/configurations/#hoodiedatasourcewritekeygeneratorconsistentlogicaltimestampenabled)
+is introduced to bring consistency irrespective of whether row writing is enabled on not. However, for the sake of
+backwards compatibility and not breaking existing pipelines, this config is set to false by default and will have to be enabled explicitly.
+
+
+### Can I switch from one index type to another without having to rewrite the entire table?
+
+It should be okay to switch between Bloom index and Simple index as long as they are not global.
+Moving from global to non-global and vice versa may not work. Also switching between Hbase (gloabl index) and regular bloom might not work.
+
+### How can I resolve the NoSuchMethodError from HBase when using Hudi with metadata table on HDFS?
+From 0.11.0 release, we have upgraded the HBase version to 2.4.9, which is released based on Hadoop 2.x.  Hudi's metadata
+table uses HFile as the base file format, relying on the HBase library.  When enabling metadata table in a Hudi table on
+HDFS using Hadoop 3.x, NoSuchMethodError can be thrown due to compatibility issues between Hadoop 2.x and 3.x.
+To address this, here's the workaround:
+
+(1) Download HBase source code from `https://github.com/apache/hbase`;
+
+(2) Switch to the source code of 2.4.9 release with the tag `rel/2.4.9`:
+```shell
+git checkout rel/2.4.9
+```
+
+(3) Package a new version of HBase 2.4.9 with Hadoop 3 version:
+```shell
+mvn clean install -Denforcer.skip -DskipTests -Dhadoop.profile=3.0 -Psite-install-step

Review Comment:
   this method maybe can not compile hbase successfully @yihua 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org