You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/04/21 00:01:26 UTC

[GitHub] [incubator-hudi] afilipchik opened a new pull request #1541: [Minor] Add ability to specify time unit for TimestampBasedKeyGenerator

afilipchik opened a new pull request #1541:
URL: https://github.com/apache/incubator-hudi/pull/1541


   ## What is the purpose of the pull request
   
   Adding a way to specify any source time unit for TimestampBasedKeyGenerator. 
   Properties probably need some refactoring, kept unix timestamp for backward compatibility
   ## Brief change log
   
   *(for example:)*
     - updated TimestampBasedKeyGenerator
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1541: [MINOR] Add ability to specify time unit for TimestampBasedKeyGenerator

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1541:
URL: https://github.com/apache/incubator-hudi/pull/1541#discussion_r412685370



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/keygen/TimestampBasedKeyGenerator.java
##########
@@ -123,4 +132,8 @@ public HoodieKey getKey(GenericRecord record) {
       throw new HoodieDeltaStreamerException("Unable to parse input partition field :" + partitionVal, pe);
     }
   }
+
+  private long convertLongTimeToSeconds(Long partitionVal) {

Review comment:
       this is converting to ms, rather? 

##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/keygen/TimestampBasedKeyGenerator.java
##########
@@ -35,16 +35,21 @@
 import java.util.Collections;
 import java.util.Date;
 import java.util.TimeZone;
+import java.util.concurrent.TimeUnit;
+
+import static java.util.concurrent.TimeUnit.MILLISECONDS;
 
 /**
  * Key generator, that relies on timestamps for partitioning field. Still picks record key by name.
  */
 public class TimestampBasedKeyGenerator extends SimpleKeyGenerator {
 
   enum TimestampType implements Serializable {
-    UNIX_TIMESTAMP, DATE_STRING, MIXED, EPOCHMILLISECONDS

Review comment:
       this will break users setting EPOCHMILLISECONDS today.. Could we avoid this pain  and transparently handle this 

##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/keygen/TimestampBasedKeyGenerator.java
##########
@@ -35,16 +35,21 @@
 import java.util.Collections;
 import java.util.Date;
 import java.util.TimeZone;
+import java.util.concurrent.TimeUnit;
+
+import static java.util.concurrent.TimeUnit.MILLISECONDS;
 
 /**
  * Key generator, that relies on timestamps for partitioning field. Still picks record key by name.
  */
 public class TimestampBasedKeyGenerator extends SimpleKeyGenerator {
 
   enum TimestampType implements Serializable {
-    UNIX_TIMESTAMP, DATE_STRING, MIXED, EPOCHMILLISECONDS

Review comment:
       If this is set, we can make the timeunit be MILLISECONDS.. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] afilipchik edited a comment on pull request #1541: [HUDI-843] Add ability to specify time unit for TimestampBasedKeyGenerator

Posted by GitBox <gi...@apache.org>.
afilipchik edited a comment on pull request #1541:
URL: https://github.com/apache/incubator-hudi/pull/1541#issuecomment-620719213


   addressed comments
   @yanghua 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] yanghua commented on issue #1541: [MINOR] Add ability to specify time unit for TimestampBasedKeyGenerator

Posted by GitBox <gi...@apache.org>.
yanghua commented on issue #1541:
URL: https://github.com/apache/incubator-hudi/pull/1541#issuecomment-618193932


   @afilipchik Firstly, You may need to figure out why the Travis is red.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] afilipchik commented on pull request #1541: [HUDI-843] Add ability to specify time unit for TimestampBasedKeyGenerator

Posted by GitBox <gi...@apache.org>.
afilipchik commented on pull request #1541:
URL: https://github.com/apache/incubator-hudi/pull/1541#issuecomment-620719213


   addressed comments


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] yanghua commented on issue #1541: [Minor] Add ability to specify time unit for TimestampBasedKeyGenerator

Posted by GitBox <gi...@apache.org>.
yanghua commented on issue #1541:
URL: https://github.com/apache/incubator-hudi/pull/1541#issuecomment-616938485


   Hi @afilipchik thanks for your contribution. The correct prefix is "MINOR". IMHO, the change of this PR should be filed in JIRA.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] vinothchandar merged pull request #1541: [HUDI-843] Add ability to specify time unit for TimestampBasedKeyGenerator

Posted by GitBox <gi...@apache.org>.
vinothchandar merged pull request #1541:
URL: https://github.com/apache/incubator-hudi/pull/1541


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] afilipchik commented on a change in pull request #1541: [MINOR] Add ability to specify time unit for TimestampBasedKeyGenerator

Posted by GitBox <gi...@apache.org>.
afilipchik commented on a change in pull request #1541:
URL: https://github.com/apache/incubator-hudi/pull/1541#discussion_r414977643



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/keygen/TimestampBasedKeyGenerator.java
##########
@@ -35,16 +35,21 @@
 import java.util.Collections;
 import java.util.Date;
 import java.util.TimeZone;
+import java.util.concurrent.TimeUnit;
+
+import static java.util.concurrent.TimeUnit.MILLISECONDS;
 
 /**
  * Key generator, that relies on timestamps for partitioning field. Still picks record key by name.
  */
 public class TimestampBasedKeyGenerator extends SimpleKeyGenerator {
 
   enum TimestampType implements Serializable {
-    UNIX_TIMESTAMP, DATE_STRING, MIXED, EPOCHMILLISECONDS

Review comment:
       yep, will do 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org