You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Vaibhav Gumashta <vg...@hortonworks.com> on 2017/01/03 22:27:56 UTC

Review Request 55154: HIVE-15366: REPL LOAD & DUMP support for incremental INSERT events

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55154/
-----------------------------------------------------------

Review request for hive, Daniel Dai, Sushanth Sowmyan, and Thejas Nair.


Bugs: HIVE-15366
    https://issues.apache.org/jira/browse/HIVE-15366


Repository: hive-git


Description
-------

https://issues.apache.org/jira/browse/HIVE-15366


Diffs
-----

  metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java fe747df 
  metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java bd9f9ec 
  metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java 9954902 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 4c0f817 
  ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 6e9602f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java f61274b 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 5561e06 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 9b83407 

Diff: https://reviews.apache.org/r/55154/diff/


Testing
-------


Thanks,

Vaibhav Gumashta


Re: Review Request 55154: HIVE-15366: REPL LOAD & DUMP support for incremental INSERT events

Posted by Vaibhav Gumashta <vg...@hortonworks.com>.

> On Jan. 9, 2017, 7:07 p.m., Sushanth Sowmyan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java, line 2352
> > <https://reviews.apache.org/r/55154/diff/4/?file=1598535#file1598535line2352>
> >
> >     We probably don't have a problem here, in that all entries in the list of newFiles are all probably in the same filesystem, but if ever that changes, we can have off-by-one issues here wherein we cannot line up the file to its checksum, if some files have checksums and others in the middle don't. Would it make sense to put in a  "" or something like that to indicate that there was no checksum for this file?
> >     
> >     Note - this is not a blocker issue, and the patch can continue as-is. I mention more because this is something that might change in the future.

Fixing in this jira itself since it's a tiny change.


- Vaibhav


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55154/#review160929
-----------------------------------------------------------


On Jan. 6, 2017, 6:43 a.m., Vaibhav Gumashta wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55154/
> -----------------------------------------------------------
> 
> (Updated Jan. 6, 2017, 6:43 a.m.)
> 
> 
> Review request for hive, Daniel Dai, Sushanth Sowmyan, and Thejas Nair.
> 
> 
> Bugs: HIVE-15366
>     https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Diffs
> -----
> 
>   itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/TestDbNotificationListener.java 39356ae 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestReplicationScenarios.java e29aa22 
>   metastore/if/hive_metastore.thrift 79592ea 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 1311b20 
>   metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/InsertEventRequestData.java 39a607d 
>   metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb ebed504 
>   metastore/src/java/org/apache/hadoop/hive/metastore/events/InsertEvent.java d9a42a7 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java fe747df 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/MessageFactory.java fdb8e80 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java bd9f9ec 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java 9954902 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 4c0f817 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java be5a6a9 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 6e9602f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java f61274b 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 5561e06 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 9b83407 
> 
> Diff: https://reviews.apache.org/r/55154/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>


Re: Review Request 55154: HIVE-15366: REPL LOAD & DUMP support for incremental INSERT events

Posted by Sushanth Sowmyan <kh...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55154/#review160929
-----------------------------------------------------------


Fix it, then Ship it!




Looks good to me. I have one potential issue marked, but that can be solved in a future patch. Thanks, Vaibhav!


ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 2352)
<https://reviews.apache.org/r/55154/#comment232164>

    We probably don't have a problem here, in that all entries in the list of newFiles are all probably in the same filesystem, but if ever that changes, we can have off-by-one issues here wherein we cannot line up the file to its checksum, if some files have checksums and others in the middle don't. Would it make sense to put in a  "" or something like that to indicate that there was no checksum for this file?
    
    Note - this is not a blocker issue, and the patch can continue as-is. I mention more because this is something that might change in the future.


- Sushanth Sowmyan


On Jan. 6, 2017, 6:43 a.m., Vaibhav Gumashta wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55154/
> -----------------------------------------------------------
> 
> (Updated Jan. 6, 2017, 6:43 a.m.)
> 
> 
> Review request for hive, Daniel Dai, Sushanth Sowmyan, and Thejas Nair.
> 
> 
> Bugs: HIVE-15366
>     https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Diffs
> -----
> 
>   itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/TestDbNotificationListener.java 39356ae 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestReplicationScenarios.java e29aa22 
>   metastore/if/hive_metastore.thrift 79592ea 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 1311b20 
>   metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/InsertEventRequestData.java 39a607d 
>   metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb ebed504 
>   metastore/src/java/org/apache/hadoop/hive/metastore/events/InsertEvent.java d9a42a7 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java fe747df 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/MessageFactory.java fdb8e80 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java bd9f9ec 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java 9954902 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 4c0f817 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java be5a6a9 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 6e9602f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java f61274b 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 5561e06 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 9b83407 
> 
> Diff: https://reviews.apache.org/r/55154/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>


Re: Review Request 55154: HIVE-15366: REPL LOAD & DUMP support for incremental INSERT events

Posted by Vaibhav Gumashta <vg...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55154/
-----------------------------------------------------------

(Updated Jan. 6, 2017, 6:43 a.m.)


Review request for hive, Daniel Dai, Sushanth Sowmyan, and Thejas Nair.


Bugs: HIVE-15366
    https://issues.apache.org/jira/browse/HIVE-15366


Repository: hive-git


Description
-------

https://issues.apache.org/jira/browse/HIVE-15366


Diffs (updated)
-----

  itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/TestDbNotificationListener.java 39356ae 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestReplicationScenarios.java e29aa22 
  metastore/if/hive_metastore.thrift 79592ea 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 1311b20 
  metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/InsertEventRequestData.java 39a607d 
  metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb ebed504 
  metastore/src/java/org/apache/hadoop/hive/metastore/events/InsertEvent.java d9a42a7 
  metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java fe747df 
  metastore/src/java/org/apache/hadoop/hive/metastore/messaging/MessageFactory.java fdb8e80 
  metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java bd9f9ec 
  metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java 9954902 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 4c0f817 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java be5a6a9 
  ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 6e9602f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java f61274b 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 5561e06 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 9b83407 

Diff: https://reviews.apache.org/r/55154/diff/


Testing
-------


Thanks,

Vaibhav Gumashta


Re: Review Request 55154: HIVE-15366: REPL LOAD & DUMP support for incremental INSERT events

Posted by Vaibhav Gumashta <vg...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55154/
-----------------------------------------------------------

(Updated Jan. 5, 2017, 10:47 a.m.)


Review request for hive, Daniel Dai, Sushanth Sowmyan, and Thejas Nair.


Bugs: HIVE-15366
    https://issues.apache.org/jira/browse/HIVE-15366


Repository: hive-git


Description
-------

https://issues.apache.org/jira/browse/HIVE-15366


Diffs (updated)
-----

  itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/TestDbNotificationListener.java 39356ae 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestReplicationScenarios.java e29aa22 
  metastore/if/hive_metastore.thrift 79592ea 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 1311b20 
  metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/InsertEventRequestData.java 39a607d 
  metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb ebed504 
  metastore/src/java/org/apache/hadoop/hive/metastore/events/InsertEvent.java d9a42a7 
  metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java fe747df 
  metastore/src/java/org/apache/hadoop/hive/metastore/messaging/MessageFactory.java fdb8e80 
  metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java bd9f9ec 
  metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java 9954902 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 4c0f817 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java be5a6a9 
  ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 6e9602f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java f61274b 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 5561e06 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 9b83407 

Diff: https://reviews.apache.org/r/55154/diff/


Testing
-------


Thanks,

Vaibhav Gumashta


Re: Review Request 55154: HIVE-15366: REPL LOAD & DUMP support for incremental INSERT events

Posted by Sushanth Sowmyan <kh...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55154/#review160533
-----------------------------------------------------------




metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java (line 53)
<https://reviews.apache.org/r/55154/#comment231682>

    I'm not convinced that this is a good method to add, since this is repl-specific, and adds complexity. Any presence of checksum must be encoded into the uris, so that when we call getFiles(), it contains it. Also, the files have no explicit meaning without the checksum, since they will not be stable uris. The getFiles() returned by InsertMessage should already be a CM uri that encodes the checksum, for eg: cm://hdfs%3A%2F%2Fblah$2Ffile1#abcdef1234567890 might imply the file hdfs://blah/file1 with checksum "abcdef1234567890". I'm not super pick on the actual encoding mechanism used, but we want the getFiles() results to be uris that are stable uris - ones which, even if we don't have a FileSystem object associated with it directly, we can extract the info we want from it at the endpoint when we use it, and generate it when we generate it, and all areas in between simply pass it on without doing anything additional with it.
    
    Thus, the places I see "generating" this are either DbNotificationListener or fireInsertEvent(), or ReplCopyTask during a bootstrap dump. The only place I see extracting/consuming this uri would be in ReplCopyTask on destination. All other areas should not split this.



metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java (line 376)
<https://reviews.apache.org/r/55154/#comment231687>

    We should not be adding more of these methods into JSONMessageFactory that add field names here. That knowledge should belong to the domain of the message itself. The existing methods that do this are currently slated for removal once we refactor DbNotificationListener to not depend on them.



ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java (line 576)
<https://reviews.apache.org/r/55154/#comment231678>

    The partspec can be obtained from insertMsg.getPartitionKeyValues() - we should'nt make calls to JSONMessageFactory here.
    
    JSONInsertMessage, in its implementation of getPartitionKeyValues, can, in turn, then call generic functions from JSONMessageFactory using knowledge it has about itself.
    
    There should'nt be any explicit calls to JSONMessageFactory from any class which is not a JSON*Message.
    
    See the previous ALTER patch and how it changed the ADD_PTNS/CREATE_TABLE processing for a reference.



ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java (line 595)
<https://reviews.apache.org/r/55154/#comment231681>

    We should not be making calls to JSONMessageFactory, or getting fields with knowledge of names such as "fileChecksums" or "files". Knowledge of fieldnames should be restricted to inside the message itself, which exposes api via its parent Message class.
    
    This should simply be a dump of what the InsertMessage.getFiles() returns and no more. Any encoding of checksum/etc that we do must happen in DbNotificationListener, or even possibly in fireInsertEvent, since the location is meaningless without the checksum.


- Sushanth Sowmyan


On Jan. 4, 2017, 12:59 p.m., Vaibhav Gumashta wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55154/
> -----------------------------------------------------------
> 
> (Updated Jan. 4, 2017, 12:59 p.m.)
> 
> 
> Review request for hive, Daniel Dai, Sushanth Sowmyan, and Thejas Nair.
> 
> 
> Bugs: HIVE-15366
>     https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Diffs
> -----
> 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestReplicationScenarios.java e29aa22 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java fe747df 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java bd9f9ec 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java 9954902 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 4c0f817 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 6e9602f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java f61274b 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 5561e06 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 9b83407 
> 
> Diff: https://reviews.apache.org/r/55154/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>


Re: Review Request 55154: HIVE-15366: REPL LOAD & DUMP support for incremental INSERT events

Posted by Vaibhav Gumashta <vg...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55154/
-----------------------------------------------------------

(Updated Jan. 4, 2017, 12:59 p.m.)


Review request for hive, Daniel Dai, Sushanth Sowmyan, and Thejas Nair.


Bugs: HIVE-15366
    https://issues.apache.org/jira/browse/HIVE-15366


Repository: hive-git


Description
-------

https://issues.apache.org/jira/browse/HIVE-15366


Diffs (updated)
-----

  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestReplicationScenarios.java e29aa22 
  metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java fe747df 
  metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java bd9f9ec 
  metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java 9954902 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 4c0f817 
  ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 6e9602f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java f61274b 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 5561e06 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 9b83407 

Diff: https://reviews.apache.org/r/55154/diff/


Testing
-------


Thanks,

Vaibhav Gumashta


Re: Review Request 55154: HIVE-15366: REPL LOAD & DUMP support for incremental INSERT events

Posted by Sushanth Sowmyan <kh...@gmail.com>.

> On Jan. 3, 2017, 11:55 p.m., Sushanth Sowmyan wrote:
> >

Note - the following is not exhaustive, and I know this patch has already been updated, but wanted to mention a few things that I noticed.


- Sushanth


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55154/#review160447
-----------------------------------------------------------


On Jan. 3, 2017, 10:27 p.m., Vaibhav Gumashta wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55154/
> -----------------------------------------------------------
> 
> (Updated Jan. 3, 2017, 10:27 p.m.)
> 
> 
> Review request for hive, Daniel Dai, Sushanth Sowmyan, and Thejas Nair.
> 
> 
> Bugs: HIVE-15366
>     https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Diffs
> -----
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java fe747df 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java bd9f9ec 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java 9954902 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 4c0f817 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 6e9602f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java f61274b 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 5561e06 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 9b83407 
> 
> Diff: https://reviews.apache.org/r/55154/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>


Re: Review Request 55154: HIVE-15366: REPL LOAD & DUMP support for incremental INSERT events

Posted by Sushanth Sowmyan <kh...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55154/#review160447
-----------------------------------------------------------




metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java (line 58)
<https://reviews.apache.org/r/55154/#comment231590>

    Rather than a List<byte[]> getFileChecksums, I was really visualizing a List<String> getFiles(), where each file listed is a URI that has the checksum encoded into it.
    
    The reason is that a list of checksums is too highly bound to our replication usecase only, and has nothing to do with a more generic "Message" that could be used for other purposes as well. Messages are currently used for things like audit as well, and not just replication.
    
    Having the checksums coded in the URLs makes the message interface consistent without knowing more on how to actually read the url.



metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java (line 115)
<https://reviews.apache.org/r/55154/#comment231591>

    Same comment as with InsertMessage - this should not be a list of checksums but a list of pathnames (urls)



ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 
<https://reviews.apache.org/r/55154/#comment231592>

    removing this is incorrect and breaks current EXPORT in replv1 - this is used to basically noop-out things like non-storagehandler based tables, views, etc.


- Sushanth Sowmyan


On Jan. 3, 2017, 10:27 p.m., Vaibhav Gumashta wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55154/
> -----------------------------------------------------------
> 
> (Updated Jan. 3, 2017, 10:27 p.m.)
> 
> 
> Review request for hive, Daniel Dai, Sushanth Sowmyan, and Thejas Nair.
> 
> 
> Bugs: HIVE-15366
>     https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Diffs
> -----
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java fe747df 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java bd9f9ec 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java 9954902 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 4c0f817 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 6e9602f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java f61274b 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 5561e06 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 9b83407 
> 
> Diff: https://reviews.apache.org/r/55154/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>


Re: Review Request 55154: HIVE-15366: REPL LOAD & DUMP support for incremental INSERT events

Posted by Sushanth Sowmyan <kh...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55154/#review160450
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java (line 573)
<https://reviews.apache.org/r/55154/#comment231594>

    Instead of using JSONMessageFactory.getTableName, please instantiate the InsertMessage (not JSONInsertMessage) and ask it for getTableName() - that way, we stick to portable MessageFactory based api. Also, if you look at the alter patch and how it changes add_ptns, you'll see how to get the partitions objects/etc generically.


- Sushanth Sowmyan


On Jan. 3, 2017, 10:27 p.m., Vaibhav Gumashta wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55154/
> -----------------------------------------------------------
> 
> (Updated Jan. 3, 2017, 10:27 p.m.)
> 
> 
> Review request for hive, Daniel Dai, Sushanth Sowmyan, and Thejas Nair.
> 
> 
> Bugs: HIVE-15366
>     https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Diffs
> -----
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java fe747df 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java bd9f9ec 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java 9954902 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 4c0f817 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 6e9602f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java f61274b 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 5561e06 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 9b83407 
> 
> Diff: https://reviews.apache.org/r/55154/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>


Re: Review Request 55154: HIVE-15366: REPL LOAD & DUMP support for incremental INSERT events

Posted by Vaibhav Gumashta <vg...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55154/#review160476
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 
<https://reviews.apache.org/r/55154/#comment231633>

    Sorry, this change was left over from some of the debugging I was doing. Thanks for catching.


- Vaibhav Gumashta


On Jan. 3, 2017, 10:27 p.m., Vaibhav Gumashta wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55154/
> -----------------------------------------------------------
> 
> (Updated Jan. 3, 2017, 10:27 p.m.)
> 
> 
> Review request for hive, Daniel Dai, Sushanth Sowmyan, and Thejas Nair.
> 
> 
> Bugs: HIVE-15366
>     https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Diffs
> -----
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java fe747df 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java bd9f9ec 
>   metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java 9954902 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 4c0f817 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 6e9602f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java f61274b 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 5561e06 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 9b83407 
> 
> Diff: https://reviews.apache.org/r/55154/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>