You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by kopal niranjan <ni...@gmail.com> on 2016/04/19 10:58:05 UTC

Review Request 46379: SQOOP-2585 Add hive-merge support to sqoop

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46379/
-----------------------------------------------------------

Review request for Sqoop and Venkat Ranganathan.


Repository: sqoop-trunk


Description
-------

Sqoop currently doesn't support merging of two hive tables. Implement a new tool for Sqoop which :
1. merges two hive tables. (This will be most commonly used to merge new incremental data into an existing hive table)
2. should support both partitioned and non partitioned tables
3. supports merge on composite keys
4. For partitioned tables, merges old partitions and also add new partitions.
5. Supports Text/RC/ORC/Seq file formats
6. Should make sure that only process is performing merge.
7. should be an atomic operation. If it fails at any point , it should revert the target hive table to its original state.


Diffs
-----

  ivy.xml d84b88f 
  ivy/libraries.properties 2e3d884 
  src/docs/user/SqoopUserGuide.txt 8d9c12d 
  src/docs/user/hive-merge-purpose.txt PRE-CREATION 
  src/docs/user/hive-merge.txt PRE-CREATION 
  src/java/org/apache/sqoop/concurrency/HiveMergeTableLock.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/AbstractMergeRecord.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/MergeRecordFactory.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/ORCMergeRecord.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/RCMergeRecord.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/RecordInspector.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/TaggedMergeRecord.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/TextSequenceMergeRecord.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/partition/util/NewPartitionsHandler.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/partition/util/NonPartitionTableHandler.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/partition/util/PartitionFilter.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/partition/util/PartitionHandler.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/partition/util/UpdatedPartitionsHandler.java PRE-CREATION 
  src/java/org/apache/sqoop/hooks/ShutdownCleanupHook.java PRE-CREATION 
  src/java/org/apache/sqoop/io/CodecMap.java cec9358 
  src/java/org/apache/sqoop/io/FILE_FORMAT.java PRE-CREATION 
  src/java/org/apache/sqoop/io/OriginalStateRestorer.java PRE-CREATION 
  src/java/org/apache/sqoop/io/VersionHandler.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/AbstractHiveMergeMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/ConfigurationConstants.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/HiveMergeJob.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/HiveMergeReducer.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/ORCHiveMergeMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/RCHiveMergeMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/SerdeFactory.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/TextSequenceHiveMergeMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/HiveMergeException.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/HiveMergeTool.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/SqoopTool.java 5b8453d 
  src/java/org/apache/sqoop/util/HDFSUtil.java PRE-CREATION 
  src/java/org/apache/sqoop/util/HiveUtil.java PRE-CREATION 
  src/java/org/apache/sqoop/util/IOStreamUtils.java PRE-CREATION 
  src/java/org/apache/sqoop/util/JSONUtil.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/merge/record/strategy/TestAbstractMergeRecord.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/merge/record/strategy/TestORCMergeRecord.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/merge/record/strategy/TestRCMergeRecord.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/merge/record/strategy/TestTaggedMergeRecord.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/merge/record/strategy/TestTextSequenceMergeRecord.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/partition/util/TestNewPartitionHandler.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/partition/util/TestNonPartitionTableHandler.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/partition/util/TestUpdatedPartitionsHandler.java PRE-CREATION 
  src/test/org/apache/sqoop/io/TestVersionHandler.java PRE-CREATION 
  src/test/org/apache/sqoop/mapreduce/hivemerge/TestHiveMergeJob.java PRE-CREATION 
  src/test/org/apache/sqoop/mapreduce/hivemerge/TestORCHiveMergeMapper.java PRE-CREATION 
  src/test/org/apache/sqoop/mapreduce/hivemerge/TestRCHiveMergeMapper.java PRE-CREATION 
  src/test/org/apache/sqoop/mapreduce/hivemerge/TestTextSequenceHiveMergeMapper.java PRE-CREATION 
  src/test/org/apache/sqoop/util/TestHDFSUtil.java PRE-CREATION 
  src/test/org/apache/sqoop/util/TestJSONUtil.java PRE-CREATION 
  src/test/org/apache/sqoop/util/clusters/SqoopMiniDFSCluster.java PRE-CREATION 

Diff: https://reviews.apache.org/r/46379/diff/


Testing
-------

Yes


Thanks,

kopal niranjan


Re: Review Request 46379: SQOOP-2585 Add hive-merge support to sqoop

Posted by kopal niranjan <ni...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46379/
-----------------------------------------------------------

(Updated April 26, 2016, 9:04 a.m.)


Review request for Sqoop and Venkat Ranganathan.


Changes
-------

removed @author notation from file TestTextSequenceMergeRecord.java


Repository: sqoop-trunk


Description
-------

Sqoop currently doesn't support merging of two hive tables. Implement a new tool for Sqoop which :
1. merges two hive tables. (This will be most commonly used to merge new incremental data into an existing hive table)
2. should support both partitioned and non partitioned tables
3. supports merge on composite keys
4. For partitioned tables, merges old partitions and also add new partitions.
5. Supports Text/RC/ORC/Seq file formats
6. Should make sure that only process is performing merge.
7. should be an atomic operation. If it fails at any point , it should revert the target hive table to its original state.


Diffs (updated)
-----

  ivy.xml d84b88f 
  ivy/libraries.properties 847b6e1 
  src/docs/user/SqoopUserGuide.txt 8d9c12d 
  src/docs/user/hive-merge-purpose.txt PRE-CREATION 
  src/docs/user/hive-merge.txt PRE-CREATION 
  src/java/org/apache/sqoop/concurrency/HiveMergeTableLock.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/AbstractMergeRecord.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/MergeRecordFactory.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/ORCMergeRecord.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/RCMergeRecord.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/RecordInspector.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/TaggedMergeRecord.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/TextSequenceMergeRecord.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/partition/util/NewPartitionsHandler.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/partition/util/NonPartitionTableHandler.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/partition/util/PartitionFilter.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/partition/util/PartitionHandler.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/partition/util/UpdatedPartitionsHandler.java PRE-CREATION 
  src/java/org/apache/sqoop/hooks/ShutdownCleanupHook.java PRE-CREATION 
  src/java/org/apache/sqoop/io/CodecMap.java cec9358 
  src/java/org/apache/sqoop/io/FILE_FORMAT.java PRE-CREATION 
  src/java/org/apache/sqoop/io/OriginalStateRestorer.java PRE-CREATION 
  src/java/org/apache/sqoop/io/VersionHandler.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/AbstractHiveMergeMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/ConfigurationConstants.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/HiveMergeJob.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/HiveMergeReducer.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/ORCHiveMergeMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/RCHiveMergeMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/SerdeFactory.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/TextSequenceHiveMergeMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/HiveMergeException.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/HiveMergeTool.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/SqoopTool.java 5b8453d 
  src/java/org/apache/sqoop/util/HDFSUtil.java PRE-CREATION 
  src/java/org/apache/sqoop/util/HiveUtil.java PRE-CREATION 
  src/java/org/apache/sqoop/util/IOStreamUtils.java PRE-CREATION 
  src/java/org/apache/sqoop/util/JSONUtil.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/merge/record/strategy/TestAbstractMergeRecord.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/merge/record/strategy/TestORCMergeRecord.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/merge/record/strategy/TestRCMergeRecord.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/merge/record/strategy/TestTaggedMergeRecord.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/merge/record/strategy/TestTextSequenceMergeRecord.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/partition/util/TestNewPartitionHandler.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/partition/util/TestNonPartitionTableHandler.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/partition/util/TestUpdatedPartitionsHandler.java PRE-CREATION 
  src/test/org/apache/sqoop/io/TestVersionHandler.java PRE-CREATION 
  src/test/org/apache/sqoop/mapreduce/hivemerge/TestHiveMergeJob.java PRE-CREATION 
  src/test/org/apache/sqoop/mapreduce/hivemerge/TestORCHiveMergeMapper.java PRE-CREATION 
  src/test/org/apache/sqoop/mapreduce/hivemerge/TestRCHiveMergeMapper.java PRE-CREATION 
  src/test/org/apache/sqoop/mapreduce/hivemerge/TestTextSequenceHiveMergeMapper.java PRE-CREATION 
  src/test/org/apache/sqoop/util/TestHDFSUtil.java PRE-CREATION 
  src/test/org/apache/sqoop/util/TestJSONUtil.java PRE-CREATION 
  src/test/org/apache/sqoop/util/clusters/SqoopMiniDFSCluster.java PRE-CREATION 

Diff: https://reviews.apache.org/r/46379/diff/


Testing
-------

Yes


Thanks,

kopal niranjan


Re: Review Request 46379: SQOOP-2585 Add hive-merge support to sqoop

Posted by kopal niranjan <ni...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46379/
-----------------------------------------------------------

(Updated April 19, 2016, 9:46 a.m.)


Review request for Sqoop and Venkat Ranganathan.


Changes
-------

Changes done here are with respect to review board https://reviews.apache.org/r/39240/


Repository: sqoop-trunk


Description
-------

Sqoop currently doesn't support merging of two hive tables. Implement a new tool for Sqoop which :
1. merges two hive tables. (This will be most commonly used to merge new incremental data into an existing hive table)
2. should support both partitioned and non partitioned tables
3. supports merge on composite keys
4. For partitioned tables, merges old partitions and also add new partitions.
5. Supports Text/RC/ORC/Seq file formats
6. Should make sure that only process is performing merge.
7. should be an atomic operation. If it fails at any point , it should revert the target hive table to its original state.


Diffs (updated)
-----

  ivy.xml d84b88f 
  ivy/libraries.properties 847b6e1 
  src/docs/user/SqoopUserGuide.txt 8d9c12d 
  src/docs/user/hive-merge-purpose.txt PRE-CREATION 
  src/docs/user/hive-merge.txt PRE-CREATION 
  src/java/org/apache/sqoop/concurrency/HiveMergeTableLock.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/AbstractMergeRecord.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/MergeRecordFactory.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/ORCMergeRecord.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/RCMergeRecord.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/RecordInspector.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/TaggedMergeRecord.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/merge/record/strategy/TextSequenceMergeRecord.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/partition/util/NewPartitionsHandler.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/partition/util/NonPartitionTableHandler.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/partition/util/PartitionFilter.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/partition/util/PartitionHandler.java PRE-CREATION 
  src/java/org/apache/sqoop/hive/partition/util/UpdatedPartitionsHandler.java PRE-CREATION 
  src/java/org/apache/sqoop/hooks/ShutdownCleanupHook.java PRE-CREATION 
  src/java/org/apache/sqoop/io/CodecMap.java cec9358 
  src/java/org/apache/sqoop/io/FILE_FORMAT.java PRE-CREATION 
  src/java/org/apache/sqoop/io/OriginalStateRestorer.java PRE-CREATION 
  src/java/org/apache/sqoop/io/VersionHandler.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/AbstractHiveMergeMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/ConfigurationConstants.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/HiveMergeJob.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/HiveMergeReducer.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/ORCHiveMergeMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/RCHiveMergeMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/SerdeFactory.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hivemerge/TextSequenceHiveMergeMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/HiveMergeException.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/HiveMergeTool.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/SqoopTool.java 5b8453d 
  src/java/org/apache/sqoop/util/HDFSUtil.java PRE-CREATION 
  src/java/org/apache/sqoop/util/HiveUtil.java PRE-CREATION 
  src/java/org/apache/sqoop/util/IOStreamUtils.java PRE-CREATION 
  src/java/org/apache/sqoop/util/JSONUtil.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/merge/record/strategy/TestAbstractMergeRecord.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/merge/record/strategy/TestORCMergeRecord.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/merge/record/strategy/TestRCMergeRecord.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/merge/record/strategy/TestTaggedMergeRecord.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/merge/record/strategy/TestTextSequenceMergeRecord.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/partition/util/TestNewPartitionHandler.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/partition/util/TestNonPartitionTableHandler.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/partition/util/TestUpdatedPartitionsHandler.java PRE-CREATION 
  src/test/org/apache/sqoop/io/TestVersionHandler.java PRE-CREATION 
  src/test/org/apache/sqoop/mapreduce/hivemerge/TestHiveMergeJob.java PRE-CREATION 
  src/test/org/apache/sqoop/mapreduce/hivemerge/TestORCHiveMergeMapper.java PRE-CREATION 
  src/test/org/apache/sqoop/mapreduce/hivemerge/TestRCHiveMergeMapper.java PRE-CREATION 
  src/test/org/apache/sqoop/mapreduce/hivemerge/TestTextSequenceHiveMergeMapper.java PRE-CREATION 
  src/test/org/apache/sqoop/util/TestHDFSUtil.java PRE-CREATION 
  src/test/org/apache/sqoop/util/TestJSONUtil.java PRE-CREATION 
  src/test/org/apache/sqoop/util/clusters/SqoopMiniDFSCluster.java PRE-CREATION 

Diff: https://reviews.apache.org/r/46379/diff/


Testing
-------

Yes


Thanks,

kopal niranjan