You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Sean Busbey (JIRA)" <ji...@apache.org> on 2017/08/27 21:58:00 UTC

[jira] [Reopened] (HBASE-18640) Move mapreduce out of hbase-server into separate hbase-mapreduce moduel

     [ https://issues.apache.org/jira/browse/HBASE-18640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Busbey reopened HBASE-18640:
---------------------------------

A few problems:

1. This broke the shaded hbase-server module, which exists to provide MapReduce classes:

{code}
$ mvn -DskipTests -Prelease package
...
$ jar tf hbase-shaded/hbase-shaded-server/target/hbase-shaded-server-3.0.0-SNAPSHOT.jar | grep "org/apache/hadoop/hbase/mapred"
org/apache/hadoop/hbase/mapreduce/
org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles$2.class
org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles$3.class
org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles$LoadQueueItem.class
org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles$4.class
org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles$5.class
org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles$1.class
org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles$BulkHFileVisitor.class
org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.class
org/apache/hadoop/hbase/mapreduce/JobUtil.class
{code}

I don't see e.g. {{hbase-shaded-mapreduce}} or other equivalent to those included prior to the change:

{code}
$ git checkout 664b6be^
...
$ mvn -DskipTests -Prelease package
...
$ jar tf hbase-shaded/hbase-shaded-server/target/hbase-shaded-server-3.0.0-SNAPSHOT.jar | grep "org/apache/hadoop/hbase/mapred"
org/apache/hadoop/hbase/mapred/
org/apache/hadoop/hbase/mapred/IdentityTableReduce.class
org/apache/hadoop/hbase/mapred/MultiTableSnapshotInputFormat.class
org/apache/hadoop/hbase/mapred/TableInputFormat.class
org/apache/hadoop/hbase/mapred/TableInputFormatBase.class
org/apache/hadoop/hbase/mapred/TableReduce.class
org/apache/hadoop/hbase/mapred/TableSnapshotInputFormat$TableSnapshotRecordReader.class
org/apache/hadoop/hbase/mapreduce/
org/apache/hadoop/hbase/mapreduce/CellCounter$CellCounterMapper.class
org/apache/hadoop/hbase/mapreduce/DefaultVisibilityExpressionResolver$1.class
org/apache/hadoop/hbase/mapreduce/Export.class
org/apache/hadoop/hbase/mapreduce/HashTable$ResultHasher.class
org/apache/hadoop/hbase/mapreduce/HFileInputFormat$1.class
org/apache/hadoop/hbase/mapreduce/HFileInputFormat.class
org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2$WriterLength.class
org/apache/hadoop/hbase/mapreduce/IdentityTableMapper.class
org/apache/hadoop/hbase/mapreduce/IdentityTableReducer.class
org/apache/hadoop/hbase/mapreduce/Import$KeyValueSortImporter.class
org/apache/hadoop/hbase/mapreduce/ImportTsv$TsvParser$ParsedLine.class
org/apache/hadoop/hbase/mapreduce/JarFinder.class
org/apache/hadoop/hbase/mapreduce/KeyValueSerialization$KeyValueDeserializer.class
org/apache/hadoop/hbase/mapreduce/KeyValueSortReducer.class
org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles$5.class
org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase$1.class
org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.class
org/apache/hadoop/hbase/mapreduce/MultiTableSnapshotInputFormat.class
org/apache/hadoop/hbase/mapreduce/MultithreadedTableMapper$1.class
org/apache/hadoop/hbase/mapreduce/MultithreadedTableMapper$MapRunner.class
org/apache/hadoop/hbase/mapreduce/MultithreadedTableMapper$SubMapStatusReporter.class
org/apache/hadoop/hbase/mapreduce/MultithreadedTableMapper.class
org/apache/hadoop/hbase/mapreduce/PutCombiner.class
org/apache/hadoop/hbase/mapreduce/ResultSerialization$1.class
org/apache/hadoop/hbase/mapreduce/ResultSerialization$Result94Deserializer.class
org/apache/hadoop/hbase/mapreduce/RowCounter$RowCounterMapper$Counters.class
org/apache/hadoop/hbase/mapreduce/RowCounter.class
org/apache/hadoop/hbase/mapreduce/SyncTable$SyncMapper$Counter.class
org/apache/hadoop/hbase/mapreduce/TableInputFormatBase$1.class
org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.class
org/apache/hadoop/hbase/mapreduce/TableRecordReader.class
org/apache/hadoop/hbase/mapreduce/TableReducer.class
org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormat$TableSnapshotRegionRecordReader.class
org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormatImpl$RecordReader.class
org/apache/hadoop/hbase/mapreduce/TableSplit.class
org/apache/hadoop/hbase/mapreduce/TsvImporterTextMapper.class
org/apache/hadoop/hbase/mapreduce/WALInputFormat$WALRecordReader.class
org/apache/hadoop/hbase/mapreduce/WALPlayer.class
org/apache/hadoop/hbase/mapred/HRegionPartitioner.class
org/apache/hadoop/hbase/mapred/RowCounter$RowCounterMapper$Counters.class
org/apache/hadoop/hbase/mapred/TableMap.class
org/apache/hadoop/hbase/mapred/TableOutputFormat.class
org/apache/hadoop/hbase/mapred/TableRecordReader.class
org/apache/hadoop/hbase/mapred/TableSnapshotInputFormat$TableSnapshotRegionSplit.class
org/apache/hadoop/hbase/mapred/TableSplit.class
org/apache/hadoop/hbase/mapreduce/CellCounter$IntSumReducer.class
org/apache/hadoop/hbase/mapreduce/CopyTable.class
org/apache/hadoop/hbase/mapreduce/HashTable$HashMapper.class
org/apache/hadoop/hbase/mapreduce/HashTable$TableHash.class
org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2$1.class
org/apache/hadoop/hbase/mapreduce/Import$KeyValueImporter.class
org/apache/hadoop/hbase/mapreduce/Import$KeyValueWritableComparable.class
org/apache/hadoop/hbase/mapreduce/ImportTsv$TsvParser$BadTsvLineException.class
org/apache/hadoop/hbase/mapreduce/ImportTsv$TsvParser.class
org/apache/hadoop/hbase/mapreduce/KeyValueSerialization$KeyValueSerializer.class
org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles$1.class
org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles$4.class
org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles$LoadQueueItem.class
org/apache/hadoop/hbase/mapreduce/MultiTableHFileOutputFormat.class
org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat$MultiTableRecordWriter.class
org/apache/hadoop/hbase/mapreduce/MultiTableSnapshotInputFormatImpl.class
org/apache/hadoop/hbase/mapreduce/MutationSerialization$MutationDeserializer.class
org/apache/hadoop/hbase/mapreduce/PutSortReducer.class
org/apache/hadoop/hbase/mapreduce/ResultSerialization$ResultDeserializer.class
org/apache/hadoop/hbase/mapreduce/ResultSerialization.class
org/apache/hadoop/hbase/mapreduce/SimpleTotalOrderPartitioner.class
org/apache/hadoop/hbase/mapreduce/TableInputFormat.class
org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.class
org/apache/hadoop/hbase/mapreduce/TableSplit$Version.class
org/apache/hadoop/hbase/mapreduce/WALInputFormat$WALKeyRecordReader.class
org/apache/hadoop/hbase/mapreduce/WALInputFormat.class
org/apache/hadoop/hbase/mapred/TableInputFormatBase$1.class
org/apache/hadoop/hbase/mapred/TableOutputFormat$TableRecordWriter.class
org/apache/hadoop/hbase/mapred/TableRecordReaderImpl.class
org/apache/hadoop/hbase/mapreduce/CellCounter$CellCounterMapper$Counters.class
org/apache/hadoop/hbase/mapreduce/CellCounter.class
org/apache/hadoop/hbase/mapreduce/Driver.class
org/apache/hadoop/hbase/mapreduce/GroupingTableMapper.class
org/apache/hadoop/hbase/mapreduce/HFileInputFormat$HFileRecordReader.class
org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.class
org/apache/hadoop/hbase/mapreduce/Import$KeyValueWritableComparablePartitioner.class
org/apache/hadoop/hbase/mapreduce/ImportTsv.class
org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles$3.class
org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles$BulkHFileVisitor.class
org/apache/hadoop/hbase/mapreduce/MultiTableInputFormat.class
org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.class
org/apache/hadoop/hbase/mapreduce/MultithreadedTableMapper$SubMapRecordReader.class
org/apache/hadoop/hbase/mapreduce/MultithreadedTableMapper$SubMapRecordWriter.class
org/apache/hadoop/hbase/mapreduce/MutationSerialization$1.class
org/apache/hadoop/hbase/mapreduce/MutationSerialization$MutationSerializer.class
org/apache/hadoop/hbase/mapreduce/MutationSerialization.class
org/apache/hadoop/hbase/mapreduce/replication/
org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication$1.class
org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication$Verifier.class
org/apache/hadoop/hbase/mapreduce/ResultSerialization$ResultSerializer.class
org/apache/hadoop/hbase/mapreduce/RowCounter$RowCounterMapper.class
org/apache/hadoop/hbase/mapreduce/SyncTable$SyncMapper$CellScanner.class
org/apache/hadoop/hbase/mapreduce/SyncTable$SyncMapper.class
org/apache/hadoop/hbase/mapreduce/TableOutputCommitter.class
org/apache/hadoop/hbase/mapreduce/TableOutputFormat.class
org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormat$TableSnapshotRegionSplit.class
org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormat.class
org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormatImpl.class
org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.class
org/apache/hadoop/hbase/mapreduce/WALPlayer$WALMapper.class
org/apache/hadoop/hbase/mapred/Driver.class
org/apache/hadoop/hbase/mapred/GroupingTableMap.class
org/apache/hadoop/hbase/mapred/IdentityTableMap.class
org/apache/hadoop/hbase/mapred/RowCounter$RowCounterMapper.class
org/apache/hadoop/hbase/mapred/RowCounter.class
org/apache/hadoop/hbase/mapred/TableMapReduceUtil.class
org/apache/hadoop/hbase/mapred/TableSnapshotInputFormat.class
org/apache/hadoop/hbase/mapreduce/CellCreator.class
org/apache/hadoop/hbase/mapreduce/DefaultVisibilityExpressionResolver.class
org/apache/hadoop/hbase/mapreduce/HashTable$TableHash$Reader.class
org/apache/hadoop/hbase/mapreduce/HashTable.class
org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2$TableInfo.class
org/apache/hadoop/hbase/mapreduce/HRegionPartitioner.class
org/apache/hadoop/hbase/mapreduce/Import$Importer.class
org/apache/hadoop/hbase/mapreduce/Import$KeyValueReducer.class
org/apache/hadoop/hbase/mapreduce/Import$KeyValueWritableComparable$KeyValueWritableComparator.class
org/apache/hadoop/hbase/mapreduce/Import.class
org/apache/hadoop/hbase/mapreduce/KeyValueSerialization.class
org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles$2.class
org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.class
org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication$Verifier$Counters.class
org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.class
org/apache/hadoop/hbase/mapreduce/SyncTable.class
org/apache/hadoop/hbase/mapreduce/TableMapper.class
org/apache/hadoop/hbase/mapreduce/TableOutputFormat$TableRecordWriter.class
org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.class
org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormatImpl$InputSplit.class
org/apache/hadoop/hbase/mapreduce/TextSortReducer.class
org/apache/hadoop/hbase/mapreduce/VisibilityExpressionResolver.class
org/apache/hadoop/hbase/mapreduce/WALInputFormat$WALSplit.class
org/apache/hadoop/hbase/mapreduce/WALPlayer$WALKeyValueMapper.class
org/apache/hadoop/hbase/mapreduce/JobUtil.class
{code}

2. The reference guide chapter on mapreduce support wasn't updated and still tells folks to rely on the hbase-server jar.

3. The utility for downstream folks to get the classpath needed to use our mapreduce stuff, {{bin/hbase mapredcp}}, doesn't show the hbase-server module
{code}
$ ./bin/hbase mapredcp 2>/dev/null | grep -o -E "(hbase-mapreduce[^:]*.jar|hbase-server[^:]*.jar)" 
hbase-mapreduce-3.0.0-SNAPSHOT.jar
{code}

That should mean that the {{TableMapReduceUtil.initTableMapperJob}} utility also won't include the hbase-server jar. Based on scanning through references in the hbase-mapreduce module, this means some of the classes will fail at runtime.


Can we either revert until we fix these things, or link a blocker JIRA(s) for alpha3 to fix them?

> Move mapreduce out of hbase-server into separate hbase-mapreduce moduel
> -----------------------------------------------------------------------
>
>                 Key: HBASE-18640
>                 URL: https://issues.apache.org/jira/browse/HBASE-18640
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Appy
>            Assignee: Appy
>             Fix For: 2.0.0-alpha-3
>
>         Attachments: HBASE-18640.branch-2.001.patch, HBASE-18640.master.001.patch, HBASE-18640.master.002.patch, HBASE-18640.master.003.patch, HBASE-18640.master.003.patch, HBASE-18640.master.004.patch, HBASE-18640.master.004.patch, HBASE-18640.master.005.patch, HBASE-18640.master.006.patch, HBASE-18640.master.007.patch, HBASE-18640.master.008.patch
>
>
> (Couldn't find another dedicated jira, so creating new one).
> Uploaded patch which is moving ~60 files to the new module. Few notes:
> - The classes remaining in hbase-server are the ones which are intensively coupled with visibility labels/wal/filesystem/hfile. These can not be migrated to new module until corresponding subcomponents are untangled out of hbase-server into their own separate modules.
> - Almost all mapreduce tests uses HBaseTestingUtil, so they can't be moved to hbase-mapreduce module. Given these dependency constraints, one way would be having a separate module for tests:
> hbase-mapreduce <---- hbase-server <------- hbase-mapreduce-tests 
> Imo, this makes sense and looks fine.
> The only issue is - yetus' pre-commit. It won't run tests in hbase-mapreduce-tests module if something changed in just hbase-mapreduce. However, yetus' limitation shouldn't warrant against the idea.
> So i'd say that we should go that way, unless there are better suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)