You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "James Taylor (JIRA)" <ji...@apache.org> on 2016/01/24 21:31:39 UTC

[jira] [Updated] (PHOENIX-2209) Building Local Index Asynchronously via IndexTool fails to populate index table

     [ https://issues.apache.org/jira/browse/PHOENIX-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

James Taylor updated PHOENIX-2209:
----------------------------------
    Fix Version/s: 4.8.0

> Building Local Index Asynchronously via IndexTool fails to populate index table
> -------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2209
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2209
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.5.0
>         Environment: CDH: 5.4.4
> HBase: 1.0.0
> Phoenix: 4.5.0 (https://github.com/SiftScience/phoenix/tree/4.5-HBase-1.0) with hacks for CDH compatibility. 
>            Reporter: Keren Gu
>            Assignee: Rajeshbabu Chintaguntla
>              Labels: IndexTool, LocalIndex, index
>             Fix For: 4.8.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Using the Asynchronous Index population tool to create local index (of 1 column) on tables with 10 columns, and 65M, 250M, 340M, and 1.3B rows respectively. 
> Table Schema as follows (with generic column names): 
> {quote}
> CREATE TABLE PH_SOJU_SHORT (
> id INT PRIMARY KEY,
> c2 VARCHAR NULL,
> c3 VARCHAR NULL,
> c4 VARCHAR NULL,
> c5 VARCHAR NULL,
> c6 VARCHAR NULL,
> c7 DOUBLE NULL,
> c8 VARCHAR NULL,
> c9 VARCHAR NULL,
> c10 BIGINT NULL
> )
> {quote}
> Example command used (for 65M row table): 
> {quote}
> 0: jdbc:phoenix:localhost> create local index LC_INDEX_SOJU_EVAL_FN on PH_SOJU_SHORT(C4) async;
> {quote}
> And MR job started with command: 
> {quote}
> $ hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table PH_SOJU_SHORT --index-table LC_INDEX_SOJU_EVAL_FN --output-path LC_INDEX_SOJU_EVAL_FN_HFILE
> {quote}
> The IndexTool MR jobs finished in 18min, 77min, 77min, and 2hr 34min respectively, but all index tables where empty. 
> For the table with 65M rows, IndexTool had 12 mappers and reducers. MR Counters show Map input and output records = 65M, Reduce Input and output records = 65M. PhoenixJobCounters input and output records are all 65M. 
> IndexTool Reducer Log tail: 
> {quote}
> ...
> 2015-08-25 00:26:44,687 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 32 segments left of total size: 22805636866 bytes
> 2015-08-25 00:26:44,693 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output Committer Algorithm version is 1
> 2015-08-25 00:26:44,765 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
> 2015-08-25 00:26:44,908 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
> 2015-08-25 00:26:45,060 INFO [main] org.apache.hadoop.hbase.io.hfile.CacheConfig: CacheConfig:disabled
> 2015-08-25 00:36:43,880 INFO [main] org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2: Writer=hdfs://nameservice/user/ubuntu/LC_INDEX_SOJU_EVAL_FN/_LOCAL_IDX_PH_SOJU_EVAL/_temporary/1/_temporary/attempt_1440094483400_5974_r_000000_0/0/496b926ad624438fa08626ac213d0f92, wrote=10737418236
> 2015-08-25 00:36:45,967 INFO [main] org.apache.hadoop.hbase.io.hfile.CacheConfig: CacheConfig:disabled
> 2015-08-25 00:38:43,095 INFO [main] org.apache.hadoop.mapred.Task: Task:attempt_1440094483400_5974_r_000000_0 is done. And is in the process of committing
> 2015-08-25 00:38:43,123 INFO [main] org.apache.hadoop.mapred.Task: Task attempt_1440094483400_5974_r_000000_0 is allowed to commit now
> 2015-08-25 00:38:43,132 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of task 'attempt_1440094483400_5974_r_000000_0' to hdfs://nameservice/user/ubuntu/LC_INDEX_SOJU_EVAL_FN/_LOCAL_IDX_PH_SOJU_EVAL/_temporary/1/task_1440094483400_5974_r_000000
> 2015-08-25 00:38:43,158 INFO [main] org.apache.hadoop.mapred.Task: Task 'attempt_1440094483400_5974_r_000000_0' done.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)