You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by "Murali Krishna. P" <mu...@yahoo.com> on 2011/01/13 00:48:17 UTC

BuildTableIndex creates no segments_2 file

Hi,
    I am using BuildTableIndex to create indices and then push them to katta for 
index serving. In some cases, for few reduce partition I dont get .cfs and 
segments_2 file generated and hence the index deployment fail. Is this because 
that partition doesnt get any keys ? How do we fix the partition, atleast 
generate an empty valid shard in that case?


Thanks,
Murali Krishna

Re: BuildTableIndex creates no segments_2 file

Posted by Stack <sa...@gmail.com>.
Interesting that it happens when companion speculative reduce.  Though probability of clash would seem low maybe it's not as u are finding.  What u suggest for fix sounds reasonable

Stack



On Jan 14, 2011, at 2:54, "Murali Krishna. P" <mu...@yahoo.com> wrote:

> Hi Stack,
> 
> The reducer partition is getting some data. Whenever this happens, the 
> corresponding task had a speculative execution which was killed. But it doesnt 
> happen in every partition which had spec exec. 
> 
> 
> The 'temp' directory on local filesystem in IndexOutputFormat.java just uses a 
> random number, which can be a problem? When we are running multiple index build 
> jobs parallely, it is possible that we conflict here. Probably we should make 
> this robust by using 'task_attempt_id' in the local path as well?
> 
> 
> 
> 
> 
>    final Path temp = context.getConfiguration().getLocalPath(       
> "mapred.local.dir", "index/_" + Integer.toString(random.nextInt())); 
> 
> 
> 
> 
> 
> 
> Thanks,
> Murali Krishna
> 
> 
> 
> 
> ________________________________
> From: Stack <st...@duboce.net>
> To: user@hbase.apache.org
> Sent: Thu, 13 January, 2011 11:39:21 AM
> Subject: Re: BuildTableIndex creates no segments_2 file
> 
> On Wed, Jan 12, 2011 at 3:48 PM, Murali Krishna. P
> <mu...@yahoo.com> wrote:
>> Hi,
>>   I am using BuildTableIndex to create indices and then push them to katta 
> for
>> index serving. In some cases, for few reduce partition I dont get .cfs and
>> segments_2 file generated and hence the index deployment fail.
> 
> Can you add logging or counts to see if the reducer partition is
> getting data?  Or perhaps the reducer is crashing in a way that does
> not look like a crash to mapreduce.  Is this possible?  So the close
> is not doing proper index close?  Add some try/catches?
> 
> 
>> Is this because
>> that partition doesnt get any keys ? How do we fix the partition, atleast
>> generate an empty valid shard in that case?
>> 
> 
> You might want to run a checkup script that checks partitions before
> they are passed Katta?
> 
> St.Ack

Re: BuildTableIndex creates no segments_2 file

Posted by "Murali Krishna. P" <mu...@yahoo.com>.
Hi Stack,

The reducer partition is getting some data. Whenever this happens, the 
corresponding task had a speculative execution which was killed. But it doesnt 
happen in every partition which had spec exec. 


The 'temp' directory on local filesystem in IndexOutputFormat.java just uses a 
random number, which can be a problem? When we are running multiple index build 
jobs parallely, it is possible that we conflict here. Probably we should make 
this robust by using 'task_attempt_id' in the local path as well?
 
 
  

  
    final Path temp = context.getConfiguration().getLocalPath(       
"mapred.local.dir", "index/_" + Integer.toString(random.nextInt())); 

 
 
 
 
 
 Thanks,
Murali Krishna




________________________________
From: Stack <st...@duboce.net>
To: user@hbase.apache.org
Sent: Thu, 13 January, 2011 11:39:21 AM
Subject: Re: BuildTableIndex creates no segments_2 file

On Wed, Jan 12, 2011 at 3:48 PM, Murali Krishna. P
<mu...@yahoo.com> wrote:
> Hi,
>    I am using BuildTableIndex to create indices and then push them to katta 
for
> index serving. In some cases, for few reduce partition I dont get .cfs and
> segments_2 file generated and hence the index deployment fail.

Can you add logging or counts to see if the reducer partition is
getting data?  Or perhaps the reducer is crashing in a way that does
not look like a crash to mapreduce.  Is this possible?  So the close
is not doing proper index close?  Add some try/catches?


> Is this because
> that partition doesnt get any keys ? How do we fix the partition, atleast
> generate an empty valid shard in that case?
>

You might want to run a checkup script that checks partitions before
they are passed Katta?

St.Ack

Re: BuildTableIndex creates no segments_2 file

Posted by Stack <st...@duboce.net>.
On Wed, Jan 12, 2011 at 3:48 PM, Murali Krishna. P
<mu...@yahoo.com> wrote:
> Hi,
>    I am using BuildTableIndex to create indices and then push them to katta for
> index serving. In some cases, for few reduce partition I dont get .cfs and
> segments_2 file generated and hence the index deployment fail.

Can you add logging or counts to see if the reducer partition is
getting data?  Or perhaps the reducer is crashing in a way that does
not look like a crash to mapreduce.  Is this possible?  So the close
is not doing proper index close?  Add some try/catches?


> Is this because
> that partition doesnt get any keys ? How do we fix the partition, atleast
> generate an empty valid shard in that case?
>

You might want to run a checkup script that checks partitions before
they are passed Katta?

St.Ack