You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Ruslan Salyakhov (JIRA)" <ji...@apache.org> on 2010/03/02 22:41:27 UTC

[jira] Updated: (HBASE-2063) For hfileoutputformat, on timeout/failure/kill clean up half-written hfile

     [ https://issues.apache.org/jira/browse/HBASE-2063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ruslan Salyakhov updated HBASE-2063:
------------------------------------

    Attachment: HBASE-2063-v1.patch

attached is the patch (HBASE-2063-v1.patch) for HFileOutputFormat:
to avoid such kind of problems need to use working path instead of output path in HFileOutputFormat.RecordWriter.
so, temporary data of failed attempts will be removed and corrupted HFiles will not appear in output folder and loadtable.rb script will go without exceptions.

> For hfileoutputformat, on timeout/failure/kill clean up half-written hfile
> --------------------------------------------------------------------------
>
>                 Key: HBASE-2063
>                 URL: https://issues.apache.org/jira/browse/HBASE-2063
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>         Attachments: HBASE-2063-v1.patch
>
>
> Below is from mailing list.  Read from bottom to top:
> {code}
>  I was going to write that perhaps you needed to turn mapred.reduce.tasks.speculative.execution off, but if enabling it and things work, that would seem to indicate that a our reducer first takes longer than the task timeout maximum and secondly, on failure, we should clean up the hfile.
> On the first issue, you are using KeyValueSortReducer?  Are your values large?  We set reducer status every 100 values.  Maybe this is not enough?  We should set status more frequently?  If you call context setstatus more frequently, do things work w/o speculative execution?
>  
> On the second, HFileOutputFormat close will set the metadata on the hfile and then close it.  On kill, this code is not being called.   Let me see if can do something about that (e.g. register a shutdown hook to clean away incomplete files -- ).
> Thanks,
> St.Ack
> On Sun, Dec 20, 2009 at 11:26 PM, ChingShen <ch...@gmail.com> wrote:
> I think I found a way.
> I set the "mapred.reduce.tasks.speculative.execution" to true and output
> hfiles again, then successfully load hfiles into hbase.
> Is it best solution? or HFileOutputFormat bug?
> Shen
> On Mon, Dec 21, 2009 at 8:25 AM, ChingShen <ch...@gmail.com> wrote:
> > Thanks, stack.
> >
> > I checked this file that isn't empty. But I found that as long as the
> > "Killed Task Attempts" > 0 in reduce phase, and run the loadtable.rb script
> > to load hfiles then failed.
> > How to avoid this problem?
> >
> > Thanks.
> >
> > Shen
> >
> >
> > On Sat, Dec 19, 2009 at 3:49 AM, stack <st...@duboce.net> wrote:
> >
> >> Check the
> >> file
> >> hdfs://domU-12-31-39-09-C5-54.compute-1.internal/osm2_hfile/Level4/197894389945760574.
> >>  Is it empty?  Was there an error during running of your MR job?  Perhaps
> >> a
> >> task failed?
> >>
> >> St.Ack
> >>
> >>
> >>
> >> On Thu, Dec 17, 2009 at 9:46 PM, ChingShen <ch...@gmail.com>
> >> wrote:
> >>
> >> > Hi,
> >> >  I use the script loadtable.rb to load my hfiles into hbase, but I got
> >> an
> >> > exception as below.
> >> >  Does anyone have any suggestions?
> >> >
> >> > 09/12/17 23:59:33 INFO loadtable: 18 read firstkey of -3.9290_52.5534
> >> from
> >> >
> >> >
> >> hdfs://domU-12-31-39-09-C5-54.compute-1.internal/osm2_hfile/Level4/1978943899457605747
> >> > org/apache/hadoop/hbase/io/hfile/HFile.java:1335:in `deserialize':
> >> > java.io.IOException: Trailer 'header' is wrong; does the trailer size
> >> match
> >> > content? (NativeException)
> >> >    from org/apache/hadoop/hbase/io/hfile/HFile.java:813:in `readTrailer'
> >> >    from org/apache/hadoop/hbase/io/hfile/HFile.java:758:in
> >> `loadFileInfo'
> >> >    from sun.reflect.GeneratedMethodAccessor7:-1:in `invoke'
> >> >    from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
> >> >    from java/lang/reflect/Method.java:597:in `invoke'
> >> >    from org/jruby/javasupport/JavaMethod.java:298:in
> >> > `invokeWithExceptionHandling'
> >> >    from org/jruby/javasupport/JavaMethod.java:259:in `invoke'
> >> >    from org/jruby/java/invokers/InstanceMethodInvoker.java:36:in `call'
> >> >     ... 18 levels...
> >> >    from org/jruby/Main.java:94:in `main'
> >> >    from loadtable.rb:83:in `each'
> >> >    from loadtable.rb:83
> >> > Complete Java stackTrace
> >> > java.io.IOException: Trailer 'header' is wrong; does the trailer size
> >> match
> >> > content?
> >> >    at
> >> >
> >> >
> >> org.apache.hadoop.hbase.io.hfile.HFile$FixedFileTrailer.deserialize(HFile.java:1335)
> >> >    at
> >> >
> >> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readTrailer(HFile.java:813)
> >> >    at
> >> >
> >> org.apache.hadoop.hbase.io.hfile.HFile$Reader.loadFileInfo(HFile.java:758)
> >> >    at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
> >> >    at
> >> >
> >> >
> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >> >    at java.lang.reflect.Method.invoke(Method.java:597)
> >> >    at
> >> >
> >> >
> >> org.jruby.javasupport.JavaMethod.invokeWithExceptionHandling(JavaMethod.java:298)
> >> >    at org.jruby.javasupport.JavaMethod.invoke(JavaMethod.java:259)
> >> >    at
> >> >
> >> >
> >> org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:36)
> >> >    at
> >> > org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:70)
> >> >    at loadtable.ensure_1$RUBY$__ensure___2(loadtable.rb:86)
> >> >    at loadtable.block_0$RUBY$__for__(loadtable.rb:85)
> >> >    at loadtableBlockCallback$block_0$RUBY$__for__xx1.call(Unknown
> >> Source)
> >> >    at org.jruby.runtime.CompiledBlock.yield(CompiledBlock.java:102)
> >> >    at org.jruby.runtime.Block.yield(Block.java:100)
> >> >    at
> >> org.jruby.java.proxies.ArrayJavaProxy.each(ArrayJavaProxy.java:112)
> >> >    at
> >> >
> >> >
> >> org.jruby.java.proxies.ArrayJavaProxy$i_method_0_0$RUBYINVOKER$each.call(org/jruby/java/proxies/ArrayJavaProxy$i_method_0_0$RUBYINVOKER$each.gen)
> >> >    at
> >> >
> >> >
> >> org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:263)
> >> >    at
> >> >
> >> >
> >> org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:81)
> >> >    at
> >> >
> >> >
> >> org.jruby.runtime.callsite.CachingCallSite.callIter(CachingCallSite.java:96)
> >> >    at loadtable.__file__(loadtable.rb:83)
> >> >    at loadtable.load(loadtable.rb)
> >> >    at org.jruby.Ruby.runScript(Ruby.java:577)
> >> >    at org.jruby.Ruby.runNormally(Ruby.java:480)
> >> >    at org.jruby.Ruby.runFromMain(Ruby.java:354)
> >> >    at org.jruby.Main.run(Main.java:229)
> >> >    at org.jruby.Main.run(Main.java:110)
> >> >    at org.jruby.Main.main(Main.java:94)
> >> >
> >>
> >
> >
> >
> >
> --
> *****************************************************
> Ching-Shen Chen
> Advanced Technology Center,
> Information & Communications Research Lab.
> E-mail: chenchingshen@itri.org.tw
> Tel:+886-3-5915542
> *****************************************************
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.