You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by Zheng Shao <zs...@gmail.com> on 2009/11/01 23:34:08 UTC

[VOTE] hive release candidate 0.4.1-rc0

I have made a release candidate 0.4.1-rc0.

We've fixed several critical bugs to hive release 0.4.0. We need hive
release 0.4.1 out asap.

Here are the list of changes:

    HIVE-884. Metastore Server should call System.exit() on error.
    (Zheng Shao via pchakka)

    HIVE-864. Fix map-join memory-leak.
    (Namit Jain via zshao)

    HIVE-878. Update the hash table entry before flushing in Group By
    hash aggregation (Zheng Shao via namit)

    HIVE-882. Create a new directory every time for scratch.
    (Namit Jain via zshao)

    HIVE-890. Fix cli.sh for detecting Hadoop versions. (Paul Huff via zshao)

    HIVE-892. Hive to kill hadoop jobs using POST. (Dhruba Borthakur via zshao)

    HIVE-883. URISyntaxException when partition value contains special chars.
    (Zheng Shao via namit)


Please vote.

--
Yours,
Zheng

Re: [VOTE] hive release candidate 0.4.1-rc0

Posted by Zheng Shao <zs...@gmail.com>.

Hi Min,

What is zip? Which codec does it use?
I think it's probably a problem of the codec.

Can you try GzipCodec? Most probably that will work fine.

Zheng

On Mon, Nov 2, 2009 at 1:08 AM, Min Zhou <co...@gmail.com> wrote:

> If  it returns more than 0 rows, that error will never happen.
>
> Thanks,
> Min
>
> On Mon, Nov 2, 2009 at 5:06 PM, Min Zhou <co...@gmail.com> wrote:
> > No, it's zip.
> >
> > On Mon, Nov 2, 2009 at 4:03 PM, Zheng Shao <zs...@gmail.com> wrote:
> >> Do you mean gzip codec?
> >> I think an empty gzip file should be 20 bytes. There might be some
> >> problem with the gzip codec (or native gzip codec) on your cluster.
> >> Can you check the log message of the map tasks whether it has a line
> >> called "Successfully loaded native gzip lib"?
> >>
> >> You can try any query that produces empty results - it should go
> >> through the same code path.
> >>
> >> Zheng
> >>
> >> On Sun, Nov 1, 2009 at 11:07 PM, Min Zhou <co...@gmail.com> wrote:
> >>> we use zip codec in default.
> >>> Some of the same lines were omitted from the error stack:
> >>> at
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)
> >>>
> >>>
> >>> Thanks,
> >>> Min
> >>>
> >>> On Mon, Nov 2, 2009 at 2:57 PM, Zheng Shao <zs...@gmail.com> wrote:
> >>>> Min, can you check the default compression codec in your hadoop conf?
> >>>> The 8-byte file must be a compressed file using the codec which
> >>>> represents 0-length file.
> >>>>
> >>>> It seems that codec was not able to decompress the stream.
> >>>>
> >>>> Zheng
> >>>>
> >>>> On Sun, Nov 1, 2009 at 10:49 PM, Min Zhou <co...@gmail.com>
> wrote:
> >>>>> I think there may be a bug still in this release.
> >>>>>
> >>>>> hive>select stuff_status from auctions where auction_id='2591238417'
> >>>>> and pt='20091027';
> >>>>>
> >>>>> auctions is a table partitioned by date, it stored as a textfile w/o
> >>>>> compression. The query above should return 0 rows.
> >>>>> but when hive.exec.compress.output=true,  hive will crash with a
> >>>>> StackOverflowError
> >>>>>
> >>>>> java.lang.StackOverflowError
> >>>>>        at java.lang.ref.FinalReference.<init>(FinalReference.java:16)
> >>>>>        at java.lang.ref.Finalizer.<init>(Finalizer.java:66)
> >>>>>        at java.lang.ref.Finalizer.register(Finalizer.java:72)
> >>>>>        at java.lang.Object.<init>(Object.java:20)
> >>>>>        at java.net.SocketImpl.<init>(SocketImpl.java:27)
> >>>>>        at java.net.PlainSocketImpl.<init>(PlainSocketImpl.java:90)
> >>>>>        at java.net.SocksSocketImpl.<init>(SocksSocketImpl.java:33)
> >>>>>        at java.net.Socket.setImpl(Socket.java:434)
> >>>>>        at java.net.Socket.<init>(Socket.java:68)
> >>>>>        at sun.nio.ch.SocketAdaptor.<init>(SocketAdaptor.java:50)
> >>>>>        at sun.nio.ch.SocketAdaptor.create(SocketAdaptor.java:55)
> >>>>>        at
> sun.nio.ch.SocketChannelImpl.socket(SocketChannelImpl.java:105)
> >>>>>        at
> org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:58)
> >>>>>        at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1540)
> >>>>>        at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1662)
> >>>>>        at java.io.DataInputStream.read(DataInputStream.java:132)
> >>>>>        at
> org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:96)
> >>>>>        at
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:86)
> >>>>>        at
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
> >>>>>        at java.io.InputStream.read(InputStream.java:85)
> >>>>>        at
> org.apache.hadoop.util.LineReader.backfill(LineReader.java:82)
> >>>>>        at
> org.apache.hadoop.util.LineReader.readLine(LineReader.java:112)
> >>>>>        at
> org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
> >>>>>        at
> org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
> >>>>>        at
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:256)
> >>>>>        at
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)
> >>>>>
> >>>>> Each mapper will produce a 8 bytes deflate file on hdfs(we set
> >>>>> hive.merge.mapfiles=false), their hex representation  is like below:
> >>>>>
> >>>>> 78 9C 03 00 00 00 00 01
> >>>>>
> >>>>> This is the reason why FetchOperator:272 is called recursively, and
> >>>>> caused a stack overflow error.
> >>>>>
> >>>>> Regards,
> >>>>> Min
> >>>>>
> >>>>>
> >>>>> On Mon, Nov 2, 2009 at 6:34 AM, Zheng Shao <zs...@gmail.com> wrote:
> >>>>>> I have made a release candidate 0.4.1-rc0.
> >>>>>>
> >>>>>> We've fixed several critical bugs to hive release 0.4.0. We need
> hive
> >>>>>> release 0.4.1 out asap.
> >>>>>>
> >>>>>> Here are the list of changes:
> >>>>>>
> >>>>>>    HIVE-884. Metastore Server should call System.exit() on error.
> >>>>>>    (Zheng Shao via pchakka)
> >>>>>>
> >>>>>>    HIVE-864. Fix map-join memory-leak.
> >>>>>>    (Namit Jain via zshao)
> >>>>>>
> >>>>>>    HIVE-878. Update the hash table entry before flushing in Group By
> >>>>>>    hash aggregation (Zheng Shao via namit)
> >>>>>>
> >>>>>>    HIVE-882. Create a new directory every time for scratch.
> >>>>>>    (Namit Jain via zshao)
> >>>>>>
> >>>>>>    HIVE-890. Fix cli.sh for detecting Hadoop versions. (Paul Huff
> via zshao)
> >>>>>>
> >>>>>>    HIVE-892. Hive to kill hadoop jobs using POST. (Dhruba Borthakur
> via zshao)
> >>>>>>
> >>>>>>    HIVE-883. URISyntaxException when partition value contains
> special chars.
> >>>>>>    (Zheng Shao via namit)
> >>>>>>
> >>>>>>
> >>>>>> Please vote.
> >>>>>>
> >>>>>> --
> >>>>>> Yours,
> >>>>>> Zheng
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> My research interests are distributed systems, parallel computing and
> >>>>> bytecode based virtual machine.
> >>>>>
> >>>>> My profile:
> >>>>> http://www.linkedin.com/in/coderplay
> >>>>> My blog:
> >>>>> http://coderplay.javaeye.com
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Yours,
> >>>> Zheng
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> My research interests are distributed systems, parallel computing and
> >>> bytecode based virtual machine.
> >>>
> >>> My profile:
> >>> http://www.linkedin.com/in/coderplay
> >>> My blog:
> >>> http://coderplay.javaeye.com
> >>>
> >>
> >>
> >>
> >> --
> >> Yours,
> >> Zheng
> >>
> >
> >
> >
> > --
> > My research interests are distributed systems, parallel computing and
> > bytecode based virtual machine.
> >
> > My profile:
> > http://www.linkedin.com/in/coderplay
> > My blog:
> > http://coderplay.javaeye.com
> >
>
>
>
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>



-- 
Yours,
Zheng

Re: [VOTE] hive release candidate 0.4.1-rc0

Posted by Min Zhou <co...@gmail.com>.

If  it returns more than 0 rows, that error will never happen.

Thanks,
Min

On Mon, Nov 2, 2009 at 5:06 PM, Min Zhou <co...@gmail.com> wrote:
> No, it's zip.
>
> On Mon, Nov 2, 2009 at 4:03 PM, Zheng Shao <zs...@gmail.com> wrote:
>> Do you mean gzip codec?
>> I think an empty gzip file should be 20 bytes. There might be some
>> problem with the gzip codec (or native gzip codec) on your cluster.
>> Can you check the log message of the map tasks whether it has a line
>> called "Successfully loaded native gzip lib"?
>>
>> You can try any query that produces empty results - it should go
>> through the same code path.
>>
>> Zheng
>>
>> On Sun, Nov 1, 2009 at 11:07 PM, Min Zhou <co...@gmail.com> wrote:
>>> we use zip codec in default.
>>> Some of the same lines were omitted from the error stack:
>>> at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)
>>>
>>>
>>> Thanks,
>>> Min
>>>
>>> On Mon, Nov 2, 2009 at 2:57 PM, Zheng Shao <zs...@gmail.com> wrote:
>>>> Min, can you check the default compression codec in your hadoop conf?
>>>> The 8-byte file must be a compressed file using the codec which
>>>> represents 0-length file.
>>>>
>>>> It seems that codec was not able to decompress the stream.
>>>>
>>>> Zheng
>>>>
>>>> On Sun, Nov 1, 2009 at 10:49 PM, Min Zhou <co...@gmail.com> wrote:
>>>>> I think there may be a bug still in this release.
>>>>>
>>>>> hive>select stuff_status from auctions where auction_id='2591238417'
>>>>> and pt='20091027';
>>>>>
>>>>> auctions is a table partitioned by date, it stored as a textfile w/o
>>>>> compression. The query above should return 0 rows.
>>>>> but when hive.exec.compress.output=true,  hive will crash with a
>>>>> StackOverflowError
>>>>>
>>>>> java.lang.StackOverflowError
>>>>>        at java.lang.ref.FinalReference.<init>(FinalReference.java:16)
>>>>>        at java.lang.ref.Finalizer.<init>(Finalizer.java:66)
>>>>>        at java.lang.ref.Finalizer.register(Finalizer.java:72)
>>>>>        at java.lang.Object.<init>(Object.java:20)
>>>>>        at java.net.SocketImpl.<init>(SocketImpl.java:27)
>>>>>        at java.net.PlainSocketImpl.<init>(PlainSocketImpl.java:90)
>>>>>        at java.net.SocksSocketImpl.<init>(SocksSocketImpl.java:33)
>>>>>        at java.net.Socket.setImpl(Socket.java:434)
>>>>>        at java.net.Socket.<init>(Socket.java:68)
>>>>>        at sun.nio.ch.SocketAdaptor.<init>(SocketAdaptor.java:50)
>>>>>        at sun.nio.ch.SocketAdaptor.create(SocketAdaptor.java:55)
>>>>>        at sun.nio.ch.SocketChannelImpl.socket(SocketChannelImpl.java:105)
>>>>>        at org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:58)
>>>>>        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1540)
>>>>>        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1662)
>>>>>        at java.io.DataInputStream.read(DataInputStream.java:132)
>>>>>        at org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:96)
>>>>>        at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:86)
>>>>>        at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
>>>>>        at java.io.InputStream.read(InputStream.java:85)
>>>>>        at org.apache.hadoop.util.LineReader.backfill(LineReader.java:82)
>>>>>        at org.apache.hadoop.util.LineReader.readLine(LineReader.java:112)
>>>>>        at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
>>>>>        at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
>>>>>        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:256)
>>>>>        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)
>>>>>
>>>>> Each mapper will produce a 8 bytes deflate file on hdfs(we set
>>>>> hive.merge.mapfiles=false), their hex representation  is like below:
>>>>>
>>>>> 78 9C 03 00 00 00 00 01
>>>>>
>>>>> This is the reason why FetchOperator:272 is called recursively, and
>>>>> caused a stack overflow error.
>>>>>
>>>>> Regards,
>>>>> Min
>>>>>
>>>>>
>>>>> On Mon, Nov 2, 2009 at 6:34 AM, Zheng Shao <zs...@gmail.com> wrote:
>>>>>> I have made a release candidate 0.4.1-rc0.
>>>>>>
>>>>>> We've fixed several critical bugs to hive release 0.4.0. We need hive
>>>>>> release 0.4.1 out asap.
>>>>>>
>>>>>> Here are the list of changes:
>>>>>>
>>>>>>    HIVE-884. Metastore Server should call System.exit() on error.
>>>>>>    (Zheng Shao via pchakka)
>>>>>>
>>>>>>    HIVE-864. Fix map-join memory-leak.
>>>>>>    (Namit Jain via zshao)
>>>>>>
>>>>>>    HIVE-878. Update the hash table entry before flushing in Group By
>>>>>>    hash aggregation (Zheng Shao via namit)
>>>>>>
>>>>>>    HIVE-882. Create a new directory every time for scratch.
>>>>>>    (Namit Jain via zshao)
>>>>>>
>>>>>>    HIVE-890. Fix cli.sh for detecting Hadoop versions. (Paul Huff via zshao)
>>>>>>
>>>>>>    HIVE-892. Hive to kill hadoop jobs using POST. (Dhruba Borthakur via zshao)
>>>>>>
>>>>>>    HIVE-883. URISyntaxException when partition value contains special chars.
>>>>>>    (Zheng Shao via namit)
>>>>>>
>>>>>>
>>>>>> Please vote.
>>>>>>
>>>>>> --
>>>>>> Yours,
>>>>>> Zheng
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> My research interests are distributed systems, parallel computing and
>>>>> bytecode based virtual machine.
>>>>>
>>>>> My profile:
>>>>> http://www.linkedin.com/in/coderplay
>>>>> My blog:
>>>>> http://coderplay.javaeye.com
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Yours,
>>>> Zheng
>>>>
>>>
>>>
>>>
>>> --
>>> My research interests are distributed systems, parallel computing and
>>> bytecode based virtual machine.
>>>
>>> My profile:
>>> http://www.linkedin.com/in/coderplay
>>> My blog:
>>> http://coderplay.javaeye.com
>>>
>>
>>
>>
>> --
>> Yours,
>> Zheng
>>
>
>
>
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>



-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

Re: [VOTE] hive release candidate 0.4.1-rc0

Posted by Min Zhou <co...@gmail.com>.

No, it's zip.

On Mon, Nov 2, 2009 at 4:03 PM, Zheng Shao <zs...@gmail.com> wrote:
> Do you mean gzip codec?
> I think an empty gzip file should be 20 bytes. There might be some
> problem with the gzip codec (or native gzip codec) on your cluster.
> Can you check the log message of the map tasks whether it has a line
> called "Successfully loaded native gzip lib"?
>
> You can try any query that produces empty results - it should go
> through the same code path.
>
> Zheng
>
> On Sun, Nov 1, 2009 at 11:07 PM, Min Zhou <co...@gmail.com> wrote:
>> we use zip codec in default.
>> Some of the same lines were omitted from the error stack:
>> at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)
>>
>>
>> Thanks,
>> Min
>>
>> On Mon, Nov 2, 2009 at 2:57 PM, Zheng Shao <zs...@gmail.com> wrote:
>>> Min, can you check the default compression codec in your hadoop conf?
>>> The 8-byte file must be a compressed file using the codec which
>>> represents 0-length file.
>>>
>>> It seems that codec was not able to decompress the stream.
>>>
>>> Zheng
>>>
>>> On Sun, Nov 1, 2009 at 10:49 PM, Min Zhou <co...@gmail.com> wrote:
>>>> I think there may be a bug still in this release.
>>>>
>>>> hive>select stuff_status from auctions where auction_id='2591238417'
>>>> and pt='20091027';
>>>>
>>>> auctions is a table partitioned by date, it stored as a textfile w/o
>>>> compression. The query above should return 0 rows.
>>>> but when hive.exec.compress.output=true,  hive will crash with a
>>>> StackOverflowError
>>>>
>>>> java.lang.StackOverflowError
>>>>        at java.lang.ref.FinalReference.<init>(FinalReference.java:16)
>>>>        at java.lang.ref.Finalizer.<init>(Finalizer.java:66)
>>>>        at java.lang.ref.Finalizer.register(Finalizer.java:72)
>>>>        at java.lang.Object.<init>(Object.java:20)
>>>>        at java.net.SocketImpl.<init>(SocketImpl.java:27)
>>>>        at java.net.PlainSocketImpl.<init>(PlainSocketImpl.java:90)
>>>>        at java.net.SocksSocketImpl.<init>(SocksSocketImpl.java:33)
>>>>        at java.net.Socket.setImpl(Socket.java:434)
>>>>        at java.net.Socket.<init>(Socket.java:68)
>>>>        at sun.nio.ch.SocketAdaptor.<init>(SocketAdaptor.java:50)
>>>>        at sun.nio.ch.SocketAdaptor.create(SocketAdaptor.java:55)
>>>>        at sun.nio.ch.SocketChannelImpl.socket(SocketChannelImpl.java:105)
>>>>        at org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:58)
>>>>        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1540)
>>>>        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1662)
>>>>        at java.io.DataInputStream.read(DataInputStream.java:132)
>>>>        at org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:96)
>>>>        at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:86)
>>>>        at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
>>>>        at java.io.InputStream.read(InputStream.java:85)
>>>>        at org.apache.hadoop.util.LineReader.backfill(LineReader.java:82)
>>>>        at org.apache.hadoop.util.LineReader.readLine(LineReader.java:112)
>>>>        at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
>>>>        at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
>>>>        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:256)
>>>>        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)
>>>>
>>>> Each mapper will produce a 8 bytes deflate file on hdfs(we set
>>>> hive.merge.mapfiles=false), their hex representation  is like below:
>>>>
>>>> 78 9C 03 00 00 00 00 01
>>>>
>>>> This is the reason why FetchOperator:272 is called recursively, and
>>>> caused a stack overflow error.
>>>>
>>>> Regards,
>>>> Min
>>>>
>>>>
>>>> On Mon, Nov 2, 2009 at 6:34 AM, Zheng Shao <zs...@gmail.com> wrote:
>>>>> I have made a release candidate 0.4.1-rc0.
>>>>>
>>>>> We've fixed several critical bugs to hive release 0.4.0. We need hive
>>>>> release 0.4.1 out asap.
>>>>>
>>>>> Here are the list of changes:
>>>>>
>>>>>    HIVE-884. Metastore Server should call System.exit() on error.
>>>>>    (Zheng Shao via pchakka)
>>>>>
>>>>>    HIVE-864. Fix map-join memory-leak.
>>>>>    (Namit Jain via zshao)
>>>>>
>>>>>    HIVE-878. Update the hash table entry before flushing in Group By
>>>>>    hash aggregation (Zheng Shao via namit)
>>>>>
>>>>>    HIVE-882. Create a new directory every time for scratch.
>>>>>    (Namit Jain via zshao)
>>>>>
>>>>>    HIVE-890. Fix cli.sh for detecting Hadoop versions. (Paul Huff via zshao)
>>>>>
>>>>>    HIVE-892. Hive to kill hadoop jobs using POST. (Dhruba Borthakur via zshao)
>>>>>
>>>>>    HIVE-883. URISyntaxException when partition value contains special chars.
>>>>>    (Zheng Shao via namit)
>>>>>
>>>>>
>>>>> Please vote.
>>>>>
>>>>> --
>>>>> Yours,
>>>>> Zheng
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> My research interests are distributed systems, parallel computing and
>>>> bytecode based virtual machine.
>>>>
>>>> My profile:
>>>> http://www.linkedin.com/in/coderplay
>>>> My blog:
>>>> http://coderplay.javaeye.com
>>>>
>>>
>>>
>>>
>>> --
>>> Yours,
>>> Zheng
>>>
>>
>>
>>
>> --
>> My research interests are distributed systems, parallel computing and
>> bytecode based virtual machine.
>>
>> My profile:
>> http://www.linkedin.com/in/coderplay
>> My blog:
>> http://coderplay.javaeye.com
>>
>
>
>
> --
> Yours,
> Zheng
>



-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

Re: [VOTE] hive release candidate 0.4.1-rc0

Posted by Zheng Shao <zs...@gmail.com>.

Do you mean gzip codec?
I think an empty gzip file should be 20 bytes. There might be some
problem with the gzip codec (or native gzip codec) on your cluster.
Can you check the log message of the map tasks whether it has a line
called "Successfully loaded native gzip lib"?

You can try any query that produces empty results - it should go
through the same code path.

Zheng

On Sun, Nov 1, 2009 at 11:07 PM, Min Zhou <co...@gmail.com> wrote:
> we use zip codec in default.
> Some of the same lines were omitted from the error stack:
> at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)
>
>
> Thanks,
> Min
>
> On Mon, Nov 2, 2009 at 2:57 PM, Zheng Shao <zs...@gmail.com> wrote:
>> Min, can you check the default compression codec in your hadoop conf?
>> The 8-byte file must be a compressed file using the codec which
>> represents 0-length file.
>>
>> It seems that codec was not able to decompress the stream.
>>
>> Zheng
>>
>> On Sun, Nov 1, 2009 at 10:49 PM, Min Zhou <co...@gmail.com> wrote:
>>> I think there may be a bug still in this release.
>>>
>>> hive>select stuff_status from auctions where auction_id='2591238417'
>>> and pt='20091027';
>>>
>>> auctions is a table partitioned by date, it stored as a textfile w/o
>>> compression. The query above should return 0 rows.
>>> but when hive.exec.compress.output=true,  hive will crash with a
>>> StackOverflowError
>>>
>>> java.lang.StackOverflowError
>>>        at java.lang.ref.FinalReference.<init>(FinalReference.java:16)
>>>        at java.lang.ref.Finalizer.<init>(Finalizer.java:66)
>>>        at java.lang.ref.Finalizer.register(Finalizer.java:72)
>>>        at java.lang.Object.<init>(Object.java:20)
>>>        at java.net.SocketImpl.<init>(SocketImpl.java:27)
>>>        at java.net.PlainSocketImpl.<init>(PlainSocketImpl.java:90)
>>>        at java.net.SocksSocketImpl.<init>(SocksSocketImpl.java:33)
>>>        at java.net.Socket.setImpl(Socket.java:434)
>>>        at java.net.Socket.<init>(Socket.java:68)
>>>        at sun.nio.ch.SocketAdaptor.<init>(SocketAdaptor.java:50)
>>>        at sun.nio.ch.SocketAdaptor.create(SocketAdaptor.java:55)
>>>        at sun.nio.ch.SocketChannelImpl.socket(SocketChannelImpl.java:105)
>>>        at org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:58)
>>>        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1540)
>>>        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1662)
>>>        at java.io.DataInputStream.read(DataInputStream.java:132)
>>>        at org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:96)
>>>        at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:86)
>>>        at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
>>>        at java.io.InputStream.read(InputStream.java:85)
>>>        at org.apache.hadoop.util.LineReader.backfill(LineReader.java:82)
>>>        at org.apache.hadoop.util.LineReader.readLine(LineReader.java:112)
>>>        at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
>>>        at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
>>>        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:256)
>>>        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)
>>>
>>> Each mapper will produce a 8 bytes deflate file on hdfs(we set
>>> hive.merge.mapfiles=false), their hex representation  is like below:
>>>
>>> 78 9C 03 00 00 00 00 01
>>>
>>> This is the reason why FetchOperator:272 is called recursively, and
>>> caused a stack overflow error.
>>>
>>> Regards,
>>> Min
>>>
>>>
>>> On Mon, Nov 2, 2009 at 6:34 AM, Zheng Shao <zs...@gmail.com> wrote:
>>>> I have made a release candidate 0.4.1-rc0.
>>>>
>>>> We've fixed several critical bugs to hive release 0.4.0. We need hive
>>>> release 0.4.1 out asap.
>>>>
>>>> Here are the list of changes:
>>>>
>>>>    HIVE-884. Metastore Server should call System.exit() on error.
>>>>    (Zheng Shao via pchakka)
>>>>
>>>>    HIVE-864. Fix map-join memory-leak.
>>>>    (Namit Jain via zshao)
>>>>
>>>>    HIVE-878. Update the hash table entry before flushing in Group By
>>>>    hash aggregation (Zheng Shao via namit)
>>>>
>>>>    HIVE-882. Create a new directory every time for scratch.
>>>>    (Namit Jain via zshao)
>>>>
>>>>    HIVE-890. Fix cli.sh for detecting Hadoop versions. (Paul Huff via zshao)
>>>>
>>>>    HIVE-892. Hive to kill hadoop jobs using POST. (Dhruba Borthakur via zshao)
>>>>
>>>>    HIVE-883. URISyntaxException when partition value contains special chars.
>>>>    (Zheng Shao via namit)
>>>>
>>>>
>>>> Please vote.
>>>>
>>>> --
>>>> Yours,
>>>> Zheng
>>>>
>>>
>>>
>>>
>>> --
>>> My research interests are distributed systems, parallel computing and
>>> bytecode based virtual machine.
>>>
>>> My profile:
>>> http://www.linkedin.com/in/coderplay
>>> My blog:
>>> http://coderplay.javaeye.com
>>>
>>
>>
>>
>> --
>> Yours,
>> Zheng
>>
>
>
>
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>



-- 
Yours,
Zheng

Re: [VOTE] hive release candidate 0.4.1-rc0

Posted by Min Zhou <co...@gmail.com>.

we use zip codec in default.
Some of the same lines were omitted from the error stack:
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)


Thanks,
Min

On Mon, Nov 2, 2009 at 2:57 PM, Zheng Shao <zs...@gmail.com> wrote:
> Min, can you check the default compression codec in your hadoop conf?
> The 8-byte file must be a compressed file using the codec which
> represents 0-length file.
>
> It seems that codec was not able to decompress the stream.
>
> Zheng
>
> On Sun, Nov 1, 2009 at 10:49 PM, Min Zhou <co...@gmail.com> wrote:
>> I think there may be a bug still in this release.
>>
>> hive>select stuff_status from auctions where auction_id='2591238417'
>> and pt='20091027';
>>
>> auctions is a table partitioned by date, it stored as a textfile w/o
>> compression. The query above should return 0 rows.
>> but when hive.exec.compress.output=true,  hive will crash with a
>> StackOverflowError
>>
>> java.lang.StackOverflowError
>>        at java.lang.ref.FinalReference.<init>(FinalReference.java:16)
>>        at java.lang.ref.Finalizer.<init>(Finalizer.java:66)
>>        at java.lang.ref.Finalizer.register(Finalizer.java:72)
>>        at java.lang.Object.<init>(Object.java:20)
>>        at java.net.SocketImpl.<init>(SocketImpl.java:27)
>>        at java.net.PlainSocketImpl.<init>(PlainSocketImpl.java:90)
>>        at java.net.SocksSocketImpl.<init>(SocksSocketImpl.java:33)
>>        at java.net.Socket.setImpl(Socket.java:434)
>>        at java.net.Socket.<init>(Socket.java:68)
>>        at sun.nio.ch.SocketAdaptor.<init>(SocketAdaptor.java:50)
>>        at sun.nio.ch.SocketAdaptor.create(SocketAdaptor.java:55)
>>        at sun.nio.ch.SocketChannelImpl.socket(SocketChannelImpl.java:105)
>>        at org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:58)
>>        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1540)
>>        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1662)
>>        at java.io.DataInputStream.read(DataInputStream.java:132)
>>        at org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:96)
>>        at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:86)
>>        at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
>>        at java.io.InputStream.read(InputStream.java:85)
>>        at org.apache.hadoop.util.LineReader.backfill(LineReader.java:82)
>>        at org.apache.hadoop.util.LineReader.readLine(LineReader.java:112)
>>        at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
>>        at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
>>        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:256)
>>        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)
>>
>> Each mapper will produce a 8 bytes deflate file on hdfs(we set
>> hive.merge.mapfiles=false), their hex representation  is like below:
>>
>> 78 9C 03 00 00 00 00 01
>>
>> This is the reason why FetchOperator:272 is called recursively, and
>> caused a stack overflow error.
>>
>> Regards,
>> Min
>>
>>
>> On Mon, Nov 2, 2009 at 6:34 AM, Zheng Shao <zs...@gmail.com> wrote:
>>> I have made a release candidate 0.4.1-rc0.
>>>
>>> We've fixed several critical bugs to hive release 0.4.0. We need hive
>>> release 0.4.1 out asap.
>>>
>>> Here are the list of changes:
>>>
>>>    HIVE-884. Metastore Server should call System.exit() on error.
>>>    (Zheng Shao via pchakka)
>>>
>>>    HIVE-864. Fix map-join memory-leak.
>>>    (Namit Jain via zshao)
>>>
>>>    HIVE-878. Update the hash table entry before flushing in Group By
>>>    hash aggregation (Zheng Shao via namit)
>>>
>>>    HIVE-882. Create a new directory every time for scratch.
>>>    (Namit Jain via zshao)
>>>
>>>    HIVE-890. Fix cli.sh for detecting Hadoop versions. (Paul Huff via zshao)
>>>
>>>    HIVE-892. Hive to kill hadoop jobs using POST. (Dhruba Borthakur via zshao)
>>>
>>>    HIVE-883. URISyntaxException when partition value contains special chars.
>>>    (Zheng Shao via namit)
>>>
>>>
>>> Please vote.
>>>
>>> --
>>> Yours,
>>> Zheng
>>>
>>
>>
>>
>> --
>> My research interests are distributed systems, parallel computing and
>> bytecode based virtual machine.
>>
>> My profile:
>> http://www.linkedin.com/in/coderplay
>> My blog:
>> http://coderplay.javaeye.com
>>
>
>
>
> --
> Yours,
> Zheng
>



-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

Re: [VOTE] hive release candidate 0.4.1-rc0

Posted by Zheng Shao <zs...@gmail.com>.

Min, can you check the default compression codec in your hadoop conf?
The 8-byte file must be a compressed file using the codec which
represents 0-length file.

It seems that codec was not able to decompress the stream.

Zheng

On Sun, Nov 1, 2009 at 10:49 PM, Min Zhou <co...@gmail.com> wrote:
> I think there may be a bug still in this release.
>
> hive>select stuff_status from auctions where auction_id='2591238417'
> and pt='20091027';
>
> auctions is a table partitioned by date, it stored as a textfile w/o
> compression. The query above should return 0 rows.
> but when hive.exec.compress.output=true,  hive will crash with a
> StackOverflowError
>
> java.lang.StackOverflowError
>        at java.lang.ref.FinalReference.<init>(FinalReference.java:16)
>        at java.lang.ref.Finalizer.<init>(Finalizer.java:66)
>        at java.lang.ref.Finalizer.register(Finalizer.java:72)
>        at java.lang.Object.<init>(Object.java:20)
>        at java.net.SocketImpl.<init>(SocketImpl.java:27)
>        at java.net.PlainSocketImpl.<init>(PlainSocketImpl.java:90)
>        at java.net.SocksSocketImpl.<init>(SocksSocketImpl.java:33)
>        at java.net.Socket.setImpl(Socket.java:434)
>        at java.net.Socket.<init>(Socket.java:68)
>        at sun.nio.ch.SocketAdaptor.<init>(SocketAdaptor.java:50)
>        at sun.nio.ch.SocketAdaptor.create(SocketAdaptor.java:55)
>        at sun.nio.ch.SocketChannelImpl.socket(SocketChannelImpl.java:105)
>        at org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:58)
>        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1540)
>        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1662)
>        at java.io.DataInputStream.read(DataInputStream.java:132)
>        at org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:96)
>        at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:86)
>        at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
>        at java.io.InputStream.read(InputStream.java:85)
>        at org.apache.hadoop.util.LineReader.backfill(LineReader.java:82)
>        at org.apache.hadoop.util.LineReader.readLine(LineReader.java:112)
>        at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
>        at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
>        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:256)
>        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)
>
> Each mapper will produce a 8 bytes deflate file on hdfs(we set
> hive.merge.mapfiles=false), their hex representation  is like below:
>
> 78 9C 03 00 00 00 00 01
>
> This is the reason why FetchOperator:272 is called recursively, and
> caused a stack overflow error.
>
> Regards,
> Min
>
>
> On Mon, Nov 2, 2009 at 6:34 AM, Zheng Shao <zs...@gmail.com> wrote:
>> I have made a release candidate 0.4.1-rc0.
>>
>> We've fixed several critical bugs to hive release 0.4.0. We need hive
>> release 0.4.1 out asap.
>>
>> Here are the list of changes:
>>
>>    HIVE-884. Metastore Server should call System.exit() on error.
>>    (Zheng Shao via pchakka)
>>
>>    HIVE-864. Fix map-join memory-leak.
>>    (Namit Jain via zshao)
>>
>>    HIVE-878. Update the hash table entry before flushing in Group By
>>    hash aggregation (Zheng Shao via namit)
>>
>>    HIVE-882. Create a new directory every time for scratch.
>>    (Namit Jain via zshao)
>>
>>    HIVE-890. Fix cli.sh for detecting Hadoop versions. (Paul Huff via zshao)
>>
>>    HIVE-892. Hive to kill hadoop jobs using POST. (Dhruba Borthakur via zshao)
>>
>>    HIVE-883. URISyntaxException when partition value contains special chars.
>>    (Zheng Shao via namit)
>>
>>
>> Please vote.
>>
>> --
>> Yours,
>> Zheng
>>
>
>
>
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>



-- 
Yours,
Zheng

Re: [VOTE] hive release candidate 0.4.1-rc0

Posted by Min Zhou <co...@gmail.com>.

I think there may be a bug still in this release.

hive>select stuff_status from auctions where auction_id='2591238417'
and pt='20091027';

auctions is a table partitioned by date, it stored as a textfile w/o
compression. The query above should return 0 rows.
but when hive.exec.compress.output=true,  hive will crash with a
StackOverflowError

java.lang.StackOverflowError
        at java.lang.ref.FinalReference.<init>(FinalReference.java:16)
        at java.lang.ref.Finalizer.<init>(Finalizer.java:66)
        at java.lang.ref.Finalizer.register(Finalizer.java:72)
        at java.lang.Object.<init>(Object.java:20)
        at java.net.SocketImpl.<init>(SocketImpl.java:27)
        at java.net.PlainSocketImpl.<init>(PlainSocketImpl.java:90)
        at java.net.SocksSocketImpl.<init>(SocksSocketImpl.java:33)
        at java.net.Socket.setImpl(Socket.java:434)
        at java.net.Socket.<init>(Socket.java:68)
        at sun.nio.ch.SocketAdaptor.<init>(SocketAdaptor.java:50)
        at sun.nio.ch.SocketAdaptor.create(SocketAdaptor.java:55)
        at sun.nio.ch.SocketChannelImpl.socket(SocketChannelImpl.java:105)
        at org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:58)
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1540)
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1662)
        at java.io.DataInputStream.read(DataInputStream.java:132)
        at org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:96)
        at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:86)
        at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
        at java.io.InputStream.read(InputStream.java:85)
        at org.apache.hadoop.util.LineReader.backfill(LineReader.java:82)
        at org.apache.hadoop.util.LineReader.readLine(LineReader.java:112)
        at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
        at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:256)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)

Each mapper will produce a 8 bytes deflate file on hdfs(we set
hive.merge.mapfiles=false), their hex representation  is like below:

78 9C 03 00 00 00 00 01

This is the reason why FetchOperator:272 is called recursively, and
caused a stack overflow error.

Regards,
Min


On Mon, Nov 2, 2009 at 6:34 AM, Zheng Shao <zs...@gmail.com> wrote:
> I have made a release candidate 0.4.1-rc0.
>
> We've fixed several critical bugs to hive release 0.4.0. We need hive
> release 0.4.1 out asap.
>
> Here are the list of changes:
>
>    HIVE-884. Metastore Server should call System.exit() on error.
>    (Zheng Shao via pchakka)
>
>    HIVE-864. Fix map-join memory-leak.
>    (Namit Jain via zshao)
>
>    HIVE-878. Update the hash table entry before flushing in Group By
>    hash aggregation (Zheng Shao via namit)
>
>    HIVE-882. Create a new directory every time for scratch.
>    (Namit Jain via zshao)
>
>    HIVE-890. Fix cli.sh for detecting Hadoop versions. (Paul Huff via zshao)
>
>    HIVE-892. Hive to kill hadoop jobs using POST. (Dhruba Borthakur via zshao)
>
>    HIVE-883. URISyntaxException when partition value contains special chars.
>    (Zheng Shao via namit)
>
>
> Please vote.
>
> --
> Yours,
> Zheng
>



-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com