You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Mafish Liu <ma...@gmail.com> on 2010/01/18 05:07:27 UTC

Unit test result depends on platform.

Hi, All:

I'm doing some work on hive to make it support multi-distinct clause
in one statement these month. I almost archieve my goal except for
some failures on unit test. And I believe in that these failure were
caused by system environment.

Hive version was hive realease 0.4.0 rc2. My colleagues  and I have
did some work on this version, such as adding ACL layer in hive.
Failure appears while running unit test TestCliDriver and the failed
.q file were input41.q and input_part9.q. The expected result of query
"" is 500,0 while the actual result was 0, 500 on one node called
"hdp2". But on another machine (called dw81), unit test passed with
the same code.

There are three stages for this query. Frist two stages gets result of
"select count(1) as cnt from src" and "select count(1) as cnt  from
srcpart where ds = '2009-08-09'", which are 500 and 0, and stores them
as files. The third stage reads two files and joins them to get final
result " 500 0".  I compared the difference between the two node and
found that this two node will get different results while listing
directory.  The listing program were written in C and Java as
attachments. On "hdp2", results are

$ ./a.out test
filename:..             d_info:13700925  d_reclen:24s
filename:.              d_info:13977094  d_reclen:24s
filename:file1          d_info:13979509  d_reclen:32s
filename:file0          d_info:13977965  d_reclen:32s

and

$ java -cp . ls
file1
file0

, while one "dw81", results are

$ ./a.out  test
filename:..             d_info:79729027  d_reclen:24s
filename:file0          d_info:79729540  d_reclen:32s
filename:.              d_info:79729531  d_reclen:24s
filename:file1          d_info:79729536  d_reclen:32s

and

$ java -cp . ls
file0
file1

There is also listing operation on stage three and the result of stage
three depends on the order of file listing. I traced the output files,
name of output file of stage-1 is
"build/ql/tmp/485971631/10001/attempt_local_0001_m_000000_0" with
value of 500 and corresponding name and value of stage-2 are
"build/ql/tmp/485971631/10001/attempt_local_0001_m_000001_0" and 0. On
"hdp2", when I list directory "build/ql/tmp/485971631/10001/", I'll
get

build/ql/tmp/485971631/10001/attempt_local_0001_m_000001_0
build/ql/tmp/485971631/10001/attempt_local_0001_m_000000_0

and values are

0
500

It was wrong output.

While on "dw81", I'll get

build/ql/tmp/485971631/10001/attempt_local_0001_m_000000_0
build/ql/tmp/485971631/10001/attempt_local_0001_m_000001_0

and values are

500
0

which is the expected result.

So I believe that the output results of the operation depend on the
order of file-listing of certain directory, which may be
uncontrollable on different platform.

Regards.

Mafish

RE: Unit test result depends on platform.

Posted by Ashish Thusoo <at...@facebook.com>.
Can you file a JIRA and give us the unit tests that fail. That would be very helpful. I suspect some of the test queries may be missing a sort by predicate so they could have different sort orders as compared to the expected output.

Ashish 

-----Original Message-----
From: Mafish Liu [mailto:mafish@gmail.com] 
Sent: Monday, January 18, 2010 5:30 PM
To: hive-dev@hadoop.apache.org
Subject: Re: Unit test result depends on platform.

Attachments are listing programs.
--
Mafish@gmail.com

Re: Unit test result depends on platform.

Posted by Mafish Liu <ma...@gmail.com>.
Attachments are listing programs.
-- 
Mafish@gmail.com