You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Tao Xiao <xi...@gmail.com> on 2014/04/08 10:35:53 UTC

A non-empty file's size is reported as 0

I wrote some data into a file using MultipleOutputs in mappers. I can see
the contents in this file using "hadoop fs -ls <file>", but its size is
reported as zero by the command "hadoop fs -du <file>" or "hadoop fs -ls
<file>" , as follows:

-rw-r--r--   3 hadoop hadoop         0 2014-04-07 22:06
/test/xt/out/2014-01-20/-m-00000

BTW, when I download this file from HDFS to local file system, I can see
the correct size. Why is it reported as zero size in hadoop CLI?

Re: A non-empty file's size is reported as 0

Posted by Tao Xiao <xi...@gmail.com>.
My mapper code is as follows and I don't know whether any file is not
closed correctly.

public class TheMapper extends Mapper<LongWritable, Text, Text,
NullWritable> {
    private MultipleOutputs<Text, NullWritable> outputs;

    @Override
    protected void setup(Context ctx) {
        outputs = new MultipleOutputs<Text, NullWritable>(ctx);
    }

    @Override
    protected void map(LongWritable o, Text t, Context ctx) {
        outputs.write(t, NullWritable.get(), "2014-01-20/");
    }

    @Override
    protected void cleanup(Context ctx) {
        outputs.close();
    }
}




2014-04-08 21:17 GMT+08:00 Peyman Mohajerian <mo...@gmail.com>:

> If you didn't close the file correctly then NameNode wouldn't be notified
> of the final size of the file. The file size is meta-data coming from
> NameNode.
>
>
> On Tue, Apr 8, 2014 at 4:35 AM, Tao Xiao <xi...@gmail.com> wrote:
>
>> I wrote some data into a file using MultipleOutputs in mappers. I can see
>> the contents in this file using "hadoop fs -ls <file>", but its size is
>> reported as zero by the command "hadoop fs -du <file>" or "hadoop fs -ls
>> <file>" , as follows:
>>
>> -rw-r--r--   3 hadoop hadoop         0 2014-04-07 22:06
>> /test/xt/out/2014-01-20/-m-00000
>>
>> BTW, when I download this file from HDFS to local file system, I can see
>> the correct size. Why is it reported as zero size in hadoop CLI?
>>
>>
>>
>

Re: A non-empty file's size is reported as 0

Posted by Tao Xiao <xi...@gmail.com>.
My mapper code is as follows and I don't know whether any file is not
closed correctly.

public class TheMapper extends Mapper<LongWritable, Text, Text,
NullWritable> {
    private MultipleOutputs<Text, NullWritable> outputs;

    @Override
    protected void setup(Context ctx) {
        outputs = new MultipleOutputs<Text, NullWritable>(ctx);
    }

    @Override
    protected void map(LongWritable o, Text t, Context ctx) {
        outputs.write(t, NullWritable.get(), "2014-01-20/");
    }

    @Override
    protected void cleanup(Context ctx) {
        outputs.close();
    }
}




2014-04-08 21:17 GMT+08:00 Peyman Mohajerian <mo...@gmail.com>:

> If you didn't close the file correctly then NameNode wouldn't be notified
> of the final size of the file. The file size is meta-data coming from
> NameNode.
>
>
> On Tue, Apr 8, 2014 at 4:35 AM, Tao Xiao <xi...@gmail.com> wrote:
>
>> I wrote some data into a file using MultipleOutputs in mappers. I can see
>> the contents in this file using "hadoop fs -ls <file>", but its size is
>> reported as zero by the command "hadoop fs -du <file>" or "hadoop fs -ls
>> <file>" , as follows:
>>
>> -rw-r--r--   3 hadoop hadoop         0 2014-04-07 22:06
>> /test/xt/out/2014-01-20/-m-00000
>>
>> BTW, when I download this file from HDFS to local file system, I can see
>> the correct size. Why is it reported as zero size in hadoop CLI?
>>
>>
>>
>

Re: A non-empty file's size is reported as 0

Posted by Tao Xiao <xi...@gmail.com>.
My mapper code is as follows and I don't know whether any file is not
closed correctly.

public class TheMapper extends Mapper<LongWritable, Text, Text,
NullWritable> {
    private MultipleOutputs<Text, NullWritable> outputs;

    @Override
    protected void setup(Context ctx) {
        outputs = new MultipleOutputs<Text, NullWritable>(ctx);
    }

    @Override
    protected void map(LongWritable o, Text t, Context ctx) {
        outputs.write(t, NullWritable.get(), "2014-01-20/");
    }

    @Override
    protected void cleanup(Context ctx) {
        outputs.close();
    }
}




2014-04-08 21:17 GMT+08:00 Peyman Mohajerian <mo...@gmail.com>:

> If you didn't close the file correctly then NameNode wouldn't be notified
> of the final size of the file. The file size is meta-data coming from
> NameNode.
>
>
> On Tue, Apr 8, 2014 at 4:35 AM, Tao Xiao <xi...@gmail.com> wrote:
>
>> I wrote some data into a file using MultipleOutputs in mappers. I can see
>> the contents in this file using "hadoop fs -ls <file>", but its size is
>> reported as zero by the command "hadoop fs -du <file>" or "hadoop fs -ls
>> <file>" , as follows:
>>
>> -rw-r--r--   3 hadoop hadoop         0 2014-04-07 22:06
>> /test/xt/out/2014-01-20/-m-00000
>>
>> BTW, when I download this file from HDFS to local file system, I can see
>> the correct size. Why is it reported as zero size in hadoop CLI?
>>
>>
>>
>

Re: A non-empty file's size is reported as 0

Posted by Tao Xiao <xi...@gmail.com>.
My mapper code is as follows and I don't know whether any file is not
closed correctly.

public class TheMapper extends Mapper<LongWritable, Text, Text,
NullWritable> {
    private MultipleOutputs<Text, NullWritable> outputs;

    @Override
    protected void setup(Context ctx) {
        outputs = new MultipleOutputs<Text, NullWritable>(ctx);
    }

    @Override
    protected void map(LongWritable o, Text t, Context ctx) {
        outputs.write(t, NullWritable.get(), "2014-01-20/");
    }

    @Override
    protected void cleanup(Context ctx) {
        outputs.close();
    }
}




2014-04-08 21:17 GMT+08:00 Peyman Mohajerian <mo...@gmail.com>:

> If you didn't close the file correctly then NameNode wouldn't be notified
> of the final size of the file. The file size is meta-data coming from
> NameNode.
>
>
> On Tue, Apr 8, 2014 at 4:35 AM, Tao Xiao <xi...@gmail.com> wrote:
>
>> I wrote some data into a file using MultipleOutputs in mappers. I can see
>> the contents in this file using "hadoop fs -ls <file>", but its size is
>> reported as zero by the command "hadoop fs -du <file>" or "hadoop fs -ls
>> <file>" , as follows:
>>
>> -rw-r--r--   3 hadoop hadoop         0 2014-04-07 22:06
>> /test/xt/out/2014-01-20/-m-00000
>>
>> BTW, when I download this file from HDFS to local file system, I can see
>> the correct size. Why is it reported as zero size in hadoop CLI?
>>
>>
>>
>

Re: A non-empty file's size is reported as 0

Posted by Peyman Mohajerian <mo...@gmail.com>.
If you didn't close the file correctly then NameNode wouldn't be notified
of the final size of the file. The file size is meta-data coming from
NameNode.


On Tue, Apr 8, 2014 at 4:35 AM, Tao Xiao <xi...@gmail.com> wrote:

> I wrote some data into a file using MultipleOutputs in mappers. I can see
> the contents in this file using "hadoop fs -ls <file>", but its size is
> reported as zero by the command "hadoop fs -du <file>" or "hadoop fs -ls
> <file>" , as follows:
>
> -rw-r--r--   3 hadoop hadoop         0 2014-04-07 22:06
> /test/xt/out/2014-01-20/-m-00000
>
> BTW, when I download this file from HDFS to local file system, I can see
> the correct size. Why is it reported as zero size in hadoop CLI?
>
>
>

Re: A non-empty file's size is reported as 0

Posted by Peyman Mohajerian <mo...@gmail.com>.
If you didn't close the file correctly then NameNode wouldn't be notified
of the final size of the file. The file size is meta-data coming from
NameNode.


On Tue, Apr 8, 2014 at 4:35 AM, Tao Xiao <xi...@gmail.com> wrote:

> I wrote some data into a file using MultipleOutputs in mappers. I can see
> the contents in this file using "hadoop fs -ls <file>", but its size is
> reported as zero by the command "hadoop fs -du <file>" or "hadoop fs -ls
> <file>" , as follows:
>
> -rw-r--r--   3 hadoop hadoop         0 2014-04-07 22:06
> /test/xt/out/2014-01-20/-m-00000
>
> BTW, when I download this file from HDFS to local file system, I can see
> the correct size. Why is it reported as zero size in hadoop CLI?
>
>
>

Re: A non-empty file's size is reported as 0

Posted by Peyman Mohajerian <mo...@gmail.com>.
If you didn't close the file correctly then NameNode wouldn't be notified
of the final size of the file. The file size is meta-data coming from
NameNode.


On Tue, Apr 8, 2014 at 4:35 AM, Tao Xiao <xi...@gmail.com> wrote:

> I wrote some data into a file using MultipleOutputs in mappers. I can see
> the contents in this file using "hadoop fs -ls <file>", but its size is
> reported as zero by the command "hadoop fs -du <file>" or "hadoop fs -ls
> <file>" , as follows:
>
> -rw-r--r--   3 hadoop hadoop         0 2014-04-07 22:06
> /test/xt/out/2014-01-20/-m-00000
>
> BTW, when I download this file from HDFS to local file system, I can see
> the correct size. Why is it reported as zero size in hadoop CLI?
>
>
>

Re: A non-empty file's size is reported as 0

Posted by Peyman Mohajerian <mo...@gmail.com>.
If you didn't close the file correctly then NameNode wouldn't be notified
of the final size of the file. The file size is meta-data coming from
NameNode.


On Tue, Apr 8, 2014 at 4:35 AM, Tao Xiao <xi...@gmail.com> wrote:

> I wrote some data into a file using MultipleOutputs in mappers. I can see
> the contents in this file using "hadoop fs -ls <file>", but its size is
> reported as zero by the command "hadoop fs -du <file>" or "hadoop fs -ls
> <file>" , as follows:
>
> -rw-r--r--   3 hadoop hadoop         0 2014-04-07 22:06
> /test/xt/out/2014-01-20/-m-00000
>
> BTW, when I download this file from HDFS to local file system, I can see
> the correct size. Why is it reported as zero size in hadoop CLI?
>
>
>