You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Jan Filipiak <Ja...@trivago.com> on 2015/07/24 16:44:28 UTC
Hdfs fSshell getmerge
Hello hadoop users,
I have an idea about a small feature for the getmerge tool. I recently
was in the need of using the new line option -nl because the files I
needed to merge simply didn't had one.
I was merging all the files from one directory and unfortunately this
directory also included empty files, which effectively led to multiple
newlines append after some files.
I needed to remove them manually afterwards.
In this situation it is maybe good to have another argument that allows
skipping empty files. I just wrote down 2 change one could try at the
end. Do you guys consider this as a good improvement to the command line
tools?
Thing one could try to implement this feature:
The call for IOUtils.copyBytes(in, out, getConf(), false); doesn't
return the number of bytes copied which would be convenient as one could
skip append the new line when 0 bytes where copied
Or one would check the file size before.
Please let me know If you would consider this useful and is worth a
feature ticket in Jira.
Thank you
Jan
Re: Hdfs fSshell getmerge
Posted by Jan Filipiak <Ja...@trivago.com>.
Sorry wrong mailing list
On 24.07.2015 16:44, Jan Filipiak wrote:
> Hello hadoop users,
>
> I have an idea about a small feature for the getmerge tool. I recently
> was in the need of using the new line option -nl because the files I
> needed to merge simply didn't had one.
> I was merging all the files from one directory and unfortunately this
> directory also included empty files, which effectively led to multiple
> newlines append after some files.
> I needed to remove them manually afterwards.
>
> In this situation it is maybe good to have another argument that
> allows skipping empty files. I just wrote down 2 change one could try
> at the end. Do you guys consider this as a good improvement to the
> command line tools?
>
> Thing one could try to implement this feature:
>
> The call for IOUtils.copyBytes(in, out, getConf(), false); doesn't
> return the number of bytes copied which would be convenient as one
> could skip append the new line when 0 bytes where copied
> Or one would check the file size before.
>
> Please let me know If you would consider this useful and is worth a
> feature ticket in Jira.
>
> Thank you
> Jan