You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Manikanda Prabhu <gm...@gmail.com> on 2014/07/29 18:29:50 UTC

Is hdfs "Append to file" command ready for production?

Hi,

We are planning to use one of the hdfs commands "appendToFile" in our file
process, would someone confirm it's production ready or any open issues
thats still in discussion.

In my research, I found the following JIRA's directly or related to this
command and its all closed (except HDFS 1060). please let me know if i
missed anything

  JIRA Id Description Status  HDFS-1060 Append/flush should support
concurrent "tailer" use case Open  HADOOP-6239
HDFS-4905 Command-line for append Fixed  HDFS-744 Support hsync in HDFS
Fixed  HDFS-222 Support for concatenating of files into a single file Fixed
HDFS-265 Revisit append- This jira revisits append, aiming for a design and
implementation supporting a semantics that are acceptable to its users.
Fixed  HADOOP-5224 Disable append Fixed  HADOOP-5332 Make support for file
append API configurable Fixed  HDFS-200 In HDFS, sync() not yet guarantees
data available to the new readers Fixed   HADOOP-1708 Make files visible in
the namespace as soon as they are created Fixed   HADOOP-1700 Append to
files in HDFS Fixed
Regards,
Mani

Re: Is hdfs "Append to file" command ready for production?

Posted by Arpit Agarwal <aa...@hortonworks.com>.
By tested I meant tested for performance. It is fine functionally.


On Tue, Jul 29, 2014 at 3:04 PM, Arpit Agarwal <aa...@hortonworks.com>
wrote:

> Most of those Jiras are for the append feature in general and not the
> appendToFile CLI.
>
> The append feature used via the FileSystem API is stable in Apache Hadoop
> 2.2 and later. I added the appendToFile CLI as a convenience and it has not
> been tested/tuned for performance so YMMV.
>
>
> On Tue, Jul 29, 2014 at 9:29 AM, Manikanda Prabhu <gmkprabhu1983@gmail.com
> > wrote:
>
>> Hi,
>>
>> We are planning to use one of the hdfs commands "appendToFile" in our
>> file process, would someone confirm it's production ready or any open
>> issues thats still in discussion.
>>
>> In my research, I found the following JIRA's directly or related to this
>> command and its all closed (except HDFS 1060). please let me know if i
>> missed anything
>>
>>   JIRA Id Description Status  HDFS-1060 Append/flush should support
>> concurrent "tailer" use case Open  HADOOP-6239
>> HDFS-4905 Command-line for append Fixed  HDFS-744 Support hsync in HDFS
>> Fixed  HDFS-222 Support for concatenating of files into a single file
>> Fixed  HDFS-265 Revisit append- This jira revisits append, aiming for a
>> design and implementation supporting a semantics that are acceptable to its
>> users. Fixed  HADOOP-5224 Disable append Fixed  HADOOP-5332 Make support
>> for file append API configurable Fixed  HDFS-200 In HDFS, sync() not yet
>> guarantees data available to the new readers Fixed   HADOOP-1708 Make
>> files visible in the namespace as soon as they are created Fixed
>> HADOOP-1700 Append to files in HDFS Fixed
>> Regards,
>> Mani
>>
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Is hdfs "Append to file" command ready for production?

Posted by Arpit Agarwal <aa...@hortonworks.com>.
By tested I meant tested for performance. It is fine functionally.


On Tue, Jul 29, 2014 at 3:04 PM, Arpit Agarwal <aa...@hortonworks.com>
wrote:

> Most of those Jiras are for the append feature in general and not the
> appendToFile CLI.
>
> The append feature used via the FileSystem API is stable in Apache Hadoop
> 2.2 and later. I added the appendToFile CLI as a convenience and it has not
> been tested/tuned for performance so YMMV.
>
>
> On Tue, Jul 29, 2014 at 9:29 AM, Manikanda Prabhu <gmkprabhu1983@gmail.com
> > wrote:
>
>> Hi,
>>
>> We are planning to use one of the hdfs commands "appendToFile" in our
>> file process, would someone confirm it's production ready or any open
>> issues thats still in discussion.
>>
>> In my research, I found the following JIRA's directly or related to this
>> command and its all closed (except HDFS 1060). please let me know if i
>> missed anything
>>
>>   JIRA Id Description Status  HDFS-1060 Append/flush should support
>> concurrent "tailer" use case Open  HADOOP-6239
>> HDFS-4905 Command-line for append Fixed  HDFS-744 Support hsync in HDFS
>> Fixed  HDFS-222 Support for concatenating of files into a single file
>> Fixed  HDFS-265 Revisit append- This jira revisits append, aiming for a
>> design and implementation supporting a semantics that are acceptable to its
>> users. Fixed  HADOOP-5224 Disable append Fixed  HADOOP-5332 Make support
>> for file append API configurable Fixed  HDFS-200 In HDFS, sync() not yet
>> guarantees data available to the new readers Fixed   HADOOP-1708 Make
>> files visible in the namespace as soon as they are created Fixed
>> HADOOP-1700 Append to files in HDFS Fixed
>> Regards,
>> Mani
>>
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Is hdfs "Append to file" command ready for production?

Posted by Arpit Agarwal <aa...@hortonworks.com>.
By tested I meant tested for performance. It is fine functionally.


On Tue, Jul 29, 2014 at 3:04 PM, Arpit Agarwal <aa...@hortonworks.com>
wrote:

> Most of those Jiras are for the append feature in general and not the
> appendToFile CLI.
>
> The append feature used via the FileSystem API is stable in Apache Hadoop
> 2.2 and later. I added the appendToFile CLI as a convenience and it has not
> been tested/tuned for performance so YMMV.
>
>
> On Tue, Jul 29, 2014 at 9:29 AM, Manikanda Prabhu <gmkprabhu1983@gmail.com
> > wrote:
>
>> Hi,
>>
>> We are planning to use one of the hdfs commands "appendToFile" in our
>> file process, would someone confirm it's production ready or any open
>> issues thats still in discussion.
>>
>> In my research, I found the following JIRA's directly or related to this
>> command and its all closed (except HDFS 1060). please let me know if i
>> missed anything
>>
>>   JIRA Id Description Status  HDFS-1060 Append/flush should support
>> concurrent "tailer" use case Open  HADOOP-6239
>> HDFS-4905 Command-line for append Fixed  HDFS-744 Support hsync in HDFS
>> Fixed  HDFS-222 Support for concatenating of files into a single file
>> Fixed  HDFS-265 Revisit append- This jira revisits append, aiming for a
>> design and implementation supporting a semantics that are acceptable to its
>> users. Fixed  HADOOP-5224 Disable append Fixed  HADOOP-5332 Make support
>> for file append API configurable Fixed  HDFS-200 In HDFS, sync() not yet
>> guarantees data available to the new readers Fixed   HADOOP-1708 Make
>> files visible in the namespace as soon as they are created Fixed
>> HADOOP-1700 Append to files in HDFS Fixed
>> Regards,
>> Mani
>>
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Is hdfs "Append to file" command ready for production?

Posted by Arpit Agarwal <aa...@hortonworks.com>.
By tested I meant tested for performance. It is fine functionally.


On Tue, Jul 29, 2014 at 3:04 PM, Arpit Agarwal <aa...@hortonworks.com>
wrote:

> Most of those Jiras are for the append feature in general and not the
> appendToFile CLI.
>
> The append feature used via the FileSystem API is stable in Apache Hadoop
> 2.2 and later. I added the appendToFile CLI as a convenience and it has not
> been tested/tuned for performance so YMMV.
>
>
> On Tue, Jul 29, 2014 at 9:29 AM, Manikanda Prabhu <gmkprabhu1983@gmail.com
> > wrote:
>
>> Hi,
>>
>> We are planning to use one of the hdfs commands "appendToFile" in our
>> file process, would someone confirm it's production ready or any open
>> issues thats still in discussion.
>>
>> In my research, I found the following JIRA's directly or related to this
>> command and its all closed (except HDFS 1060). please let me know if i
>> missed anything
>>
>>   JIRA Id Description Status  HDFS-1060 Append/flush should support
>> concurrent "tailer" use case Open  HADOOP-6239
>> HDFS-4905 Command-line for append Fixed  HDFS-744 Support hsync in HDFS
>> Fixed  HDFS-222 Support for concatenating of files into a single file
>> Fixed  HDFS-265 Revisit append- This jira revisits append, aiming for a
>> design and implementation supporting a semantics that are acceptable to its
>> users. Fixed  HADOOP-5224 Disable append Fixed  HADOOP-5332 Make support
>> for file append API configurable Fixed  HDFS-200 In HDFS, sync() not yet
>> guarantees data available to the new readers Fixed   HADOOP-1708 Make
>> files visible in the namespace as soon as they are created Fixed
>> HADOOP-1700 Append to files in HDFS Fixed
>> Regards,
>> Mani
>>
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Is hdfs "Append to file" command ready for production?

Posted by Arpit Agarwal <aa...@hortonworks.com>.
Most of those Jiras are for the append feature in general and not the
appendToFile CLI.

The append feature used via the FileSystem API is stable in Apache Hadoop
2.2 and later. I added the appendToFile CLI as a convenience and it has not
been tested/tuned for performance so YMMV.


On Tue, Jul 29, 2014 at 9:29 AM, Manikanda Prabhu <gm...@gmail.com>
wrote:

> Hi,
>
> We are planning to use one of the hdfs commands "appendToFile" in our file
> process, would someone confirm it's production ready or any open issues
> thats still in discussion.
>
> In my research, I found the following JIRA's directly or related to this
> command and its all closed (except HDFS 1060). please let me know if i
> missed anything
>
>   JIRA Id Description Status  HDFS-1060 Append/flush should support
> concurrent "tailer" use case Open  HADOOP-6239
> HDFS-4905 Command-line for append Fixed  HDFS-744 Support hsync in HDFS
> Fixed  HDFS-222 Support for concatenating of files into a single file
> Fixed  HDFS-265 Revisit append- This jira revisits append, aiming for a
> design and implementation supporting a semantics that are acceptable to its
> users. Fixed  HADOOP-5224 Disable append Fixed  HADOOP-5332 Make support
> for file append API configurable Fixed  HDFS-200 In HDFS, sync() not yet
> guarantees data available to the new readers Fixed   HADOOP-1708 Make
> files visible in the namespace as soon as they are created Fixed
> HADOOP-1700 Append to files in HDFS Fixed
> Regards,
> Mani
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Is hdfs "Append to file" command ready for production?

Posted by Arpit Agarwal <aa...@hortonworks.com>.
Most of those Jiras are for the append feature in general and not the
appendToFile CLI.

The append feature used via the FileSystem API is stable in Apache Hadoop
2.2 and later. I added the appendToFile CLI as a convenience and it has not
been tested/tuned for performance so YMMV.


On Tue, Jul 29, 2014 at 9:29 AM, Manikanda Prabhu <gm...@gmail.com>
wrote:

> Hi,
>
> We are planning to use one of the hdfs commands "appendToFile" in our file
> process, would someone confirm it's production ready or any open issues
> thats still in discussion.
>
> In my research, I found the following JIRA's directly or related to this
> command and its all closed (except HDFS 1060). please let me know if i
> missed anything
>
>   JIRA Id Description Status  HDFS-1060 Append/flush should support
> concurrent "tailer" use case Open  HADOOP-6239
> HDFS-4905 Command-line for append Fixed  HDFS-744 Support hsync in HDFS
> Fixed  HDFS-222 Support for concatenating of files into a single file
> Fixed  HDFS-265 Revisit append- This jira revisits append, aiming for a
> design and implementation supporting a semantics that are acceptable to its
> users. Fixed  HADOOP-5224 Disable append Fixed  HADOOP-5332 Make support
> for file append API configurable Fixed  HDFS-200 In HDFS, sync() not yet
> guarantees data available to the new readers Fixed   HADOOP-1708 Make
> files visible in the namespace as soon as they are created Fixed
> HADOOP-1700 Append to files in HDFS Fixed
> Regards,
> Mani
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Is hdfs "Append to file" command ready for production?

Posted by Arpit Agarwal <aa...@hortonworks.com>.
Most of those Jiras are for the append feature in general and not the
appendToFile CLI.

The append feature used via the FileSystem API is stable in Apache Hadoop
2.2 and later. I added the appendToFile CLI as a convenience and it has not
been tested/tuned for performance so YMMV.


On Tue, Jul 29, 2014 at 9:29 AM, Manikanda Prabhu <gm...@gmail.com>
wrote:

> Hi,
>
> We are planning to use one of the hdfs commands "appendToFile" in our file
> process, would someone confirm it's production ready or any open issues
> thats still in discussion.
>
> In my research, I found the following JIRA's directly or related to this
> command and its all closed (except HDFS 1060). please let me know if i
> missed anything
>
>   JIRA Id Description Status  HDFS-1060 Append/flush should support
> concurrent "tailer" use case Open  HADOOP-6239
> HDFS-4905 Command-line for append Fixed  HDFS-744 Support hsync in HDFS
> Fixed  HDFS-222 Support for concatenating of files into a single file
> Fixed  HDFS-265 Revisit append- This jira revisits append, aiming for a
> design and implementation supporting a semantics that are acceptable to its
> users. Fixed  HADOOP-5224 Disable append Fixed  HADOOP-5332 Make support
> for file append API configurable Fixed  HDFS-200 In HDFS, sync() not yet
> guarantees data available to the new readers Fixed   HADOOP-1708 Make
> files visible in the namespace as soon as they are created Fixed
> HADOOP-1700 Append to files in HDFS Fixed
> Regards,
> Mani
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Is hdfs "Append to file" command ready for production?

Posted by Arpit Agarwal <aa...@hortonworks.com>.
Most of those Jiras are for the append feature in general and not the
appendToFile CLI.

The append feature used via the FileSystem API is stable in Apache Hadoop
2.2 and later. I added the appendToFile CLI as a convenience and it has not
been tested/tuned for performance so YMMV.


On Tue, Jul 29, 2014 at 9:29 AM, Manikanda Prabhu <gm...@gmail.com>
wrote:

> Hi,
>
> We are planning to use one of the hdfs commands "appendToFile" in our file
> process, would someone confirm it's production ready or any open issues
> thats still in discussion.
>
> In my research, I found the following JIRA's directly or related to this
> command and its all closed (except HDFS 1060). please let me know if i
> missed anything
>
>   JIRA Id Description Status  HDFS-1060 Append/flush should support
> concurrent "tailer" use case Open  HADOOP-6239
> HDFS-4905 Command-line for append Fixed  HDFS-744 Support hsync in HDFS
> Fixed  HDFS-222 Support for concatenating of files into a single file
> Fixed  HDFS-265 Revisit append- This jira revisits append, aiming for a
> design and implementation supporting a semantics that are acceptable to its
> users. Fixed  HADOOP-5224 Disable append Fixed  HADOOP-5332 Make support
> for file append API configurable Fixed  HDFS-200 In HDFS, sync() not yet
> guarantees data available to the new readers Fixed   HADOOP-1708 Make
> files visible in the namespace as soon as they are created Fixed
> HADOOP-1700 Append to files in HDFS Fixed
> Regards,
> Mani
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.