You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Vijaya Narayana Reddy Bhoomi Reddy <vi...@whishworks.com> on 2015/03/25 10:50:44 UTC
Identifying new files in HDFS
Hi,
We have a requirement to process only new files in HDFS on a daily basis. I
am sure this is a general requirement in many ETL kind of processing
scenarios. Just wondering if there is a way to identify new files that are
added to a path in HDFS? For example, assume already some files were
present for sometime. Now I have added new files today. So wanted to
process only those new files. What is the best way to achieve this.
Thanks & Regards
Vijay
--
The contents of this e-mail are confidential and for the exclusive use of
the intended recipient. If you receive this e-mail in error please delete
it from your system immediately and notify us either by e-mail or
telephone. You should not copy, forward or otherwise disclose the content
of the e-mail. The views expressed in this communication may not
necessarily be the view held by WHISHWORKS.