You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by "Guyle M. Taber" <gu...@gmtech.net> on 2015/05/29 19:46:44 UTC

Re: HDFS sink: "clever" routing

Ok I figured this out by using the %{basename} placeholder.

However I’m trying to figure out how to prevent the epoch suffix from being applied to every file as it’s written to hdfs.

Example:
20150528133001.txt-.1432920411283

How do I prevent the epoch timestamp from being appended to every file name?





> On May 28, 2015, at 3:23 PM, <gu...@gmtech.net> wrote:
> 
> I’m using the %{file} var to hold and preserve the file/log name as it’s stored in HDFS, but it seems to be recreating the entire directory structure from the source side.
> How can I simply write the filename as-is into the HDFS path specified?
> 
> dp1.sinks.sinkSG.hdfs.filePrefix = %{file}  # Just want the file name and not the entire path+filename.
> 
> dp1.sinks.sinkSG.hdfs.path = hdfs://hadoopnn1.company.com/flume/events/fe_event/%{host}/%y-%m-%d

Re: HDFS sink: "clever" routing

Posted by Johny Rufus <jr...@cloudera.com>.
The completed filename will always contain the epochTimestamp/counter added
to it (this is to uniquely distinguish the rolled files)

Thanks,
Rufus

On Fri, May 29, 2015 at 10:46 AM, Guyle M. Taber <gu...@gmtech.net> wrote:

> Ok I figured this out by using the %{basename} placeholder.
>
> However I’m trying to figure out how to prevent the epoch suffix from
> being applied to every file as it’s written to hdfs.
>
> Example:
> 20150528133001.txt-.1432920411283
>
> How do I prevent the epoch timestamp from being appended to every file
> name?
>
>
>
>
>
> > On May 28, 2015, at 3:23 PM, <gu...@gmtech.net> wrote:
> >
> > I’m using the %{file} var to hold and preserve the file/log name as it’s
> stored in HDFS, but it seems to be recreating the entire directory
> structure from the source side.
> > How can I simply write the filename as-is into the HDFS path specified?
> >
> > dp1.sinks.sinkSG.hdfs.filePrefix = %{file}  # Just want the file name
> and not the entire path+filename.
> >
> > dp1.sinks.sinkSG.hdfs.path = hdfs://
> hadoopnn1.company.com/flume/events/fe_event/%{host}/%y-%m-%d
>