You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by ra...@bt.com on 2015/10/22 08:35:30 UTC

NiFi 0.3 : Query regarding HDFS processor

Hi Team,

I have been exploring NiFi for couple of days now.

NiFi is running on a machine which is not a prt of Hadoop cluster. I want to put files into HDFS from an external source. How do I configure the Hadoop cluster host details in the NiFi?

I can get the file from remote source using GetSFTP processor and write it into the Hadoop edge node using PutSFTP, followed by PutHDFS.
But I would like to know if there is any way to directly write the FlowFile to HDFS using PutHDFS without writing to Hadoop name node.

Could you please help me to identify the same?

Regards,
Ramkishan Betta,
Consultant, BT e-serv India,
Bengaluru - INDIA


Re: NiFi 0.3 : Query regarding HDFS processor

Posted by Ricky Saltzer <ri...@cloudera.com>.
Bryan is correct in that you must be able to connect to the NameNode in
order to perform an HDFS write request. Although the write does not go
*through* the NameNode, you must first consult with it in order to obtain a
DataNode write path. Depending on which distribution of Hadoop you are
using, you will need to obtain the client configuration files
(core-site.xml, hdfs-site.xml) and place them in a readable directory on
your NiFi server. Then just modify your PutHDFS processor and supply the
client configurations (e.g. /path/to/core-site.xml,/path/to/hdfs-site.xml).
That should be all you need to do.


ricky

On Thu, Oct 22, 2015 at 10:30 AM, Bryan Bende <bb...@gmail.com> wrote:

> Hello,
>
> There is a property on PutHDFS where you can specify the Hadoop
> configuration files which tell the processor about your HDFS installation:
>
> Hadoop Configuration Resources - A file or comma separated list of files
> which contains the Hadoop file system configuration. Without this, Hadoop
> will search the classpath for a 'core-site.xml' and 'hdfs-site.xml' file or
> will revert to a default configuration.
>
> I don't think it is possible to bypass the name node since the name node
> tracks where all the files are in HDFS.
>
> -Bryan
>
>
> On Thu, Oct 22, 2015 at 2:35 AM, <ra...@bt.com> wrote:
>
> > Hi Team,
> >
> > I have been exploring NiFi for couple of days now.
> >
> > NiFi is running on a machine which is not a prt of Hadoop cluster. I want
> > to put files into HDFS from an external source. How do I configure the
> > Hadoop cluster host details in the NiFi?
> >
> > I can get the file from remote source using GetSFTP processor and write
> it
> > into the Hadoop edge node using PutSFTP, followed by PutHDFS.
> > But I would like to know if there is any way to directly write the
> > FlowFile to HDFS using PutHDFS without writing to Hadoop name node.
> >
> > Could you please help me to identify the same?
> >
> > Regards,
> > Ramkishan Betta,
> > Consultant, BT e-serv India,
> > Bengaluru - INDIA
> >
> >
>



-- 
Ricky Saltzer
http://www.cloudera.com

Re: NiFi 0.3 : Query regarding HDFS processor

Posted by Bryan Bende <bb...@gmail.com>.
Hello,

There is a property on PutHDFS where you can specify the Hadoop
configuration files which tell the processor about your HDFS installation:

Hadoop Configuration Resources - A file or comma separated list of files
which contains the Hadoop file system configuration. Without this, Hadoop
will search the classpath for a 'core-site.xml' and 'hdfs-site.xml' file or
will revert to a default configuration.

I don't think it is possible to bypass the name node since the name node
tracks where all the files are in HDFS.

-Bryan


On Thu, Oct 22, 2015 at 2:35 AM, <ra...@bt.com> wrote:

> Hi Team,
>
> I have been exploring NiFi for couple of days now.
>
> NiFi is running on a machine which is not a prt of Hadoop cluster. I want
> to put files into HDFS from an external source. How do I configure the
> Hadoop cluster host details in the NiFi?
>
> I can get the file from remote source using GetSFTP processor and write it
> into the Hadoop edge node using PutSFTP, followed by PutHDFS.
> But I would like to know if there is any way to directly write the
> FlowFile to HDFS using PutHDFS without writing to Hadoop name node.
>
> Could you please help me to identify the same?
>
> Regards,
> Ramkishan Betta,
> Consultant, BT e-serv India,
> Bengaluru - INDIA
>
>