You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Sudheer Nulu <su...@saksoft.co.in> on 2018/04/19 05:55:55 UTC

Apache nifi isuues-while ingesting data to hdfs of aws instance

Hi Team,

 

I have recently started exploring Apache nifi to implement in our upcoming
projects.

 

The issue is:

 

I have nifi installed on my windows machine to transfer the data from oracle
database to aws instance hdfs layer by using putHDFS. files are transferred
to hdfs with 0 Bytes of memory.(Attached the snapshot for reference)

 

I was able to place files using putsftp in aws instance without any
issue,but failed to do it in hdfs layer.

 

 

In logs I could see this:

 

<logs>

WARN [Thread-5441] org.apache.hadoop.hdfs.DFSClient DataStreamer Exception
java.nio.channels.UnresolvedAddressException: null

at sun.nio.ch.Net.checkAddress(Unknown Source) at
sun.nio.ch.SocketChannelImpl.connect(Unknown Source) at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:1
92)

</logs>

 

Procedure followed:

I have copied the hdfs-site.xml and core-site.xml from aws instance and
replaced all private hostname with public hostname as I cannot communicate
with private DNS,then I provided those two files as inputs  to apache nifi
configuration.

 

The hdfs layer has full permissions to write to it.

 

Could someone help on this as I have struck at this point from longtime,
made many changes in hosts file for resolving hostname issue.

 

Regards,

Sudheer Nulu

 

 

 


RE: Apache nifi isuues-while ingesting data to hdfs of aws instance

Posted by Sudheer Nulu <su...@saksoft.co.in>.
Hi Matt,

Thanks for looking into this issue.

Previously, I have made this entry in hdfs-site.xml file .But there was No Luck..

Regards,
Sudheer Nulu

-----Original Message-----
From: Matt Burgess [mailto:mattyb149@apache.org] 
Sent: Thursday, April 19, 2018 6:00 PM
To: dev@nifi.apache.org
Subject: Re: Apache nifi isuues-while ingesting data to hdfs of aws instance

If that's the case, you may be able to fix that by adding the following property to your hdfs-site.xml:

dfs.client.use.datanode.hostname = true

It should cause the hostnames of the data nodes to be returned to the client, rather than private/inaccessible IPs.

Regards,
Matt


On Thu, Apr 19, 2018 at 8:22 AM, Bryan Bende <bb...@gmail.com> wrote:
> Hello,
>
> I have no idea how AWS works, but most likely what is happening is the 
> Hadoop client in NiFi asks the name node to write a file, and the name 
> node then responds with the data nodes to write to, but it is 
> responding with the private IPs/hostnames of the data nodes which you 
> can't reach from your windows machines.
>
> You would probably have the same problem if you tried installing the 
> hadoop client on your windows machine and issue commands like "hadoop 
> fs -ls /".
>
> -Bryan
>
>
> On Thu, Apr 19, 2018 at 1:55 AM, Sudheer Nulu <su...@saksoft.co.in> wrote:
>> Hi Team,
>>
>>
>>
>> I have recently started exploring Apache nifi to implement in our 
>> upcoming projects.
>>
>>
>>
>> The issue is:
>>
>>
>>
>> I have nifi installed on my windows machine to transfer the data from 
>> oracle database to aws instance hdfs layer by using putHDFS. files 
>> are transferred to hdfs with 0 Bytes of memory.(Attached the snapshot 
>> for reference)
>>
>>
>>
>> I was able to place files using putsftp in aws instance without any 
>> issue,but failed to do it in hdfs layer.
>>
>>
>>
>>
>>
>> In logs I could see this:
>>
>>
>>
>> <logs>
>>
>> WARN [Thread-5441] org.apache.hadoop.hdfs.DFSClient DataStreamer 
>> Exception
>> java.nio.channels.UnresolvedAddressException: null
>>
>> at sun.nio.ch.Net.checkAddress(Unknown Source) at 
>> sun.nio.ch.SocketChannelImpl.connect(Unknown Source) at
>> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout
>> .java:192)
>>
>> </logs>
>>
>>
>>
>> Procedure followed:
>>
>> I have copied the hdfs-site.xml and core-site.xml from aws instance 
>> and replaced all private hostname with public hostname as I cannot 
>> communicate with private DNS,then I provided those two files as 
>> inputs  to apache nifi configuration.
>>
>>
>>
>> The hdfs layer has full permissions to write to it.
>>
>>
>>
>> Could someone help on this as I have struck at this point from 
>> longtime, made many changes in hosts file for resolving hostname issue.
>>
>>
>>
>> Regards,
>>
>> Sudheer Nulu
>>
>>
>>
>>
>>
>>


Re: Apache nifi isuues-while ingesting data to hdfs of aws instance

Posted by Matt Burgess <ma...@apache.org>.
If that's the case, you may be able to fix that by adding the
following property to your hdfs-site.xml:

dfs.client.use.datanode.hostname = true

It should cause the hostnames of the data nodes to be returned to the
client, rather than private/inaccessible IPs.

Regards,
Matt


On Thu, Apr 19, 2018 at 8:22 AM, Bryan Bende <bb...@gmail.com> wrote:
> Hello,
>
> I have no idea how AWS works, but most likely what is happening is the
> Hadoop client in NiFi asks the name node to write a file, and the name
> node then responds with the data nodes to write to, but it is
> responding with the private IPs/hostnames of the data nodes which you
> can't reach from your windows machines.
>
> You would probably have the same problem if you tried installing the
> hadoop client on your windows machine and issue commands like "hadoop
> fs -ls /".
>
> -Bryan
>
>
> On Thu, Apr 19, 2018 at 1:55 AM, Sudheer Nulu <su...@saksoft.co.in> wrote:
>> Hi Team,
>>
>>
>>
>> I have recently started exploring Apache nifi to implement in our upcoming
>> projects.
>>
>>
>>
>> The issue is:
>>
>>
>>
>> I have nifi installed on my windows machine to transfer the data from oracle
>> database to aws instance hdfs layer by using putHDFS. files are transferred
>> to hdfs with 0 Bytes of memory.(Attached the snapshot for reference)
>>
>>
>>
>> I was able to place files using putsftp in aws instance without any
>> issue,but failed to do it in hdfs layer.
>>
>>
>>
>>
>>
>> In logs I could see this:
>>
>>
>>
>> <logs>
>>
>> WARN [Thread-5441] org.apache.hadoop.hdfs.DFSClient DataStreamer Exception
>> java.nio.channels.UnresolvedAddressException: null
>>
>> at sun.nio.ch.Net.checkAddress(Unknown Source) at
>> sun.nio.ch.SocketChannelImpl.connect(Unknown Source) at
>> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
>>
>> </logs>
>>
>>
>>
>> Procedure followed:
>>
>> I have copied the hdfs-site.xml and core-site.xml from aws instance and
>> replaced all private hostname with public hostname as I cannot communicate
>> with private DNS,then I provided those two files as inputs  to apache nifi
>> configuration.
>>
>>
>>
>> The hdfs layer has full permissions to write to it.
>>
>>
>>
>> Could someone help on this as I have struck at this point from longtime,
>> made many changes in hosts file for resolving hostname issue.
>>
>>
>>
>> Regards,
>>
>> Sudheer Nulu
>>
>>
>>
>>
>>
>>

Re: Apache nifi isuues-while ingesting data to hdfs of aws instance

Posted by Bryan Bende <bb...@gmail.com>.
Hello,

I have no idea how AWS works, but most likely what is happening is the
Hadoop client in NiFi asks the name node to write a file, and the name
node then responds with the data nodes to write to, but it is
responding with the private IPs/hostnames of the data nodes which you
can't reach from your windows machines.

You would probably have the same problem if you tried installing the
hadoop client on your windows machine and issue commands like "hadoop
fs -ls /".

-Bryan


On Thu, Apr 19, 2018 at 1:55 AM, Sudheer Nulu <su...@saksoft.co.in> wrote:
> Hi Team,
>
>
>
> I have recently started exploring Apache nifi to implement in our upcoming
> projects.
>
>
>
> The issue is:
>
>
>
> I have nifi installed on my windows machine to transfer the data from oracle
> database to aws instance hdfs layer by using putHDFS. files are transferred
> to hdfs with 0 Bytes of memory.(Attached the snapshot for reference)
>
>
>
> I was able to place files using putsftp in aws instance without any
> issue,but failed to do it in hdfs layer.
>
>
>
>
>
> In logs I could see this:
>
>
>
> <logs>
>
> WARN [Thread-5441] org.apache.hadoop.hdfs.DFSClient DataStreamer Exception
> java.nio.channels.UnresolvedAddressException: null
>
> at sun.nio.ch.Net.checkAddress(Unknown Source) at
> sun.nio.ch.SocketChannelImpl.connect(Unknown Source) at
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
>
> </logs>
>
>
>
> Procedure followed:
>
> I have copied the hdfs-site.xml and core-site.xml from aws instance and
> replaced all private hostname with public hostname as I cannot communicate
> with private DNS,then I provided those two files as inputs  to apache nifi
> configuration.
>
>
>
> The hdfs layer has full permissions to write to it.
>
>
>
> Could someone help on this as I have struck at this point from longtime,
> made many changes in hosts file for resolving hostname issue.
>
>
>
> Regards,
>
> Sudheer Nulu
>
>
>
>
>
>