You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Sudheer Nulu <su...@saksoft.co.in> on 2018/04/19 05:55:55 UTC
Apache nifi isuues-while ingesting data to hdfs of aws instance
Hi Team,
I have recently started exploring Apache nifi to implement in our upcoming
projects.
The issue is:
I have nifi installed on my windows machine to transfer the data from oracle
database to aws instance hdfs layer by using putHDFS. files are transferred
to hdfs with 0 Bytes of memory.(Attached the snapshot for reference)
I was able to place files using putsftp in aws instance without any
issue,but failed to do it in hdfs layer.
In logs I could see this:
<logs>
WARN [Thread-5441] org.apache.hadoop.hdfs.DFSClient DataStreamer Exception
java.nio.channels.UnresolvedAddressException: null
at sun.nio.ch.Net.checkAddress(Unknown Source) at
sun.nio.ch.SocketChannelImpl.connect(Unknown Source) at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:1
92)
</logs>
Procedure followed:
I have copied the hdfs-site.xml and core-site.xml from aws instance and
replaced all private hostname with public hostname as I cannot communicate
with private DNS,then I provided those two files as inputs to apache nifi
configuration.
The hdfs layer has full permissions to write to it.
Could someone help on this as I have struck at this point from longtime,
made many changes in hosts file for resolving hostname issue.
Regards,
Sudheer Nulu
RE: Apache nifi isuues-while ingesting data to hdfs of aws instance
Posted by Sudheer Nulu <su...@saksoft.co.in>.
Hi Matt,
Thanks for looking into this issue.
Previously, I have made this entry in hdfs-site.xml file .But there was No Luck..
Regards,
Sudheer Nulu
-----Original Message-----
From: Matt Burgess [mailto:mattyb149@apache.org]
Sent: Thursday, April 19, 2018 6:00 PM
To: dev@nifi.apache.org
Subject: Re: Apache nifi isuues-while ingesting data to hdfs of aws instance
If that's the case, you may be able to fix that by adding the following property to your hdfs-site.xml:
dfs.client.use.datanode.hostname = true
It should cause the hostnames of the data nodes to be returned to the client, rather than private/inaccessible IPs.
Regards,
Matt
On Thu, Apr 19, 2018 at 8:22 AM, Bryan Bende <bb...@gmail.com> wrote:
> Hello,
>
> I have no idea how AWS works, but most likely what is happening is the
> Hadoop client in NiFi asks the name node to write a file, and the name
> node then responds with the data nodes to write to, but it is
> responding with the private IPs/hostnames of the data nodes which you
> can't reach from your windows machines.
>
> You would probably have the same problem if you tried installing the
> hadoop client on your windows machine and issue commands like "hadoop
> fs -ls /".
>
> -Bryan
>
>
> On Thu, Apr 19, 2018 at 1:55 AM, Sudheer Nulu <su...@saksoft.co.in> wrote:
>> Hi Team,
>>
>>
>>
>> I have recently started exploring Apache nifi to implement in our
>> upcoming projects.
>>
>>
>>
>> The issue is:
>>
>>
>>
>> I have nifi installed on my windows machine to transfer the data from
>> oracle database to aws instance hdfs layer by using putHDFS. files
>> are transferred to hdfs with 0 Bytes of memory.(Attached the snapshot
>> for reference)
>>
>>
>>
>> I was able to place files using putsftp in aws instance without any
>> issue,but failed to do it in hdfs layer.
>>
>>
>>
>>
>>
>> In logs I could see this:
>>
>>
>>
>> <logs>
>>
>> WARN [Thread-5441] org.apache.hadoop.hdfs.DFSClient DataStreamer
>> Exception
>> java.nio.channels.UnresolvedAddressException: null
>>
>> at sun.nio.ch.Net.checkAddress(Unknown Source) at
>> sun.nio.ch.SocketChannelImpl.connect(Unknown Source) at
>> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout
>> .java:192)
>>
>> </logs>
>>
>>
>>
>> Procedure followed:
>>
>> I have copied the hdfs-site.xml and core-site.xml from aws instance
>> and replaced all private hostname with public hostname as I cannot
>> communicate with private DNS,then I provided those two files as
>> inputs to apache nifi configuration.
>>
>>
>>
>> The hdfs layer has full permissions to write to it.
>>
>>
>>
>> Could someone help on this as I have struck at this point from
>> longtime, made many changes in hosts file for resolving hostname issue.
>>
>>
>>
>> Regards,
>>
>> Sudheer Nulu
>>
>>
>>
>>
>>
>>
Re: Apache nifi isuues-while ingesting data to hdfs of aws instance
Posted by Matt Burgess <ma...@apache.org>.
If that's the case, you may be able to fix that by adding the
following property to your hdfs-site.xml:
dfs.client.use.datanode.hostname = true
It should cause the hostnames of the data nodes to be returned to the
client, rather than private/inaccessible IPs.
Regards,
Matt
On Thu, Apr 19, 2018 at 8:22 AM, Bryan Bende <bb...@gmail.com> wrote:
> Hello,
>
> I have no idea how AWS works, but most likely what is happening is the
> Hadoop client in NiFi asks the name node to write a file, and the name
> node then responds with the data nodes to write to, but it is
> responding with the private IPs/hostnames of the data nodes which you
> can't reach from your windows machines.
>
> You would probably have the same problem if you tried installing the
> hadoop client on your windows machine and issue commands like "hadoop
> fs -ls /".
>
> -Bryan
>
>
> On Thu, Apr 19, 2018 at 1:55 AM, Sudheer Nulu <su...@saksoft.co.in> wrote:
>> Hi Team,
>>
>>
>>
>> I have recently started exploring Apache nifi to implement in our upcoming
>> projects.
>>
>>
>>
>> The issue is:
>>
>>
>>
>> I have nifi installed on my windows machine to transfer the data from oracle
>> database to aws instance hdfs layer by using putHDFS. files are transferred
>> to hdfs with 0 Bytes of memory.(Attached the snapshot for reference)
>>
>>
>>
>> I was able to place files using putsftp in aws instance without any
>> issue,but failed to do it in hdfs layer.
>>
>>
>>
>>
>>
>> In logs I could see this:
>>
>>
>>
>> <logs>
>>
>> WARN [Thread-5441] org.apache.hadoop.hdfs.DFSClient DataStreamer Exception
>> java.nio.channels.UnresolvedAddressException: null
>>
>> at sun.nio.ch.Net.checkAddress(Unknown Source) at
>> sun.nio.ch.SocketChannelImpl.connect(Unknown Source) at
>> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
>>
>> </logs>
>>
>>
>>
>> Procedure followed:
>>
>> I have copied the hdfs-site.xml and core-site.xml from aws instance and
>> replaced all private hostname with public hostname as I cannot communicate
>> with private DNS,then I provided those two files as inputs to apache nifi
>> configuration.
>>
>>
>>
>> The hdfs layer has full permissions to write to it.
>>
>>
>>
>> Could someone help on this as I have struck at this point from longtime,
>> made many changes in hosts file for resolving hostname issue.
>>
>>
>>
>> Regards,
>>
>> Sudheer Nulu
>>
>>
>>
>>
>>
>>
Re: Apache nifi isuues-while ingesting data to hdfs of aws instance
Posted by Bryan Bende <bb...@gmail.com>.
Hello,
I have no idea how AWS works, but most likely what is happening is the
Hadoop client in NiFi asks the name node to write a file, and the name
node then responds with the data nodes to write to, but it is
responding with the private IPs/hostnames of the data nodes which you
can't reach from your windows machines.
You would probably have the same problem if you tried installing the
hadoop client on your windows machine and issue commands like "hadoop
fs -ls /".
-Bryan
On Thu, Apr 19, 2018 at 1:55 AM, Sudheer Nulu <su...@saksoft.co.in> wrote:
> Hi Team,
>
>
>
> I have recently started exploring Apache nifi to implement in our upcoming
> projects.
>
>
>
> The issue is:
>
>
>
> I have nifi installed on my windows machine to transfer the data from oracle
> database to aws instance hdfs layer by using putHDFS. files are transferred
> to hdfs with 0 Bytes of memory.(Attached the snapshot for reference)
>
>
>
> I was able to place files using putsftp in aws instance without any
> issue,but failed to do it in hdfs layer.
>
>
>
>
>
> In logs I could see this:
>
>
>
> <logs>
>
> WARN [Thread-5441] org.apache.hadoop.hdfs.DFSClient DataStreamer Exception
> java.nio.channels.UnresolvedAddressException: null
>
> at sun.nio.ch.Net.checkAddress(Unknown Source) at
> sun.nio.ch.SocketChannelImpl.connect(Unknown Source) at
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
>
> </logs>
>
>
>
> Procedure followed:
>
> I have copied the hdfs-site.xml and core-site.xml from aws instance and
> replaced all private hostname with public hostname as I cannot communicate
> with private DNS,then I provided those two files as inputs to apache nifi
> configuration.
>
>
>
> The hdfs layer has full permissions to write to it.
>
>
>
> Could someone help on this as I have struck at this point from longtime,
> made many changes in hosts file for resolving hostname issue.
>
>
>
> Regards,
>
> Sudheer Nulu
>
>
>
>
>
>