You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Alex Clark <al...@bitstew.com> on 2014/04/26 07:58:55 UTC
Reading remote HDFS file with Java Client
Hello all, I¹m having a bit of trouble with a simple Hadoop install. I¹ve
downloaded hadoop 2.4.0 and installed on a single CentOS Linux node (Virtual
Machine). I¹ve configured hadoop for a single node with pseudo distribution
as described on the apache site
(http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-common/Sing
leCluster.html). It starts with no issues in the logs and I can read +
write files using the ³hadoop fs² commands from the command line.
I¹m attempting to read a file from the HDFS on a remote machine with the
Java API. The machine can connect and list directory contents. It can also
determine if a file exists with the code:
Path p=new Path("hdfs://test.server:9000/usr/test/test_file.txt");
FileSystem fs = FileSystem.get(new Configuration());
System.out.println(p.getName() + " exists: " + fs.exists(p));
The system prints ³true² indicating it exists. However, when I attempt to
read the file with:
BufferedReader br = null;
try {
Path p=new Path("hdfs://test.server:9000/usr/test/test_file.txt");
FileSystem fs = FileSystem.get(CONFIG);
System.out.println(p.getName() + " exists: " + fs.exists(p));
br=new BufferedReader(new InputStreamReader(fs.open(p)));
String line = br.readLine();
while (line != null) {
System.out.println(line);
line=br.readLine();
}
}
finally {
if(br != null) br.close();
}
this code throws the exception:
Exception in thread "main" org.apache.hadoop.hdfs.BlockMissingException:
Could not obtain block:
BP-13917963-127.0.0.1-1398476189167:blk_1073741831_1007
file=/usr/test/test_file.txt
Googling gave some possible tips but all checked out. The data node is
connected, active, and has enough space. The admin report from hdfs
dfsadmin report shows:
Configured Capacity: 52844687360 (49.22 GB)
Present Capacity: 48507940864 (45.18 GB)
DFS Remaining: 48507887616 (45.18 GB)
DFS Used: 53248 (52 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)
Live datanodes:
Name: 127.0.0.1:50010 (test.server)
Hostname: test.server
Decommission Status : Normal
Configured Capacity: 52844687360 (49.22 GB)
DFS Used: 53248 (52 KB)
Non DFS Used: 4336746496 (4.04 GB)
DFS Remaining: 48507887616 (45.18 GB)
DFS Used%: 0.00%
DFS Remaining%: 91.79%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Last contact: Fri Apr 25 22:16:56 PDT 2014
The client jars were copied directly from the hadoop install so no version
mismatch there. I can browse the file system with my Java class and read
file attributes. I just can¹t read the file contents without getting the
exception. If I try to write a file with the code:
FileSystem fs = null;
BufferedWriter br = null;
System.setProperty("HADOOP_USER_NAME", "root");
try {
fs = FileSystem.get(new Configuraion());
//Path p = new Path(dir, file);
Path p = new Path("hdfs://test.server:9000/usr/test/test.txt");
br = new BufferedWriter(new OutputStreamWriter(fs.create(p,true)));
br.write("Hello World");
}
finally {
if(br != null) br.close();
if(fs != null) fs.close();
}
this creates the file but doesn¹t write any bytes and throws the exception:
Exception in thread "main"
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
/usr/test/test.txt could only be replicated to 0 nodes instead of
minReplication (=1). There are 1 datanode(s) running and 1 node(s) are
excluded in this operation.
Googling for this indicated a possible space issue but from the dfsadmin
report, it seems there is plenty of space. This is a plain vanilla install
and I can¹t get past this issue.
The environment summary is:
SERVER:
Hadoop 2.4.0 with pseudo-distribution
(http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-common/Sing
leCluster.html)
CentOS 6.5 Virtual Machine 64 bit server
Java 1.7.0_55
CLIENT:
Windows 8 (Virtual Machine)
Java 1.7.0_51
Any help is greatly appreciated.