You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Roger Whitcomb <Ro...@actian.com> on 2014/04/11 22:20:06 UTC

Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

Hi,
I'm fairly new to Hadoop, but not to Apache, and I'm having a newbie kind of issue browsing HDFS files.  I have written an Apache Commons VFS (Virtual File System) browser for the Apache Pivot GUI framework (I'm the PMC Chair for Pivot: full disclosure).  And now I'm trying to get this browser to work with HDFS to do HDFS browsing from our application.  I'm running into a problem, which seems sort of basic, so I thought I'd ask here...

So, I downloaded Hadoop 2.3.0 from one of the mirrors, and was able to track down sort of the minimum set of .jars necessary to at least (try to) connect using Commons VFS 2.1:
commons-collections-3.2.1.jar
commons-configuration-1.6.jar
commons-lang-2.6.jar
commons-vfs2-2.1-SNAPSHOT.jar
guava-11.0.2.jar
hadoop-auth-2.3.0.jar
hadoop-common-2.3.0.jar
log4j-1.2.17.jar
slf4j-api-1.7.5.jar
slf4j-log4j12-1.7.5.jar

What's happening now is that I instantiated the HdfsProvider this way:
	private static DefaultFileSystemManager manager = null;

	static
	{
	    manager = new DefaultFileSystemManager();
	    try {
		manager.setFilesCache(new DefaultFilesCache());
		manager.addProvider("hdfs", new HdfsFileProvider());
		manager.setFileContentInfoFactory(new FileContentInfoFilenameFactory());
		manager.setFilesCache(new SoftRefFilesCache());
		manager.setReplicator(new DefaultFileReplicator());
		manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
		manager.init();
	    }
	    catch (final FileSystemException e) {
		throw new RuntimeException(Intl.getString("object#manager.setupError"), e);
	    }
	}

Then, I try to browse into an HDFS system this way:
	    String url = String.format("hdfs://%1$s:%2$d/%3$s", "hadoop-master ", 50070, hdfsPath);
	    return manager.resolveFile(url);

Note: the client is running on Windows 7 (but could be any system that runs Java), and the target has been one of several Hadoop clusters on Ubuntu VMs (basically the same thing happens no matter which Hadoop installation I try to hit).  So I'm guessing the problem is in my client configuration.

This attempt to basically just connect to HDFS results in a bunch of error messages in the log file, which looks like it is trying to do user validation on the local machine instead of against the Hadoop (remote) cluster.
Apr 11,2014 18:27:38.640 GMT T[AWT-EventQueue-0](26) DEBUG FileObjectManager: Trying to resolve file reference 'hdfs://hadoop-master:50070/'
Apr 11,2014 18:27:38.953 GMT T[AWT-EventQueue-0](26)  INFO org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
Apr 11,2014 18:27:39.078 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[GetGroups], about=, type=DEFAULT, always=false, sampleName=Ops)
Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics
Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG Groups:  Creating new Groups object
Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: java.library.path=.... <bunch of stuff>
Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26)  WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to detect a valid hadoop home directory: HADOOP_HOME or hadoop.home.dir are not set.
java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
	at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265)
	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:290)
	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
	at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:92)
	at org.apache.hadoop.security.Groups.<init>(Groups.java:76)
	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:239)
	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
	at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:232)
	at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:718)
	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:703)
	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:605)
	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2473)
	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2465)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2331)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)

Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to locate the winutils binary in the hadoop binary path: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login
Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login commit
Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: using local user:NTUserPrincipal: <user_name>
Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: UGI loginUser:whiro01 (auth:SIMPLE)
Apr 11,2014 18:27:39.813 GMT T[AWT-EventQueue-0](26) ERROR HdfsFileSystem: Error connecting to filesystem hdfs://hadoop-master:50070/: No FileSystem for scheme: hdfs
java.io.IOException: No FileSystem for scheme: hdfs
	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2304)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)

So, my guess is that I don't have enough configuration setup on my client machine to tell Hadoop that the authentication is to be done at the remote end ....??  So, I'm trying to track down what the configuration info might be.

Hoping to see if anyone here can see past the Commons VFS stuff that you probably don't understand to be able to tell me what other Hadoop/HDFS files / configuration I need to get this working.

Note: I want to build a GUI component that can browse to arbitrary HDFS installations, so I can't really be setting up a hard-coded XML file for each potential Hadoop cluster I might connect to ....

Thanks,
~Roger Whitcomb


RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

Posted by Roger Whitcomb <Ro...@actian.com>.
Hi Dave,

    ​Thanks for the responses.  I guess I have a small question then:  what exact class(es) would it be looking for that it can't find?  I have all the .jar files I mentioned below on the classpath, and it is loading and executing stuff in the "org.apache.hadoop.fs.FileSystem" class (according to the stack trace below), so .... there are implementing classes I would guess, so what .jar file would they be in?


Thanks,

~Roger


________________________________
From: david marion <dl...@hotmail.com>
Sent: Friday, April 11, 2014 4:55 PM
To: user@hadoop.apache.org
Subject: RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

Also, make sure that the jars on the classpath actually contain the HDFS file system. I'm looking at:

No FileSystem for scheme: hdfs

which is an indicator for this condition.

Dave

________________________________
From: dlmarion@hotmail.com
To: user@hadoop.apache.org
Subject: RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
Date: Fri, 11 Apr 2014 23:48:48 +0000

Hi Roger,

  I wrote the HDFS provider for Commons VFS. I went back and looked at the source and tests, and I don't see anything wrong with what you are doing. I did develop it against Hadoop 1.1.2 at the time, so there might be an issue that is not accounted for with Hadoop 2. It was also not tested with security turned on. Are you using security?

Dave

> From: Roger.Whitcomb@actian.com
> To: user@hadoop.apache.org
> Subject: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
> Date: Fri, 11 Apr 2014 20:20:06 +0000
>
> Hi,
> I'm fairly new to Hadoop, but not to Apache, and I'm having a newbie kind of issue browsing HDFS files. I have written an Apache Commons VFS (Virtual File System) browser for the Apache Pivot GUI framework (I'm the PMC Chair for Pivot: full disclosure). And now I'm trying to get this browser to work with HDFS to do HDFS browsing from our application. I'm running into a problem, which seems sort of basic, so I thought I'd ask here...
>
> So, I downloaded Hadoop 2.3.0 from one of the mirrors, and was able to track down sort of the minimum set of .jars necessary to at least (try to) connect using Commons VFS 2.1:
> commons-collections-3.2.1.jar
> commons-configuration-1.6.jar
> commons-lang-2.6.jar
> commons-vfs2-2.1-SNAPSHOT.jar
> guava-11.0.2.jar
> hadoop-auth-2.3.0.jar
> hadoop-common-2.3.0.jar
> log4j-1.2.17.jar
> slf4j-api-1.7.5.jar
> slf4j-log4j12-1.7.5.jar
>
> What's happening now is that I instantiated the HdfsProvider this way:
> private static DefaultFileSystemManager manager = null;
>
> static
> {
> manager = new DefaultFileSystemManager();
> try {
> manager.setFilesCache(new DefaultFilesCache());
> manager.addProvider("hdfs", new HdfsFileProvider());
> manager.setFileContentInfoFactory(new FileContentInfoFilenameFactory());
> manager.setFilesCache(new SoftRefFilesCache());
> manager.setReplicator(new DefaultFileReplicator());
> manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
> manager.init();
> }
> catch (final FileSystemException e) {
> throw new RuntimeException(Intl.getString("object#manager.setupError"), e);
> }
> }
>
> Then, I try to browse into an HDFS system this way:
> String url = String.format("hdfs://%1$s:%2$d/%3$s", "hadoop-master ", 50070, hdfsPath);
> return manager.resolveFile(url);
>
> Note: the client is running on Windows 7 (but could be any system that runs Java), and the target has been one of several Hadoop clusters on Ubuntu VMs (basically the same thing happens no matter which Hadoop installation I try to hit). So I'm guessing the problem is in my client configuration.
>
> This attempt to basically just connect to HDFS results in a bunch of error messages in the log file, which looks like it is trying to do user validation on the local machine instead of against the Hadoop (remote) cluster.
> Apr 11,2014 18:27:38.640 GMT T[AWT-EventQueue-0](26) DEBUG FileObjectManager: Trying to resolve file reference 'hdfs://hadoop-master:50070/'
> Apr 11,2014 18:27:38.953 GMT T[AWT-EventQueue-0](26) INFO org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
> Apr 11,2014 18:27:39.078 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[GetGroups], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG Groups: Creating new Groups object
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: java.library.path=.... <bunch of stuff>
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to detect a valid hadoop home directory: HADOOP_HOME or hadoop.home.dir are not set.
> java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
> at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265)
> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:290)
> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:92)
> at org.apache.hadoop.security.Groups.<init>(Groups.java:76)
> at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:239)
> at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
> at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:232)
> at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:718)
> at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:703)
> at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:605)
> at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2473)
> at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2465)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2331)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
>
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to locate the winutils binary in the hadoop binary path: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
>
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login commit
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: using local user:NTUserPrincipal: <user_name>
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: UGI loginUser:whiro01 (auth:SIMPLE)
> Apr 11,2014 18:27:39.813 GMT T[AWT-EventQueue-0](26) ERROR HdfsFileSystem: Error connecting to filesystem hdfs://hadoop-master:50070/: No FileSystem for scheme: hdfs
> java.io.IOException: No FileSystem for scheme: hdfs
> at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2304)
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90)
> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
>
> So, my guess is that I don't have enough configuration setup on my client machine to tell Hadoop that the authentication is to be done at the remote end ....?? So, I'm trying to track down what the configuration info might be.
>
> Hoping to see if anyone here can see past the Commons VFS stuff that you probably don't understand to be able to tell me what other Hadoop/HDFS files / configuration I need to get this working.
>
> Note: I want to build a GUI component that can browse to arbitrary HDFS installations, so I can't really be setting up a hard-coded XML file for each potential Hadoop cluster I might connect to ....
>
> Thanks,
> ~Roger Whitcomb
>

RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

Posted by Roger Whitcomb <Ro...@actian.com>.
Hi Dave,

    ​Thanks for the responses.  I guess I have a small question then:  what exact class(es) would it be looking for that it can't find?  I have all the .jar files I mentioned below on the classpath, and it is loading and executing stuff in the "org.apache.hadoop.fs.FileSystem" class (according to the stack trace below), so .... there are implementing classes I would guess, so what .jar file would they be in?


Thanks,

~Roger


________________________________
From: david marion <dl...@hotmail.com>
Sent: Friday, April 11, 2014 4:55 PM
To: user@hadoop.apache.org
Subject: RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

Also, make sure that the jars on the classpath actually contain the HDFS file system. I'm looking at:

No FileSystem for scheme: hdfs

which is an indicator for this condition.

Dave

________________________________
From: dlmarion@hotmail.com
To: user@hadoop.apache.org
Subject: RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
Date: Fri, 11 Apr 2014 23:48:48 +0000

Hi Roger,

  I wrote the HDFS provider for Commons VFS. I went back and looked at the source and tests, and I don't see anything wrong with what you are doing. I did develop it against Hadoop 1.1.2 at the time, so there might be an issue that is not accounted for with Hadoop 2. It was also not tested with security turned on. Are you using security?

Dave

> From: Roger.Whitcomb@actian.com
> To: user@hadoop.apache.org
> Subject: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
> Date: Fri, 11 Apr 2014 20:20:06 +0000
>
> Hi,
> I'm fairly new to Hadoop, but not to Apache, and I'm having a newbie kind of issue browsing HDFS files. I have written an Apache Commons VFS (Virtual File System) browser for the Apache Pivot GUI framework (I'm the PMC Chair for Pivot: full disclosure). And now I'm trying to get this browser to work with HDFS to do HDFS browsing from our application. I'm running into a problem, which seems sort of basic, so I thought I'd ask here...
>
> So, I downloaded Hadoop 2.3.0 from one of the mirrors, and was able to track down sort of the minimum set of .jars necessary to at least (try to) connect using Commons VFS 2.1:
> commons-collections-3.2.1.jar
> commons-configuration-1.6.jar
> commons-lang-2.6.jar
> commons-vfs2-2.1-SNAPSHOT.jar
> guava-11.0.2.jar
> hadoop-auth-2.3.0.jar
> hadoop-common-2.3.0.jar
> log4j-1.2.17.jar
> slf4j-api-1.7.5.jar
> slf4j-log4j12-1.7.5.jar
>
> What's happening now is that I instantiated the HdfsProvider this way:
> private static DefaultFileSystemManager manager = null;
>
> static
> {
> manager = new DefaultFileSystemManager();
> try {
> manager.setFilesCache(new DefaultFilesCache());
> manager.addProvider("hdfs", new HdfsFileProvider());
> manager.setFileContentInfoFactory(new FileContentInfoFilenameFactory());
> manager.setFilesCache(new SoftRefFilesCache());
> manager.setReplicator(new DefaultFileReplicator());
> manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
> manager.init();
> }
> catch (final FileSystemException e) {
> throw new RuntimeException(Intl.getString("object#manager.setupError"), e);
> }
> }
>
> Then, I try to browse into an HDFS system this way:
> String url = String.format("hdfs://%1$s:%2$d/%3$s", "hadoop-master ", 50070, hdfsPath);
> return manager.resolveFile(url);
>
> Note: the client is running on Windows 7 (but could be any system that runs Java), and the target has been one of several Hadoop clusters on Ubuntu VMs (basically the same thing happens no matter which Hadoop installation I try to hit). So I'm guessing the problem is in my client configuration.
>
> This attempt to basically just connect to HDFS results in a bunch of error messages in the log file, which looks like it is trying to do user validation on the local machine instead of against the Hadoop (remote) cluster.
> Apr 11,2014 18:27:38.640 GMT T[AWT-EventQueue-0](26) DEBUG FileObjectManager: Trying to resolve file reference 'hdfs://hadoop-master:50070/'
> Apr 11,2014 18:27:38.953 GMT T[AWT-EventQueue-0](26) INFO org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
> Apr 11,2014 18:27:39.078 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[GetGroups], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG Groups: Creating new Groups object
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: java.library.path=.... <bunch of stuff>
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to detect a valid hadoop home directory: HADOOP_HOME or hadoop.home.dir are not set.
> java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
> at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265)
> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:290)
> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:92)
> at org.apache.hadoop.security.Groups.<init>(Groups.java:76)
> at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:239)
> at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
> at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:232)
> at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:718)
> at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:703)
> at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:605)
> at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2473)
> at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2465)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2331)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
>
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to locate the winutils binary in the hadoop binary path: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
>
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login commit
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: using local user:NTUserPrincipal: <user_name>
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: UGI loginUser:whiro01 (auth:SIMPLE)
> Apr 11,2014 18:27:39.813 GMT T[AWT-EventQueue-0](26) ERROR HdfsFileSystem: Error connecting to filesystem hdfs://hadoop-master:50070/: No FileSystem for scheme: hdfs
> java.io.IOException: No FileSystem for scheme: hdfs
> at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2304)
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90)
> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
>
> So, my guess is that I don't have enough configuration setup on my client machine to tell Hadoop that the authentication is to be done at the remote end ....?? So, I'm trying to track down what the configuration info might be.
>
> Hoping to see if anyone here can see past the Commons VFS stuff that you probably don't understand to be able to tell me what other Hadoop/HDFS files / configuration I need to get this working.
>
> Note: I want to build a GUI component that can browse to arbitrary HDFS installations, so I can't really be setting up a hard-coded XML file for each potential Hadoop cluster I might connect to ....
>
> Thanks,
> ~Roger Whitcomb
>

RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

Posted by Roger Whitcomb <Ro...@actian.com>.
Hi Dave,

    ​Thanks for the responses.  I guess I have a small question then:  what exact class(es) would it be looking for that it can't find?  I have all the .jar files I mentioned below on the classpath, and it is loading and executing stuff in the "org.apache.hadoop.fs.FileSystem" class (according to the stack trace below), so .... there are implementing classes I would guess, so what .jar file would they be in?


Thanks,

~Roger


________________________________
From: david marion <dl...@hotmail.com>
Sent: Friday, April 11, 2014 4:55 PM
To: user@hadoop.apache.org
Subject: RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

Also, make sure that the jars on the classpath actually contain the HDFS file system. I'm looking at:

No FileSystem for scheme: hdfs

which is an indicator for this condition.

Dave

________________________________
From: dlmarion@hotmail.com
To: user@hadoop.apache.org
Subject: RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
Date: Fri, 11 Apr 2014 23:48:48 +0000

Hi Roger,

  I wrote the HDFS provider for Commons VFS. I went back and looked at the source and tests, and I don't see anything wrong with what you are doing. I did develop it against Hadoop 1.1.2 at the time, so there might be an issue that is not accounted for with Hadoop 2. It was also not tested with security turned on. Are you using security?

Dave

> From: Roger.Whitcomb@actian.com
> To: user@hadoop.apache.org
> Subject: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
> Date: Fri, 11 Apr 2014 20:20:06 +0000
>
> Hi,
> I'm fairly new to Hadoop, but not to Apache, and I'm having a newbie kind of issue browsing HDFS files. I have written an Apache Commons VFS (Virtual File System) browser for the Apache Pivot GUI framework (I'm the PMC Chair for Pivot: full disclosure). And now I'm trying to get this browser to work with HDFS to do HDFS browsing from our application. I'm running into a problem, which seems sort of basic, so I thought I'd ask here...
>
> So, I downloaded Hadoop 2.3.0 from one of the mirrors, and was able to track down sort of the minimum set of .jars necessary to at least (try to) connect using Commons VFS 2.1:
> commons-collections-3.2.1.jar
> commons-configuration-1.6.jar
> commons-lang-2.6.jar
> commons-vfs2-2.1-SNAPSHOT.jar
> guava-11.0.2.jar
> hadoop-auth-2.3.0.jar
> hadoop-common-2.3.0.jar
> log4j-1.2.17.jar
> slf4j-api-1.7.5.jar
> slf4j-log4j12-1.7.5.jar
>
> What's happening now is that I instantiated the HdfsProvider this way:
> private static DefaultFileSystemManager manager = null;
>
> static
> {
> manager = new DefaultFileSystemManager();
> try {
> manager.setFilesCache(new DefaultFilesCache());
> manager.addProvider("hdfs", new HdfsFileProvider());
> manager.setFileContentInfoFactory(new FileContentInfoFilenameFactory());
> manager.setFilesCache(new SoftRefFilesCache());
> manager.setReplicator(new DefaultFileReplicator());
> manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
> manager.init();
> }
> catch (final FileSystemException e) {
> throw new RuntimeException(Intl.getString("object#manager.setupError"), e);
> }
> }
>
> Then, I try to browse into an HDFS system this way:
> String url = String.format("hdfs://%1$s:%2$d/%3$s", "hadoop-master ", 50070, hdfsPath);
> return manager.resolveFile(url);
>
> Note: the client is running on Windows 7 (but could be any system that runs Java), and the target has been one of several Hadoop clusters on Ubuntu VMs (basically the same thing happens no matter which Hadoop installation I try to hit). So I'm guessing the problem is in my client configuration.
>
> This attempt to basically just connect to HDFS results in a bunch of error messages in the log file, which looks like it is trying to do user validation on the local machine instead of against the Hadoop (remote) cluster.
> Apr 11,2014 18:27:38.640 GMT T[AWT-EventQueue-0](26) DEBUG FileObjectManager: Trying to resolve file reference 'hdfs://hadoop-master:50070/'
> Apr 11,2014 18:27:38.953 GMT T[AWT-EventQueue-0](26) INFO org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
> Apr 11,2014 18:27:39.078 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[GetGroups], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG Groups: Creating new Groups object
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: java.library.path=.... <bunch of stuff>
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to detect a valid hadoop home directory: HADOOP_HOME or hadoop.home.dir are not set.
> java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
> at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265)
> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:290)
> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:92)
> at org.apache.hadoop.security.Groups.<init>(Groups.java:76)
> at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:239)
> at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
> at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:232)
> at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:718)
> at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:703)
> at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:605)
> at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2473)
> at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2465)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2331)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
>
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to locate the winutils binary in the hadoop binary path: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
>
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login commit
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: using local user:NTUserPrincipal: <user_name>
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: UGI loginUser:whiro01 (auth:SIMPLE)
> Apr 11,2014 18:27:39.813 GMT T[AWT-EventQueue-0](26) ERROR HdfsFileSystem: Error connecting to filesystem hdfs://hadoop-master:50070/: No FileSystem for scheme: hdfs
> java.io.IOException: No FileSystem for scheme: hdfs
> at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2304)
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90)
> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
>
> So, my guess is that I don't have enough configuration setup on my client machine to tell Hadoop that the authentication is to be done at the remote end ....?? So, I'm trying to track down what the configuration info might be.
>
> Hoping to see if anyone here can see past the Commons VFS stuff that you probably don't understand to be able to tell me what other Hadoop/HDFS files / configuration I need to get this working.
>
> Note: I want to build a GUI component that can browse to arbitrary HDFS installations, so I can't really be setting up a hard-coded XML file for each potential Hadoop cluster I might connect to ....
>
> Thanks,
> ~Roger Whitcomb
>

RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

Posted by Roger Whitcomb <Ro...@actian.com>.
Hi Dave,

    ​Thanks for the responses.  I guess I have a small question then:  what exact class(es) would it be looking for that it can't find?  I have all the .jar files I mentioned below on the classpath, and it is loading and executing stuff in the "org.apache.hadoop.fs.FileSystem" class (according to the stack trace below), so .... there are implementing classes I would guess, so what .jar file would they be in?


Thanks,

~Roger


________________________________
From: david marion <dl...@hotmail.com>
Sent: Friday, April 11, 2014 4:55 PM
To: user@hadoop.apache.org
Subject: RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

Also, make sure that the jars on the classpath actually contain the HDFS file system. I'm looking at:

No FileSystem for scheme: hdfs

which is an indicator for this condition.

Dave

________________________________
From: dlmarion@hotmail.com
To: user@hadoop.apache.org
Subject: RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
Date: Fri, 11 Apr 2014 23:48:48 +0000

Hi Roger,

  I wrote the HDFS provider for Commons VFS. I went back and looked at the source and tests, and I don't see anything wrong with what you are doing. I did develop it against Hadoop 1.1.2 at the time, so there might be an issue that is not accounted for with Hadoop 2. It was also not tested with security turned on. Are you using security?

Dave

> From: Roger.Whitcomb@actian.com
> To: user@hadoop.apache.org
> Subject: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
> Date: Fri, 11 Apr 2014 20:20:06 +0000
>
> Hi,
> I'm fairly new to Hadoop, but not to Apache, and I'm having a newbie kind of issue browsing HDFS files. I have written an Apache Commons VFS (Virtual File System) browser for the Apache Pivot GUI framework (I'm the PMC Chair for Pivot: full disclosure). And now I'm trying to get this browser to work with HDFS to do HDFS browsing from our application. I'm running into a problem, which seems sort of basic, so I thought I'd ask here...
>
> So, I downloaded Hadoop 2.3.0 from one of the mirrors, and was able to track down sort of the minimum set of .jars necessary to at least (try to) connect using Commons VFS 2.1:
> commons-collections-3.2.1.jar
> commons-configuration-1.6.jar
> commons-lang-2.6.jar
> commons-vfs2-2.1-SNAPSHOT.jar
> guava-11.0.2.jar
> hadoop-auth-2.3.0.jar
> hadoop-common-2.3.0.jar
> log4j-1.2.17.jar
> slf4j-api-1.7.5.jar
> slf4j-log4j12-1.7.5.jar
>
> What's happening now is that I instantiated the HdfsProvider this way:
> private static DefaultFileSystemManager manager = null;
>
> static
> {
> manager = new DefaultFileSystemManager();
> try {
> manager.setFilesCache(new DefaultFilesCache());
> manager.addProvider("hdfs", new HdfsFileProvider());
> manager.setFileContentInfoFactory(new FileContentInfoFilenameFactory());
> manager.setFilesCache(new SoftRefFilesCache());
> manager.setReplicator(new DefaultFileReplicator());
> manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
> manager.init();
> }
> catch (final FileSystemException e) {
> throw new RuntimeException(Intl.getString("object#manager.setupError"), e);
> }
> }
>
> Then, I try to browse into an HDFS system this way:
> String url = String.format("hdfs://%1$s:%2$d/%3$s", "hadoop-master ", 50070, hdfsPath);
> return manager.resolveFile(url);
>
> Note: the client is running on Windows 7 (but could be any system that runs Java), and the target has been one of several Hadoop clusters on Ubuntu VMs (basically the same thing happens no matter which Hadoop installation I try to hit). So I'm guessing the problem is in my client configuration.
>
> This attempt to basically just connect to HDFS results in a bunch of error messages in the log file, which looks like it is trying to do user validation on the local machine instead of against the Hadoop (remote) cluster.
> Apr 11,2014 18:27:38.640 GMT T[AWT-EventQueue-0](26) DEBUG FileObjectManager: Trying to resolve file reference 'hdfs://hadoop-master:50070/'
> Apr 11,2014 18:27:38.953 GMT T[AWT-EventQueue-0](26) INFO org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
> Apr 11,2014 18:27:39.078 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[GetGroups], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG Groups: Creating new Groups object
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: java.library.path=.... <bunch of stuff>
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to detect a valid hadoop home directory: HADOOP_HOME or hadoop.home.dir are not set.
> java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
> at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265)
> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:290)
> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:92)
> at org.apache.hadoop.security.Groups.<init>(Groups.java:76)
> at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:239)
> at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
> at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:232)
> at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:718)
> at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:703)
> at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:605)
> at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2473)
> at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2465)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2331)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
>
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to locate the winutils binary in the hadoop binary path: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
>
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login commit
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: using local user:NTUserPrincipal: <user_name>
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: UGI loginUser:whiro01 (auth:SIMPLE)
> Apr 11,2014 18:27:39.813 GMT T[AWT-EventQueue-0](26) ERROR HdfsFileSystem: Error connecting to filesystem hdfs://hadoop-master:50070/: No FileSystem for scheme: hdfs
> java.io.IOException: No FileSystem for scheme: hdfs
> at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2304)
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90)
> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
>
> So, my guess is that I don't have enough configuration setup on my client machine to tell Hadoop that the authentication is to be done at the remote end ....?? So, I'm trying to track down what the configuration info might be.
>
> Hoping to see if anyone here can see past the Commons VFS stuff that you probably don't understand to be able to tell me what other Hadoop/HDFS files / configuration I need to get this working.
>
> Note: I want to build a GUI component that can browse to arbitrary HDFS installations, so I can't really be setting up a hard-coded XML file for each potential Hadoop cluster I might connect to ....
>
> Thanks,
> ~Roger Whitcomb
>

RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

Posted by david marion <dl...@hotmail.com>.
Also, make sure that the jars on the classpath actually contain the HDFS file system. I'm looking at:

No FileSystem for scheme: hdfs

which is an indicator for this condition.

Dave

From: dlmarion@hotmail.com
To: user@hadoop.apache.org
Subject: RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
Date: Fri, 11 Apr 2014 23:48:48 +0000




Hi Roger,

  I wrote the HDFS provider for Commons VFS. I went back and looked at the source and tests, and I don't see anything wrong with what you are doing. I did develop it against Hadoop 1.1.2 at the time, so there might be an issue that is not accounted for with Hadoop 2. It was also not tested with security turned on. Are you using security?

Dave

> From: Roger.Whitcomb@actian.com
> To: user@hadoop.apache.org
> Subject: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
> Date: Fri, 11 Apr 2014 20:20:06 +0000
> 
> Hi,
> I'm fairly new to Hadoop, but not to Apache, and I'm having a newbie kind of issue browsing HDFS files.  I have written an Apache Commons VFS (Virtual File System) browser for the Apache Pivot GUI framework (I'm the PMC Chair for Pivot: full disclosure).  And now I'm trying to get this browser to work with HDFS to do HDFS browsing from our application.  I'm running into a problem, which seems sort of basic, so I thought I'd ask here...
> 
> So, I downloaded Hadoop 2.3.0 from one of the mirrors, and was able to track down sort of the minimum set of .jars necessary to at least (try to) connect using Commons VFS 2.1:
> commons-collections-3.2.1.jar
> commons-configuration-1.6.jar
> commons-lang-2.6.jar
> commons-vfs2-2.1-SNAPSHOT.jar
> guava-11.0.2.jar
> hadoop-auth-2.3.0.jar
> hadoop-common-2.3.0.jar
> log4j-1.2.17.jar
> slf4j-api-1.7.5.jar
> slf4j-log4j12-1.7.5.jar
> 
> What's happening now is that I instantiated the HdfsProvider this way:
> 	private static DefaultFileSystemManager manager = null;
> 
> 	static
> 	{
> 	    manager = new DefaultFileSystemManager();
> 	    try {
> 		manager.setFilesCache(new DefaultFilesCache());
> 		manager.addProvider("hdfs", new HdfsFileProvider());
> 		manager.setFileContentInfoFactory(new FileContentInfoFilenameFactory());
> 		manager.setFilesCache(new SoftRefFilesCache());
> 		manager.setReplicator(new DefaultFileReplicator());
> 		manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
> 		manager.init();
> 	    }
> 	    catch (final FileSystemException e) {
> 		throw new RuntimeException(Intl.getString("object#manager.setupError"), e);
> 	    }
> 	}
> 
> Then, I try to browse into an HDFS system this way:
> 	    String url = String.format("hdfs://%1$s:%2$d/%3$s", "hadoop-master ", 50070, hdfsPath);
> 	    return manager.resolveFile(url);
> 
> Note: the client is running on Windows 7 (but could be any system that runs Java), and the target has been one of several Hadoop clusters on Ubuntu VMs (basically the same thing happens no matter which Hadoop installation I try to hit).  So I'm guessing the problem is in my client configuration.
> 
> This attempt to basically just connect to HDFS results in a bunch of error messages in the log file, which looks like it is trying to do user validation on the local machine instead of against the Hadoop (remote) cluster.
> Apr 11,2014 18:27:38.640 GMT T[AWT-EventQueue-0](26) DEBUG FileObjectManager: Trying to resolve file reference 'hdfs://hadoop-master:50070/'
> Apr 11,2014 18:27:38.953 GMT T[AWT-EventQueue-0](26)  INFO org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
> Apr 11,2014 18:27:39.078 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[GetGroups], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG Groups:  Creating new Groups object
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: java.library.path=.... <bunch of stuff>
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26)  WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to detect a valid hadoop home directory: HADOOP_HOME or hadoop.home.dir are not set.
> java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
> 	at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265)
> 	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:290)
> 	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
> 	at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:92)
> 	at org.apache.hadoop.security.Groups.<init>(Groups.java:76)
> 	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:239)
> 	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
> 	at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:232)
> 	at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:718)
> 	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:703)
> 	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:605)
> 	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2473)
> 	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2465)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2331)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> 	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
> 
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to locate the winutils binary in the hadoop binary path: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> 
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login commit
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: using local user:NTUserPrincipal: <user_name>
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: UGI loginUser:whiro01 (auth:SIMPLE)
> Apr 11,2014 18:27:39.813 GMT T[AWT-EventQueue-0](26) ERROR HdfsFileSystem: Error connecting to filesystem hdfs://hadoop-master:50070/: No FileSystem for scheme: hdfs
> java.io.IOException: No FileSystem for scheme: hdfs
> 	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2304)
> 	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311)
> 	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90)
> 	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> 	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
> 
> So, my guess is that I don't have enough configuration setup on my client machine to tell Hadoop that the authentication is to be done at the remote end ....??  So, I'm trying to track down what the configuration info might be.
> 
> Hoping to see if anyone here can see past the Commons VFS stuff that you probably don't understand to be able to tell me what other Hadoop/HDFS files / configuration I need to get this working.
> 
> Note: I want to build a GUI component that can browse to arbitrary HDFS installations, so I can't really be setting up a hard-coded XML file for each potential Hadoop cluster I might connect to ....
> 
> Thanks,
> ~Roger Whitcomb
> 
 		 	   		   		 	   		  

RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

Posted by david marion <dl...@hotmail.com>.
Also, make sure that the jars on the classpath actually contain the HDFS file system. I'm looking at:

No FileSystem for scheme: hdfs

which is an indicator for this condition.

Dave

From: dlmarion@hotmail.com
To: user@hadoop.apache.org
Subject: RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
Date: Fri, 11 Apr 2014 23:48:48 +0000




Hi Roger,

  I wrote the HDFS provider for Commons VFS. I went back and looked at the source and tests, and I don't see anything wrong with what you are doing. I did develop it against Hadoop 1.1.2 at the time, so there might be an issue that is not accounted for with Hadoop 2. It was also not tested with security turned on. Are you using security?

Dave

> From: Roger.Whitcomb@actian.com
> To: user@hadoop.apache.org
> Subject: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
> Date: Fri, 11 Apr 2014 20:20:06 +0000
> 
> Hi,
> I'm fairly new to Hadoop, but not to Apache, and I'm having a newbie kind of issue browsing HDFS files.  I have written an Apache Commons VFS (Virtual File System) browser for the Apache Pivot GUI framework (I'm the PMC Chair for Pivot: full disclosure).  And now I'm trying to get this browser to work with HDFS to do HDFS browsing from our application.  I'm running into a problem, which seems sort of basic, so I thought I'd ask here...
> 
> So, I downloaded Hadoop 2.3.0 from one of the mirrors, and was able to track down sort of the minimum set of .jars necessary to at least (try to) connect using Commons VFS 2.1:
> commons-collections-3.2.1.jar
> commons-configuration-1.6.jar
> commons-lang-2.6.jar
> commons-vfs2-2.1-SNAPSHOT.jar
> guava-11.0.2.jar
> hadoop-auth-2.3.0.jar
> hadoop-common-2.3.0.jar
> log4j-1.2.17.jar
> slf4j-api-1.7.5.jar
> slf4j-log4j12-1.7.5.jar
> 
> What's happening now is that I instantiated the HdfsProvider this way:
> 	private static DefaultFileSystemManager manager = null;
> 
> 	static
> 	{
> 	    manager = new DefaultFileSystemManager();
> 	    try {
> 		manager.setFilesCache(new DefaultFilesCache());
> 		manager.addProvider("hdfs", new HdfsFileProvider());
> 		manager.setFileContentInfoFactory(new FileContentInfoFilenameFactory());
> 		manager.setFilesCache(new SoftRefFilesCache());
> 		manager.setReplicator(new DefaultFileReplicator());
> 		manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
> 		manager.init();
> 	    }
> 	    catch (final FileSystemException e) {
> 		throw new RuntimeException(Intl.getString("object#manager.setupError"), e);
> 	    }
> 	}
> 
> Then, I try to browse into an HDFS system this way:
> 	    String url = String.format("hdfs://%1$s:%2$d/%3$s", "hadoop-master ", 50070, hdfsPath);
> 	    return manager.resolveFile(url);
> 
> Note: the client is running on Windows 7 (but could be any system that runs Java), and the target has been one of several Hadoop clusters on Ubuntu VMs (basically the same thing happens no matter which Hadoop installation I try to hit).  So I'm guessing the problem is in my client configuration.
> 
> This attempt to basically just connect to HDFS results in a bunch of error messages in the log file, which looks like it is trying to do user validation on the local machine instead of against the Hadoop (remote) cluster.
> Apr 11,2014 18:27:38.640 GMT T[AWT-EventQueue-0](26) DEBUG FileObjectManager: Trying to resolve file reference 'hdfs://hadoop-master:50070/'
> Apr 11,2014 18:27:38.953 GMT T[AWT-EventQueue-0](26)  INFO org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
> Apr 11,2014 18:27:39.078 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[GetGroups], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG Groups:  Creating new Groups object
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: java.library.path=.... <bunch of stuff>
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26)  WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to detect a valid hadoop home directory: HADOOP_HOME or hadoop.home.dir are not set.
> java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
> 	at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265)
> 	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:290)
> 	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
> 	at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:92)
> 	at org.apache.hadoop.security.Groups.<init>(Groups.java:76)
> 	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:239)
> 	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
> 	at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:232)
> 	at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:718)
> 	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:703)
> 	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:605)
> 	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2473)
> 	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2465)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2331)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> 	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
> 
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to locate the winutils binary in the hadoop binary path: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> 
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login commit
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: using local user:NTUserPrincipal: <user_name>
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: UGI loginUser:whiro01 (auth:SIMPLE)
> Apr 11,2014 18:27:39.813 GMT T[AWT-EventQueue-0](26) ERROR HdfsFileSystem: Error connecting to filesystem hdfs://hadoop-master:50070/: No FileSystem for scheme: hdfs
> java.io.IOException: No FileSystem for scheme: hdfs
> 	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2304)
> 	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311)
> 	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90)
> 	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> 	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
> 
> So, my guess is that I don't have enough configuration setup on my client machine to tell Hadoop that the authentication is to be done at the remote end ....??  So, I'm trying to track down what the configuration info might be.
> 
> Hoping to see if anyone here can see past the Commons VFS stuff that you probably don't understand to be able to tell me what other Hadoop/HDFS files / configuration I need to get this working.
> 
> Note: I want to build a GUI component that can browse to arbitrary HDFS installations, so I can't really be setting up a hard-coded XML file for each potential Hadoop cluster I might connect to ....
> 
> Thanks,
> ~Roger Whitcomb
> 
 		 	   		   		 	   		  

RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

Posted by david marion <dl...@hotmail.com>.
Also, make sure that the jars on the classpath actually contain the HDFS file system. I'm looking at:

No FileSystem for scheme: hdfs

which is an indicator for this condition.

Dave

From: dlmarion@hotmail.com
To: user@hadoop.apache.org
Subject: RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
Date: Fri, 11 Apr 2014 23:48:48 +0000




Hi Roger,

  I wrote the HDFS provider for Commons VFS. I went back and looked at the source and tests, and I don't see anything wrong with what you are doing. I did develop it against Hadoop 1.1.2 at the time, so there might be an issue that is not accounted for with Hadoop 2. It was also not tested with security turned on. Are you using security?

Dave

> From: Roger.Whitcomb@actian.com
> To: user@hadoop.apache.org
> Subject: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
> Date: Fri, 11 Apr 2014 20:20:06 +0000
> 
> Hi,
> I'm fairly new to Hadoop, but not to Apache, and I'm having a newbie kind of issue browsing HDFS files.  I have written an Apache Commons VFS (Virtual File System) browser for the Apache Pivot GUI framework (I'm the PMC Chair for Pivot: full disclosure).  And now I'm trying to get this browser to work with HDFS to do HDFS browsing from our application.  I'm running into a problem, which seems sort of basic, so I thought I'd ask here...
> 
> So, I downloaded Hadoop 2.3.0 from one of the mirrors, and was able to track down sort of the minimum set of .jars necessary to at least (try to) connect using Commons VFS 2.1:
> commons-collections-3.2.1.jar
> commons-configuration-1.6.jar
> commons-lang-2.6.jar
> commons-vfs2-2.1-SNAPSHOT.jar
> guava-11.0.2.jar
> hadoop-auth-2.3.0.jar
> hadoop-common-2.3.0.jar
> log4j-1.2.17.jar
> slf4j-api-1.7.5.jar
> slf4j-log4j12-1.7.5.jar
> 
> What's happening now is that I instantiated the HdfsProvider this way:
> 	private static DefaultFileSystemManager manager = null;
> 
> 	static
> 	{
> 	    manager = new DefaultFileSystemManager();
> 	    try {
> 		manager.setFilesCache(new DefaultFilesCache());
> 		manager.addProvider("hdfs", new HdfsFileProvider());
> 		manager.setFileContentInfoFactory(new FileContentInfoFilenameFactory());
> 		manager.setFilesCache(new SoftRefFilesCache());
> 		manager.setReplicator(new DefaultFileReplicator());
> 		manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
> 		manager.init();
> 	    }
> 	    catch (final FileSystemException e) {
> 		throw new RuntimeException(Intl.getString("object#manager.setupError"), e);
> 	    }
> 	}
> 
> Then, I try to browse into an HDFS system this way:
> 	    String url = String.format("hdfs://%1$s:%2$d/%3$s", "hadoop-master ", 50070, hdfsPath);
> 	    return manager.resolveFile(url);
> 
> Note: the client is running on Windows 7 (but could be any system that runs Java), and the target has been one of several Hadoop clusters on Ubuntu VMs (basically the same thing happens no matter which Hadoop installation I try to hit).  So I'm guessing the problem is in my client configuration.
> 
> This attempt to basically just connect to HDFS results in a bunch of error messages in the log file, which looks like it is trying to do user validation on the local machine instead of against the Hadoop (remote) cluster.
> Apr 11,2014 18:27:38.640 GMT T[AWT-EventQueue-0](26) DEBUG FileObjectManager: Trying to resolve file reference 'hdfs://hadoop-master:50070/'
> Apr 11,2014 18:27:38.953 GMT T[AWT-EventQueue-0](26)  INFO org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
> Apr 11,2014 18:27:39.078 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[GetGroups], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG Groups:  Creating new Groups object
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: java.library.path=.... <bunch of stuff>
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26)  WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to detect a valid hadoop home directory: HADOOP_HOME or hadoop.home.dir are not set.
> java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
> 	at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265)
> 	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:290)
> 	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
> 	at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:92)
> 	at org.apache.hadoop.security.Groups.<init>(Groups.java:76)
> 	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:239)
> 	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
> 	at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:232)
> 	at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:718)
> 	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:703)
> 	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:605)
> 	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2473)
> 	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2465)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2331)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> 	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
> 
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to locate the winutils binary in the hadoop binary path: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> 
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login commit
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: using local user:NTUserPrincipal: <user_name>
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: UGI loginUser:whiro01 (auth:SIMPLE)
> Apr 11,2014 18:27:39.813 GMT T[AWT-EventQueue-0](26) ERROR HdfsFileSystem: Error connecting to filesystem hdfs://hadoop-master:50070/: No FileSystem for scheme: hdfs
> java.io.IOException: No FileSystem for scheme: hdfs
> 	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2304)
> 	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311)
> 	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90)
> 	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> 	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
> 
> So, my guess is that I don't have enough configuration setup on my client machine to tell Hadoop that the authentication is to be done at the remote end ....??  So, I'm trying to track down what the configuration info might be.
> 
> Hoping to see if anyone here can see past the Commons VFS stuff that you probably don't understand to be able to tell me what other Hadoop/HDFS files / configuration I need to get this working.
> 
> Note: I want to build a GUI component that can browse to arbitrary HDFS installations, so I can't really be setting up a hard-coded XML file for each potential Hadoop cluster I might connect to ....
> 
> Thanks,
> ~Roger Whitcomb
> 
 		 	   		   		 	   		  

RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

Posted by david marion <dl...@hotmail.com>.
Also, make sure that the jars on the classpath actually contain the HDFS file system. I'm looking at:

No FileSystem for scheme: hdfs

which is an indicator for this condition.

Dave

From: dlmarion@hotmail.com
To: user@hadoop.apache.org
Subject: RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
Date: Fri, 11 Apr 2014 23:48:48 +0000




Hi Roger,

  I wrote the HDFS provider for Commons VFS. I went back and looked at the source and tests, and I don't see anything wrong with what you are doing. I did develop it against Hadoop 1.1.2 at the time, so there might be an issue that is not accounted for with Hadoop 2. It was also not tested with security turned on. Are you using security?

Dave

> From: Roger.Whitcomb@actian.com
> To: user@hadoop.apache.org
> Subject: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
> Date: Fri, 11 Apr 2014 20:20:06 +0000
> 
> Hi,
> I'm fairly new to Hadoop, but not to Apache, and I'm having a newbie kind of issue browsing HDFS files.  I have written an Apache Commons VFS (Virtual File System) browser for the Apache Pivot GUI framework (I'm the PMC Chair for Pivot: full disclosure).  And now I'm trying to get this browser to work with HDFS to do HDFS browsing from our application.  I'm running into a problem, which seems sort of basic, so I thought I'd ask here...
> 
> So, I downloaded Hadoop 2.3.0 from one of the mirrors, and was able to track down sort of the minimum set of .jars necessary to at least (try to) connect using Commons VFS 2.1:
> commons-collections-3.2.1.jar
> commons-configuration-1.6.jar
> commons-lang-2.6.jar
> commons-vfs2-2.1-SNAPSHOT.jar
> guava-11.0.2.jar
> hadoop-auth-2.3.0.jar
> hadoop-common-2.3.0.jar
> log4j-1.2.17.jar
> slf4j-api-1.7.5.jar
> slf4j-log4j12-1.7.5.jar
> 
> What's happening now is that I instantiated the HdfsProvider this way:
> 	private static DefaultFileSystemManager manager = null;
> 
> 	static
> 	{
> 	    manager = new DefaultFileSystemManager();
> 	    try {
> 		manager.setFilesCache(new DefaultFilesCache());
> 		manager.addProvider("hdfs", new HdfsFileProvider());
> 		manager.setFileContentInfoFactory(new FileContentInfoFilenameFactory());
> 		manager.setFilesCache(new SoftRefFilesCache());
> 		manager.setReplicator(new DefaultFileReplicator());
> 		manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
> 		manager.init();
> 	    }
> 	    catch (final FileSystemException e) {
> 		throw new RuntimeException(Intl.getString("object#manager.setupError"), e);
> 	    }
> 	}
> 
> Then, I try to browse into an HDFS system this way:
> 	    String url = String.format("hdfs://%1$s:%2$d/%3$s", "hadoop-master ", 50070, hdfsPath);
> 	    return manager.resolveFile(url);
> 
> Note: the client is running on Windows 7 (but could be any system that runs Java), and the target has been one of several Hadoop clusters on Ubuntu VMs (basically the same thing happens no matter which Hadoop installation I try to hit).  So I'm guessing the problem is in my client configuration.
> 
> This attempt to basically just connect to HDFS results in a bunch of error messages in the log file, which looks like it is trying to do user validation on the local machine instead of against the Hadoop (remote) cluster.
> Apr 11,2014 18:27:38.640 GMT T[AWT-EventQueue-0](26) DEBUG FileObjectManager: Trying to resolve file reference 'hdfs://hadoop-master:50070/'
> Apr 11,2014 18:27:38.953 GMT T[AWT-EventQueue-0](26)  INFO org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
> Apr 11,2014 18:27:39.078 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[GetGroups], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG Groups:  Creating new Groups object
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: java.library.path=.... <bunch of stuff>
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26)  WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to detect a valid hadoop home directory: HADOOP_HOME or hadoop.home.dir are not set.
> java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
> 	at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265)
> 	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:290)
> 	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
> 	at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:92)
> 	at org.apache.hadoop.security.Groups.<init>(Groups.java:76)
> 	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:239)
> 	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
> 	at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:232)
> 	at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:718)
> 	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:703)
> 	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:605)
> 	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2473)
> 	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2465)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2331)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> 	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
> 
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to locate the winutils binary in the hadoop binary path: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> 
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login commit
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: using local user:NTUserPrincipal: <user_name>
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: UGI loginUser:whiro01 (auth:SIMPLE)
> Apr 11,2014 18:27:39.813 GMT T[AWT-EventQueue-0](26) ERROR HdfsFileSystem: Error connecting to filesystem hdfs://hadoop-master:50070/: No FileSystem for scheme: hdfs
> java.io.IOException: No FileSystem for scheme: hdfs
> 	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2304)
> 	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311)
> 	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90)
> 	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> 	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
> 
> So, my guess is that I don't have enough configuration setup on my client machine to tell Hadoop that the authentication is to be done at the remote end ....??  So, I'm trying to track down what the configuration info might be.
> 
> Hoping to see if anyone here can see past the Commons VFS stuff that you probably don't understand to be able to tell me what other Hadoop/HDFS files / configuration I need to get this working.
> 
> Note: I want to build a GUI component that can browse to arbitrary HDFS installations, so I can't really be setting up a hard-coded XML file for each potential Hadoop cluster I might connect to ....
> 
> Thanks,
> ~Roger Whitcomb
> 
 		 	   		   		 	   		  

RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

Posted by david marion <dl...@hotmail.com>.
Hi Roger,

  I wrote the HDFS provider for Commons VFS. I went back and looked at the source and tests, and I don't see anything wrong with what you are doing. I did develop it against Hadoop 1.1.2 at the time, so there might be an issue that is not accounted for with Hadoop 2. It was also not tested with security turned on. Are you using security?

Dave

> From: Roger.Whitcomb@actian.com
> To: user@hadoop.apache.org
> Subject: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
> Date: Fri, 11 Apr 2014 20:20:06 +0000
> 
> Hi,
> I'm fairly new to Hadoop, but not to Apache, and I'm having a newbie kind of issue browsing HDFS files.  I have written an Apache Commons VFS (Virtual File System) browser for the Apache Pivot GUI framework (I'm the PMC Chair for Pivot: full disclosure).  And now I'm trying to get this browser to work with HDFS to do HDFS browsing from our application.  I'm running into a problem, which seems sort of basic, so I thought I'd ask here...
> 
> So, I downloaded Hadoop 2.3.0 from one of the mirrors, and was able to track down sort of the minimum set of .jars necessary to at least (try to) connect using Commons VFS 2.1:
> commons-collections-3.2.1.jar
> commons-configuration-1.6.jar
> commons-lang-2.6.jar
> commons-vfs2-2.1-SNAPSHOT.jar
> guava-11.0.2.jar
> hadoop-auth-2.3.0.jar
> hadoop-common-2.3.0.jar
> log4j-1.2.17.jar
> slf4j-api-1.7.5.jar
> slf4j-log4j12-1.7.5.jar
> 
> What's happening now is that I instantiated the HdfsProvider this way:
> 	private static DefaultFileSystemManager manager = null;
> 
> 	static
> 	{
> 	    manager = new DefaultFileSystemManager();
> 	    try {
> 		manager.setFilesCache(new DefaultFilesCache());
> 		manager.addProvider("hdfs", new HdfsFileProvider());
> 		manager.setFileContentInfoFactory(new FileContentInfoFilenameFactory());
> 		manager.setFilesCache(new SoftRefFilesCache());
> 		manager.setReplicator(new DefaultFileReplicator());
> 		manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
> 		manager.init();
> 	    }
> 	    catch (final FileSystemException e) {
> 		throw new RuntimeException(Intl.getString("object#manager.setupError"), e);
> 	    }
> 	}
> 
> Then, I try to browse into an HDFS system this way:
> 	    String url = String.format("hdfs://%1$s:%2$d/%3$s", "hadoop-master ", 50070, hdfsPath);
> 	    return manager.resolveFile(url);
> 
> Note: the client is running on Windows 7 (but could be any system that runs Java), and the target has been one of several Hadoop clusters on Ubuntu VMs (basically the same thing happens no matter which Hadoop installation I try to hit).  So I'm guessing the problem is in my client configuration.
> 
> This attempt to basically just connect to HDFS results in a bunch of error messages in the log file, which looks like it is trying to do user validation on the local machine instead of against the Hadoop (remote) cluster.
> Apr 11,2014 18:27:38.640 GMT T[AWT-EventQueue-0](26) DEBUG FileObjectManager: Trying to resolve file reference 'hdfs://hadoop-master:50070/'
> Apr 11,2014 18:27:38.953 GMT T[AWT-EventQueue-0](26)  INFO org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
> Apr 11,2014 18:27:39.078 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[GetGroups], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG Groups:  Creating new Groups object
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: java.library.path=.... <bunch of stuff>
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26)  WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to detect a valid hadoop home directory: HADOOP_HOME or hadoop.home.dir are not set.
> java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
> 	at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265)
> 	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:290)
> 	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
> 	at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:92)
> 	at org.apache.hadoop.security.Groups.<init>(Groups.java:76)
> 	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:239)
> 	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
> 	at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:232)
> 	at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:718)
> 	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:703)
> 	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:605)
> 	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2473)
> 	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2465)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2331)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> 	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
> 
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to locate the winutils binary in the hadoop binary path: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> 
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login commit
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: using local user:NTUserPrincipal: <user_name>
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: UGI loginUser:whiro01 (auth:SIMPLE)
> Apr 11,2014 18:27:39.813 GMT T[AWT-EventQueue-0](26) ERROR HdfsFileSystem: Error connecting to filesystem hdfs://hadoop-master:50070/: No FileSystem for scheme: hdfs
> java.io.IOException: No FileSystem for scheme: hdfs
> 	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2304)
> 	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311)
> 	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90)
> 	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> 	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
> 
> So, my guess is that I don't have enough configuration setup on my client machine to tell Hadoop that the authentication is to be done at the remote end ....??  So, I'm trying to track down what the configuration info might be.
> 
> Hoping to see if anyone here can see past the Commons VFS stuff that you probably don't understand to be able to tell me what other Hadoop/HDFS files / configuration I need to get this working.
> 
> Note: I want to build a GUI component that can browse to arbitrary HDFS installations, so I can't really be setting up a hard-coded XML file for each potential Hadoop cluster I might connect to ....
> 
> Thanks,
> ~Roger Whitcomb
> 
 		 	   		  

RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

Posted by david marion <dl...@hotmail.com>.
Hi Roger,

  I wrote the HDFS provider for Commons VFS. I went back and looked at the source and tests, and I don't see anything wrong with what you are doing. I did develop it against Hadoop 1.1.2 at the time, so there might be an issue that is not accounted for with Hadoop 2. It was also not tested with security turned on. Are you using security?

Dave

> From: Roger.Whitcomb@actian.com
> To: user@hadoop.apache.org
> Subject: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
> Date: Fri, 11 Apr 2014 20:20:06 +0000
> 
> Hi,
> I'm fairly new to Hadoop, but not to Apache, and I'm having a newbie kind of issue browsing HDFS files.  I have written an Apache Commons VFS (Virtual File System) browser for the Apache Pivot GUI framework (I'm the PMC Chair for Pivot: full disclosure).  And now I'm trying to get this browser to work with HDFS to do HDFS browsing from our application.  I'm running into a problem, which seems sort of basic, so I thought I'd ask here...
> 
> So, I downloaded Hadoop 2.3.0 from one of the mirrors, and was able to track down sort of the minimum set of .jars necessary to at least (try to) connect using Commons VFS 2.1:
> commons-collections-3.2.1.jar
> commons-configuration-1.6.jar
> commons-lang-2.6.jar
> commons-vfs2-2.1-SNAPSHOT.jar
> guava-11.0.2.jar
> hadoop-auth-2.3.0.jar
> hadoop-common-2.3.0.jar
> log4j-1.2.17.jar
> slf4j-api-1.7.5.jar
> slf4j-log4j12-1.7.5.jar
> 
> What's happening now is that I instantiated the HdfsProvider this way:
> 	private static DefaultFileSystemManager manager = null;
> 
> 	static
> 	{
> 	    manager = new DefaultFileSystemManager();
> 	    try {
> 		manager.setFilesCache(new DefaultFilesCache());
> 		manager.addProvider("hdfs", new HdfsFileProvider());
> 		manager.setFileContentInfoFactory(new FileContentInfoFilenameFactory());
> 		manager.setFilesCache(new SoftRefFilesCache());
> 		manager.setReplicator(new DefaultFileReplicator());
> 		manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
> 		manager.init();
> 	    }
> 	    catch (final FileSystemException e) {
> 		throw new RuntimeException(Intl.getString("object#manager.setupError"), e);
> 	    }
> 	}
> 
> Then, I try to browse into an HDFS system this way:
> 	    String url = String.format("hdfs://%1$s:%2$d/%3$s", "hadoop-master ", 50070, hdfsPath);
> 	    return manager.resolveFile(url);
> 
> Note: the client is running on Windows 7 (but could be any system that runs Java), and the target has been one of several Hadoop clusters on Ubuntu VMs (basically the same thing happens no matter which Hadoop installation I try to hit).  So I'm guessing the problem is in my client configuration.
> 
> This attempt to basically just connect to HDFS results in a bunch of error messages in the log file, which looks like it is trying to do user validation on the local machine instead of against the Hadoop (remote) cluster.
> Apr 11,2014 18:27:38.640 GMT T[AWT-EventQueue-0](26) DEBUG FileObjectManager: Trying to resolve file reference 'hdfs://hadoop-master:50070/'
> Apr 11,2014 18:27:38.953 GMT T[AWT-EventQueue-0](26)  INFO org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
> Apr 11,2014 18:27:39.078 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[GetGroups], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG Groups:  Creating new Groups object
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: java.library.path=.... <bunch of stuff>
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26)  WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to detect a valid hadoop home directory: HADOOP_HOME or hadoop.home.dir are not set.
> java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
> 	at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265)
> 	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:290)
> 	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
> 	at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:92)
> 	at org.apache.hadoop.security.Groups.<init>(Groups.java:76)
> 	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:239)
> 	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
> 	at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:232)
> 	at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:718)
> 	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:703)
> 	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:605)
> 	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2473)
> 	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2465)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2331)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> 	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
> 
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to locate the winutils binary in the hadoop binary path: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> 
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login commit
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: using local user:NTUserPrincipal: <user_name>
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: UGI loginUser:whiro01 (auth:SIMPLE)
> Apr 11,2014 18:27:39.813 GMT T[AWT-EventQueue-0](26) ERROR HdfsFileSystem: Error connecting to filesystem hdfs://hadoop-master:50070/: No FileSystem for scheme: hdfs
> java.io.IOException: No FileSystem for scheme: hdfs
> 	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2304)
> 	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311)
> 	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90)
> 	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> 	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
> 
> So, my guess is that I don't have enough configuration setup on my client machine to tell Hadoop that the authentication is to be done at the remote end ....??  So, I'm trying to track down what the configuration info might be.
> 
> Hoping to see if anyone here can see past the Commons VFS stuff that you probably don't understand to be able to tell me what other Hadoop/HDFS files / configuration I need to get this working.
> 
> Note: I want to build a GUI component that can browse to arbitrary HDFS installations, so I can't really be setting up a hard-coded XML file for each potential Hadoop cluster I might connect to ....
> 
> Thanks,
> ~Roger Whitcomb
> 
 		 	   		  

RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

Posted by david marion <dl...@hotmail.com>.
Hi Roger,

  I wrote the HDFS provider for Commons VFS. I went back and looked at the source and tests, and I don't see anything wrong with what you are doing. I did develop it against Hadoop 1.1.2 at the time, so there might be an issue that is not accounted for with Hadoop 2. It was also not tested with security turned on. Are you using security?

Dave

> From: Roger.Whitcomb@actian.com
> To: user@hadoop.apache.org
> Subject: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
> Date: Fri, 11 Apr 2014 20:20:06 +0000
> 
> Hi,
> I'm fairly new to Hadoop, but not to Apache, and I'm having a newbie kind of issue browsing HDFS files.  I have written an Apache Commons VFS (Virtual File System) browser for the Apache Pivot GUI framework (I'm the PMC Chair for Pivot: full disclosure).  And now I'm trying to get this browser to work with HDFS to do HDFS browsing from our application.  I'm running into a problem, which seems sort of basic, so I thought I'd ask here...
> 
> So, I downloaded Hadoop 2.3.0 from one of the mirrors, and was able to track down sort of the minimum set of .jars necessary to at least (try to) connect using Commons VFS 2.1:
> commons-collections-3.2.1.jar
> commons-configuration-1.6.jar
> commons-lang-2.6.jar
> commons-vfs2-2.1-SNAPSHOT.jar
> guava-11.0.2.jar
> hadoop-auth-2.3.0.jar
> hadoop-common-2.3.0.jar
> log4j-1.2.17.jar
> slf4j-api-1.7.5.jar
> slf4j-log4j12-1.7.5.jar
> 
> What's happening now is that I instantiated the HdfsProvider this way:
> 	private static DefaultFileSystemManager manager = null;
> 
> 	static
> 	{
> 	    manager = new DefaultFileSystemManager();
> 	    try {
> 		manager.setFilesCache(new DefaultFilesCache());
> 		manager.addProvider("hdfs", new HdfsFileProvider());
> 		manager.setFileContentInfoFactory(new FileContentInfoFilenameFactory());
> 		manager.setFilesCache(new SoftRefFilesCache());
> 		manager.setReplicator(new DefaultFileReplicator());
> 		manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
> 		manager.init();
> 	    }
> 	    catch (final FileSystemException e) {
> 		throw new RuntimeException(Intl.getString("object#manager.setupError"), e);
> 	    }
> 	}
> 
> Then, I try to browse into an HDFS system this way:
> 	    String url = String.format("hdfs://%1$s:%2$d/%3$s", "hadoop-master ", 50070, hdfsPath);
> 	    return manager.resolveFile(url);
> 
> Note: the client is running on Windows 7 (but could be any system that runs Java), and the target has been one of several Hadoop clusters on Ubuntu VMs (basically the same thing happens no matter which Hadoop installation I try to hit).  So I'm guessing the problem is in my client configuration.
> 
> This attempt to basically just connect to HDFS results in a bunch of error messages in the log file, which looks like it is trying to do user validation on the local machine instead of against the Hadoop (remote) cluster.
> Apr 11,2014 18:27:38.640 GMT T[AWT-EventQueue-0](26) DEBUG FileObjectManager: Trying to resolve file reference 'hdfs://hadoop-master:50070/'
> Apr 11,2014 18:27:38.953 GMT T[AWT-EventQueue-0](26)  INFO org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
> Apr 11,2014 18:27:39.078 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[GetGroups], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG Groups:  Creating new Groups object
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: java.library.path=.... <bunch of stuff>
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26)  WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to detect a valid hadoop home directory: HADOOP_HOME or hadoop.home.dir are not set.
> java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
> 	at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265)
> 	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:290)
> 	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
> 	at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:92)
> 	at org.apache.hadoop.security.Groups.<init>(Groups.java:76)
> 	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:239)
> 	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
> 	at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:232)
> 	at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:718)
> 	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:703)
> 	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:605)
> 	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2473)
> 	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2465)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2331)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> 	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
> 
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to locate the winutils binary in the hadoop binary path: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> 
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login commit
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: using local user:NTUserPrincipal: <user_name>
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: UGI loginUser:whiro01 (auth:SIMPLE)
> Apr 11,2014 18:27:39.813 GMT T[AWT-EventQueue-0](26) ERROR HdfsFileSystem: Error connecting to filesystem hdfs://hadoop-master:50070/: No FileSystem for scheme: hdfs
> java.io.IOException: No FileSystem for scheme: hdfs
> 	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2304)
> 	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311)
> 	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90)
> 	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> 	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
> 
> So, my guess is that I don't have enough configuration setup on my client machine to tell Hadoop that the authentication is to be done at the remote end ....??  So, I'm trying to track down what the configuration info might be.
> 
> Hoping to see if anyone here can see past the Commons VFS stuff that you probably don't understand to be able to tell me what other Hadoop/HDFS files / configuration I need to get this working.
> 
> Note: I want to build a GUI component that can browse to arbitrary HDFS installations, so I can't really be setting up a hard-coded XML file for each potential Hadoop cluster I might connect to ....
> 
> Thanks,
> ~Roger Whitcomb
> 
 		 	   		  

RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

Posted by david marion <dl...@hotmail.com>.
Hi Roger,

  I wrote the HDFS provider for Commons VFS. I went back and looked at the source and tests, and I don't see anything wrong with what you are doing. I did develop it against Hadoop 1.1.2 at the time, so there might be an issue that is not accounted for with Hadoop 2. It was also not tested with security turned on. Are you using security?

Dave

> From: Roger.Whitcomb@actian.com
> To: user@hadoop.apache.org
> Subject: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?
> Date: Fri, 11 Apr 2014 20:20:06 +0000
> 
> Hi,
> I'm fairly new to Hadoop, but not to Apache, and I'm having a newbie kind of issue browsing HDFS files.  I have written an Apache Commons VFS (Virtual File System) browser for the Apache Pivot GUI framework (I'm the PMC Chair for Pivot: full disclosure).  And now I'm trying to get this browser to work with HDFS to do HDFS browsing from our application.  I'm running into a problem, which seems sort of basic, so I thought I'd ask here...
> 
> So, I downloaded Hadoop 2.3.0 from one of the mirrors, and was able to track down sort of the minimum set of .jars necessary to at least (try to) connect using Commons VFS 2.1:
> commons-collections-3.2.1.jar
> commons-configuration-1.6.jar
> commons-lang-2.6.jar
> commons-vfs2-2.1-SNAPSHOT.jar
> guava-11.0.2.jar
> hadoop-auth-2.3.0.jar
> hadoop-common-2.3.0.jar
> log4j-1.2.17.jar
> slf4j-api-1.7.5.jar
> slf4j-log4j12-1.7.5.jar
> 
> What's happening now is that I instantiated the HdfsProvider this way:
> 	private static DefaultFileSystemManager manager = null;
> 
> 	static
> 	{
> 	    manager = new DefaultFileSystemManager();
> 	    try {
> 		manager.setFilesCache(new DefaultFilesCache());
> 		manager.addProvider("hdfs", new HdfsFileProvider());
> 		manager.setFileContentInfoFactory(new FileContentInfoFilenameFactory());
> 		manager.setFilesCache(new SoftRefFilesCache());
> 		manager.setReplicator(new DefaultFileReplicator());
> 		manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
> 		manager.init();
> 	    }
> 	    catch (final FileSystemException e) {
> 		throw new RuntimeException(Intl.getString("object#manager.setupError"), e);
> 	    }
> 	}
> 
> Then, I try to browse into an HDFS system this way:
> 	    String url = String.format("hdfs://%1$s:%2$d/%3$s", "hadoop-master ", 50070, hdfsPath);
> 	    return manager.resolveFile(url);
> 
> Note: the client is running on Windows 7 (but could be any system that runs Java), and the target has been one of several Hadoop clusters on Ubuntu VMs (basically the same thing happens no matter which Hadoop installation I try to hit).  So I'm guessing the problem is in my client configuration.
> 
> This attempt to basically just connect to HDFS results in a bunch of error messages in the log file, which looks like it is trying to do user validation on the local machine instead of against the Hadoop (remote) cluster.
> Apr 11,2014 18:27:38.640 GMT T[AWT-EventQueue-0](26) DEBUG FileObjectManager: Trying to resolve file reference 'hdfs://hadoop-master:50070/'
> Apr 11,2014 18:27:38.953 GMT T[AWT-EventQueue-0](26)  INFO org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
> Apr 11,2014 18:27:39.078 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[GetGroups], about=, type=DEFAULT, always=false, sampleName=Ops)
> Apr 11,2014 18:27:39.094 GMT T[AWT-EventQueue-0](26) DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG Groups:  Creating new Groups object
> Apr 11,2014 18:27:39.344 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26) DEBUG NativeCodeLoader: java.library.path=.... <bunch of stuff>
> Apr 11,2014 18:27:39.360 GMT T[AWT-EventQueue-0](26)  WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Falling back to shell based
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
> Apr 11,2014 18:27:39.375 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to detect a valid hadoop home directory: HADOOP_HOME or hadoop.home.dir are not set.
> java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
> 	at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265)
> 	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:290)
> 	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
> 	at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:92)
> 	at org.apache.hadoop.security.Groups.<init>(Groups.java:76)
> 	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:239)
> 	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
> 	at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:232)
> 	at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:718)
> 	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:703)
> 	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:605)
> 	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2473)
> 	at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2465)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2331)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> 	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
> 
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) ERROR Shell: Failed to locate the winutils binary in the hadoop binary path: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
> 
> Apr 11,2014 18:27:39.391 GMT T[AWT-EventQueue-0](26) DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login
> Apr 11,2014 18:27:39.469 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: hadoop login commit
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: using local user:NTUserPrincipal: <user_name>
> Apr 11,2014 18:27:39.751 GMT T[AWT-EventQueue-0](26) DEBUG UserGroupInformation: UGI loginUser:whiro01 (auth:SIMPLE)
> Apr 11,2014 18:27:39.813 GMT T[AWT-EventQueue-0](26) ERROR HdfsFileSystem: Error connecting to filesystem hdfs://hadoop-master:50070/: No FileSystem for scheme: hdfs
> java.io.IOException: No FileSystem for scheme: hdfs
> 	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2304)
> 	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311)
> 	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90)
> 	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
> 	at org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
> 	at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
> 	at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)
> 
> So, my guess is that I don't have enough configuration setup on my client machine to tell Hadoop that the authentication is to be done at the remote end ....??  So, I'm trying to track down what the configuration info might be.
> 
> Hoping to see if anyone here can see past the Commons VFS stuff that you probably don't understand to be able to tell me what other Hadoop/HDFS files / configuration I need to get this working.
> 
> Note: I want to build a GUI component that can browse to arbitrary HDFS installations, so I can't really be setting up a hard-coded XML file for each potential Hadoop cluster I might connect to ....
> 
> Thanks,
> ~Roger Whitcomb
>