You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Andreas Kostyrka <an...@kostyrka.org> on 2008/04/07 16:48:02 UTC

HDFS access/Jython examples

Hi!

I just wondered if there is some Jython example that shows how to access
the HDFS from Jython, without running a mapreduce?

Andreas

Re: HDFS access/Jython examples

Posted by Andreas Kostyrka <an...@kostyrka.org>.
Ok, I've traced it down to the following problem in
S3FileSystem.initialize():

    this.localFs = get(URI.create("file:///"), conf);

Why this bombs I have no idea.

OTOH, the cool thing here is, it's the last line, which means that
catching this exception, and ignoring might help me in my quest to
read/write data to HDFS/S3.

Andreas

Am Montag, den 07.04.2008, 18:21 +0200 schrieb Andreas Kostyrka:
> Well, I'm getting the following funny errors:
> 
> Traceback (innermost last):
>   File "test.py", line 13, in ?
> 	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1177)
> 	at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:53)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1191)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:148)
> 	at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:122)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:94)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:585)
> 
> My source code is:
> 
> import sys, glob, os
> 
> sys.path.append("./hadoop-0.16.2-core.jar")
> sys.path.extend(glob.glob("lib/*.jar"))
> from org.apache.hadoop.fs import *
> from org.apache.hadoop.conf import Configuration, Configured
> from org.apache.hadoop.fs.s3 import S3FileSystem, Jets3tFileSystemStore
> from java.net import URI
> c = Configuration()
> c.set("fs.default.name", "s3://loohad")
> c.set("fs.s3.awsAccessKeyId", "1SKTFXXJKF5EXJ7S5202")
> c.set("fs.s3.awsSecretAccessKey", "MyKEY")
> fs = FileSystem.get(c)
> print fs
> 

Re: HDFS access/Jython examples

Posted by Andreas Kostyrka <an...@kostyrka.org>.
Well, I'm getting the following funny errors:

Traceback (innermost last):
  File "test.py", line 13, in ?
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1177)
	at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:53)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1191)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:148)
	at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:122)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:94)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:585)

My source code is:

import sys, glob, os

sys.path.append("./hadoop-0.16.2-core.jar")
sys.path.extend(glob.glob("lib/*.jar"))
from org.apache.hadoop.fs import *
from org.apache.hadoop.conf import Configuration, Configured
from org.apache.hadoop.fs.s3 import S3FileSystem, Jets3tFileSystemStore
from java.net import URI
c = Configuration()
c.set("fs.default.name", "s3://loohad")
c.set("fs.s3.awsAccessKeyId", "1SKTFXXJKF5EXJ7S5202")
c.set("fs.s3.awsSecretAccessKey", "MyKEY")
fs = FileSystem.get(c)
print fs


Re: HDFS access/Jython examples

Posted by Ted Dunning <td...@veoh.com>.

Here is the implementation I use in Groovy.  Jython should be nearly as
concise except that Jython may expect more out of a file reader.  These are
methods on my HadoopFile abstraction.

    void withPrintWriter(Closure action) {
        OutputStream os = outputStream()
        PrintWriter pw = new PrintWriter(os)
        try {
            action(pw)
        } finally {
            pw?.close()
        }
    }

    private OutputStream outputStream() {
        Configuration conf = new Configuration()
        //        conf.set("fs.default.name", "metricsapp4:50020")
        FileSystem fs = org.apache.hadoop.fs.FileSystem.get(conf)
        if (fs instanceof LocalFileSystem) {
            new File(name).deleteOnExit()
        }
        fs.create(new Path(name));
    }

    /**
     * Same as for File
     */
    public void eachLine(Closure action) {
        Hadoop.local {
            Configuration conf = new Configuration()
            //        conf.set("fs.default.name", "metricsapp4:50020")
            FileSystem fs = org.apache.hadoop.fs.FileSystem.get(conf)
            def read = {part ->
                def f = new BufferedReader(new
InputStreamReader(fs.open(part)))
                try {
                    f.eachLine(action)
                } finally {
                    f?.close()
                }
            }
            // sometimes the file is really a directory with part-* files,
sometimes
            // it is a file.
            if (fs.isFile(new Path(name))) {
                for (part in fs.globPaths(new Path(name))) {read(part)}
            } else {
                for (part in fs.globPaths(new Path(name, "part-*")))
{read(part)}
            }
        }
    }






On 4/7/08 7:48 AM, "Andreas Kostyrka" <an...@kostyrka.org> wrote:

> Hi!
> 
> I just wondered if there is some Jython example that shows how to access
> the HDFS from Jython, without running a mapreduce?
> 
> Andreas