You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Ralf Heyde <ra...@gmx.de> on 2011/09/09 17:41:21 UTC

Native HDFS Write Text & JAQL Execution

Hello again,

 

I'm thinking I have misunderstood something in writing files to HDFS and
process them in JAQL.

 

I have some sample-data which are represented by a set of objects. 

I transform these object to a JSONString. 

 

I'm writing JSON data directly to a HDFS-File through my HDFS-Client code:

 

-------------------------------------------

Configuration config = new Configuration();

// add the hadoop configuration files residing in the installation path of
hadoop

config.addResource(new Path("core-site.xml"));

// pass the username and password required to access the HDFS (set up on the
namenode)

config.set("hadoop.job.ugi", "hadoop, password");

FileSystem fs = FileSystem.get(config);

 

Path path = new Path("/sampledata");

fs.mkdirs( path );

 

Path file = new Path( path, "samplefile.json" );

 

FSDataOutputStream fos = fs.create( file );

 

// Collect Sample Data and

Collection<Entry> entries = MockFactory.createEntries();

// Build JSON and

JSONArray jsonArray = JSONBuilder.buildSomeTwitterJSON(entries);

 

// write JSON to HDFS

fos.writeBytes( jsonArray.toString() );

fos.flush();

fos.close();

 

fs.close();

-------------------------------------------

Now I would like to run a JAQL script, but I get an error - The input file
is not a SequenceFile.

-------------------------------------------

// Read

$sampledata = read(hdfs("/sampledata/samplefile.json"));

 

// Query 1: filter and transform

$ sampledata

  -> filter $.status_id == 1

  -> transform { $.authorurl, $.datum };

-------------------------------------------

 

Can someone give me a hint to correct my misunderstanding?

 

Thanks,

 

Ralf

RE: Native HDFS Write Text & JAQL Execution

Posted by Ralf Heyde <ra...@gmx.de>.

In Addition ... I found my mistake ... 

The JAQL documentation gaves me an hint:
http://code.google.com/p/jaql/wiki/IO

// Options needed to read data as JSON from Hadoop Text File
jaql> txtInOpt = {format: "org.apache.hadoop.mapred.TextInputFormat", 
                  converter:
"com.ibm.jaql.io.hadoop.converter.FromJsonTextConverter"};



-----Original Message-----
From: Ralf Heyde [mailto:ralf.heyde@gmx.de] 
Sent: Freitag, 9. September 2011 17:41
To: common-user@hadoop.apache.org
Subject: Native HDFS Write Text & JAQL Execution

Hello again,

 

I'm thinking I have misunderstood something in writing files to HDFS and
process them in JAQL.

 

I have some sample-data which are represented by a set of objects. 

I transform these object to a JSONString. 

 

I'm writing JSON data directly to a HDFS-File through my HDFS-Client code:

 

-------------------------------------------

Configuration config = new Configuration();

// add the hadoop configuration files residing in the installation path of
hadoop

config.addResource(new Path("core-site.xml"));

// pass the username and password required to access the HDFS (set up on the
namenode)

config.set("hadoop.job.ugi", "hadoop, password");

FileSystem fs = FileSystem.get(config);

 

Path path = new Path("/sampledata");

fs.mkdirs( path );

 

Path file = new Path( path, "samplefile.json" );

 

FSDataOutputStream fos = fs.create( file );

 

// Collect Sample Data and

Collection<Entry> entries = MockFactory.createEntries();

// Build JSON and

JSONArray jsonArray = JSONBuilder.buildSomeTwitterJSON(entries);

 

// write JSON to HDFS

fos.writeBytes( jsonArray.toString() );

fos.flush();

fos.close();

 

fs.close();

-------------------------------------------

Now I would like to run a JAQL script, but I get an error - The input file
is not a SequenceFile.

-------------------------------------------

// Read

$sampledata = read(hdfs("/sampledata/samplefile.json"));

 

// Query 1: filter and transform

$ sampledata

  -> filter $.status_id == 1

  -> transform { $.authorurl, $.datum };

-------------------------------------------

 

Can someone give me a hint to correct my misunderstanding?

 

Thanks,

 

Ralf