You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Andrew Nguyen <an...@ucsfcti.org> on 2010/05/03 19:36:34 UTC

HDF5 and Hadoop

Does anyone know of any existing work integrating HDF5 (http://www.hdfgroup.org/HDF5/whatishdf5.html) with Hadoop?

I don't know much about HDF5 but it was recently brought to my attention as a way to store high-density scientific data.  Since I've confirmed that having Hadoop dramatically speeds up our analysis, it seems like marrying the two might have some benefits.

I've done some searches on google and it doesn't turn up much.

Thanks!

--Andrew

Re: HDF5 and Hadoop

Posted by Andrew Nguyen <an...@ucsfcti.org>.
Chris,

Thanks for the heads up!

--Andrew

On May 3, 2010, at 10:45 AM, Mattmann, Chris A (388J) wrote:

> Hi Andrew,
> 
> There has been some work in the Tika [1] project recently on looking at NetCDF4 [2] and HDF4/5 [3] and extracting metadata/text content from them. Though this doesn't directly apply to your question below, it might be worth perhaps looking at how to marry Tika and Hadoop in that regard.
> 
> HTH!
> 
> Cheers,
> Chris
> 
> [1] http://lucene.apache.org/tika/
> [2] http://issues.apache.org/jira/browse/TIKA-400
> [3] https://issues.apache.org/jira/browse/TIKA-399

Re: HDF5 and Hadoop

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Andrew,

There has been some work in the Tika [1] project recently on looking at NetCDF4 [2] and HDF4/5 [3] and extracting metadata/text content from them. Though this doesn't directly apply to your question below, it might be worth perhaps looking at how to marry Tika and Hadoop in that regard.

HTH!

Cheers,
Chris

[1] http://lucene.apache.org/tika/
[2] http://issues.apache.org/jira/browse/TIKA-400
[3] https://issues.apache.org/jira/browse/TIKA-399


On 5/3/10 10:36 AM, "Andrew Nguyen" <an...@ucsfcti.org> wrote:

Does anyone know of any existing work integrating HDF5 (http://www.hdfgroup.org/HDF5/whatishdf5.html) with Hadoop?

I don't know much about HDF5 but it was recently brought to my attention as a way to store high-density scientific data.  Since I've confirmed that having Hadoop dramatically speeds up our analysis, it seems like marrying the two might have some benefits.

I've done some searches on google and it doesn't turn up much.

Thanks!

--Andrew



++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: Chris.Mattmann@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++