You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Kevin Savage <ke...@hotmail.com> on 2012/04/04 23:00:11 UTC

Sharing data between maps

Hi,

I'm currently working on some simulation software than models engineering facilities. As input we have two big chunks of data, one about the design of the site and one about the climate the site is in. 

As we have an extensive set of climate data (about 1000 locations) we thought it would be interesting to say "given this design, where would be best to build it". Hadoop seemed like a good fit. 

However, what we have is one big file of design data that needs to go to all the maps and many big files of climate data that need to go to one map each. I've not been able to work out if there is a good way of doing this in Hadoop. 

Is there a good way of doing something like this?

Kevin

Re: Sharing data between maps

Posted by Kevin Savage <ke...@hotmail.com>.
On 4 Apr 2012, at 22:07, John Armstrong <jr...@ccri.com> wrote:

> On 04/04/2012 05:00 PM, Kevin Savage wrote:
>> However, what we have is one big file of design data that needs to go to all the maps and many big files of climate data that need to go to one map each. I've not been able to work out if there is a good way of doing this in Hadoop.
> 
> It sounds like "one big file" belongs on the DistributedCache, while the "many big files" should be set up as the input using some subclass of FileInputFormat
> 
> hth
> 
> 

Hi John, 

Sounds good, sounds like I just needed pointing at the right words!

Thanks,
Kevin

Re: Sharing data between maps

Posted by John Armstrong <jr...@ccri.com>.
On 04/04/2012 05:00 PM, Kevin Savage wrote:
> However, what we have is one big file of design data that needs to go to all the maps and many big files of climate data that need to go to one map each. I've not been able to work out if there is a good way of doing this in Hadoop.

It sounds like "one big file" belongs on the DistributedCache, while the 
"many big files" should be set up as the input using some subclass of 
FileInputFormat

hth