You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Alex Parvulescu <al...@gmail.com> on 2010/03/12 17:45:52 UTC

http alternative for 'hadoop dfs -getmerge'?

Hello,

I'd like to know if there is any alternative to 'hadoop dfs -getmerge' over
http. The closest I could find is the 'Download this file' link but this is
available only for parts, not the whole directory (
http://hadoop:50075/streamFile?filename=%2Fuser%2Fhadoop-user%2Foutput%2Fsolr%2F%2Fpart-00000
)

What I'd like to do is push something from Hadoop to Solr.
The options are:

1. run 'dsf -getmerge' which will get me a unified file (all the parts as
one file) and scp that to the solr server , then run the actual push to
solr.

or

2. find a way to be able to provide this file (unified, not just parts of
it)  via http, so Solr (1.4) will be able to stream it.
You could add a http server, but I see no point in adding apache to this
mix, as long as there already is the hdfs browser running on 50070.

I'd really like to go with 2, as it seems a lot easier.

What do you think?

thanks,
alex