You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Stuart Scott <St...@e-mis.com> on 2011/01/10 07:09:05 UTC

Connecting to Hadoop from Windows

Hi,

 

Can anyone offer any advice please?

We are successfully running a Hadoop cluster (Hbase/Hive etc..) but need
to access this from a Windows operating system (for various legacy
reasons!). We need to be able to put files/gets in/out of HDFS and also
access HBase.

 

Has anyone achieved this? 

Am I on to a none-starter?

 

I have managed to build a Java application using the latest
hadoop-core-0.20.2+320.jar's etc.. this compiles/runs but gives various
errors when trying to reference Hadoop.

e.g. Opening an InputStream Exception in thread "main"
java.net.MalformedURLException: unknown protocol: hdfs

 

Would we need the full Hadoop on the Client (which isn't part of the
cluster)?

 

Will this actually work?

 

Any advice would be gratefully received.

 

Regards

 

Stuart Scott

System Architect 
emis intellectual technology 
Fulford Grange, Micklefield Lane 
Rawdon Leeds LS19 6BA 
E-mail: stuart.scott@e-mis.com <ma...@e-mis.com>  
Website: www.emisit.com <outbind://26/www.emisit.com>  

Privileged and/or Confidential information may be contained in this
message. If you are not the original addressee indicated in this message
(or responsible for delivery of the message to such person), you may not
copy or deliver this message to anyone. In such case, please delete this
message, and notify us immediately. Opinions, conclusions and other
information expressed in this message are not given or endorsed by EMIS
nor can I conclude contracts on its behalf unless otherwise indicated by
an authorised representative independently of this message. 

EMIS reserves the right to monitor, intercept and (where appropriate)
read all incoming and outgoing communications. By replying to this
message and where necessary you are taken as being aware of and giving
consent to such monitoring, interception and reading.


EMIS is a trading name of Egton Medical Information Systems Limited.
Registered in England. No 2117205. Registered Office: Fulford Grange,
Micklefield Lane, Rawdon, Leeds, LS19 6BA

 


Re: Connecting to Hadoop from Windows

Posted by Tost <nc...@gmail.com>.
hadoop client library. see like below



======================
package com.hadoop;

import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

/**
 * Hadoop File upload test
 * (If you want to upload big file, don't use hbase. just use hadoop
directly.)
 * @author ncanis@gmail.com
 *
 */
public class HadoopTest {
private Configuration conf = null;
public HadoopTest(){
conf = new Configuration();
}
 private FileSystem getDFS() throws IOException {
return FileSystem.get(conf);
}
 public void close(OutputStream os){
if(os==null) return;
try { os.close(); } catch (IOException e) {}
}
public void close(InputStream is){
if(is==null) return;
try { is.close(); } catch (IOException e) {}
}
 public boolean uploadFile(Path directory,String filename, InputStream is,
boolean overwrite){
boolean isSuccess=false;
FSDataOutputStream fos=null;
try {
Path p  = new Path(directory, new Path(filename));
FileSystem fs = getDFS();
if(fs.getFileStatus(directory).isDir()==false) {
throw new IOException(directory+" isn't directory.");
}else if(fs.exists(p)){
if(overwrite) {
delete(p,true);
}else{
throw new IOException(p+" already exist.");
}
}
 fos = fs.create(p);
BufferedInputStream bis = new BufferedInputStream(is);
// IOUtils.copyBytes(bis,fos,8192,true);
copyBytes(bis,fos,8192,true);
 isSuccess= true;
} catch (IOException e) {
e.printStackTrace();
} finally{
close(fos);
}
return isSuccess;
}
 /**
 * copy bytes.
 * you should use IOUtils.copyBytes(bis,fos,8192,true)
 * @param is
 * @param os
 * @param bufferSize
 * @param isLog
 * @throws IOException
 */
private void copyBytes(InputStream is, OutputStream os, int bufferSize,
boolean isLog) throws IOException{
int length=0;
long current = 0;
long total = is.available();
long progress = 0;
byte[] buf = new byte[bufferSize];
while((length=is.read(buf))!=(-1)){
os.write(buf,0, length);
current+=length;
long now = current*100/total;
if(progress!=now) {
log("progress=",now,"%");
progress = now;
}
}
}
 public void delete(Path p, boolean recursive) throws IOException{
FileSystem fs = getDFS();
fs.delete(p, recursive);
}
 public static final void log(Object ... os){
StringBuilder sb = new StringBuilder();
for(Object s:os){
sb.append(s);
}
System.out.println(sb.toString());
}
 /**
 * after use, must close inputstream.
 * @param p
 * @param bufferSize
 * @return
 * @throws IOException
 */
public InputStream path2Stream(Path p, int bufferSize) throws IOException{
FileSystem fs = getDFS();
FSDataInputStream fis = fs.open(p, bufferSize);
return fis;
}
 public boolean copyFile2Local(Path path, File f){
boolean isSuccess=false;
try {
InputStream is = path2Stream(path,8192);
write2File(is,f);
isSuccess= true;
} catch (IOException e) {
e.printStackTrace();
}
return isSuccess;
 }
/**
 * load file stream in hbase, write to file
 * @param is
 * @param f
 * @return
 */
public boolean write2File(InputStream is, File f){
boolean isSuccess=false;
FileOutputStream fos=null;
try {
 BufferedInputStream bis = new BufferedInputStream(is);
fos = new FileOutputStream(f);
// IOUtils.copyBytes(is,fos,8192,true);
copyBytes(bis,fos,8192,true);
isSuccess = true;
} catch (IOException e) {
e.printStackTrace();
} finally{
close(is);
close(fos);
}
return isSuccess;
}
 long start = 0L;
private void start(){
start = System.currentTimeMillis();
}
 private long end(){
return System.currentTimeMillis()-start;
}
/**
 * @param args
 * @throws IOException
 */
public static void main(String[] args) throws IOException {
HadoopTest ht = new HadoopTest();
 File file = new File("doc/movie.avi");
FileInputStream fis = new FileInputStream(file);
Path rootPath = new Path("/files/");
String filename = "movie.avi";
 ht.start();
boolean isUploaded = ht.uploadFile(rootPath,filename,fis,true);
log("uploaded=",isUploaded," size=",file.length()," bytes. time=",ht.end(),"
ms");
 ht.start();
ht.copyFile2Local(new Path(rootPath,filename), new File("test/"+filename));
log("saved file time=",ht.end());
}

}



2011/1/10 Stuart Scott <St...@e-mis.com>

> Hi,
>
>
>
> Can anyone offer any advice please?
>
> We are successfully running a Hadoop cluster (Hbase/Hive etc..) but need
> to access this from a Windows operating system (for various legacy
> reasons!). We need to be able to put files/gets in/out of HDFS and also
> access HBase.
>
>
>
> Has anyone achieved this?
>
> Am I on to a none-starter?
>
>
>
> I have managed to build a Java application using the latest
> hadoop-core-0.20.2+320.jar's etc.. this compiles/runs but gives various
> errors when trying to reference Hadoop.
>
> e.g. Opening an InputStream Exception in thread "main"
> java.net.MalformedURLException: unknown protocol: hdfs
>
>
>
> Would we need the full Hadoop on the Client (which isn't part of the
> cluster)?
>
>
>
> Will this actually work?
>
>
>
> Any advice would be gratefully received.
>
>
>
> Regards
>
>
>
> Stuart Scott
>
> System Architect
> emis intellectual technology
> Fulford Grange, Micklefield Lane
> Rawdon Leeds LS19 6BA
> E-mail: stuart.scott@e-mis.com <ma...@e-mis.com>
> Website: www.emisit.com <outbind://26/www.emisit.com>
>
> Privileged and/or Confidential information may be contained in this
> message. If you are not the original addressee indicated in this message
> (or responsible for delivery of the message to such person), you may not
> copy or deliver this message to anyone. In such case, please delete this
> message, and notify us immediately. Opinions, conclusions and other
> information expressed in this message are not given or endorsed by EMIS
> nor can I conclude contracts on its behalf unless otherwise indicated by
> an authorised representative independently of this message.
>
> EMIS reserves the right to monitor, intercept and (where appropriate)
> read all incoming and outgoing communications. By replying to this
> message and where necessary you are taken as being aware of and giving
> consent to such monitoring, interception and reading.
>
>
> EMIS is a trading name of Egton Medical Information Systems Limited.
> Registered in England. No 2117205. Registered Office: Fulford Grange,
> Micklefield Lane, Rawdon, Leeds, LS19 6BA
>
>
>
>

Re: Connecting to Hadoop from Windows

Posted by Stack <st...@duboce.net>.
Does this help, http://hbase.apache.org/docs/r0.89.20100924/cygwin.html?
St.Ack

On Sun, Jan 9, 2011 at 10:41 PM, Ryan Rawson <ry...@gmail.com> wrote:
> People have run hadoop on windows, the primary code base being pure java.
> The main issue being that the scripts are bash, but running apps without the
> scripts aren't too hard.
> On Jan 9, 2011 10:09 PM, "Stuart Scott" <St...@e-mis.com> wrote:
>> Hi,
>>
>>
>>
>> Can anyone offer any advice please?
>>
>> We are successfully running a Hadoop cluster (Hbase/Hive etc..) but need
>> to access this from a Windows operating system (for various legacy
>> reasons!). We need to be able to put files/gets in/out of HDFS and also
>> access HBase.
>>
>>
>>
>> Has anyone achieved this?
>>
>> Am I on to a none-starter?
>>
>>
>>
>> I have managed to build a Java application using the latest
>> hadoop-core-0.20.2+320.jar's etc.. this compiles/runs but gives various
>> errors when trying to reference Hadoop.
>>
>> e.g. Opening an InputStream Exception in thread "main"
>> java.net.MalformedURLException: unknown protocol: hdfs
>>
>>
>>
>> Would we need the full Hadoop on the Client (which isn't part of the
>> cluster)?
>>
>>
>>
>> Will this actually work?
>>
>>
>>
>> Any advice would be gratefully received.
>>
>>
>>
>> Regards
>>
>>
>>
>> Stuart Scott
>>
>> System Architect
>> emis intellectual technology
>> Fulford Grange, Micklefield Lane
>> Rawdon Leeds LS19 6BA
>> E-mail: stuart.scott@e-mis.com <ma...@e-mis.com>
>> Website: www.emisit.com <outbind://26/www.emisit.com>
>>
>> Privileged and/or Confidential information may be contained in this
>> message. If you are not the original addressee indicated in this message
>> (or responsible for delivery of the message to such person), you may not
>> copy or deliver this message to anyone. In such case, please delete this
>> message, and notify us immediately. Opinions, conclusions and other
>> information expressed in this message are not given or endorsed by EMIS
>> nor can I conclude contracts on its behalf unless otherwise indicated by
>> an authorised representative independently of this message.
>>
>> EMIS reserves the right to monitor, intercept and (where appropriate)
>> read all incoming and outgoing communications. By replying to this
>> message and where necessary you are taken as being aware of and giving
>> consent to such monitoring, interception and reading.
>>
>>
>> EMIS is a trading name of Egton Medical Information Systems Limited.
>> Registered in England. No 2117205. Registered Office: Fulford Grange,
>> Micklefield Lane, Rawdon, Leeds, LS19 6BA
>>
>>
>>
>

Re: Connecting to Hadoop from Windows

Posted by Ryan Rawson <ry...@gmail.com>.
People have run hadoop on windows, the primary code base being pure java.
The main issue being that the scripts are bash, but running apps without the
scripts aren't too hard.
On Jan 9, 2011 10:09 PM, "Stuart Scott" <St...@e-mis.com> wrote:
> Hi,
>
>
>
> Can anyone offer any advice please?
>
> We are successfully running a Hadoop cluster (Hbase/Hive etc..) but need
> to access this from a Windows operating system (for various legacy
> reasons!). We need to be able to put files/gets in/out of HDFS and also
> access HBase.
>
>
>
> Has anyone achieved this?
>
> Am I on to a none-starter?
>
>
>
> I have managed to build a Java application using the latest
> hadoop-core-0.20.2+320.jar's etc.. this compiles/runs but gives various
> errors when trying to reference Hadoop.
>
> e.g. Opening an InputStream Exception in thread "main"
> java.net.MalformedURLException: unknown protocol: hdfs
>
>
>
> Would we need the full Hadoop on the Client (which isn't part of the
> cluster)?
>
>
>
> Will this actually work?
>
>
>
> Any advice would be gratefully received.
>
>
>
> Regards
>
>
>
> Stuart Scott
>
> System Architect
> emis intellectual technology
> Fulford Grange, Micklefield Lane
> Rawdon Leeds LS19 6BA
> E-mail: stuart.scott@e-mis.com <ma...@e-mis.com>
> Website: www.emisit.com <outbind://26/www.emisit.com>
>
> Privileged and/or Confidential information may be contained in this
> message. If you are not the original addressee indicated in this message
> (or responsible for delivery of the message to such person), you may not
> copy or deliver this message to anyone. In such case, please delete this
> message, and notify us immediately. Opinions, conclusions and other
> information expressed in this message are not given or endorsed by EMIS
> nor can I conclude contracts on its behalf unless otherwise indicated by
> an authorised representative independently of this message.
>
> EMIS reserves the right to monitor, intercept and (where appropriate)
> read all incoming and outgoing communications. By replying to this
> message and where necessary you are taken as being aware of and giving
> consent to such monitoring, interception and reading.
>
>
> EMIS is a trading name of Egton Medical Information Systems Limited.
> Registered in England. No 2117205. Registered Office: Fulford Grange,
> Micklefield Lane, Rawdon, Leeds, LS19 6BA
>
>
>