You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by kashif khan <dr...@gmail.com> on 2012/11/19 11:44:34 UTC

Automatically upload files into HDFS

HI,

I am generating files continuously in local folder of my base machine. How
I can now use the flume to stream the generated files from local folder to
HDFS.
I dont know how exactly configure the sources, sinks and hdfs.

1) location of folder where files are generating: /usr/datastorage/
2) name node address: htdfs://hadoop1.example.com:8020

Please let me help.

Many thanks

Best regards,
KK

Re: Automatically upload files into HDFS

Posted by Mohammad Tariq <do...@gmail.com>.
BTW, Alex has got a point. You could write a cronjob or something as you
just have to move data from your Local FS to HDFS.

Regards,
    Mohammad Tariq



On Mon, Nov 19, 2012 at 6:04 PM, Mohammad Tariq <do...@gmail.com> wrote:

> I am so so sorry for the blunder. I was doing something with the twitter
> API and copied that link by mistake. Apologies. Please use this link :
> http://cloudfront.blogspot.in/2012/06/how-to-build-and-use-flume-ng.html
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Nov 19, 2012 at 6:00 PM, kashif khan <dr...@gmail.com>wrote:
>
>> Thanks M. Tariq
>>
>> I have tried to visit the link but I think is not accessible as generate
>> the following error message:
>>
>>  Whoa there!
>>
>> The request token for this page is invalid. It may have already been
>> used, or expired because it is too old. Please go back to the site or
>> application that sent you here and try again; it was probably just a
>> mistake.
>>
>>    - Go to Twitter <http://twitter.com/home>.
>>
>>  You can revoke access to any application at any time from the Applications
>> tab <http://twitter.com/settings/applications> of your Settings page.
>>
>> By authorizing an application you continue to operate under Twitter's
>> Terms of Service <http://twitter.com/tos>. In particular, some usage
>> information will be shared back with Twitter. For more, see our Privacy
>> Policy <http://twitter.com/privacy>.
>>
>>
>>
>> Best regards,
>>
>> KK
>>
>>
>>
>>
>>
>> On Mon, Nov 19, 2012 at 10:50 AM, Mohammad Tariq <do...@gmail.com>wrote:
>>
>>> Hello Kashif,
>>>
>>>     You can visit this link and see if it is of any help to you. I have
>>> shared some of my initial experience here.
>>>
>>> http://api.twitter.com/oauth/authorize?oauth_token=ndACNGIkLSeMJdeMIeQYowyzpjDtvvmqo5ja9We7zo
>>>
>>> You may want to skip the build part and download the release directly
>>> and start off with that.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 4:14 PM, kashif khan <dr...@gmail.com>wrote:
>>>
>>>> HI,
>>>>
>>>> I am generating files continuously in local folder of my base machine.
>>>> How I can now use the flume to stream the generated files from local folder
>>>> to HDFS.
>>>> I dont know how exactly configure the sources, sinks and hdfs.
>>>>
>>>> 1) location of folder where files are generating: /usr/datastorage/
>>>> 2) name node address: htdfs://hadoop1.example.com:8020
>>>>
>>>> Please let me help.
>>>>
>>>> Many thanks
>>>>
>>>> Best regards,
>>>> KK
>>>
>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by Mohammad Tariq <do...@gmail.com>.
I am so so sorry for the blunder. I was doing something with the twitter
API and copied that link by mistake. Apologies. Please use this link :
http://cloudfront.blogspot.in/2012/06/how-to-build-and-use-flume-ng.html

Regards,
    Mohammad Tariq



On Mon, Nov 19, 2012 at 6:00 PM, kashif khan <dr...@gmail.com> wrote:

> Thanks M. Tariq
>
> I have tried to visit the link but I think is not accessible as generate
> the following error message:
>
>  Whoa there!
>
> The request token for this page is invalid. It may have already been used,
> or expired because it is too old. Please go back to the site or application
> that sent you here and try again; it was probably just a mistake.
>
>    - Go to Twitter <http://twitter.com/home>.
>
>  You can revoke access to any application at any time from the Applications
> tab <http://twitter.com/settings/applications> of your Settings page.
>
> By authorizing an application you continue to operate under Twitter's
> Terms of Service <http://twitter.com/tos>. In particular, some usage
> information will be shared back with Twitter. For more, see our Privacy
> Policy <http://twitter.com/privacy>.
>
>
>
> Best regards,
>
> KK
>
>
>
>
>
> On Mon, Nov 19, 2012 at 10:50 AM, Mohammad Tariq <do...@gmail.com>wrote:
>
>> Hello Kashif,
>>
>>     You can visit this link and see if it is of any help to you. I have
>> shared some of my initial experience here.
>>
>> http://api.twitter.com/oauth/authorize?oauth_token=ndACNGIkLSeMJdeMIeQYowyzpjDtvvmqo5ja9We7zo
>>
>> You may want to skip the build part and download the release directly and
>> start off with that.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Nov 19, 2012 at 4:14 PM, kashif khan <dr...@gmail.com>wrote:
>>
>>> HI,
>>>
>>> I am generating files continuously in local folder of my base machine.
>>> How I can now use the flume to stream the generated files from local folder
>>> to HDFS.
>>> I dont know how exactly configure the sources, sinks and hdfs.
>>>
>>> 1) location of folder where files are generating: /usr/datastorage/
>>> 2) name node address: htdfs://hadoop1.example.com:8020
>>>
>>> Please let me help.
>>>
>>> Many thanks
>>>
>>> Best regards,
>>> KK
>>
>>
>>
>

Re: Automatically upload files into HDFS

Posted by kashif khan <dr...@gmail.com>.
Thanks M. Tariq

I have tried to visit the link but I think is not accessible as generate
the following error message:

 Whoa there!

The request token for this page is invalid. It may have already been used,
or expired because it is too old. Please go back to the site or application
that sent you here and try again; it was probably just a mistake.

   - Go to Twitter <http://twitter.com/home>.

 You can revoke access to any application at any time from the Applications
tab <http://twitter.com/settings/applications> of your Settings page.

By authorizing an application you continue to operate under Twitter's Terms
of Service <http://twitter.com/tos>. In particular, some usage information
will be shared back with Twitter. For more, see our Privacy
Policy<http://twitter.com/privacy>
.



Best regards,

KK




On Mon, Nov 19, 2012 at 10:50 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Kashif,
>
>     You can visit this link and see if it is of any help to you. I have
> shared some of my initial experience here.
>
> http://api.twitter.com/oauth/authorize?oauth_token=ndACNGIkLSeMJdeMIeQYowyzpjDtvvmqo5ja9We7zo
>
> You may want to skip the build part and download the release directly and
> start off with that.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Nov 19, 2012 at 4:14 PM, kashif khan <dr...@gmail.com>wrote:
>
>> HI,
>>
>> I am generating files continuously in local folder of my base machine.
>> How I can now use the flume to stream the generated files from local folder
>> to HDFS.
>> I dont know how exactly configure the sources, sinks and hdfs.
>>
>> 1) location of folder where files are generating: /usr/datastorage/
>> 2) name node address: htdfs://hadoop1.example.com:8020
>>
>> Please let me help.
>>
>> Many thanks
>>
>> Best regards,
>> KK
>
>
>

Re: Automatically upload files into HDFS

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Kashif,

    You can visit this link and see if it is of any help to you. I have
shared some of my initial experience here.
http://api.twitter.com/oauth/authorize?oauth_token=ndACNGIkLSeMJdeMIeQYowyzpjDtvvmqo5ja9We7zo

You may want to skip the build part and download the release directly and
start off with that.

Regards,
    Mohammad Tariq



On Mon, Nov 19, 2012 at 4:14 PM, kashif khan <dr...@gmail.com> wrote:

> HI,
>
> I am generating files continuously in local folder of my base machine. How
> I can now use the flume to stream the generated files from local folder to
> HDFS.
> I dont know how exactly configure the sources, sinks and hdfs.
>
> 1) location of folder where files are generating: /usr/datastorage/
> 2) name node address: htdfs://hadoop1.example.com:8020
>
> Please let me help.
>
> Many thanks
>
> Best regards,
> KK

Re: Automatically upload files into HDFS

Posted by kashif khan <dr...@gmail.com>.
Dear shekhar,

I am still struggling. I have written some java code, it does work but not
100%  as some time sending file with 0KB data. If you have any solution,
please let me know. I shall be very grateful to you.

Many thanks

Best regards

On Mon, Nov 26, 2012 at 4:42 PM, shekhar sharma <sh...@gmail.com>wrote:

> Hello Khasif,
> Sorry for late reply...Are you done? or u still struggling?
>
> mail me: shekhar2581@gmail.com
> Regards,
> Som Shekhar Sharma
>
>
>
> On Wed, Nov 21, 2012 at 6:06 PM, kashif khan <dr...@gmail.com>wrote:
>
>> Dear Shankar Sharma.
>>
>> I am using Eclipse as IDE. I dont have any idea, how to create the
>> project as maven project. I have downloaded Mave2 but given me some strange
>> error. So if you can help me then I will try the maven. Actually, I am
>> trying to automatically upload the files into HDFS and then will apply some
>> algorithms to analyze the data. The algorithms will implement in mapreduce
>> . So if you think maven will good for me then please let me know how I can
>> create the project as maven project.
>>
>>
>> Many thanks
>>
>> Best regards,
>>
>> KK
>>
>>
>>
>>  On Tue, Nov 20, 2012 at 7:06 PM, shekhar sharma <sh...@gmail.com>wrote:
>>
>>> By the way how are you building and running your project.Are u running
>>> from any IDE?
>>> The best practises you can follow:
>>>
>>> (1) Create your project as maven project and give the dependency of
>>> hadoop-X.Y.Z . So your project will automatically will have all the
>>> necessary jars
>>> and i am sure you will not face these kind of errors
>>> (2) And In your HADOOP_CLASSPATH provide the path for $HADOOP_LIB.
>>>
>>> Regards,
>>> SOm
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Nov 20, 2012 at 9:52 PM, kashif khan <dr...@gmail.com>wrote:
>>>
>>>> Dear Tariq
>>>>
>>>> Many thanks, finally I have created the directory and upload the file.
>>>>
>>>> Once again many thanks
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On Tue, Nov 20, 2012 at 3:04 PM, kashif khan <dr...@gmail.com>wrote:
>>>>
>>>>> Dear Many thanks
>>>>>
>>>>>
>>>>> I have downloaded the jar file and added to project. Now getting
>>>>> another error as:
>>>>>
>>>>> og4j:WARN No appenders could be found for logger
>>>>> (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
>>>>> log4j:WARN Please initialize the log4j system properly.
>>>>> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfigfor more info.
>>>>> Exception in thread "main" java.io.IOException: No FileSystem for
>>>>> scheme: hdfs
>>>>>     at
>>>>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2206)
>>>>>     at
>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2213)
>>>>>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>>>>>     at
>>>>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>>>>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>>>>>
>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>>>>     at CopyFile.main(CopyFile.java:14)
>>>>>
>>>>> Have any idea about this?
>>>>>
>>>>> Thanks again
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Nov 20, 2012 at 2:53 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>
>>>>>> You can download the jar here :
>>>>>> http://search.maven.org/remotecontent?filepath=com/google/guava/guava/13.0.1/guava-13.0.1.jar
>>>>>>
>>>>>> Regards,
>>>>>>     Mohammad Tariq
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 20, 2012 at 8:06 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>>
>>>>>>> Could please let me know the name of jar file and location
>>>>>>>
>>>>>>> Many thanks
>>>>>>>
>>>>>>> Best regards
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 20, 2012 at 2:33 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>>>
>>>>>>>> Download the required jar and include it in your project.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>     Mohammad Tariq
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Nov 20, 2012 at 7:57 PM, kashif khan <
>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Dear Tariq Thanks
>>>>>>>>>
>>>>>>>>> I have added the jar files from Cdh and download the cdh4 eclipse
>>>>>>>>> plugin and copied into eclipse plugin folder. The previous error I think
>>>>>>>>> sorted out but now I am getting another strange error.
>>>>>>>>>
>>>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>>> com/google/common/collect/Maps
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.metrics2.lib.MetricsRegistry.<init>(MetricsRegistry.java:42)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:87)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:133)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:97)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:190)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2373)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2365)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2233)
>>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>>>>>>>>     at CopyFile.main(CopyFile.java:14)
>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>> com.google.common.collect.Maps
>>>>>>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>>>>>>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>>>>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
>>>>>>>>>     at
>>>>>>>>> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>>>>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
>>>>>>>>>     ... 13 more
>>>>>>>>>
>>>>>>>>> Have any idea about this error.
>>>>>>>>>
>>>>>>>>> Many thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Nov 20, 2012 at 2:19 PM, Mohammad Tariq <
>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hello Kashif,
>>>>>>>>>>
>>>>>>>>>>      You are correct. This because of some version mismatch. I am
>>>>>>>>>> not using CDH personally but AFAIK, CDH4 uses Hadoop-2.x.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Nov 20, 2012 at 4:10 PM, kashif khan <
>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> HI M Tariq
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I am trying the following the program to create directory and
>>>>>>>>>>> copy file to hdfs. But I am getting the following errors
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Program:
>>>>>>>>>>>
>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>>
>>>>>>>>>>> public class CopyFile {
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>         public static void main(String[] args) throws
>>>>>>>>>>> IOException{
>>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>>         conf.set("fs.default.name", "hadoop1.example.com:8020");
>>>>>>>>>>>         FileSystem dfs = FileSystem.get(conf);
>>>>>>>>>>>         String dirName = "Test1";
>>>>>>>>>>>         Path src = new Path(dfs.getWorkingDirectory() + "/" +
>>>>>>>>>>> dirName);
>>>>>>>>>>>         dfs.mkdirs(src);
>>>>>>>>>>>         Path scr1 = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>         Path dst = new Path(dfs.getWorkingDirectory() +
>>>>>>>>>>> "/Test1/");
>>>>>>>>>>>         dfs.copyFromLocalFile(src, dst);
>>>>>>>>>>>
>>>>>>>>>>>         }
>>>>>>>>>>>         }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>     Exception in thread "main"
>>>>>>>>>>> org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot
>>>>>>>>>>> communicate with client version 4
>>>>>>>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>>>>>>>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>>>>>>     at $Proxy1.getProtocolVersion(Unknown Source)
>>>>>>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>>>>>>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>>>>>>>>>>>     at
>>>>>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>>>>>>>>>>>     at
>>>>>>>>>>> org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>>>>>>>>>>>     at
>>>>>>>>>>> org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>>>>>>>>>>>     at
>>>>>>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>>>>>>>>>>>     at
>>>>>>>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>>>>>>>>>>>     at
>>>>>>>>>>> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>>>>>>>>>>     at
>>>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>>>>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>>>>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
>>>>>>>>>>>     at CopyFile.main(CopyFile.java:11)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I am using CDH4.1. i have download the source file of
>>>>>>>>>>> hadoop-1.0.4 and import the jar files into Eclipse. I think it is due to
>>>>>>>>>>> version problem. Could you please let me know what will be correct version
>>>>>>>>>>> for the CDH4.1?
>>>>>>>>>>>
>>>>>>>>>>> Many thanks
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 3:41 PM, Mohammad Tariq <
>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> It should work. Same code is working fine for me. Try to create
>>>>>>>>>>>> some other directory in your Hdfs and use it as your output path. Also see
>>>>>>>>>>>> if you find something in datanode logs.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <
>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> The input path is fine. Problem in output path. I am just
>>>>>>>>>>>>> wonder that it copy the data into local disk  (/user/root/) not into hdfs.
>>>>>>>>>>>>> I dont know why? Is it we give the correct statement to point to hdfs?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <
>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Try this as your input file path
>>>>>>>>>>>>>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <
>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> when I am applying the command as
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> $ hadoop fs -put /usr/Eclipse/Output.csv
>>>>>>>>>>>>>>> /user/root/Output.csv.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> its work fine and file browsing in the hdfs. But i dont know
>>>>>>>>>>>>>>> why its not work in program.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Many thanks for your cooperation.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <
>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It would be good if I could have a look on the files.
>>>>>>>>>>>>>>>> Meantime try some other directories. Also, check the directory permissions
>>>>>>>>>>>>>>>> once.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <
>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have tried through root user and made the following
>>>>>>>>>>>>>>>>> changes:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> No result. The following is the log output. The log shows
>>>>>>>>>>>>>>>>> the destination is null.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>>>>>>>>>>>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>>>>>>>>>>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>>>>>>>>>>>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>>>>>>>>>>>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <
>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Yeah, My cluster running. When brows
>>>>>>>>>>>>>>>>>> http://hadoop1.example.com: 50070/dfshealth.jsp. I am
>>>>>>>>>>>>>>>>>> getting the main page. Then click on Brows file system. I am getting the
>>>>>>>>>>>>>>>>>> following:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> hbase
>>>>>>>>>>>>>>>>>> tmp
>>>>>>>>>>>>>>>>>> user
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> And when click on user getting:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> beeswax
>>>>>>>>>>>>>>>>>> huuser (I have created)
>>>>>>>>>>>>>>>>>> root (I have created)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Would you like to see my configuration file. As did not
>>>>>>>>>>>>>>>>>> change any things, all by default. I have installed CDH4.1 and running on
>>>>>>>>>>>>>>>>>> VMs.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Is your cluster running fine? Are you able to browse
>>>>>>>>>>>>>>>>>>> Hdfs through the Hdfs Web Console at 50070?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <
>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Many thanks.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I have changed the program accordingly. It does not
>>>>>>>>>>>>>>>>>>>> show any error but one warring , but when I am browsing the HDFS folder,
>>>>>>>>>>>>>>>>>>>> file is not copied.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>>>>>>>>>> public static void main(String[] args) throws
>>>>>>>>>>>>>>>>>>>> IOException{
>>>>>>>>>>>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>>>>>>>>>>>         //Configuration configuration = new
>>>>>>>>>>>>>>>>>>>> Configuration();
>>>>>>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>         conf.addResource(new
>>>>>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>>         conf.addResource(new Path
>>>>>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>>         FileSystem fs = FileSystem.get(conf);
>>>>>>>>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>>>>>>>>> Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>>>>>>>>         Path outputFile = new
>>>>>>>>>>>>>>>>>>>> Path("/user/hduser/Output1.csv");
>>>>>>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 19-Nov-2012 13:50:32
>>>>>>>>>>>>>>>>>>>> org.apache.hadoop.util.NativeCodeLoader <clinit>
>>>>>>>>>>>>>>>>>>>> WARNING: Unable to load native-hadoop library for your
>>>>>>>>>>>>>>>>>>>> platform... using builtin-java classes where applicable
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Have any idea?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> If it is just copying the files without any processing
>>>>>>>>>>>>>>>>>>>>> or change, you can use something like this :
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>  public class CopyData {
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>     public static void main(String[] args) throws
>>>>>>>>>>>>>>>>>>>>> IOException{
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>         Configuration configuration = new
>>>>>>>>>>>>>>>>>>>>> Configuration();
>>>>>>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>>>>>>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Obviously you have to modify it as per your
>>>>>>>>>>>>>>>>>>>>> requirements like continuously polling the targeted directory for new files.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <
>>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> As I am new in  Java and Hadoop and have no much
>>>>>>>>>>>>>>>>>>>>>> experience. I am trying to first write a simple program to upload data into
>>>>>>>>>>>>>>>>>>>>>> HDFS and gradually move forward. I have written the following simple
>>>>>>>>>>>>>>>>>>>>>> program to upload the file into HDFS, I dont know why it does not working.
>>>>>>>>>>>>>>>>>>>>>> could you please check it, if have time.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>>>>>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>>>>>>>>>>>>>> import java.io.File;
>>>>>>>>>>>>>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>>>>>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>>>>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>>>>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>>>>>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>>>>>>>>>>>>>> import java.nio.*;
>>>>>>>>>>>>>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>>>>>>>>>>>>> public class hdfsdata {
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> public static void main(String [] args) throws
>>>>>>>>>>>>>>>>>>>>>> IOException
>>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>>>     try{
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>>>>>>>>>>>>>>     conf.addResource(new
>>>>>>>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>>>>     conf.addResource(new Path
>>>>>>>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>>>>>>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>>>>>>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>     //String fileName =
>>>>>>>>>>>>>>>>>>>>>> source.substring(source.lastIndexOf('/') + source.length());
>>>>>>>>>>>>>>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>>     else
>>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>>         System.out.println("File" + dest + " already
>>>>>>>>>>>>>>>>>>>>>> exists");
>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>>>>>>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>>>>>>>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>>>>>>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>>>>>>>>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>>>>>>>>>>>>>>>>    int numbytes = 0;
>>>>>>>>>>>>>>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>    {
>>>>>>>>>>>>>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>>>>>>>    in.close();
>>>>>>>>>>>>>>>>>>>>>>    out.close();
>>>>>>>>>>>>>>>>>>>>>>    //bos.close();
>>>>>>>>>>>>>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>>     catch(Exception e)
>>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> KK
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> You can set your cronjob to execute the program
>>>>>>>>>>>>>>>>>>>>>>> after every 5 sec.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Well, I want to automatically upload the files as
>>>>>>>>>>>>>>>>>>>>>>>> the files are generating about every 3-5 sec and each file has size about
>>>>>>>>>>>>>>>>>>>>>>>> 3MB.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>  Is it possible to automate the system using put or
>>>>>>>>>>>>>>>>>>>>>>>> cp command?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I read about the flume and webHDFS but I am not
>>>>>>>>>>>>>>>>>>>>>>>> sure it will work or not.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander
>>>>>>>>>>>>>>>>>>>>>>>> Alten-Lorenz <wg...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Why do you don't use HDFS related tools like put
>>>>>>>>>>>>>>>>>>>>>>>>> or cp?
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> > HI,
>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>> > I am generating files continuously in local
>>>>>>>>>>>>>>>>>>>>>>>>> folder of my base machine. How
>>>>>>>>>>>>>>>>>>>>>>>>> > I can now use the flume to stream the generated
>>>>>>>>>>>>>>>>>>>>>>>>> files from local folder to
>>>>>>>>>>>>>>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>>>>>>>>>>>>>>> > I dont know how exactly configure the sources,
>>>>>>>>>>>>>>>>>>>>>>>>> sinks and hdfs.
>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>> > 1) location of folder where files are
>>>>>>>>>>>>>>>>>>>>>>>>> generating: /usr/datastorage/
>>>>>>>>>>>>>>>>>>>>>>>>> > 2) name node address: htdfs://
>>>>>>>>>>>>>>>>>>>>>>>>> hadoop1.example.com:8020
>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>>>>>>>>>>>>>> > KK
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>>>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>>>>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by shekhar sharma <sh...@gmail.com>.
Hello Khasif,
Sorry for late reply...Are you done? or u still struggling?

mail me: shekhar2581@gmail.com
Regards,
Som Shekhar Sharma



On Wed, Nov 21, 2012 at 6:06 PM, kashif khan <dr...@gmail.com> wrote:

> Dear Shankar Sharma.
>
> I am using Eclipse as IDE. I dont have any idea, how to create the project
> as maven project. I have downloaded Mave2 but given me some strange error.
> So if you can help me then I will try the maven. Actually, I am trying to
> automatically upload the files into HDFS and then will apply some
> algorithms to analyze the data. The algorithms will implement in mapreduce
> . So if you think maven will good for me then please let me know how I can
> create the project as maven project.
>
>
> Many thanks
>
> Best regards,
>
> KK
>
>
>
>  On Tue, Nov 20, 2012 at 7:06 PM, shekhar sharma <sh...@gmail.com>wrote:
>
>> By the way how are you building and running your project.Are u running
>> from any IDE?
>> The best practises you can follow:
>>
>> (1) Create your project as maven project and give the dependency of
>> hadoop-X.Y.Z . So your project will automatically will have all the
>> necessary jars
>> and i am sure you will not face these kind of errors
>> (2) And In your HADOOP_CLASSPATH provide the path for $HADOOP_LIB.
>>
>> Regards,
>> SOm
>>
>>
>>
>>
>>
>> On Tue, Nov 20, 2012 at 9:52 PM, kashif khan <dr...@gmail.com>wrote:
>>
>>> Dear Tariq
>>>
>>> Many thanks, finally I have created the directory and upload the file.
>>>
>>> Once again many thanks
>>>
>>> Best regards
>>>
>>>
>>> On Tue, Nov 20, 2012 at 3:04 PM, kashif khan <dr...@gmail.com>wrote:
>>>
>>>> Dear Many thanks
>>>>
>>>>
>>>> I have downloaded the jar file and added to project. Now getting
>>>> another error as:
>>>>
>>>> og4j:WARN No appenders could be found for logger
>>>> (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
>>>> log4j:WARN Please initialize the log4j system properly.
>>>> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfigfor more info.
>>>> Exception in thread "main" java.io.IOException: No FileSystem for
>>>> scheme: hdfs
>>>>     at
>>>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2206)
>>>>     at
>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2213)
>>>>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>>>>     at
>>>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>>>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>>>>
>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>>>     at CopyFile.main(CopyFile.java:14)
>>>>
>>>> Have any idea about this?
>>>>
>>>> Thanks again
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Nov 20, 2012 at 2:53 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>
>>>>> You can download the jar here :
>>>>> http://search.maven.org/remotecontent?filepath=com/google/guava/guava/13.0.1/guava-13.0.1.jar
>>>>>
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Nov 20, 2012 at 8:06 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>
>>>>>> Could please let me know the name of jar file and location
>>>>>>
>>>>>> Many thanks
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 20, 2012 at 2:33 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>>
>>>>>>> Download the required jar and include it in your project.
>>>>>>>
>>>>>>> Regards,
>>>>>>>     Mohammad Tariq
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 20, 2012 at 7:57 PM, kashif khan <drkashif8310@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Dear Tariq Thanks
>>>>>>>>
>>>>>>>> I have added the jar files from Cdh and download the cdh4 eclipse
>>>>>>>> plugin and copied into eclipse plugin folder. The previous error I think
>>>>>>>> sorted out but now I am getting another strange error.
>>>>>>>>
>>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>> com/google/common/collect/Maps
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.metrics2.lib.MetricsRegistry.<init>(MetricsRegistry.java:42)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:87)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:133)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:97)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:190)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2373)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2365)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2233)
>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>>>>>>>     at CopyFile.main(CopyFile.java:14)
>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>> com.google.common.collect.Maps
>>>>>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>>>>>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>>>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
>>>>>>>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>>>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
>>>>>>>>     ... 13 more
>>>>>>>>
>>>>>>>> Have any idea about this error.
>>>>>>>>
>>>>>>>> Many thanks
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Nov 20, 2012 at 2:19 PM, Mohammad Tariq <dontariq@gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Hello Kashif,
>>>>>>>>>
>>>>>>>>>      You are correct. This because of some version mismatch. I am
>>>>>>>>> not using CDH personally but AFAIK, CDH4 uses Hadoop-2.x.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>     Mohammad Tariq
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Nov 20, 2012 at 4:10 PM, kashif khan <
>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> HI M Tariq
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I am trying the following the program to create directory and
>>>>>>>>>> copy file to hdfs. But I am getting the following errors
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Program:
>>>>>>>>>>
>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>
>>>>>>>>>> public class CopyFile {
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>         public static void main(String[] args) throws IOException{
>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>         conf.set("fs.default.name", "hadoop1.example.com:8020");
>>>>>>>>>>         FileSystem dfs = FileSystem.get(conf);
>>>>>>>>>>         String dirName = "Test1";
>>>>>>>>>>         Path src = new Path(dfs.getWorkingDirectory() + "/" +
>>>>>>>>>> dirName);
>>>>>>>>>>         dfs.mkdirs(src);
>>>>>>>>>>         Path scr1 = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>         Path dst = new Path(dfs.getWorkingDirectory() +
>>>>>>>>>> "/Test1/");
>>>>>>>>>>         dfs.copyFromLocalFile(src, dst);
>>>>>>>>>>
>>>>>>>>>>         }
>>>>>>>>>>         }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>     Exception in thread "main"
>>>>>>>>>> org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot
>>>>>>>>>> communicate with client version 4
>>>>>>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>>>>>>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>>>>>     at $Proxy1.getProtocolVersion(Unknown Source)
>>>>>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>>>>>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>>>>>>>>>>     at
>>>>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>>>>>>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>>>>>>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>>>>>>>>>>     at
>>>>>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>>>>>>>>>>     at
>>>>>>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>>>>>>>>>>     at
>>>>>>>>>> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>>>>>>>>>     at
>>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>>>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>>>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
>>>>>>>>>>     at CopyFile.main(CopyFile.java:11)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I am using CDH4.1. i have download the source file of
>>>>>>>>>> hadoop-1.0.4 and import the jar files into Eclipse. I think it is due to
>>>>>>>>>> version problem. Could you please let me know what will be correct version
>>>>>>>>>> for the CDH4.1?
>>>>>>>>>>
>>>>>>>>>> Many thanks
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 3:41 PM, Mohammad Tariq <
>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> It should work. Same code is working fine for me. Try to create
>>>>>>>>>>> some other directory in your Hdfs and use it as your output path. Also see
>>>>>>>>>>> if you find something in datanode logs.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <
>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> The input path is fine. Problem in output path. I am just
>>>>>>>>>>>> wonder that it copy the data into local disk  (/user/root/) not into hdfs.
>>>>>>>>>>>> I dont know why? Is it we give the correct statement to point to hdfs?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <
>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Try this as your input file path
>>>>>>>>>>>>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <
>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> when I am applying the command as
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> $ hadoop fs -put /usr/Eclipse/Output.csv
>>>>>>>>>>>>>> /user/root/Output.csv.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> its work fine and file browsing in the hdfs. But i dont know
>>>>>>>>>>>>>> why its not work in program.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Many thanks for your cooperation.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <
>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It would be good if I could have a look on the files.
>>>>>>>>>>>>>>> Meantime try some other directories. Also, check the directory permissions
>>>>>>>>>>>>>>> once.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <
>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I have tried through root user and made the following
>>>>>>>>>>>>>>>> changes:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> No result. The following is the log output. The log shows
>>>>>>>>>>>>>>>> the destination is null.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>>>>>>>>>>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>>>>>>>>>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>>>>>>>>>>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>>>>>>>>>>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <
>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yeah, My cluster running. When brows
>>>>>>>>>>>>>>>>> http://hadoop1.example.com: 50070/dfshealth.jsp. I am
>>>>>>>>>>>>>>>>> getting the main page. Then click on Brows file system. I am getting the
>>>>>>>>>>>>>>>>> following:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> hbase
>>>>>>>>>>>>>>>>> tmp
>>>>>>>>>>>>>>>>> user
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> And when click on user getting:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> beeswax
>>>>>>>>>>>>>>>>> huuser (I have created)
>>>>>>>>>>>>>>>>> root (I have created)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Would you like to see my configuration file. As did not
>>>>>>>>>>>>>>>>> change any things, all by default. I have installed CDH4.1 and running on
>>>>>>>>>>>>>>>>> VMs.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Is your cluster running fine? Are you able to browse Hdfs
>>>>>>>>>>>>>>>>>> through the Hdfs Web Console at 50070?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <
>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Many thanks.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I have changed the program accordingly. It does not show
>>>>>>>>>>>>>>>>>>> any error but one warring , but when I am browsing the HDFS folder, file is
>>>>>>>>>>>>>>>>>>> not copied.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>>>>>>>>> public static void main(String[] args) throws
>>>>>>>>>>>>>>>>>>> IOException{
>>>>>>>>>>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>>>>>>>>>>         //Configuration configuration = new
>>>>>>>>>>>>>>>>>>> Configuration();
>>>>>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>         conf.addResource(new
>>>>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>         conf.addResource(new Path
>>>>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>         FileSystem fs = FileSystem.get(conf);
>>>>>>>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>>>>>>>> Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>>>>>>>         Path outputFile = new
>>>>>>>>>>>>>>>>>>> Path("/user/hduser/Output1.csv");
>>>>>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 19-Nov-2012 13:50:32
>>>>>>>>>>>>>>>>>>> org.apache.hadoop.util.NativeCodeLoader <clinit>
>>>>>>>>>>>>>>>>>>> WARNING: Unable to load native-hadoop library for your
>>>>>>>>>>>>>>>>>>> platform... using builtin-java classes where applicable
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Have any idea?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> If it is just copying the files without any processing
>>>>>>>>>>>>>>>>>>>> or change, you can use something like this :
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>  public class CopyData {
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     public static void main(String[] args) throws
>>>>>>>>>>>>>>>>>>>> IOException{
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>         Configuration configuration = new
>>>>>>>>>>>>>>>>>>>> Configuration();
>>>>>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>>>>>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Obviously you have to modify it as per your
>>>>>>>>>>>>>>>>>>>> requirements like continuously polling the targeted directory for new files.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <
>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> As I am new in  Java and Hadoop and have no much
>>>>>>>>>>>>>>>>>>>>> experience. I am trying to first write a simple program to upload data into
>>>>>>>>>>>>>>>>>>>>> HDFS and gradually move forward. I have written the following simple
>>>>>>>>>>>>>>>>>>>>> program to upload the file into HDFS, I dont know why it does not working.
>>>>>>>>>>>>>>>>>>>>> could you please check it, if have time.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>>>>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>>>>>>>>>>>>> import java.io.File;
>>>>>>>>>>>>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>>>>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>>>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>>>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>>>>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>>>>>>>>>>>>> import java.nio.*;
>>>>>>>>>>>>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>>>>>>>>>>>> public class hdfsdata {
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> public static void main(String [] args) throws
>>>>>>>>>>>>>>>>>>>>> IOException
>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>>     try{
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>>>>>>>>>>>>>     conf.addResource(new
>>>>>>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>>>     conf.addResource(new Path
>>>>>>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>>>>>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>>>>>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>     //String fileName =
>>>>>>>>>>>>>>>>>>>>> source.substring(source.lastIndexOf('/') + source.length());
>>>>>>>>>>>>>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>     else
>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>         System.out.println("File" + dest + " already
>>>>>>>>>>>>>>>>>>>>> exists");
>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>>>>>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>>>>>>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>>>>>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>>>>>>>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>>>>>>>>>>>>>>>    int numbytes = 0;
>>>>>>>>>>>>>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>    {
>>>>>>>>>>>>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>>>>>>    in.close();
>>>>>>>>>>>>>>>>>>>>>    out.close();
>>>>>>>>>>>>>>>>>>>>>    //bos.close();
>>>>>>>>>>>>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>     catch(Exception e)
>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> KK
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> You can set your cronjob to execute the program after
>>>>>>>>>>>>>>>>>>>>>> every 5 sec.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Well, I want to automatically upload the files as
>>>>>>>>>>>>>>>>>>>>>>> the files are generating about every 3-5 sec and each file has size about
>>>>>>>>>>>>>>>>>>>>>>> 3MB.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>  Is it possible to automate the system using put or
>>>>>>>>>>>>>>>>>>>>>>> cp command?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I read about the flume and webHDFS but I am not sure
>>>>>>>>>>>>>>>>>>>>>>> it will work or not.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander
>>>>>>>>>>>>>>>>>>>>>>> Alten-Lorenz <wg...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Why do you don't use HDFS related tools like put or
>>>>>>>>>>>>>>>>>>>>>>>> cp?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> > HI,
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>> > I am generating files continuously in local
>>>>>>>>>>>>>>>>>>>>>>>> folder of my base machine. How
>>>>>>>>>>>>>>>>>>>>>>>> > I can now use the flume to stream the generated
>>>>>>>>>>>>>>>>>>>>>>>> files from local folder to
>>>>>>>>>>>>>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>>>>>>>>>>>>>> > I dont know how exactly configure the sources,
>>>>>>>>>>>>>>>>>>>>>>>> sinks and hdfs.
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>>>>>>>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>>>>>>>>>>>>>>> > 2) name node address: htdfs://
>>>>>>>>>>>>>>>>>>>>>>>> hadoop1.example.com:8020
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>>>>>>>>>>>>> > KK
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>>>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by kashif khan <dr...@gmail.com>.
Dear Shankar Sharma.

I am using Eclipse as IDE. I dont have any idea, how to create the project
as maven project. I have downloaded Mave2 but given me some strange error.
So if you can help me then I will try the maven. Actually, I am trying to
automatically upload the files into HDFS and then will apply some
algorithms to analyze the data. The algorithms will implement in mapreduce
. So if you think maven will good for me then please let me know how I can
create the project as maven project.

Many thanks

Best regards,

KK



On Tue, Nov 20, 2012 at 7:06 PM, shekhar sharma <sh...@gmail.com>wrote:

> By the way how are you building and running your project.Are u running
> from any IDE?
> The best practises you can follow:
>
> (1) Create your project as maven project and give the dependency of
> hadoop-X.Y.Z . So your project will automatically will have all the
> necessary jars
> and i am sure you will not face these kind of errors
> (2) And In your HADOOP_CLASSPATH provide the path for $HADOOP_LIB.
>
> Regards,
> SOm
>
>
>
>
>
> On Tue, Nov 20, 2012 at 9:52 PM, kashif khan <dr...@gmail.com>wrote:
>
>> Dear Tariq
>>
>> Many thanks, finally I have created the directory and upload the file.
>>
>> Once again many thanks
>>
>> Best regards
>>
>>
>> On Tue, Nov 20, 2012 at 3:04 PM, kashif khan <dr...@gmail.com>wrote:
>>
>>> Dear Many thanks
>>>
>>>
>>> I have downloaded the jar file and added to project. Now getting another
>>> error as:
>>>
>>> og4j:WARN No appenders could be found for logger
>>> (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
>>> log4j:WARN Please initialize the log4j system properly.
>>> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfigfor more info.
>>> Exception in thread "main" java.io.IOException: No FileSystem for
>>> scheme: hdfs
>>>     at
>>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2206)
>>>     at
>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2213)
>>>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>>>     at
>>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>>>
>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>>     at CopyFile.main(CopyFile.java:14)
>>>
>>> Have any idea about this?
>>>
>>> Thanks again
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Nov 20, 2012 at 2:53 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>
>>>> You can download the jar here :
>>>> http://search.maven.org/remotecontent?filepath=com/google/guava/guava/13.0.1/guava-13.0.1.jar
>>>>
>>>> Regards,
>>>>     Mohammad Tariq
>>>>
>>>>
>>>>
>>>> On Tue, Nov 20, 2012 at 8:06 PM, kashif khan <dr...@gmail.com>wrote:
>>>>
>>>>> Could please let me know the name of jar file and location
>>>>>
>>>>> Many thanks
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>> On Tue, Nov 20, 2012 at 2:33 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>
>>>>>> Download the required jar and include it in your project.
>>>>>>
>>>>>> Regards,
>>>>>>     Mohammad Tariq
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 20, 2012 at 7:57 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>>
>>>>>>> Dear Tariq Thanks
>>>>>>>
>>>>>>> I have added the jar files from Cdh and download the cdh4 eclipse
>>>>>>> plugin and copied into eclipse plugin folder. The previous error I think
>>>>>>> sorted out but now I am getting another strange error.
>>>>>>>
>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>> com/google/common/collect/Maps
>>>>>>>     at
>>>>>>> org.apache.hadoop.metrics2.lib.MetricsRegistry.<init>(MetricsRegistry.java:42)
>>>>>>>     at
>>>>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:87)
>>>>>>>     at
>>>>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:133)
>>>>>>>     at
>>>>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38)
>>>>>>>     at
>>>>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36)
>>>>>>>     at
>>>>>>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:97)
>>>>>>>     at
>>>>>>> org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:190)
>>>>>>>     at
>>>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2373)
>>>>>>>     at
>>>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2365)
>>>>>>>     at
>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2233)
>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>>>>>>     at CopyFile.main(CopyFile.java:14)
>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>> com.google.common.collect.Maps
>>>>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>>>>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
>>>>>>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
>>>>>>>     ... 13 more
>>>>>>>
>>>>>>> Have any idea about this error.
>>>>>>>
>>>>>>> Many thanks
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 20, 2012 at 2:19 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>>>
>>>>>>>> Hello Kashif,
>>>>>>>>
>>>>>>>>      You are correct. This because of some version mismatch. I am
>>>>>>>> not using CDH personally but AFAIK, CDH4 uses Hadoop-2.x.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>     Mohammad Tariq
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Nov 20, 2012 at 4:10 PM, kashif khan <
>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> HI M Tariq
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I am trying the following the program to create directory and copy
>>>>>>>>> file to hdfs. But I am getting the following errors
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Program:
>>>>>>>>>
>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>> import java.io.IOException;
>>>>>>>>>
>>>>>>>>> public class CopyFile {
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>         public static void main(String[] args) throws IOException{
>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>         conf.set("fs.default.name", "hadoop1.example.com:8020");
>>>>>>>>>         FileSystem dfs = FileSystem.get(conf);
>>>>>>>>>         String dirName = "Test1";
>>>>>>>>>         Path src = new Path(dfs.getWorkingDirectory() + "/" +
>>>>>>>>> dirName);
>>>>>>>>>         dfs.mkdirs(src);
>>>>>>>>>         Path scr1 = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>         Path dst = new Path(dfs.getWorkingDirectory() + "/Test1/");
>>>>>>>>>         dfs.copyFromLocalFile(src, dst);
>>>>>>>>>
>>>>>>>>>         }
>>>>>>>>>         }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>     Exception in thread "main"
>>>>>>>>> org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot
>>>>>>>>> communicate with client version 4
>>>>>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>>>>>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>>>>     at $Proxy1.getProtocolVersion(Unknown Source)
>>>>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>>>>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>>>>>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>>>>>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
>>>>>>>>>     at CopyFile.main(CopyFile.java:11)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I am using CDH4.1. i have download the source file of hadoop-1.0.4
>>>>>>>>> and import the jar files into Eclipse. I think it is due to version
>>>>>>>>> problem. Could you please let me know what will be correct version for the
>>>>>>>>> CDH4.1?
>>>>>>>>>
>>>>>>>>> Many thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 19, 2012 at 3:41 PM, Mohammad Tariq <
>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> It should work. Same code is working fine for me. Try to create
>>>>>>>>>> some other directory in your Hdfs and use it as your output path. Also see
>>>>>>>>>> if you find something in datanode logs.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <
>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> The input path is fine. Problem in output path. I am just wonder
>>>>>>>>>>> that it copy the data into local disk  (/user/root/) not into hdfs. I dont
>>>>>>>>>>> know why? Is it we give the correct statement to point to hdfs?
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <
>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Try this as your input file path
>>>>>>>>>>>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <
>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> when I am applying the command as
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ hadoop fs -put /usr/Eclipse/Output.csv /user/root/Output.csv.
>>>>>>>>>>>>>
>>>>>>>>>>>>> its work fine and file browsing in the hdfs. But i dont know
>>>>>>>>>>>>> why its not work in program.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Many thanks for your cooperation.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <
>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> It would be good if I could have a look on the files.
>>>>>>>>>>>>>> Meantime try some other directories. Also, check the directory permissions
>>>>>>>>>>>>>> once.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <
>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have tried through root user and made the following
>>>>>>>>>>>>>>> changes:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> No result. The following is the log output. The log shows
>>>>>>>>>>>>>>> the destination is null.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>>>>>>>>>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>>>>>>>>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>>>>>>>>>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>>>>>>>>>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <
>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yeah, My cluster running. When brows
>>>>>>>>>>>>>>>> http://hadoop1.example.com: 50070/dfshealth.jsp. I am
>>>>>>>>>>>>>>>> getting the main page. Then click on Brows file system. I am getting the
>>>>>>>>>>>>>>>> following:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> hbase
>>>>>>>>>>>>>>>> tmp
>>>>>>>>>>>>>>>> user
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> And when click on user getting:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> beeswax
>>>>>>>>>>>>>>>> huuser (I have created)
>>>>>>>>>>>>>>>> root (I have created)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Would you like to see my configuration file. As did not
>>>>>>>>>>>>>>>> change any things, all by default. I have installed CDH4.1 and running on
>>>>>>>>>>>>>>>> VMs.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Is your cluster running fine? Are you able to browse Hdfs
>>>>>>>>>>>>>>>>> through the Hdfs Web Console at 50070?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <
>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Many thanks.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I have changed the program accordingly. It does not show
>>>>>>>>>>>>>>>>>> any error but one warring , but when I am browsing the HDFS folder, file is
>>>>>>>>>>>>>>>>>> not copied.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>>>>>>>> public static void main(String[] args) throws IOException{
>>>>>>>>>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>>>>>>>>>         //Configuration configuration = new
>>>>>>>>>>>>>>>>>> Configuration();
>>>>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>         conf.addResource(new
>>>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>         conf.addResource(new Path
>>>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>         FileSystem fs = FileSystem.get(conf);
>>>>>>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>>>>>>> Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>>>>>>         Path outputFile = new
>>>>>>>>>>>>>>>>>> Path("/user/hduser/Output1.csv");
>>>>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 19-Nov-2012 13:50:32
>>>>>>>>>>>>>>>>>> org.apache.hadoop.util.NativeCodeLoader <clinit>
>>>>>>>>>>>>>>>>>> WARNING: Unable to load native-hadoop library for your
>>>>>>>>>>>>>>>>>> platform... using builtin-java classes where applicable
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Have any idea?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> If it is just copying the files without any processing
>>>>>>>>>>>>>>>>>>> or change, you can use something like this :
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>  public class CopyData {
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>     public static void main(String[] args) throws
>>>>>>>>>>>>>>>>>>> IOException{
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>         Configuration configuration = new
>>>>>>>>>>>>>>>>>>> Configuration();
>>>>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>>>>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Obviously you have to modify it as per your requirements
>>>>>>>>>>>>>>>>>>> like continuously polling the targeted directory for new files.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <
>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> As I am new in  Java and Hadoop and have no much
>>>>>>>>>>>>>>>>>>>> experience. I am trying to first write a simple program to upload data into
>>>>>>>>>>>>>>>>>>>> HDFS and gradually move forward. I have written the following simple
>>>>>>>>>>>>>>>>>>>> program to upload the file into HDFS, I dont know why it does not working.
>>>>>>>>>>>>>>>>>>>> could you please check it, if have time.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>>>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>>>>>>>>>>>> import java.io.File;
>>>>>>>>>>>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>>>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>>>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>>>>>>>>>>>> import java.nio.*;
>>>>>>>>>>>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>>>>>>>>>>> public class hdfsdata {
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> public static void main(String [] args) throws
>>>>>>>>>>>>>>>>>>>> IOException
>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>     try{
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>>>>>>>>>>>>     conf.addResource(new
>>>>>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>>     conf.addResource(new Path
>>>>>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>>>>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>>>>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     //String fileName =
>>>>>>>>>>>>>>>>>>>> source.substring(source.lastIndexOf('/') + source.length());
>>>>>>>>>>>>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>     else
>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>         System.out.println("File" + dest + " already
>>>>>>>>>>>>>>>>>>>> exists");
>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>>>>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>>>>>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>>>>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>>>>>>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>>>>>>>>>>>>>>    int numbytes = 0;
>>>>>>>>>>>>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>    {
>>>>>>>>>>>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>>>>>    in.close();
>>>>>>>>>>>>>>>>>>>>    out.close();
>>>>>>>>>>>>>>>>>>>>    //bos.close();
>>>>>>>>>>>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>     catch(Exception e)
>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> KK
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> You can set your cronjob to execute the program after
>>>>>>>>>>>>>>>>>>>>> every 5 sec.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Well, I want to automatically upload the files as
>>>>>>>>>>>>>>>>>>>>>> the files are generating about every 3-5 sec and each file has size about
>>>>>>>>>>>>>>>>>>>>>> 3MB.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>  Is it possible to automate the system using put or
>>>>>>>>>>>>>>>>>>>>>> cp command?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I read about the flume and webHDFS but I am not sure
>>>>>>>>>>>>>>>>>>>>>> it will work or not.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander
>>>>>>>>>>>>>>>>>>>>>> Alten-Lorenz <wg...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Why do you don't use HDFS related tools like put or
>>>>>>>>>>>>>>>>>>>>>>> cp?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> > HI,
>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>> > I am generating files continuously in local folder
>>>>>>>>>>>>>>>>>>>>>>> of my base machine. How
>>>>>>>>>>>>>>>>>>>>>>> > I can now use the flume to stream the generated
>>>>>>>>>>>>>>>>>>>>>>> files from local folder to
>>>>>>>>>>>>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>>>>>>>>>>>>> > I dont know how exactly configure the sources,
>>>>>>>>>>>>>>>>>>>>>>> sinks and hdfs.
>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>>>>>>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>>>>>>>>>>>>>> > 2) name node address: htdfs://
>>>>>>>>>>>>>>>>>>>>>>> hadoop1.example.com:8020
>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>>>>>>>>>>>> > KK
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by shekhar sharma <sh...@gmail.com>.
By the way how are you building and running your project.Are u running from
any IDE?
The best practises you can follow:

(1) Create your project as maven project and give the dependency of
hadoop-X.Y.Z . So your project will automatically will have all the
necessary jars
and i am sure you will not face these kind of errors
(2) And In your HADOOP_CLASSPATH provide the path for $HADOOP_LIB.

Regards,
SOm





On Tue, Nov 20, 2012 at 9:52 PM, kashif khan <dr...@gmail.com> wrote:

> Dear Tariq
>
> Many thanks, finally I have created the directory and upload the file.
>
> Once again many thanks
>
> Best regards
>
>
> On Tue, Nov 20, 2012 at 3:04 PM, kashif khan <dr...@gmail.com>wrote:
>
>> Dear Many thanks
>>
>>
>> I have downloaded the jar file and added to project. Now getting another
>> error as:
>>
>> og4j:WARN No appenders could be found for logger
>> (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
>> log4j:WARN Please initialize the log4j system properly.
>> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
>> more info.
>> Exception in thread "main" java.io.IOException: No FileSystem for scheme:
>> hdfs
>>     at
>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2206)
>>     at
>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2213)
>>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>>     at
>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>>
>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>     at CopyFile.main(CopyFile.java:14)
>>
>> Have any idea about this?
>>
>> Thanks again
>>
>>
>>
>>
>>
>> On Tue, Nov 20, 2012 at 2:53 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>
>>> You can download the jar here :
>>> http://search.maven.org/remotecontent?filepath=com/google/guava/guava/13.0.1/guava-13.0.1.jar
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Tue, Nov 20, 2012 at 8:06 PM, kashif khan <dr...@gmail.com>wrote:
>>>
>>>> Could please let me know the name of jar file and location
>>>>
>>>> Many thanks
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On Tue, Nov 20, 2012 at 2:33 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>
>>>>> Download the required jar and include it in your project.
>>>>>
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Nov 20, 2012 at 7:57 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>
>>>>>> Dear Tariq Thanks
>>>>>>
>>>>>> I have added the jar files from Cdh and download the cdh4 eclipse
>>>>>> plugin and copied into eclipse plugin folder. The previous error I think
>>>>>> sorted out but now I am getting another strange error.
>>>>>>
>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>> com/google/common/collect/Maps
>>>>>>     at
>>>>>> org.apache.hadoop.metrics2.lib.MetricsRegistry.<init>(MetricsRegistry.java:42)
>>>>>>     at
>>>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:87)
>>>>>>     at
>>>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:133)
>>>>>>     at
>>>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38)
>>>>>>     at
>>>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36)
>>>>>>     at
>>>>>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:97)
>>>>>>     at
>>>>>> org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:190)
>>>>>>     at
>>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2373)
>>>>>>     at
>>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2365)
>>>>>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2233)
>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>>>>>     at CopyFile.main(CopyFile.java:14)
>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>> com.google.common.collect.Maps
>>>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>>>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
>>>>>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
>>>>>>     ... 13 more
>>>>>>
>>>>>> Have any idea about this error.
>>>>>>
>>>>>> Many thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 20, 2012 at 2:19 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>>
>>>>>>> Hello Kashif,
>>>>>>>
>>>>>>>      You are correct. This because of some version mismatch. I am
>>>>>>> not using CDH personally but AFAIK, CDH4 uses Hadoop-2.x.
>>>>>>>
>>>>>>> Regards,
>>>>>>>     Mohammad Tariq
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 20, 2012 at 4:10 PM, kashif khan <drkashif8310@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> HI M Tariq
>>>>>>>>
>>>>>>>>
>>>>>>>> I am trying the following the program to create directory and copy
>>>>>>>> file to hdfs. But I am getting the following errors
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Program:
>>>>>>>>
>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>> import java.io.IOException;
>>>>>>>>
>>>>>>>> public class CopyFile {
>>>>>>>>
>>>>>>>>
>>>>>>>>         public static void main(String[] args) throws IOException{
>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>         conf.set("fs.default.name", "hadoop1.example.com:8020");
>>>>>>>>         FileSystem dfs = FileSystem.get(conf);
>>>>>>>>         String dirName = "Test1";
>>>>>>>>         Path src = new Path(dfs.getWorkingDirectory() + "/" +
>>>>>>>> dirName);
>>>>>>>>         dfs.mkdirs(src);
>>>>>>>>         Path scr1 = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>         Path dst = new Path(dfs.getWorkingDirectory() + "/Test1/");
>>>>>>>>         dfs.copyFromLocalFile(src, dst);
>>>>>>>>
>>>>>>>>         }
>>>>>>>>         }
>>>>>>>>
>>>>>>>>
>>>>>>>>     Exception in thread "main"
>>>>>>>> org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot
>>>>>>>> communicate with client version 4
>>>>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>>>>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>>>     at $Proxy1.getProtocolVersion(Unknown Source)
>>>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>>>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>>>>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>>>>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
>>>>>>>>     at CopyFile.main(CopyFile.java:11)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I am using CDH4.1. i have download the source file of hadoop-1.0.4
>>>>>>>> and import the jar files into Eclipse. I think it is due to version
>>>>>>>> problem. Could you please let me know what will be correct version for the
>>>>>>>> CDH4.1?
>>>>>>>>
>>>>>>>> Many thanks
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 3:41 PM, Mohammad Tariq <dontariq@gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> It should work. Same code is working fine for me. Try to create
>>>>>>>>> some other directory in your Hdfs and use it as your output path. Also see
>>>>>>>>> if you find something in datanode logs.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>     Mohammad Tariq
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <
>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> The input path is fine. Problem in output path. I am just wonder
>>>>>>>>>> that it copy the data into local disk  (/user/root/) not into hdfs. I dont
>>>>>>>>>> know why? Is it we give the correct statement to point to hdfs?
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <
>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Try this as your input file path
>>>>>>>>>>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <
>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> when I am applying the command as
>>>>>>>>>>>>
>>>>>>>>>>>> $ hadoop fs -put /usr/Eclipse/Output.csv /user/root/Output.csv.
>>>>>>>>>>>>
>>>>>>>>>>>> its work fine and file browsing in the hdfs. But i dont know
>>>>>>>>>>>> why its not work in program.
>>>>>>>>>>>>
>>>>>>>>>>>> Many thanks for your cooperation.
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <
>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> It would be good if I could have a look on the files. Meantime
>>>>>>>>>>>>> try some other directories. Also, check the directory permissions once.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <
>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have tried through root user and made the following
>>>>>>>>>>>>>> changes:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> No result. The following is the log output. The log shows the
>>>>>>>>>>>>>> destination is null.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>>>>>>>>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>>>>>>>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>>>>>>>>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>>>>>>>>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <
>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yeah, My cluster running. When brows
>>>>>>>>>>>>>>> http://hadoop1.example.com: 50070/dfshealth.jsp. I am
>>>>>>>>>>>>>>> getting the main page. Then click on Brows file system. I am getting the
>>>>>>>>>>>>>>> following:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> hbase
>>>>>>>>>>>>>>> tmp
>>>>>>>>>>>>>>> user
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> And when click on user getting:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> beeswax
>>>>>>>>>>>>>>> huuser (I have created)
>>>>>>>>>>>>>>> root (I have created)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Would you like to see my configuration file. As did not
>>>>>>>>>>>>>>> change any things, all by default. I have installed CDH4.1 and running on
>>>>>>>>>>>>>>> VMs.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <
>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Is your cluster running fine? Are you able to browse Hdfs
>>>>>>>>>>>>>>>> through the Hdfs Web Console at 50070?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <
>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Many thanks.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have changed the program accordingly. It does not show
>>>>>>>>>>>>>>>>> any error but one warring , but when I am browsing the HDFS folder, file is
>>>>>>>>>>>>>>>>> not copied.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>>>>>>> public static void main(String[] args) throws IOException{
>>>>>>>>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>>>>>>>>         //Configuration configuration = new
>>>>>>>>>>>>>>>>> Configuration();
>>>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>         conf.addResource(new
>>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>         conf.addResource(new Path
>>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>         FileSystem fs = FileSystem.get(conf);
>>>>>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>>>>>> Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>>>>>         Path outputFile = new
>>>>>>>>>>>>>>>>> Path("/user/hduser/Output1.csv");
>>>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 19-Nov-2012 13:50:32
>>>>>>>>>>>>>>>>> org.apache.hadoop.util.NativeCodeLoader <clinit>
>>>>>>>>>>>>>>>>> WARNING: Unable to load native-hadoop library for your
>>>>>>>>>>>>>>>>> platform... using builtin-java classes where applicable
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Have any idea?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> If it is just copying the files without any processing or
>>>>>>>>>>>>>>>>>> change, you can use something like this :
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>  public class CopyData {
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>     public static void main(String[] args) throws
>>>>>>>>>>>>>>>>>> IOException{
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>         Configuration configuration = new Configuration();
>>>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>>>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Obviously you have to modify it as per your requirements
>>>>>>>>>>>>>>>>>> like continuously polling the targeted directory for new files.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <
>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> As I am new in  Java and Hadoop and have no much
>>>>>>>>>>>>>>>>>>> experience. I am trying to first write a simple program to upload data into
>>>>>>>>>>>>>>>>>>> HDFS and gradually move forward. I have written the following simple
>>>>>>>>>>>>>>>>>>> program to upload the file into HDFS, I dont know why it does not working.
>>>>>>>>>>>>>>>>>>> could you please check it, if have time.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>>>>>>>>>>> import java.io.File;
>>>>>>>>>>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>>>>>>>>>>> import java.nio.*;
>>>>>>>>>>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>>>>>>>>>> public class hdfsdata {
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> public static void main(String [] args) throws
>>>>>>>>>>>>>>>>>>> IOException
>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>     try{
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>>>>>>>>>>>     conf.addResource(new
>>>>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>     conf.addResource(new Path
>>>>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>>>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>>>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>     //String fileName =
>>>>>>>>>>>>>>>>>>> source.substring(source.lastIndexOf('/') + source.length());
>>>>>>>>>>>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>     else
>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>         System.out.println("File" + dest + " already
>>>>>>>>>>>>>>>>>>> exists");
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>>>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>>>>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>>>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>>>>>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>>>>>>>>>>>>>    int numbytes = 0;
>>>>>>>>>>>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>    {
>>>>>>>>>>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>>>>    in.close();
>>>>>>>>>>>>>>>>>>>    out.close();
>>>>>>>>>>>>>>>>>>>    //bos.close();
>>>>>>>>>>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>     catch(Exception e)
>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> KK
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> You can set your cronjob to execute the program after
>>>>>>>>>>>>>>>>>>>> every 5 sec.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Well, I want to automatically upload the files as  the
>>>>>>>>>>>>>>>>>>>>> files are generating about every 3-5 sec and each file has size about 3MB.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>  Is it possible to automate the system using put or cp
>>>>>>>>>>>>>>>>>>>>> command?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I read about the flume and webHDFS but I am not sure
>>>>>>>>>>>>>>>>>>>>> it will work or not.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander
>>>>>>>>>>>>>>>>>>>>> Alten-Lorenz <wg...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Why do you don't use HDFS related tools like put or
>>>>>>>>>>>>>>>>>>>>>> cp?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> > HI,
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > I am generating files continuously in local folder
>>>>>>>>>>>>>>>>>>>>>> of my base machine. How
>>>>>>>>>>>>>>>>>>>>>> > I can now use the flume to stream the generated
>>>>>>>>>>>>>>>>>>>>>> files from local folder to
>>>>>>>>>>>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>>>>>>>>>>>> > I dont know how exactly configure the sources,
>>>>>>>>>>>>>>>>>>>>>> sinks and hdfs.
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>>>>>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>>>>>>>>>>>>> > 2) name node address: htdfs://
>>>>>>>>>>>>>>>>>>>>>> hadoop1.example.com:8020
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>>>>>>>>>>> > KK
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by kashif khan <dr...@gmail.com>.
Dear Tariq

Many thanks, finally I have created the directory and upload the file.

Once again many thanks

Best regards

On Tue, Nov 20, 2012 at 3:04 PM, kashif khan <dr...@gmail.com> wrote:

> Dear Many thanks
>
>
> I have downloaded the jar file and added to project. Now getting another
> error as:
>
> og4j:WARN No appenders could be found for logger
> (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
> more info.
> Exception in thread "main" java.io.IOException: No FileSystem for scheme:
> hdfs
>     at
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2206)
>     at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2213)
>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>     at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>     at CopyFile.main(CopyFile.java:14)
>
> Have any idea about this?
>
> Thanks again
>
>
>
>
>
> On Tue, Nov 20, 2012 at 2:53 PM, Mohammad Tariq <do...@gmail.com>wrote:
>
>> You can download the jar here :
>> http://search.maven.org/remotecontent?filepath=com/google/guava/guava/13.0.1/guava-13.0.1.jar
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Tue, Nov 20, 2012 at 8:06 PM, kashif khan <dr...@gmail.com>wrote:
>>
>>> Could please let me know the name of jar file and location
>>>
>>> Many thanks
>>>
>>> Best regards
>>>
>>>
>>> On Tue, Nov 20, 2012 at 2:33 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>
>>>> Download the required jar and include it in your project.
>>>>
>>>> Regards,
>>>>     Mohammad Tariq
>>>>
>>>>
>>>>
>>>> On Tue, Nov 20, 2012 at 7:57 PM, kashif khan <dr...@gmail.com>wrote:
>>>>
>>>>> Dear Tariq Thanks
>>>>>
>>>>> I have added the jar files from Cdh and download the cdh4 eclipse
>>>>> plugin and copied into eclipse plugin folder. The previous error I think
>>>>> sorted out but now I am getting another strange error.
>>>>>
>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>> com/google/common/collect/Maps
>>>>>     at
>>>>> org.apache.hadoop.metrics2.lib.MetricsRegistry.<init>(MetricsRegistry.java:42)
>>>>>     at
>>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:87)
>>>>>     at
>>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:133)
>>>>>     at
>>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38)
>>>>>     at
>>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36)
>>>>>     at
>>>>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:97)
>>>>>     at
>>>>> org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:190)
>>>>>     at
>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2373)
>>>>>     at
>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2365)
>>>>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2233)
>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>>>>     at CopyFile.main(CopyFile.java:14)
>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>> com.google.common.collect.Maps
>>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
>>>>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
>>>>>     ... 13 more
>>>>>
>>>>> Have any idea about this error.
>>>>>
>>>>> Many thanks
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Nov 20, 2012 at 2:19 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>
>>>>>> Hello Kashif,
>>>>>>
>>>>>>      You are correct. This because of some version mismatch. I am not
>>>>>> using CDH personally but AFAIK, CDH4 uses Hadoop-2.x.
>>>>>>
>>>>>> Regards,
>>>>>>     Mohammad Tariq
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 20, 2012 at 4:10 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>>
>>>>>>> HI M Tariq
>>>>>>>
>>>>>>>
>>>>>>> I am trying the following the program to create directory and copy
>>>>>>> file to hdfs. But I am getting the following errors
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Program:
>>>>>>>
>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>> import java.io.IOException;
>>>>>>>
>>>>>>> public class CopyFile {
>>>>>>>
>>>>>>>
>>>>>>>         public static void main(String[] args) throws IOException{
>>>>>>>         Configuration conf = new Configuration();
>>>>>>>          conf.set("fs.default.name", "hadoop1.example.com:8020");
>>>>>>>         FileSystem dfs = FileSystem.get(conf);
>>>>>>>         String dirName = "Test1";
>>>>>>>         Path src = new Path(dfs.getWorkingDirectory() + "/" +
>>>>>>> dirName);
>>>>>>>         dfs.mkdirs(src);
>>>>>>>         Path scr1 = new Path("/usr/Eclipse/Output.csv");
>>>>>>>         Path dst = new Path(dfs.getWorkingDirectory() + "/Test1/");
>>>>>>>         dfs.copyFromLocalFile(src, dst);
>>>>>>>
>>>>>>>         }
>>>>>>>         }
>>>>>>>
>>>>>>>
>>>>>>>     Exception in thread "main"
>>>>>>> org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot
>>>>>>> communicate with client version 4
>>>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>>>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>>     at $Proxy1.getProtocolVersion(Unknown Source)
>>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>>>>>>>     at
>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>>>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>>>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>>>>>>>     at
>>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>>>>>>>     at
>>>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>>>>>>>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>>>>>>     at
>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
>>>>>>>     at CopyFile.main(CopyFile.java:11)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I am using CDH4.1. i have download the source file of hadoop-1.0.4
>>>>>>> and import the jar files into Eclipse. I think it is due to version
>>>>>>> problem. Could you please let me know what will be correct version for the
>>>>>>> CDH4.1?
>>>>>>>
>>>>>>> Many thanks
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 19, 2012 at 3:41 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>>>
>>>>>>>> It should work. Same code is working fine for me. Try to create
>>>>>>>> some other directory in your Hdfs and use it as your output path. Also see
>>>>>>>> if you find something in datanode logs.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>     Mohammad Tariq
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <
>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> The input path is fine. Problem in output path. I am just wonder
>>>>>>>>> that it copy the data into local disk  (/user/root/) not into hdfs. I dont
>>>>>>>>> know why? Is it we give the correct statement to point to hdfs?
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <
>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Try this as your input file path
>>>>>>>>>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <
>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> when I am applying the command as
>>>>>>>>>>>
>>>>>>>>>>> $ hadoop fs -put /usr/Eclipse/Output.csv /user/root/Output.csv.
>>>>>>>>>>>
>>>>>>>>>>> its work fine and file browsing in the hdfs. But i dont know why
>>>>>>>>>>> its not work in program.
>>>>>>>>>>>
>>>>>>>>>>> Many thanks for your cooperation.
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <
>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> It would be good if I could have a look on the files. Meantime
>>>>>>>>>>>> try some other directories. Also, check the directory permissions once.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <
>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have tried through root user and made the following changes:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>>>>>>>>>
>>>>>>>>>>>>> No result. The following is the log output. The log shows the
>>>>>>>>>>>>> destination is null.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>>>>>>>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>>>>>>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>>>>>>>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>>>>>>>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <
>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yeah, My cluster running. When brows
>>>>>>>>>>>>>> http://hadoop1.example.com: 50070/dfshealth.jsp. I am
>>>>>>>>>>>>>> getting the main page. Then click on Brows file system. I am getting the
>>>>>>>>>>>>>> following:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> hbase
>>>>>>>>>>>>>> tmp
>>>>>>>>>>>>>> user
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> And when click on user getting:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> beeswax
>>>>>>>>>>>>>> huuser (I have created)
>>>>>>>>>>>>>> root (I have created)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Would you like to see my configuration file. As did not
>>>>>>>>>>>>>> change any things, all by default. I have installed CDH4.1 and running on
>>>>>>>>>>>>>> VMs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <
>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Is your cluster running fine? Are you able to browse Hdfs
>>>>>>>>>>>>>>> through the Hdfs Web Console at 50070?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <
>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Many thanks.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I have changed the program accordingly. It does not show
>>>>>>>>>>>>>>>> any error but one warring , but when I am browsing the HDFS folder, file is
>>>>>>>>>>>>>>>> not copied.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>>>>>> public static void main(String[] args) throws IOException{
>>>>>>>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>>>>>>>         //Configuration configuration = new Configuration();
>>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>         conf.addResource(new
>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>         conf.addResource(new Path
>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>          FileSystem fs = FileSystem.get(conf);
>>>>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>>>>> Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>>>>         Path outputFile = new
>>>>>>>>>>>>>>>> Path("/user/hduser/Output1.csv");
>>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 19-Nov-2012 13:50:32
>>>>>>>>>>>>>>>> org.apache.hadoop.util.NativeCodeLoader <clinit>
>>>>>>>>>>>>>>>> WARNING: Unable to load native-hadoop library for your
>>>>>>>>>>>>>>>> platform... using builtin-java classes where applicable
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Have any idea?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If it is just copying the files without any processing or
>>>>>>>>>>>>>>>>> change, you can use something like this :
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>     public static void main(String[] args) throws
>>>>>>>>>>>>>>>>> IOException{
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>         Configuration configuration = new Configuration();
>>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Obviously you have to modify it as per your requirements
>>>>>>>>>>>>>>>>> like continuously polling the targeted directory for new files.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <
>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> As I am new in  Java and Hadoop and have no much
>>>>>>>>>>>>>>>>>> experience. I am trying to first write a simple program to upload data into
>>>>>>>>>>>>>>>>>> HDFS and gradually move forward. I have written the following simple
>>>>>>>>>>>>>>>>>> program to upload the file into HDFS, I dont know why it does not working.
>>>>>>>>>>>>>>>>>> could you please check it, if have time.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>>>>>>>>>> import java.io.File;
>>>>>>>>>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>>>>>>>>>> import java.nio.*;
>>>>>>>>>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>>>>>>>>> public class hdfsdata {
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> public static void main(String [] args) throws IOException
>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>     try{
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>>>>>>>>>>     conf.addResource(new
>>>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>     conf.addResource(new Path
>>>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>     //String fileName =
>>>>>>>>>>>>>>>>>> source.substring(source.lastIndexOf('/') + source.length());
>>>>>>>>>>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>     else
>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>         System.out.println("File" + dest + " already
>>>>>>>>>>>>>>>>>> exists");
>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>>>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>>>>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>>>>>>>>>>>>    int numbytes = 0;
>>>>>>>>>>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>    {
>>>>>>>>>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>>>    in.close();
>>>>>>>>>>>>>>>>>>    out.close();
>>>>>>>>>>>>>>>>>>    //bos.close();
>>>>>>>>>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>     catch(Exception e)
>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> KK
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> You can set your cronjob to execute the program after
>>>>>>>>>>>>>>>>>>> every 5 sec.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Well, I want to automatically upload the files as  the
>>>>>>>>>>>>>>>>>>>> files are generating about every 3-5 sec and each file has size about 3MB.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>  Is it possible to automate the system using put or cp
>>>>>>>>>>>>>>>>>>>> command?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I read about the flume and webHDFS but I am not sure it
>>>>>>>>>>>>>>>>>>>> will work or not.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander
>>>>>>>>>>>>>>>>>>>> Alten-Lorenz <wg...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> > HI,
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>> > I am generating files continuously in local folder
>>>>>>>>>>>>>>>>>>>>> of my base machine. How
>>>>>>>>>>>>>>>>>>>>> > I can now use the flume to stream the generated
>>>>>>>>>>>>>>>>>>>>> files from local folder to
>>>>>>>>>>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>>>>>>>>>>> > I dont know how exactly configure the sources, sinks
>>>>>>>>>>>>>>>>>>>>> and hdfs.
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>>>>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>>>>>>>>>>>> > 2) name node address: htdfs://
>>>>>>>>>>>>>>>>>>>>> hadoop1.example.com:8020
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>>>>>>>>>> > KK
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by kashif khan <dr...@gmail.com>.
Dear Many thanks


I have downloaded the jar file and added to project. Now getting another
error as:

og4j:WARN No appenders could be found for logger
(org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
more info.
Exception in thread "main" java.io.IOException: No FileSystem for scheme:
hdfs
    at
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2206)
    at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2213)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
    at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
    at CopyFile.main(CopyFile.java:14)

Have any idea about this?

Thanks again




On Tue, Nov 20, 2012 at 2:53 PM, Mohammad Tariq <do...@gmail.com> wrote:

> You can download the jar here :
> http://search.maven.org/remotecontent?filepath=com/google/guava/guava/13.0.1/guava-13.0.1.jar
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Tue, Nov 20, 2012 at 8:06 PM, kashif khan <dr...@gmail.com>wrote:
>
>> Could please let me know the name of jar file and location
>>
>> Many thanks
>>
>> Best regards
>>
>>
>> On Tue, Nov 20, 2012 at 2:33 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>
>>> Download the required jar and include it in your project.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Tue, Nov 20, 2012 at 7:57 PM, kashif khan <dr...@gmail.com>wrote:
>>>
>>>> Dear Tariq Thanks
>>>>
>>>> I have added the jar files from Cdh and download the cdh4 eclipse
>>>> plugin and copied into eclipse plugin folder. The previous error I think
>>>> sorted out but now I am getting another strange error.
>>>>
>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>> com/google/common/collect/Maps
>>>>     at
>>>> org.apache.hadoop.metrics2.lib.MetricsRegistry.<init>(MetricsRegistry.java:42)
>>>>     at
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:87)
>>>>     at
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:133)
>>>>     at
>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38)
>>>>     at
>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36)
>>>>     at
>>>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:97)
>>>>     at
>>>> org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:190)
>>>>     at
>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2373)
>>>>     at
>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2365)
>>>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2233)
>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>>>     at CopyFile.main(CopyFile.java:14)
>>>> Caused by: java.lang.ClassNotFoundException:
>>>> com.google.common.collect.Maps
>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
>>>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
>>>>     ... 13 more
>>>>
>>>> Have any idea about this error.
>>>>
>>>> Many thanks
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Nov 20, 2012 at 2:19 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>
>>>>> Hello Kashif,
>>>>>
>>>>>      You are correct. This because of some version mismatch. I am not
>>>>> using CDH personally but AFAIK, CDH4 uses Hadoop-2.x.
>>>>>
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Nov 20, 2012 at 4:10 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>
>>>>>> HI M Tariq
>>>>>>
>>>>>>
>>>>>> I am trying the following the program to create directory and copy
>>>>>> file to hdfs. But I am getting the following errors
>>>>>>
>>>>>>
>>>>>>
>>>>>> Program:
>>>>>>
>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>> import org.apache.hadoop.fs.Path;
>>>>>> import java.io.IOException;
>>>>>>
>>>>>> public class CopyFile {
>>>>>>
>>>>>>
>>>>>>         public static void main(String[] args) throws IOException{
>>>>>>         Configuration conf = new Configuration();
>>>>>>          conf.set("fs.default.name", "hadoop1.example.com:8020");
>>>>>>         FileSystem dfs = FileSystem.get(conf);
>>>>>>         String dirName = "Test1";
>>>>>>         Path src = new Path(dfs.getWorkingDirectory() + "/" +
>>>>>> dirName);
>>>>>>         dfs.mkdirs(src);
>>>>>>         Path scr1 = new Path("/usr/Eclipse/Output.csv");
>>>>>>         Path dst = new Path(dfs.getWorkingDirectory() + "/Test1/");
>>>>>>         dfs.copyFromLocalFile(src, dst);
>>>>>>
>>>>>>         }
>>>>>>         }
>>>>>>
>>>>>>
>>>>>>     Exception in thread "main" org.apache.hadoop.ipc.RemoteException:
>>>>>> Server IPC version 7 cannot communicate with client version 4
>>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>     at $Proxy1.getProtocolVersion(Unknown Source)
>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>>>>>>     at
>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>>>>>>     at
>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>>>>>>     at
>>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>>>>>>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>>>>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
>>>>>>     at CopyFile.main(CopyFile.java:11)
>>>>>>
>>>>>>
>>>>>>
>>>>>> I am using CDH4.1. i have download the source file of hadoop-1.0.4
>>>>>> and import the jar files into Eclipse. I think it is due to version
>>>>>> problem. Could you please let me know what will be correct version for the
>>>>>> CDH4.1?
>>>>>>
>>>>>> Many thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 3:41 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>>
>>>>>>> It should work. Same code is working fine for me. Try to create some
>>>>>>> other directory in your Hdfs and use it as your output path. Also see if
>>>>>>> you find something in datanode logs.
>>>>>>>
>>>>>>> Regards,
>>>>>>>     Mohammad Tariq
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <drkashif8310@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> The input path is fine. Problem in output path. I am just wonder
>>>>>>>> that it copy the data into local disk  (/user/root/) not into hdfs. I dont
>>>>>>>> know why? Is it we give the correct statement to point to hdfs?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <dontariq@gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Try this as your input file path
>>>>>>>>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>     Mohammad Tariq
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <
>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> when I am applying the command as
>>>>>>>>>>
>>>>>>>>>> $ hadoop fs -put /usr/Eclipse/Output.csv /user/root/Output.csv.
>>>>>>>>>>
>>>>>>>>>> its work fine and file browsing in the hdfs. But i dont know why
>>>>>>>>>> its not work in program.
>>>>>>>>>>
>>>>>>>>>> Many thanks for your cooperation.
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <
>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> It would be good if I could have a look on the files. Meantime
>>>>>>>>>>> try some other directories. Also, check the directory permissions once.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <
>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I have tried through root user and made the following changes:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>>>>>>>>
>>>>>>>>>>>> No result. The following is the log output. The log shows the
>>>>>>>>>>>> destination is null.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>>>>>>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>>>>>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>>>>>>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>>>>>>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <
>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Yeah, My cluster running. When brows
>>>>>>>>>>>>> http://hadoop1.example.com: 50070/dfshealth.jsp. I am getting
>>>>>>>>>>>>> the main page. Then click on Brows file system. I am getting the following:
>>>>>>>>>>>>>
>>>>>>>>>>>>> hbase
>>>>>>>>>>>>> tmp
>>>>>>>>>>>>> user
>>>>>>>>>>>>>
>>>>>>>>>>>>> And when click on user getting:
>>>>>>>>>>>>>
>>>>>>>>>>>>> beeswax
>>>>>>>>>>>>> huuser (I have created)
>>>>>>>>>>>>> root (I have created)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Would you like to see my configuration file. As did not change
>>>>>>>>>>>>> any things, all by default. I have installed CDH4.1 and running on VMs.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <
>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Is your cluster running fine? Are you able to browse Hdfs
>>>>>>>>>>>>>> through the Hdfs Web Console at 50070?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <
>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Many thanks.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have changed the program accordingly. It does not show any
>>>>>>>>>>>>>>> error but one warring , but when I am browsing the HDFS folder, file is not
>>>>>>>>>>>>>>> copied.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>>>>> public static void main(String[] args) throws IOException{
>>>>>>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>>>>>>         //Configuration configuration = new Configuration();
>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>         conf.addResource(new
>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>         conf.addResource(new Path
>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>          FileSystem fs = FileSystem.get(conf);
>>>>>>>>>>>>>>>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>>>         Path outputFile = new
>>>>>>>>>>>>>>> Path("/user/hduser/Output1.csv");
>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader
>>>>>>>>>>>>>>> <clinit>
>>>>>>>>>>>>>>> WARNING: Unable to load native-hadoop library for your
>>>>>>>>>>>>>>> platform... using builtin-java classes where applicable
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Have any idea?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <
>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If it is just copying the files without any processing or
>>>>>>>>>>>>>>>> change, you can use something like this :
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     public static void main(String[] args) throws
>>>>>>>>>>>>>>>> IOException{
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>         Configuration configuration = new Configuration();
>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Obviously you have to modify it as per your requirements
>>>>>>>>>>>>>>>> like continuously polling the targeted directory for new files.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <
>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> As I am new in  Java and Hadoop and have no much
>>>>>>>>>>>>>>>>> experience. I am trying to first write a simple program to upload data into
>>>>>>>>>>>>>>>>> HDFS and gradually move forward. I have written the following simple
>>>>>>>>>>>>>>>>> program to upload the file into HDFS, I dont know why it does not working.
>>>>>>>>>>>>>>>>> could you please check it, if have time.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>>>>>>>>> import java.io.File;
>>>>>>>>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>>>>>>>>> import java.nio.*;
>>>>>>>>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>>>>>>>> public class hdfsdata {
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> public static void main(String [] args) throws IOException
>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>     try{
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>>>>>>>>>     conf.addResource(new
>>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>     conf.addResource(new Path
>>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>     //String fileName =
>>>>>>>>>>>>>>>>> source.substring(source.lastIndexOf('/') + source.length());
>>>>>>>>>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>     else
>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>         System.out.println("File" + dest + " already
>>>>>>>>>>>>>>>>> exists");
>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>>>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>>>>>>>>>>>    int numbytes = 0;
>>>>>>>>>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>    {
>>>>>>>>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>>    in.close();
>>>>>>>>>>>>>>>>>    out.close();
>>>>>>>>>>>>>>>>>    //bos.close();
>>>>>>>>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>     catch(Exception e)
>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> KK
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> You can set your cronjob to execute the program after
>>>>>>>>>>>>>>>>>> every 5 sec.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Well, I want to automatically upload the files as  the
>>>>>>>>>>>>>>>>>>> files are generating about every 3-5 sec and each file has size about 3MB.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>  Is it possible to automate the system using put or cp
>>>>>>>>>>>>>>>>>>> command?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I read about the flume and webHDFS but I am not sure it
>>>>>>>>>>>>>>>>>>> will work or not.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz
>>>>>>>>>>>>>>>>>>> <wg...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> > HI,
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > I am generating files continuously in local folder of
>>>>>>>>>>>>>>>>>>>> my base machine. How
>>>>>>>>>>>>>>>>>>>> > I can now use the flume to stream the generated files
>>>>>>>>>>>>>>>>>>>> from local folder to
>>>>>>>>>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>>>>>>>>>> > I dont know how exactly configure the sources, sinks
>>>>>>>>>>>>>>>>>>>> and hdfs.
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>>>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>>>>>>>>>>> > 2) name node address: htdfs://
>>>>>>>>>>>>>>>>>>>> hadoop1.example.com:8020
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>>>>>>>>> > KK
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by Mohammad Tariq <do...@gmail.com>.
You can download the jar here :
http://search.maven.org/remotecontent?filepath=com/google/guava/guava/13.0.1/guava-13.0.1.jar

Regards,
    Mohammad Tariq



On Tue, Nov 20, 2012 at 8:06 PM, kashif khan <dr...@gmail.com> wrote:

> Could please let me know the name of jar file and location
>
> Many thanks
>
> Best regards
>
>
> On Tue, Nov 20, 2012 at 2:33 PM, Mohammad Tariq <do...@gmail.com>wrote:
>
>> Download the required jar and include it in your project.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Tue, Nov 20, 2012 at 7:57 PM, kashif khan <dr...@gmail.com>wrote:
>>
>>> Dear Tariq Thanks
>>>
>>> I have added the jar files from Cdh and download the cdh4 eclipse plugin
>>> and copied into eclipse plugin folder. The previous error I think sorted
>>> out but now I am getting another strange error.
>>>
>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>> com/google/common/collect/Maps
>>>     at
>>> org.apache.hadoop.metrics2.lib.MetricsRegistry.<init>(MetricsRegistry.java:42)
>>>     at
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:87)
>>>     at
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:133)
>>>     at
>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38)
>>>     at
>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36)
>>>     at
>>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:97)
>>>     at
>>> org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:190)
>>>     at
>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2373)
>>>     at
>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2365)
>>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2233)
>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>>     at CopyFile.main(CopyFile.java:14)
>>> Caused by: java.lang.ClassNotFoundException:
>>> com.google.common.collect.Maps
>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
>>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
>>>     ... 13 more
>>>
>>> Have any idea about this error.
>>>
>>> Many thanks
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Nov 20, 2012 at 2:19 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>
>>>> Hello Kashif,
>>>>
>>>>      You are correct. This because of some version mismatch. I am not
>>>> using CDH personally but AFAIK, CDH4 uses Hadoop-2.x.
>>>>
>>>> Regards,
>>>>     Mohammad Tariq
>>>>
>>>>
>>>>
>>>> On Tue, Nov 20, 2012 at 4:10 PM, kashif khan <dr...@gmail.com>wrote:
>>>>
>>>>> HI M Tariq
>>>>>
>>>>>
>>>>> I am trying the following the program to create directory and copy
>>>>> file to hdfs. But I am getting the following errors
>>>>>
>>>>>
>>>>>
>>>>> Program:
>>>>>
>>>>> import org.apache.hadoop.conf.Configuration;
>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>> import org.apache.hadoop.fs.Path;
>>>>> import java.io.IOException;
>>>>>
>>>>> public class CopyFile {
>>>>>
>>>>>
>>>>>         public static void main(String[] args) throws IOException{
>>>>>         Configuration conf = new Configuration();
>>>>>          conf.set("fs.default.name", "hadoop1.example.com:8020");
>>>>>         FileSystem dfs = FileSystem.get(conf);
>>>>>         String dirName = "Test1";
>>>>>         Path src = new Path(dfs.getWorkingDirectory() + "/" + dirName);
>>>>>         dfs.mkdirs(src);
>>>>>         Path scr1 = new Path("/usr/Eclipse/Output.csv");
>>>>>         Path dst = new Path(dfs.getWorkingDirectory() + "/Test1/");
>>>>>         dfs.copyFromLocalFile(src, dst);
>>>>>
>>>>>         }
>>>>>         }
>>>>>
>>>>>
>>>>>     Exception in thread "main" org.apache.hadoop.ipc.RemoteException:
>>>>> Server IPC version 7 cannot communicate with client version 4
>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>     at $Proxy1.getProtocolVersion(Unknown Source)
>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>>>>>     at
>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>>>>>     at
>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>>>>>     at
>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>>>>>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>>>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
>>>>>     at CopyFile.main(CopyFile.java:11)
>>>>>
>>>>>
>>>>>
>>>>> I am using CDH4.1. i have download the source file of hadoop-1.0.4 and
>>>>> import the jar files into Eclipse. I think it is due to version problem.
>>>>> Could you please let me know what will be correct version for the CDH4.1?
>>>>>
>>>>> Many thanks
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 3:41 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>
>>>>>> It should work. Same code is working fine for me. Try to create some
>>>>>> other directory in your Hdfs and use it as your output path. Also see if
>>>>>> you find something in datanode logs.
>>>>>>
>>>>>> Regards,
>>>>>>     Mohammad Tariq
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>>
>>>>>>> The input path is fine. Problem in output path. I am just wonder
>>>>>>> that it copy the data into local disk  (/user/root/) not into hdfs. I dont
>>>>>>> know why? Is it we give the correct statement to point to hdfs?
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>>>
>>>>>>>> Try this as your input file path
>>>>>>>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>     Mohammad Tariq
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <
>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> when I am applying the command as
>>>>>>>>>
>>>>>>>>> $ hadoop fs -put /usr/Eclipse/Output.csv /user/root/Output.csv.
>>>>>>>>>
>>>>>>>>> its work fine and file browsing in the hdfs. But i dont know why
>>>>>>>>> its not work in program.
>>>>>>>>>
>>>>>>>>> Many thanks for your cooperation.
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <
>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> It would be good if I could have a look on the files. Meantime
>>>>>>>>>> try some other directories. Also, check the directory permissions once.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <
>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I have tried through root user and made the following changes:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>>>>>>>
>>>>>>>>>>> No result. The following is the log output. The log shows the
>>>>>>>>>>> destination is null.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>>>>>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>>>>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>>>>>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>>>>>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <
>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Yeah, My cluster running. When brows http://hadoop1.example.com:
>>>>>>>>>>>> 50070/dfshealth.jsp. I am getting the main page. Then click on Brows file
>>>>>>>>>>>> system. I am getting the following:
>>>>>>>>>>>>
>>>>>>>>>>>> hbase
>>>>>>>>>>>> tmp
>>>>>>>>>>>> user
>>>>>>>>>>>>
>>>>>>>>>>>> And when click on user getting:
>>>>>>>>>>>>
>>>>>>>>>>>> beeswax
>>>>>>>>>>>> huuser (I have created)
>>>>>>>>>>>> root (I have created)
>>>>>>>>>>>>
>>>>>>>>>>>> Would you like to see my configuration file. As did not change
>>>>>>>>>>>> any things, all by default. I have installed CDH4.1 and running on VMs.
>>>>>>>>>>>>
>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <
>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Is your cluster running fine? Are you able to browse Hdfs
>>>>>>>>>>>>> through the Hdfs Web Console at 50070?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <
>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Many thanks.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have changed the program accordingly. It does not show any
>>>>>>>>>>>>>> error but one warring , but when I am browsing the HDFS folder, file is not
>>>>>>>>>>>>>> copied.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>>>> public static void main(String[] args) throws IOException{
>>>>>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>>>>>         //Configuration configuration = new Configuration();
>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>         conf.addResource(new
>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>         conf.addResource(new Path
>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>          FileSystem fs = FileSystem.get(conf);
>>>>>>>>>>>>>>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>>         Path outputFile = new
>>>>>>>>>>>>>> Path("/user/hduser/Output1.csv");
>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader
>>>>>>>>>>>>>> <clinit>
>>>>>>>>>>>>>> WARNING: Unable to load native-hadoop library for your
>>>>>>>>>>>>>> platform... using builtin-java classes where applicable
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Have any idea?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <
>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If it is just copying the files without any processing or
>>>>>>>>>>>>>>> change, you can use something like this :
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     public static void main(String[] args) throws
>>>>>>>>>>>>>>> IOException{
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>         Configuration configuration = new Configuration();
>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Obviously you have to modify it as per your requirements
>>>>>>>>>>>>>>> like continuously polling the targeted directory for new files.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <
>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> As I am new in  Java and Hadoop and have no much
>>>>>>>>>>>>>>>> experience. I am trying to first write a simple program to upload data into
>>>>>>>>>>>>>>>> HDFS and gradually move forward. I have written the following simple
>>>>>>>>>>>>>>>> program to upload the file into HDFS, I dont know why it does not working.
>>>>>>>>>>>>>>>> could you please check it, if have time.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>>>>>>>> import java.io.File;
>>>>>>>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>>>>>>>> import java.nio.*;
>>>>>>>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>>>>>>> public class hdfsdata {
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> public static void main(String [] args) throws IOException
>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>     try{
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>>>>>>>>     conf.addResource(new
>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>     conf.addResource(new Path
>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     //String fileName =
>>>>>>>>>>>>>>>> source.substring(source.lastIndexOf('/') + source.length());
>>>>>>>>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>     else
>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>         System.out.println("File" + dest + " already
>>>>>>>>>>>>>>>> exists");
>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>>>>>>>>>>    int numbytes = 0;
>>>>>>>>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>    {
>>>>>>>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>    in.close();
>>>>>>>>>>>>>>>>    out.close();
>>>>>>>>>>>>>>>>    //bos.close();
>>>>>>>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>     catch(Exception e)
>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> KK
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> You can set your cronjob to execute the program after
>>>>>>>>>>>>>>>>> every 5 sec.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Well, I want to automatically upload the files as  the
>>>>>>>>>>>>>>>>>> files are generating about every 3-5 sec and each file has size about 3MB.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>  Is it possible to automate the system using put or cp
>>>>>>>>>>>>>>>>>> command?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I read about the flume and webHDFS but I am not sure it
>>>>>>>>>>>>>>>>>> will work or not.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz
>>>>>>>>>>>>>>>>>> <wg...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> > HI,
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > I am generating files continuously in local folder of
>>>>>>>>>>>>>>>>>>> my base machine. How
>>>>>>>>>>>>>>>>>>> > I can now use the flume to stream the generated files
>>>>>>>>>>>>>>>>>>> from local folder to
>>>>>>>>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>>>>>>>>> > I dont know how exactly configure the sources, sinks
>>>>>>>>>>>>>>>>>>> and hdfs.
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>>>>>>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>>>>>>>> > KK
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by kashif khan <dr...@gmail.com>.
Could please let me know the name of jar file and location

Many thanks

Best regards

On Tue, Nov 20, 2012 at 2:33 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Download the required jar and include it in your project.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Tue, Nov 20, 2012 at 7:57 PM, kashif khan <dr...@gmail.com>wrote:
>
>> Dear Tariq Thanks
>>
>> I have added the jar files from Cdh and download the cdh4 eclipse plugin
>> and copied into eclipse plugin folder. The previous error I think sorted
>> out but now I am getting another strange error.
>>
>> Exception in thread "main" java.lang.NoClassDefFoundError:
>> com/google/common/collect/Maps
>>     at
>> org.apache.hadoop.metrics2.lib.MetricsRegistry.<init>(MetricsRegistry.java:42)
>>     at
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:87)
>>     at
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:133)
>>     at
>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38)
>>     at
>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36)
>>     at
>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:97)
>>     at
>> org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:190)
>>     at
>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2373)
>>     at
>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2365)
>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2233)
>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>     at CopyFile.main(CopyFile.java:14)
>> Caused by: java.lang.ClassNotFoundException:
>> com.google.common.collect.Maps
>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>>     at java.security.AccessController.doPrivileged(Native Method)
>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
>>     ... 13 more
>>
>> Have any idea about this error.
>>
>> Many thanks
>>
>>
>>
>>
>>
>> On Tue, Nov 20, 2012 at 2:19 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>
>>> Hello Kashif,
>>>
>>>      You are correct. This because of some version mismatch. I am not
>>> using CDH personally but AFAIK, CDH4 uses Hadoop-2.x.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Tue, Nov 20, 2012 at 4:10 PM, kashif khan <dr...@gmail.com>wrote:
>>>
>>>> HI M Tariq
>>>>
>>>>
>>>> I am trying the following the program to create directory and copy file
>>>> to hdfs. But I am getting the following errors
>>>>
>>>>
>>>>
>>>> Program:
>>>>
>>>> import org.apache.hadoop.conf.Configuration;
>>>> import org.apache.hadoop.fs.FileSystem;
>>>> import org.apache.hadoop.fs.Path;
>>>> import java.io.IOException;
>>>>
>>>> public class CopyFile {
>>>>
>>>>
>>>>         public static void main(String[] args) throws IOException{
>>>>         Configuration conf = new Configuration();
>>>>          conf.set("fs.default.name", "hadoop1.example.com:8020");
>>>>         FileSystem dfs = FileSystem.get(conf);
>>>>         String dirName = "Test1";
>>>>         Path src = new Path(dfs.getWorkingDirectory() + "/" + dirName);
>>>>         dfs.mkdirs(src);
>>>>         Path scr1 = new Path("/usr/Eclipse/Output.csv");
>>>>         Path dst = new Path(dfs.getWorkingDirectory() + "/Test1/");
>>>>         dfs.copyFromLocalFile(src, dst);
>>>>
>>>>         }
>>>>         }
>>>>
>>>>
>>>>     Exception in thread "main" org.apache.hadoop.ipc.RemoteException:
>>>> Server IPC version 7 cannot communicate with client version 4
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>     at $Proxy1.getProtocolVersion(Unknown Source)
>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>>>>     at
>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>>>>     at
>>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>>>>     at
>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>>>>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
>>>>     at CopyFile.main(CopyFile.java:11)
>>>>
>>>>
>>>>
>>>> I am using CDH4.1. i have download the source file of hadoop-1.0.4 and
>>>> import the jar files into Eclipse. I think it is due to version problem.
>>>> Could you please let me know what will be correct version for the CDH4.1?
>>>>
>>>> Many thanks
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 3:41 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>
>>>>> It should work. Same code is working fine for me. Try to create some
>>>>> other directory in your Hdfs and use it as your output path. Also see if
>>>>> you find something in datanode logs.
>>>>>
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>
>>>>>> The input path is fine. Problem in output path. I am just wonder that
>>>>>> it copy the data into local disk  (/user/root/) not into hdfs. I dont know
>>>>>> why? Is it we give the correct statement to point to hdfs?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>>
>>>>>>> Try this as your input file path
>>>>>>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>>>>>>
>>>>>>> Regards,
>>>>>>>     Mohammad Tariq
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <drkashif8310@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> when I am applying the command as
>>>>>>>>
>>>>>>>> $ hadoop fs -put /usr/Eclipse/Output.csv /user/root/Output.csv.
>>>>>>>>
>>>>>>>> its work fine and file browsing in the hdfs. But i dont know why
>>>>>>>> its not work in program.
>>>>>>>>
>>>>>>>> Many thanks for your cooperation.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <dontariq@gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> It would be good if I could have a look on the files. Meantime try
>>>>>>>>> some other directories. Also, check the directory permissions once.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>     Mohammad Tariq
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <
>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I have tried through root user and made the following changes:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>>>>>>
>>>>>>>>>> No result. The following is the log output. The log shows the
>>>>>>>>>> destination is null.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>>>>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>>>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>>>>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>>>>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <
>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yeah, My cluster running. When brows http://hadoop1.example.com:
>>>>>>>>>>> 50070/dfshealth.jsp. I am getting the main page. Then click on Brows file
>>>>>>>>>>> system. I am getting the following:
>>>>>>>>>>>
>>>>>>>>>>> hbase
>>>>>>>>>>> tmp
>>>>>>>>>>> user
>>>>>>>>>>>
>>>>>>>>>>> And when click on user getting:
>>>>>>>>>>>
>>>>>>>>>>> beeswax
>>>>>>>>>>> huuser (I have created)
>>>>>>>>>>> root (I have created)
>>>>>>>>>>>
>>>>>>>>>>> Would you like to see my configuration file. As did not change
>>>>>>>>>>> any things, all by default. I have installed CDH4.1 and running on VMs.
>>>>>>>>>>>
>>>>>>>>>>> Many thanks
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <
>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Is your cluster running fine? Are you able to browse Hdfs
>>>>>>>>>>>> through the Hdfs Web Console at 50070?
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <
>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Many thanks.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have changed the program accordingly. It does not show any
>>>>>>>>>>>>> error but one warring , but when I am browsing the HDFS folder, file is not
>>>>>>>>>>>>> copied.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>>> public static void main(String[] args) throws IOException{
>>>>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>>>>         //Configuration configuration = new Configuration();
>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>
>>>>>>>>>>>>>         conf.addResource(new
>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>         conf.addResource(new Path
>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>          FileSystem fs = FileSystem.get(conf);
>>>>>>>>>>>>>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>         Path outputFile = new Path("/user/hduser/Output1.csv");
>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>     }
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader
>>>>>>>>>>>>> <clinit>
>>>>>>>>>>>>> WARNING: Unable to load native-hadoop library for your
>>>>>>>>>>>>> platform... using builtin-java classes where applicable
>>>>>>>>>>>>>
>>>>>>>>>>>>> Have any idea?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <
>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> If it is just copying the files without any processing or
>>>>>>>>>>>>>> change, you can use something like this :
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     public static void main(String[] args) throws IOException{
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>         Configuration configuration = new Configuration();
>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Obviously you have to modify it as per your requirements like
>>>>>>>>>>>>>> continuously polling the targeted directory for new files.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <
>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As I am new in  Java and Hadoop and have no much experience.
>>>>>>>>>>>>>>> I am trying to first write a simple program to upload data into HDFS and
>>>>>>>>>>>>>>> gradually move forward. I have written the following simple program to
>>>>>>>>>>>>>>> upload the file into HDFS, I dont know why it does not working.  could you
>>>>>>>>>>>>>>> please check it, if have time.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>>>>>>> import java.io.File;
>>>>>>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>>>>>>> import java.nio.*;
>>>>>>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>>>>>> public class hdfsdata {
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> public static void main(String [] args) throws IOException
>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>     try{
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>>>>>>>     conf.addResource(new
>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>     conf.addResource(new Path
>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     //String fileName =
>>>>>>>>>>>>>>> source.substring(source.lastIndexOf('/') + source.length());
>>>>>>>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>     else
>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>         System.out.println("File" + dest + " already
>>>>>>>>>>>>>>> exists");
>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>>>>>>>>>    int numbytes = 0;
>>>>>>>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>    {
>>>>>>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>    in.close();
>>>>>>>>>>>>>>>    out.close();
>>>>>>>>>>>>>>>    //bos.close();
>>>>>>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>     catch(Exception e)
>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> KK
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> You can set your cronjob to execute the program after every
>>>>>>>>>>>>>>>> 5 sec.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Well, I want to automatically upload the files as  the
>>>>>>>>>>>>>>>>> files are generating about every 3-5 sec and each file has size about 3MB.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  Is it possible to automate the system using put or cp
>>>>>>>>>>>>>>>>> command?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I read about the flume and webHDFS but I am not sure it
>>>>>>>>>>>>>>>>> will work or not.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>>>>>>>>>>>>>>>>> wget.null@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> > HI,
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > I am generating files continuously in local folder of
>>>>>>>>>>>>>>>>>> my base machine. How
>>>>>>>>>>>>>>>>>> > I can now use the flume to stream the generated files
>>>>>>>>>>>>>>>>>> from local folder to
>>>>>>>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>>>>>>>> > I dont know how exactly configure the sources, sinks
>>>>>>>>>>>>>>>>>> and hdfs.
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>>>>>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>>>>>>> > KK
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by Mohammad Tariq <do...@gmail.com>.
Download the required jar and include it in your project.

Regards,
    Mohammad Tariq



On Tue, Nov 20, 2012 at 7:57 PM, kashif khan <dr...@gmail.com> wrote:

> Dear Tariq Thanks
>
> I have added the jar files from Cdh and download the cdh4 eclipse plugin
> and copied into eclipse plugin folder. The previous error I think sorted
> out but now I am getting another strange error.
>
> Exception in thread "main" java.lang.NoClassDefFoundError:
> com/google/common/collect/Maps
>     at
> org.apache.hadoop.metrics2.lib.MetricsRegistry.<init>(MetricsRegistry.java:42)
>     at
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:87)
>     at
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:133)
>     at
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38)
>     at
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36)
>     at
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:97)
>     at
> org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:190)
>     at
> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2373)
>     at
> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2365)
>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2233)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>     at CopyFile.main(CopyFile.java:14)
> Caused by: java.lang.ClassNotFoundException: com.google.common.collect.Maps
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
>     ... 13 more
>
> Have any idea about this error.
>
> Many thanks
>
>
>
>
>
> On Tue, Nov 20, 2012 at 2:19 PM, Mohammad Tariq <do...@gmail.com>wrote:
>
>> Hello Kashif,
>>
>>      You are correct. This because of some version mismatch. I am not
>> using CDH personally but AFAIK, CDH4 uses Hadoop-2.x.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Tue, Nov 20, 2012 at 4:10 PM, kashif khan <dr...@gmail.com>wrote:
>>
>>> HI M Tariq
>>>
>>>
>>> I am trying the following the program to create directory and copy file
>>> to hdfs. But I am getting the following errors
>>>
>>>
>>>
>>> Program:
>>>
>>> import org.apache.hadoop.conf.Configuration;
>>> import org.apache.hadoop.fs.FileSystem;
>>> import org.apache.hadoop.fs.Path;
>>> import java.io.IOException;
>>>
>>> public class CopyFile {
>>>
>>>
>>>         public static void main(String[] args) throws IOException{
>>>         Configuration conf = new Configuration();
>>>          conf.set("fs.default.name", "hadoop1.example.com:8020");
>>>         FileSystem dfs = FileSystem.get(conf);
>>>         String dirName = "Test1";
>>>         Path src = new Path(dfs.getWorkingDirectory() + "/" + dirName);
>>>         dfs.mkdirs(src);
>>>         Path scr1 = new Path("/usr/Eclipse/Output.csv");
>>>         Path dst = new Path(dfs.getWorkingDirectory() + "/Test1/");
>>>         dfs.copyFromLocalFile(src, dst);
>>>
>>>         }
>>>         }
>>>
>>>
>>>     Exception in thread "main" org.apache.hadoop.ipc.RemoteException:
>>> Server IPC version 7 cannot communicate with client version 4
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>     at $Proxy1.getProtocolVersion(Unknown Source)
>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>>>     at
>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>>>     at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>>>     at
>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>>>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
>>>     at CopyFile.main(CopyFile.java:11)
>>>
>>>
>>>
>>> I am using CDH4.1. i have download the source file of hadoop-1.0.4 and
>>> import the jar files into Eclipse. I think it is due to version problem.
>>> Could you please let me know what will be correct version for the CDH4.1?
>>>
>>> Many thanks
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 3:41 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>
>>>> It should work. Same code is working fine for me. Try to create some
>>>> other directory in your Hdfs and use it as your output path. Also see if
>>>> you find something in datanode logs.
>>>>
>>>> Regards,
>>>>     Mohammad Tariq
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <dr...@gmail.com>wrote:
>>>>
>>>>> The input path is fine. Problem in output path. I am just wonder that
>>>>> it copy the data into local disk  (/user/root/) not into hdfs. I dont know
>>>>> why? Is it we give the correct statement to point to hdfs?
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>
>>>>>> Try this as your input file path
>>>>>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>>>>>
>>>>>> Regards,
>>>>>>     Mohammad Tariq
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>>
>>>>>>> when I am applying the command as
>>>>>>>
>>>>>>> $ hadoop fs -put /usr/Eclipse/Output.csv /user/root/Output.csv.
>>>>>>>
>>>>>>> its work fine and file browsing in the hdfs. But i dont know why its
>>>>>>> not work in program.
>>>>>>>
>>>>>>> Many thanks for your cooperation.
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>>>
>>>>>>>> It would be good if I could have a look on the files. Meantime try
>>>>>>>> some other directories. Also, check the directory permissions once.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>     Mohammad Tariq
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <
>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I have tried through root user and made the following changes:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>>>>>
>>>>>>>>> No result. The following is the log output. The log shows the
>>>>>>>>> destination is null.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>>>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>>>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>>>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <
>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Yeah, My cluster running. When brows http://hadoop1.example.com:
>>>>>>>>>> 50070/dfshealth.jsp. I am getting the main page. Then click on Brows file
>>>>>>>>>> system. I am getting the following:
>>>>>>>>>>
>>>>>>>>>> hbase
>>>>>>>>>> tmp
>>>>>>>>>> user
>>>>>>>>>>
>>>>>>>>>> And when click on user getting:
>>>>>>>>>>
>>>>>>>>>> beeswax
>>>>>>>>>> huuser (I have created)
>>>>>>>>>> root (I have created)
>>>>>>>>>>
>>>>>>>>>> Would you like to see my configuration file. As did not change
>>>>>>>>>> any things, all by default. I have installed CDH4.1 and running on VMs.
>>>>>>>>>>
>>>>>>>>>> Many thanks
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <
>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Is your cluster running fine? Are you able to browse Hdfs
>>>>>>>>>>> through the Hdfs Web Console at 50070?
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <
>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Many thanks.
>>>>>>>>>>>>
>>>>>>>>>>>> I have changed the program accordingly. It does not show any
>>>>>>>>>>>> error but one warring , but when I am browsing the HDFS folder, file is not
>>>>>>>>>>>> copied.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>> public static void main(String[] args) throws IOException{
>>>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>>>         //Configuration configuration = new Configuration();
>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>
>>>>>>>>>>>>         conf.addResource(new
>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>         conf.addResource(new Path
>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>          FileSystem fs = FileSystem.get(conf);
>>>>>>>>>>>>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>         Path outputFile = new Path("/user/hduser/Output1.csv");
>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>     }
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader
>>>>>>>>>>>> <clinit>
>>>>>>>>>>>> WARNING: Unable to load native-hadoop library for your
>>>>>>>>>>>> platform... using builtin-java classes where applicable
>>>>>>>>>>>>
>>>>>>>>>>>> Have any idea?
>>>>>>>>>>>>
>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <
>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> If it is just copying the files without any processing or
>>>>>>>>>>>>> change, you can use something like this :
>>>>>>>>>>>>>
>>>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>>>
>>>>>>>>>>>>>     public static void main(String[] args) throws IOException{
>>>>>>>>>>>>>
>>>>>>>>>>>>>         Configuration configuration = new Configuration();
>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>     }
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>> Obviously you have to modify it as per your requirements like
>>>>>>>>>>>>> continuously polling the targeted directory for new files.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <
>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As I am new in  Java and Hadoop and have no much experience.
>>>>>>>>>>>>>> I am trying to first write a simple program to upload data into HDFS and
>>>>>>>>>>>>>> gradually move forward. I have written the following simple program to
>>>>>>>>>>>>>> upload the file into HDFS, I dont know why it does not working.  could you
>>>>>>>>>>>>>> please check it, if have time.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>>>>>> import java.io.File;
>>>>>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>>>>>> import java.nio.*;
>>>>>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>>>>> public class hdfsdata {
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> public static void main(String [] args) throws IOException
>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>     try{
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>>>>>>     conf.addResource(new
>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>     conf.addResource(new Path
>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     //String fileName =
>>>>>>>>>>>>>> source.substring(source.lastIndexOf('/') + source.length());
>>>>>>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>     else
>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>         System.out.println("File" + dest + " already exists");
>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>>>>>>>>    int numbytes = 0;
>>>>>>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>    {
>>>>>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>    in.close();
>>>>>>>>>>>>>>    out.close();
>>>>>>>>>>>>>>    //bos.close();
>>>>>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>     catch(Exception e)
>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> KK
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> You can set your cronjob to execute the program after every
>>>>>>>>>>>>>>> 5 sec.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Well, I want to automatically upload the files as  the
>>>>>>>>>>>>>>>> files are generating about every 3-5 sec and each file has size about 3MB.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  Is it possible to automate the system using put or cp
>>>>>>>>>>>>>>>> command?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I read about the flume and webHDFS but I am not sure it
>>>>>>>>>>>>>>>> will work or not.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>>>>>>>>>>>>>>>> wget.null@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> > HI,
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > I am generating files continuously in local folder of my
>>>>>>>>>>>>>>>>> base machine. How
>>>>>>>>>>>>>>>>> > I can now use the flume to stream the generated files
>>>>>>>>>>>>>>>>> from local folder to
>>>>>>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>>>>>>> > I dont know how exactly configure the sources, sinks and
>>>>>>>>>>>>>>>>> hdfs.
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>>>>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>>>>>> > KK
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by kashif khan <dr...@gmail.com>.
Dear Tariq Thanks

I have added the jar files from Cdh and download the cdh4 eclipse plugin
and copied into eclipse plugin folder. The previous error I think sorted
out but now I am getting another strange error.

Exception in thread "main" java.lang.NoClassDefFoundError:
com/google/common/collect/Maps
    at
org.apache.hadoop.metrics2.lib.MetricsRegistry.<init>(MetricsRegistry.java:42)
    at
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:87)
    at
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:133)
    at
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38)
    at
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36)
    at
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:97)
    at
org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:190)
    at
org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2373)
    at
org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2365)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2233)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
    at CopyFile.main(CopyFile.java:14)
Caused by: java.lang.ClassNotFoundException: com.google.common.collect.Maps
    at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
    ... 13 more

Have any idea about this error.

Many thanks




On Tue, Nov 20, 2012 at 2:19 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Kashif,
>
>      You are correct. This because of some version mismatch. I am not
> using CDH personally but AFAIK, CDH4 uses Hadoop-2.x.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Tue, Nov 20, 2012 at 4:10 PM, kashif khan <dr...@gmail.com>wrote:
>
>> HI M Tariq
>>
>>
>> I am trying the following the program to create directory and copy file
>> to hdfs. But I am getting the following errors
>>
>>
>>
>> Program:
>>
>> import org.apache.hadoop.conf.Configuration;
>> import org.apache.hadoop.fs.FileSystem;
>> import org.apache.hadoop.fs.Path;
>> import java.io.IOException;
>>
>> public class CopyFile {
>>
>>
>>         public static void main(String[] args) throws IOException{
>>         Configuration conf = new Configuration();
>>          conf.set("fs.default.name", "hadoop1.example.com:8020");
>>         FileSystem dfs = FileSystem.get(conf);
>>         String dirName = "Test1";
>>         Path src = new Path(dfs.getWorkingDirectory() + "/" + dirName);
>>         dfs.mkdirs(src);
>>         Path scr1 = new Path("/usr/Eclipse/Output.csv");
>>         Path dst = new Path(dfs.getWorkingDirectory() + "/Test1/");
>>         dfs.copyFromLocalFile(src, dst);
>>
>>         }
>>         }
>>
>>
>>     Exception in thread "main" org.apache.hadoop.ipc.RemoteException:
>> Server IPC version 7 cannot communicate with client version 4
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>     at $Proxy1.getProtocolVersion(Unknown Source)
>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>>     at
>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>>     at
>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>>     at
>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
>>     at CopyFile.main(CopyFile.java:11)
>>
>>
>>
>> I am using CDH4.1. i have download the source file of hadoop-1.0.4 and
>> import the jar files into Eclipse. I think it is due to version problem.
>> Could you please let me know what will be correct version for the CDH4.1?
>>
>> Many thanks
>>
>>
>>
>>
>>
>>
>> On Mon, Nov 19, 2012 at 3:41 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>
>>> It should work. Same code is working fine for me. Try to create some
>>> other directory in your Hdfs and use it as your output path. Also see if
>>> you find something in datanode logs.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <dr...@gmail.com>wrote:
>>>
>>>> The input path is fine. Problem in output path. I am just wonder that
>>>> it copy the data into local disk  (/user/root/) not into hdfs. I dont know
>>>> why? Is it we give the correct statement to point to hdfs?
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>
>>>>> Try this as your input file path
>>>>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>>>>
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>
>>>>>> when I am applying the command as
>>>>>>
>>>>>> $ hadoop fs -put /usr/Eclipse/Output.csv /user/root/Output.csv.
>>>>>>
>>>>>> its work fine and file browsing in the hdfs. But i dont know why its
>>>>>> not work in program.
>>>>>>
>>>>>> Many thanks for your cooperation.
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>>
>>>>>>> It would be good if I could have a look on the files. Meantime try
>>>>>>> some other directories. Also, check the directory permissions once.
>>>>>>>
>>>>>>> Regards,
>>>>>>>     Mohammad Tariq
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <drkashif8310@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> I have tried through root user and made the following changes:
>>>>>>>>
>>>>>>>>
>>>>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>>>>
>>>>>>>> No result. The following is the log output. The log shows the
>>>>>>>> destination is null.
>>>>>>>>
>>>>>>>>
>>>>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <
>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Yeah, My cluster running. When brows http://hadoop1.example.com:
>>>>>>>>> 50070/dfshealth.jsp. I am getting the main page. Then click on Brows file
>>>>>>>>> system. I am getting the following:
>>>>>>>>>
>>>>>>>>> hbase
>>>>>>>>> tmp
>>>>>>>>> user
>>>>>>>>>
>>>>>>>>> And when click on user getting:
>>>>>>>>>
>>>>>>>>> beeswax
>>>>>>>>> huuser (I have created)
>>>>>>>>> root (I have created)
>>>>>>>>>
>>>>>>>>> Would you like to see my configuration file. As did not change any
>>>>>>>>> things, all by default. I have installed CDH4.1 and running on VMs.
>>>>>>>>>
>>>>>>>>> Many thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <
>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Is your cluster running fine? Are you able to browse Hdfs through
>>>>>>>>>> the Hdfs Web Console at 50070?
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <
>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Many thanks.
>>>>>>>>>>>
>>>>>>>>>>> I have changed the program accordingly. It does not show any
>>>>>>>>>>> error but one warring , but when I am browsing the HDFS folder, file is not
>>>>>>>>>>> copied.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> public class CopyData {
>>>>>>>>>>> public static void main(String[] args) throws IOException{
>>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>>         //Configuration configuration = new Configuration();
>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>
>>>>>>>>>>>         conf.addResource(new
>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>         conf.addResource(new Path
>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>          FileSystem fs = FileSystem.get(conf);
>>>>>>>>>>>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>         Path outputFile = new Path("/user/hduser/Output1.csv");
>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>         fs.close();
>>>>>>>>>>>     }
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader
>>>>>>>>>>> <clinit>
>>>>>>>>>>> WARNING: Unable to load native-hadoop library for your
>>>>>>>>>>> platform... using builtin-java classes where applicable
>>>>>>>>>>>
>>>>>>>>>>> Have any idea?
>>>>>>>>>>>
>>>>>>>>>>> Many thanks
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <
>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> If it is just copying the files without any processing or
>>>>>>>>>>>> change, you can use something like this :
>>>>>>>>>>>>
>>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>>
>>>>>>>>>>>>     public static void main(String[] args) throws IOException{
>>>>>>>>>>>>
>>>>>>>>>>>>         Configuration configuration = new Configuration();
>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>     }
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> Obviously you have to modify it as per your requirements like
>>>>>>>>>>>> continuously polling the targeted directory for new files.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <
>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>>>>
>>>>>>>>>>>>> As I am new in  Java and Hadoop and have no much experience. I
>>>>>>>>>>>>> am trying to first write a simple program to upload data into HDFS and
>>>>>>>>>>>>> gradually move forward. I have written the following simple program to
>>>>>>>>>>>>> upload the file into HDFS, I dont know why it does not working.  could you
>>>>>>>>>>>>> please check it, if have time.
>>>>>>>>>>>>>
>>>>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>>>>> import java.io.File;
>>>>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>>>>> import java.nio.*;
>>>>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>>>>
>>>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>>>> public class hdfsdata {
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> public static void main(String [] args) throws IOException
>>>>>>>>>>>>> {
>>>>>>>>>>>>>     try{
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>>>>>     conf.addResource(new
>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>     conf.addResource(new Path
>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>>>>>
>>>>>>>>>>>>>     //String fileName =
>>>>>>>>>>>>> source.substring(source.lastIndexOf('/') + source.length());
>>>>>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>>>>>
>>>>>>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>>>>>>     {
>>>>>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>>>>>     }
>>>>>>>>>>>>>     else
>>>>>>>>>>>>>     {
>>>>>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>>>>>
>>>>>>>>>>>>>     }
>>>>>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>>>>     {
>>>>>>>>>>>>>         System.out.println("File" + dest + " already exists");
>>>>>>>>>>>>>     }
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>>>>>>>    int numbytes = 0;
>>>>>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>>>>>
>>>>>>>>>>>>>    {
>>>>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>>>>    }
>>>>>>>>>>>>>    in.close();
>>>>>>>>>>>>>    out.close();
>>>>>>>>>>>>>    //bos.close();
>>>>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>>>>     }
>>>>>>>>>>>>>     catch(Exception e)
>>>>>>>>>>>>>     {
>>>>>>>>>>>>>
>>>>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>>>>     }
>>>>>>>>>>>>>     }
>>>>>>>>>>>>>
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>> KK
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> You can set your cronjob to execute the program after every 5
>>>>>>>>>>>>>> sec.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Well, I want to automatically upload the files as  the files
>>>>>>>>>>>>>>> are generating about every 3-5 sec and each file has size about 3MB.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  Is it possible to automate the system using put or cp
>>>>>>>>>>>>>>> command?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I read about the flume and webHDFS but I am not sure it will
>>>>>>>>>>>>>>> work or not.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>>>>>>>>>>>>>>> wget.null@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> > HI,
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > I am generating files continuously in local folder of my
>>>>>>>>>>>>>>>> base machine. How
>>>>>>>>>>>>>>>> > I can now use the flume to stream the generated files
>>>>>>>>>>>>>>>> from local folder to
>>>>>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>>>>>> > I dont know how exactly configure the sources, sinks and
>>>>>>>>>>>>>>>> hdfs.
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>>>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>>>>> > KK
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Kashif,

     You are correct. This because of some version mismatch. I am not using
CDH personally but AFAIK, CDH4 uses Hadoop-2.x.

Regards,
    Mohammad Tariq



On Tue, Nov 20, 2012 at 4:10 PM, kashif khan <dr...@gmail.com> wrote:

> HI M Tariq
>
>
> I am trying the following the program to create directory and copy file to
> hdfs. But I am getting the following errors
>
>
>
> Program:
>
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.Path;
> import java.io.IOException;
>
> public class CopyFile {
>
>
>         public static void main(String[] args) throws IOException{
>         Configuration conf = new Configuration();
>          conf.set("fs.default.name", "hadoop1.example.com:8020");
>         FileSystem dfs = FileSystem.get(conf);
>         String dirName = "Test1";
>         Path src = new Path(dfs.getWorkingDirectory() + "/" + dirName);
>         dfs.mkdirs(src);
>         Path scr1 = new Path("/usr/Eclipse/Output.csv");
>         Path dst = new Path(dfs.getWorkingDirectory() + "/Test1/");
>         dfs.copyFromLocalFile(src, dst);
>
>         }
>         }
>
>
>     Exception in thread "main" org.apache.hadoop.ipc.RemoteException:
> Server IPC version 7 cannot communicate with client version 4
>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>     at $Proxy1.getProtocolVersion(Unknown Source)
>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>     at
> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>     at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>     at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
>     at CopyFile.main(CopyFile.java:11)
>
>
>
> I am using CDH4.1. i have download the source file of hadoop-1.0.4 and
> import the jar files into Eclipse. I think it is due to version problem.
> Could you please let me know what will be correct version for the CDH4.1?
>
> Many thanks
>
>
>
>
>
>
> On Mon, Nov 19, 2012 at 3:41 PM, Mohammad Tariq <do...@gmail.com>wrote:
>
>> It should work. Same code is working fine for me. Try to create some
>> other directory in your Hdfs and use it as your output path. Also see if
>> you find something in datanode logs.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <dr...@gmail.com>wrote:
>>
>>> The input path is fine. Problem in output path. I am just wonder that it
>>> copy the data into local disk  (/user/root/) not into hdfs. I dont know
>>> why? Is it we give the correct statement to point to hdfs?
>>>
>>> Thanks
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>
>>>> Try this as your input file path
>>>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>>>
>>>> Regards,
>>>>     Mohammad Tariq
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <dr...@gmail.com>wrote:
>>>>
>>>>> when I am applying the command as
>>>>>
>>>>> $ hadoop fs -put /usr/Eclipse/Output.csv /user/root/Output.csv.
>>>>>
>>>>> its work fine and file browsing in the hdfs. But i dont know why its
>>>>> not work in program.
>>>>>
>>>>> Many thanks for your cooperation.
>>>>>
>>>>> Best regards,
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>
>>>>>> It would be good if I could have a look on the files. Meantime try
>>>>>> some other directories. Also, check the directory permissions once.
>>>>>>
>>>>>> Regards,
>>>>>>     Mohammad Tariq
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>>
>>>>>>>
>>>>>>> I have tried through root user and made the following changes:
>>>>>>>
>>>>>>>
>>>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>>>
>>>>>>> No result. The following is the log output. The log shows the
>>>>>>> destination is null.
>>>>>>>
>>>>>>>
>>>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <drkashif8310@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Yeah, My cluster running. When brows http://hadoop1.example.com:
>>>>>>>> 50070/dfshealth.jsp. I am getting the main page. Then click on Brows file
>>>>>>>> system. I am getting the following:
>>>>>>>>
>>>>>>>> hbase
>>>>>>>> tmp
>>>>>>>> user
>>>>>>>>
>>>>>>>> And when click on user getting:
>>>>>>>>
>>>>>>>> beeswax
>>>>>>>> huuser (I have created)
>>>>>>>> root (I have created)
>>>>>>>>
>>>>>>>> Would you like to see my configuration file. As did not change any
>>>>>>>> things, all by default. I have installed CDH4.1 and running on VMs.
>>>>>>>>
>>>>>>>> Many thanks
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <dontariq@gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Is your cluster running fine? Are you able to browse Hdfs through
>>>>>>>>> the Hdfs Web Console at 50070?
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>     Mohammad Tariq
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <
>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Many thanks.
>>>>>>>>>>
>>>>>>>>>> I have changed the program accordingly. It does not show any
>>>>>>>>>> error but one warring , but when I am browsing the HDFS folder, file is not
>>>>>>>>>> copied.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> public class CopyData {
>>>>>>>>>> public static void main(String[] args) throws IOException{
>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>         //Configuration configuration = new Configuration();
>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>
>>>>>>>>>>         conf.addResource(new
>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>         conf.addResource(new Path
>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>          FileSystem fs = FileSystem.get(conf);
>>>>>>>>>>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>         Path outputFile = new Path("/user/hduser/Output1.csv");
>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>         fs.close();
>>>>>>>>>>     }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader
>>>>>>>>>> <clinit>
>>>>>>>>>> WARNING: Unable to load native-hadoop library for your
>>>>>>>>>> platform... using builtin-java classes where applicable
>>>>>>>>>>
>>>>>>>>>> Have any idea?
>>>>>>>>>>
>>>>>>>>>> Many thanks
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <
>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> If it is just copying the files without any processing or
>>>>>>>>>>> change, you can use something like this :
>>>>>>>>>>>
>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>
>>>>>>>>>>>     public static void main(String[] args) throws IOException{
>>>>>>>>>>>
>>>>>>>>>>>         Configuration configuration = new Configuration();
>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>         fs.close();
>>>>>>>>>>>     }
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> Obviously you have to modify it as per your requirements like
>>>>>>>>>>> continuously polling the targeted directory for new files.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <
>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>>>
>>>>>>>>>>>> As I am new in  Java and Hadoop and have no much experience. I
>>>>>>>>>>>> am trying to first write a simple program to upload data into HDFS and
>>>>>>>>>>>> gradually move forward. I have written the following simple program to
>>>>>>>>>>>> upload the file into HDFS, I dont know why it does not working.  could you
>>>>>>>>>>>> please check it, if have time.
>>>>>>>>>>>>
>>>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>>>> import java.io.File;
>>>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>>>> import java.nio.*;
>>>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>>>
>>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>>> public class hdfsdata {
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> public static void main(String [] args) throws IOException
>>>>>>>>>>>> {
>>>>>>>>>>>>     try{
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>>>>     conf.addResource(new
>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>     conf.addResource(new Path
>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>>>>
>>>>>>>>>>>>     //String fileName =
>>>>>>>>>>>> source.substring(source.lastIndexOf('/') + source.length());
>>>>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>>>>
>>>>>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>>>>>     {
>>>>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>>>>     }
>>>>>>>>>>>>     else
>>>>>>>>>>>>     {
>>>>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>>>>
>>>>>>>>>>>>     }
>>>>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>>>     {
>>>>>>>>>>>>         System.out.println("File" + dest + " already exists");
>>>>>>>>>>>>     }
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>>>>>>    int numbytes = 0;
>>>>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>>>>
>>>>>>>>>>>>    {
>>>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>>>    }
>>>>>>>>>>>>    in.close();
>>>>>>>>>>>>    out.close();
>>>>>>>>>>>>    //bos.close();
>>>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>>>     }
>>>>>>>>>>>>     catch(Exception e)
>>>>>>>>>>>>     {
>>>>>>>>>>>>
>>>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>>>     }
>>>>>>>>>>>>     }
>>>>>>>>>>>>
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>
>>>>>>>>>>>> KK
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> You can set your cronjob to execute the program after every 5
>>>>>>>>>>>>> sec.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Well, I want to automatically upload the files as  the files
>>>>>>>>>>>>>> are generating about every 3-5 sec and each file has size about 3MB.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  Is it possible to automate the system using put or cp
>>>>>>>>>>>>>> command?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I read about the flume and webHDFS but I am not sure it will
>>>>>>>>>>>>>> work or not.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>>>>>>>>>>>>>> wget.null@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> > HI,
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > I am generating files continuously in local folder of my
>>>>>>>>>>>>>>> base machine. How
>>>>>>>>>>>>>>> > I can now use the flume to stream the generated files from
>>>>>>>>>>>>>>> local folder to
>>>>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>>>>> > I dont know how exactly configure the sources, sinks and
>>>>>>>>>>>>>>> hdfs.
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>>>> > KK
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by kashif khan <dr...@gmail.com>.
HI M Tariq


I am trying the following the program to create directory and copy file to
hdfs. But I am getting the following errors



Program:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import java.io.IOException;

public class CopyFile {

        public static void main(String[] args) throws IOException{
        Configuration conf = new Configuration();
        conf.set("fs.default.name", "hadoop1.example.com:8020");
        FileSystem dfs = FileSystem.get(conf);
        String dirName = "Test1";
        Path src = new Path(dfs.getWorkingDirectory() + "/" + dirName);
        dfs.mkdirs(src);
        Path scr1 = new Path("/usr/Eclipse/Output.csv");
        Path dst = new Path(dfs.getWorkingDirectory() + "/Test1/");
        dfs.copyFromLocalFile(src, dst);

        }
        }


    Exception in thread "main" org.apache.hadoop.ipc.RemoteException:
Server IPC version 7 cannot communicate with client version 4
    at org.apache.hadoop.ipc.Client.call(Client.java:1070)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
    at $Proxy1.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
    at
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
    at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
    at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
    at CopyFile.main(CopyFile.java:11)



I am using CDH4.1. i have download the source file of hadoop-1.0.4 and
import the jar files into Eclipse. I think it is due to version problem.
Could you please let me know what will be correct version for the CDH4.1?

Many thanks





On Mon, Nov 19, 2012 at 3:41 PM, Mohammad Tariq <do...@gmail.com> wrote:

> It should work. Same code is working fine for me. Try to create some other
> directory in your Hdfs and use it as your output path. Also see if you find
> something in datanode logs.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <dr...@gmail.com>wrote:
>
>> The input path is fine. Problem in output path. I am just wonder that it
>> copy the data into local disk  (/user/root/) not into hdfs. I dont know
>> why? Is it we give the correct statement to point to hdfs?
>>
>> Thanks
>>
>>
>>
>> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>
>>> Try this as your input file path
>>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <dr...@gmail.com>wrote:
>>>
>>>> when I am applying the command as
>>>>
>>>> $ hadoop fs -put /usr/Eclipse/Output.csv /user/root/Output.csv.
>>>>
>>>> its work fine and file browsing in the hdfs. But i dont know why its
>>>> not work in program.
>>>>
>>>> Many thanks for your cooperation.
>>>>
>>>> Best regards,
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>
>>>>> It would be good if I could have a look on the files. Meantime try
>>>>> some other directories. Also, check the directory permissions once.
>>>>>
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>
>>>>>>
>>>>>> I have tried through root user and made the following changes:
>>>>>>
>>>>>>
>>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>>
>>>>>> No result. The following is the log output. The log shows the
>>>>>> destination is null.
>>>>>>
>>>>>>
>>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>>
>>>>>>> Yeah, My cluster running. When brows http://hadoop1.example.com:
>>>>>>> 50070/dfshealth.jsp. I am getting the main page. Then click on Brows file
>>>>>>> system. I am getting the following:
>>>>>>>
>>>>>>> hbase
>>>>>>> tmp
>>>>>>> user
>>>>>>>
>>>>>>> And when click on user getting:
>>>>>>>
>>>>>>> beeswax
>>>>>>> huuser (I have created)
>>>>>>> root (I have created)
>>>>>>>
>>>>>>> Would you like to see my configuration file. As did not change any
>>>>>>> things, all by default. I have installed CDH4.1 and running on VMs.
>>>>>>>
>>>>>>> Many thanks
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>>>
>>>>>>>> Is your cluster running fine? Are you able to browse Hdfs through
>>>>>>>> the Hdfs Web Console at 50070?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>     Mohammad Tariq
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <
>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Many thanks.
>>>>>>>>>
>>>>>>>>> I have changed the program accordingly. It does not show any error
>>>>>>>>> but one warring , but when I am browsing the HDFS folder, file is not
>>>>>>>>> copied.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> public class CopyData {
>>>>>>>>> public static void main(String[] args) throws IOException{
>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>         //Configuration configuration = new Configuration();
>>>>>>>>>         //configuration.addResource(new
>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>         //configuration.addResource(new
>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>
>>>>>>>>>         conf.addResource(new
>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>         conf.addResource(new Path
>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>          FileSystem fs = FileSystem.get(conf);
>>>>>>>>>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>         Path outputFile = new Path("/user/hduser/Output1.csv");
>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>         fs.close();
>>>>>>>>>     }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader
>>>>>>>>> <clinit>
>>>>>>>>> WARNING: Unable to load native-hadoop library for your platform...
>>>>>>>>> using builtin-java classes where applicable
>>>>>>>>>
>>>>>>>>> Have any idea?
>>>>>>>>>
>>>>>>>>> Many thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <
>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> If it is just copying the files without any processing or change,
>>>>>>>>>> you can use something like this :
>>>>>>>>>>
>>>>>>>>>> public class CopyData {
>>>>>>>>>>
>>>>>>>>>>     public static void main(String[] args) throws IOException{
>>>>>>>>>>
>>>>>>>>>>         Configuration configuration = new Configuration();
>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>>>         Path inputFile = new
>>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>         fs.close();
>>>>>>>>>>     }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> Obviously you have to modify it as per your requirements like
>>>>>>>>>> continuously polling the targeted directory for new files.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <
>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>>
>>>>>>>>>>> As I am new in  Java and Hadoop and have no much experience. I
>>>>>>>>>>> am trying to first write a simple program to upload data into HDFS and
>>>>>>>>>>> gradually move forward. I have written the following simple program to
>>>>>>>>>>> upload the file into HDFS, I dont know why it does not working.  could you
>>>>>>>>>>> please check it, if have time.
>>>>>>>>>>>
>>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>>> import java.io.File;
>>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>>> import java.nio.*;
>>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>>
>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>> public class hdfsdata {
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> public static void main(String [] args) throws IOException
>>>>>>>>>>> {
>>>>>>>>>>>     try{
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>>>     conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>     conf.addResource(new Path
>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>>>
>>>>>>>>>>>     //String fileName = source.substring(source.lastIndexOf('/')
>>>>>>>>>>> + source.length());
>>>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>>>
>>>>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>>>>     {
>>>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>>>     }
>>>>>>>>>>>     else
>>>>>>>>>>>     {
>>>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>>>
>>>>>>>>>>>     }
>>>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>>     {
>>>>>>>>>>>         System.out.println("File" + dest + " already exists");
>>>>>>>>>>>     }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>>>>>    int numbytes = 0;
>>>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>>>
>>>>>>>>>>>    {
>>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>>    }
>>>>>>>>>>>    in.close();
>>>>>>>>>>>    out.close();
>>>>>>>>>>>    //bos.close();
>>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>>     }
>>>>>>>>>>>     catch(Exception e)
>>>>>>>>>>>     {
>>>>>>>>>>>
>>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>>     }
>>>>>>>>>>>     }
>>>>>>>>>>>
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks again,
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>>
>>>>>>>>>>> KK
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> You can set your cronjob to execute the program after every 5
>>>>>>>>>>>> sec.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Well, I want to automatically upload the files as  the files
>>>>>>>>>>>>> are generating about every 3-5 sec and each file has size about 3MB.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  Is it possible to automate the system using put or cp command?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I read about the flume and webHDFS but I am not sure it will
>>>>>>>>>>>>> work or not.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>>>>>>>>>>>>> wget.null@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> > HI,
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > I am generating files continuously in local folder of my
>>>>>>>>>>>>>> base machine. How
>>>>>>>>>>>>>> > I can now use the flume to stream the generated files from
>>>>>>>>>>>>>> local folder to
>>>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>>>> > I dont know how exactly configure the sources, sinks and
>>>>>>>>>>>>>> hdfs.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>>> > KK
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by Mohammad Tariq <do...@gmail.com>.
It should work. Same code is working fine for me. Try to create some other
directory in your Hdfs and use it as your output path. Also see if you find
something in datanode logs.

Regards,
    Mohammad Tariq



On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <dr...@gmail.com> wrote:

> The input path is fine. Problem in output path. I am just wonder that it
> copy the data into local disk  (/user/root/) not into hdfs. I dont know
> why? Is it we give the correct statement to point to hdfs?
>
> Thanks
>
>
>
> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <do...@gmail.com>wrote:
>
>> Try this as your input file path
>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <dr...@gmail.com>wrote:
>>
>>> when I am applying the command as
>>>
>>> $ hadoop fs -put /usr/Eclipse/Output.csv /user/root/Output.csv.
>>>
>>> its work fine and file browsing in the hdfs. But i dont know why its not
>>> work in program.
>>>
>>> Many thanks for your cooperation.
>>>
>>> Best regards,
>>>
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>
>>>> It would be good if I could have a look on the files. Meantime try some
>>>> other directories. Also, check the directory permissions once.
>>>>
>>>> Regards,
>>>>     Mohammad Tariq
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <dr...@gmail.com>wrote:
>>>>
>>>>>
>>>>> I have tried through root user and made the following changes:
>>>>>
>>>>>
>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>
>>>>> No result. The following is the log output. The log shows the
>>>>> destination is null.
>>>>>
>>>>>
>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>
>>>>>> Yeah, My cluster running. When brows http://hadoop1.example.com:
>>>>>> 50070/dfshealth.jsp. I am getting the main page. Then click on Brows file
>>>>>> system. I am getting the following:
>>>>>>
>>>>>> hbase
>>>>>> tmp
>>>>>> user
>>>>>>
>>>>>> And when click on user getting:
>>>>>>
>>>>>> beeswax
>>>>>> huuser (I have created)
>>>>>> root (I have created)
>>>>>>
>>>>>> Would you like to see my configuration file. As did not change any
>>>>>> things, all by default. I have installed CDH4.1 and running on VMs.
>>>>>>
>>>>>> Many thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>>
>>>>>>> Is your cluster running fine? Are you able to browse Hdfs through
>>>>>>> the Hdfs Web Console at 50070?
>>>>>>>
>>>>>>> Regards,
>>>>>>>     Mohammad Tariq
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <drkashif8310@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Many thanks.
>>>>>>>>
>>>>>>>> I have changed the program accordingly. It does not show any error
>>>>>>>> but one warring , but when I am browsing the HDFS folder, file is not
>>>>>>>> copied.
>>>>>>>>
>>>>>>>>
>>>>>>>> public class CopyData {
>>>>>>>> public static void main(String[] args) throws IOException{
>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>         //Configuration configuration = new Configuration();
>>>>>>>>         //configuration.addResource(new
>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>         //configuration.addResource(new
>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>
>>>>>>>>         conf.addResource(new
>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>         conf.addResource(new Path
>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>          FileSystem fs = FileSystem.get(conf);
>>>>>>>>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>         Path outputFile = new Path("/user/hduser/Output1.csv");
>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>         fs.close();
>>>>>>>>     }
>>>>>>>> }
>>>>>>>>
>>>>>>>> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader
>>>>>>>> <clinit>
>>>>>>>> WARNING: Unable to load native-hadoop library for your platform...
>>>>>>>> using builtin-java classes where applicable
>>>>>>>>
>>>>>>>> Have any idea?
>>>>>>>>
>>>>>>>> Many thanks
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <dontariq@gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> If it is just copying the files without any processing or change,
>>>>>>>>> you can use something like this :
>>>>>>>>>
>>>>>>>>> public class CopyData {
>>>>>>>>>
>>>>>>>>>     public static void main(String[] args) throws IOException{
>>>>>>>>>
>>>>>>>>>         Configuration configuration = new Configuration();
>>>>>>>>>         configuration.addResource(new
>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>         configuration.addResource(new
>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>>         Path inputFile = new
>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>         fs.close();
>>>>>>>>>     }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> Obviously you have to modify it as per your requirements like
>>>>>>>>> continuously polling the targeted directory for new files.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>     Mohammad Tariq
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <
>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>
>>>>>>>>>> As I am new in  Java and Hadoop and have no much experience. I am
>>>>>>>>>> trying to first write a simple program to upload data into HDFS and
>>>>>>>>>> gradually move forward. I have written the following simple program to
>>>>>>>>>> upload the file into HDFS, I dont know why it does not working.  could you
>>>>>>>>>> please check it, if have time.
>>>>>>>>>>
>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>> import java.io.File;
>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>> import java.io.IOException;
>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>> import java.nio.*;
>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>
>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>> public class hdfsdata {
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> public static void main(String [] args) throws IOException
>>>>>>>>>> {
>>>>>>>>>>     try{
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>>     conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>     conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>>
>>>>>>>>>>     //String fileName = source.substring(source.lastIndexOf('/')
>>>>>>>>>> + source.length());
>>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>>
>>>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>>>     {
>>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>>     }
>>>>>>>>>>     else
>>>>>>>>>>     {
>>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>>
>>>>>>>>>>     }
>>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>     {
>>>>>>>>>>         System.out.println("File" + dest + " already exists");
>>>>>>>>>>     }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>>>>    int numbytes = 0;
>>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>>
>>>>>>>>>>    {
>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>    }
>>>>>>>>>>    in.close();
>>>>>>>>>>    out.close();
>>>>>>>>>>    //bos.close();
>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>     }
>>>>>>>>>>     catch(Exception e)
>>>>>>>>>>     {
>>>>>>>>>>
>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>     }
>>>>>>>>>>     }
>>>>>>>>>>
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks again,
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>>
>>>>>>>>>> KK
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> You can set your cronjob to execute the program after every 5
>>>>>>>>>>> sec.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Well, I want to automatically upload the files as  the files
>>>>>>>>>>>> are generating about every 3-5 sec and each file has size about 3MB.
>>>>>>>>>>>>
>>>>>>>>>>>>  Is it possible to automate the system using put or cp command?
>>>>>>>>>>>>
>>>>>>>>>>>> I read about the flume and webHDFS but I am not sure it will
>>>>>>>>>>>> work or not.
>>>>>>>>>>>>
>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>>>>>>>>>>>> wget.null@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>>>>>>>>>
>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> > HI,
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > I am generating files continuously in local folder of my
>>>>>>>>>>>>> base machine. How
>>>>>>>>>>>>> > I can now use the flume to stream the generated files from
>>>>>>>>>>>>> local folder to
>>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>>> > I dont know how exactly configure the sources, sinks and
>>>>>>>>>>>>> hdfs.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>> > KK
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by kashif khan <dr...@gmail.com>.
The input path is fine. Problem in output path. I am just wonder that it
copy the data into local disk  (/user/root/) not into hdfs. I dont know
why? Is it we give the correct statement to point to hdfs?

Thanks


On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Try this as your input file path
> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <dr...@gmail.com>wrote:
>
>> when I am applying the command as
>>
>> $ hadoop fs -put /usr/Eclipse/Output.csv /user/root/Output.csv.
>>
>> its work fine and file browsing in the hdfs. But i dont know why its not
>> work in program.
>>
>> Many thanks for your cooperation.
>>
>> Best regards,
>>
>>
>>
>>
>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>
>>> It would be good if I could have a look on the files. Meantime try some
>>> other directories. Also, check the directory permissions once.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <dr...@gmail.com>wrote:
>>>
>>>>
>>>> I have tried through root user and made the following changes:
>>>>
>>>>
>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>
>>>> No result. The following is the log output. The log shows the
>>>> destination is null.
>>>>
>>>>
>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <dr...@gmail.com>wrote:
>>>>
>>>>> Yeah, My cluster running. When brows http://hadoop1.example.com:
>>>>> 50070/dfshealth.jsp. I am getting the main page. Then click on Brows file
>>>>> system. I am getting the following:
>>>>>
>>>>> hbase
>>>>> tmp
>>>>> user
>>>>>
>>>>> And when click on user getting:
>>>>>
>>>>> beeswax
>>>>> huuser (I have created)
>>>>> root (I have created)
>>>>>
>>>>> Would you like to see my configuration file. As did not change any
>>>>> things, all by default. I have installed CDH4.1 and running on VMs.
>>>>>
>>>>> Many thanks
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>
>>>>>> Is your cluster running fine? Are you able to browse Hdfs through the
>>>>>> Hdfs Web Console at 50070?
>>>>>>
>>>>>> Regards,
>>>>>>     Mohammad Tariq
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>>
>>>>>>> Many thanks.
>>>>>>>
>>>>>>> I have changed the program accordingly. It does not show any error
>>>>>>> but one warring , but when I am browsing the HDFS folder, file is not
>>>>>>> copied.
>>>>>>>
>>>>>>>
>>>>>>> public class CopyData {
>>>>>>> public static void main(String[] args) throws IOException{
>>>>>>>         Configuration conf = new Configuration();
>>>>>>>         //Configuration configuration = new Configuration();
>>>>>>>         //configuration.addResource(new
>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>         //configuration.addResource(new
>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>
>>>>>>>         conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>         conf.addResource(new Path
>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>          FileSystem fs = FileSystem.get(conf);
>>>>>>>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>         Path outputFile = new Path("/user/hduser/Output1.csv");
>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>         fs.close();
>>>>>>>     }
>>>>>>> }
>>>>>>>
>>>>>>> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader <clinit>
>>>>>>> WARNING: Unable to load native-hadoop library for your platform...
>>>>>>> using builtin-java classes where applicable
>>>>>>>
>>>>>>> Have any idea?
>>>>>>>
>>>>>>> Many thanks
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>>>
>>>>>>>> If it is just copying the files without any processing or change,
>>>>>>>> you can use something like this :
>>>>>>>>
>>>>>>>> public class CopyData {
>>>>>>>>
>>>>>>>>     public static void main(String[] args) throws IOException{
>>>>>>>>
>>>>>>>>         Configuration configuration = new Configuration();
>>>>>>>>         configuration.addResource(new
>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>         configuration.addResource(new
>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>         Path inputFile = new
>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>         fs.close();
>>>>>>>>     }
>>>>>>>> }
>>>>>>>>
>>>>>>>> Obviously you have to modify it as per your requirements like
>>>>>>>> continuously polling the targeted directory for new files.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>     Mohammad Tariq
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <
>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Thanks M  Tariq
>>>>>>>>>
>>>>>>>>> As I am new in  Java and Hadoop and have no much experience. I am
>>>>>>>>> trying to first write a simple program to upload data into HDFS and
>>>>>>>>> gradually move forward. I have written the following simple program to
>>>>>>>>> upload the file into HDFS, I dont know why it does not working.  could you
>>>>>>>>> please check it, if have time.
>>>>>>>>>
>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>> import java.io.File;
>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>> import java.io.IOException;
>>>>>>>>> import java.io.InputStream;
>>>>>>>>> import java.io.OutputStream;
>>>>>>>>> import java.nio.*;
>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>
>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>> public class hdfsdata {
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> public static void main(String [] args) throws IOException
>>>>>>>>> {
>>>>>>>>>     try{
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>     conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>     conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>
>>>>>>>>>     //String fileName = source.substring(source.lastIndexOf('/') +
>>>>>>>>> source.length());
>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>
>>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>>     {
>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>     }
>>>>>>>>>     else
>>>>>>>>>     {
>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>
>>>>>>>>>     }
>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>     {
>>>>>>>>>         System.out.println("File" + dest + " already exists");
>>>>>>>>>     }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>>>    int numbytes = 0;
>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>
>>>>>>>>>    {
>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>    }
>>>>>>>>>    in.close();
>>>>>>>>>    out.close();
>>>>>>>>>    //bos.close();
>>>>>>>>>    fileSystem.close();
>>>>>>>>>     }
>>>>>>>>>     catch(Exception e)
>>>>>>>>>     {
>>>>>>>>>
>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>     }
>>>>>>>>>     }
>>>>>>>>>
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks again,
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>> KK
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> You can set your cronjob to execute the program after every 5
>>>>>>>>>> sec.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Well, I want to automatically upload the files as  the files are
>>>>>>>>>>> generating about every 3-5 sec and each file has size about 3MB.
>>>>>>>>>>>
>>>>>>>>>>>  Is it possible to automate the system using put or cp command?
>>>>>>>>>>>
>>>>>>>>>>> I read about the flume and webHDFS but I am not sure it will
>>>>>>>>>>> work or not.
>>>>>>>>>>>
>>>>>>>>>>> Many thanks
>>>>>>>>>>>
>>>>>>>>>>> Best regards
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>>>>>>>>>>> wget.null@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>>>>>>>>
>>>>>>>>>>>> - Alex
>>>>>>>>>>>>
>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> > HI,
>>>>>>>>>>>> >
>>>>>>>>>>>> > I am generating files continuously in local folder of my base
>>>>>>>>>>>> machine. How
>>>>>>>>>>>> > I can now use the flume to stream the generated files from
>>>>>>>>>>>> local folder to
>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>> > I dont know how exactly configure the sources, sinks and hdfs.
>>>>>>>>>>>> >
>>>>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>>>>>>>>> >
>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>> >
>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>> > KK
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by Mohammad Tariq <do...@gmail.com>.
Try this as your input file path
Path inputFile = new Path("file:///usr/Eclipse/Output.csv");

Regards,
    Mohammad Tariq



On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <dr...@gmail.com> wrote:

> when I am applying the command as
>
> $ hadoop fs -put /usr/Eclipse/Output.csv /user/root/Output.csv.
>
> its work fine and file browsing in the hdfs. But i dont know why its not
> work in program.
>
> Many thanks for your cooperation.
>
> Best regards,
>
>
>
>
> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <do...@gmail.com>wrote:
>
>> It would be good if I could have a look on the files. Meantime try some
>> other directories. Also, check the directory permissions once.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <dr...@gmail.com>wrote:
>>
>>>
>>> I have tried through root user and made the following changes:
>>>
>>>
>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>
>>> No result. The following is the log output. The log shows the
>>> destination is null.
>>>
>>>
>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <dr...@gmail.com>wrote:
>>>
>>>> Yeah, My cluster running. When brows http://hadoop1.example.com:
>>>> 50070/dfshealth.jsp. I am getting the main page. Then click on Brows file
>>>> system. I am getting the following:
>>>>
>>>> hbase
>>>> tmp
>>>> user
>>>>
>>>> And when click on user getting:
>>>>
>>>> beeswax
>>>> huuser (I have created)
>>>> root (I have created)
>>>>
>>>> Would you like to see my configuration file. As did not change any
>>>> things, all by default. I have installed CDH4.1 and running on VMs.
>>>>
>>>> Many thanks
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>
>>>>> Is your cluster running fine? Are you able to browse Hdfs through the
>>>>> Hdfs Web Console at 50070?
>>>>>
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>
>>>>>> Many thanks.
>>>>>>
>>>>>> I have changed the program accordingly. It does not show any error
>>>>>> but one warring , but when I am browsing the HDFS folder, file is not
>>>>>> copied.
>>>>>>
>>>>>>
>>>>>> public class CopyData {
>>>>>> public static void main(String[] args) throws IOException{
>>>>>>         Configuration conf = new Configuration();
>>>>>>         //Configuration configuration = new Configuration();
>>>>>>         //configuration.addResource(new
>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>         //configuration.addResource(new
>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>
>>>>>>         conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>         conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>          FileSystem fs = FileSystem.get(conf);
>>>>>>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>         Path outputFile = new Path("/user/hduser/Output1.csv");
>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>         fs.close();
>>>>>>     }
>>>>>> }
>>>>>>
>>>>>> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader <clinit>
>>>>>> WARNING: Unable to load native-hadoop library for your platform...
>>>>>> using builtin-java classes where applicable
>>>>>>
>>>>>> Have any idea?
>>>>>>
>>>>>> Many thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>>
>>>>>>> If it is just copying the files without any processing or change,
>>>>>>> you can use something like this :
>>>>>>>
>>>>>>> public class CopyData {
>>>>>>>
>>>>>>>     public static void main(String[] args) throws IOException{
>>>>>>>
>>>>>>>         Configuration configuration = new Configuration();
>>>>>>>         configuration.addResource(new
>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>         configuration.addResource(new
>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>         Path inputFile = new Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>         fs.close();
>>>>>>>     }
>>>>>>> }
>>>>>>>
>>>>>>> Obviously you have to modify it as per your requirements like
>>>>>>> continuously polling the targeted directory for new files.
>>>>>>>
>>>>>>> Regards,
>>>>>>>     Mohammad Tariq
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <drkashif8310@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Thanks M  Tariq
>>>>>>>>
>>>>>>>> As I am new in  Java and Hadoop and have no much experience. I am
>>>>>>>> trying to first write a simple program to upload data into HDFS and
>>>>>>>> gradually move forward. I have written the following simple program to
>>>>>>>> upload the file into HDFS, I dont know why it does not working.  could you
>>>>>>>> please check it, if have time.
>>>>>>>>
>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>> import java.io.File;
>>>>>>>> import java.io.FileInputStream;
>>>>>>>> import java.io.FileOutputStream;
>>>>>>>> import java.io.IOException;
>>>>>>>> import java.io.InputStream;
>>>>>>>> import java.io.OutputStream;
>>>>>>>> import java.nio.*;
>>>>>>>> //import java.nio.file.Path;
>>>>>>>>
>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>> public class hdfsdata {
>>>>>>>>
>>>>>>>>
>>>>>>>> public static void main(String [] args) throws IOException
>>>>>>>> {
>>>>>>>>     try{
>>>>>>>>
>>>>>>>>
>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>     conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>     conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>
>>>>>>>>     //String fileName = source.substring(source.lastIndexOf('/') +
>>>>>>>> source.length());
>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>
>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>     {
>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>     }
>>>>>>>>     else
>>>>>>>>     {
>>>>>>>>         dest = dest + fileName;
>>>>>>>>
>>>>>>>>     }
>>>>>>>>     Path path = new Path(dest);
>>>>>>>>
>>>>>>>>
>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>     {
>>>>>>>>         System.out.println("File" + dest + " already exists");
>>>>>>>>     }
>>>>>>>>
>>>>>>>>
>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>    InputStream in = new BufferedInputStream(new FileInputStream(new
>>>>>>>> File(source)));
>>>>>>>>    File myfile = new File(source);
>>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>>    int numbytes = 0;
>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>
>>>>>>>>    {
>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>    }
>>>>>>>>    in.close();
>>>>>>>>    out.close();
>>>>>>>>    //bos.close();
>>>>>>>>    fileSystem.close();
>>>>>>>>     }
>>>>>>>>     catch(Exception e)
>>>>>>>>     {
>>>>>>>>
>>>>>>>>         System.out.println(e.toString());
>>>>>>>>     }
>>>>>>>>     }
>>>>>>>>
>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks again,
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> KK
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> You can set your cronjob to execute the program after every 5 sec.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>     Mohammad Tariq
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Well, I want to automatically upload the files as  the files are
>>>>>>>>>> generating about every 3-5 sec and each file has size about 3MB.
>>>>>>>>>>
>>>>>>>>>>  Is it possible to automate the system using put or cp command?
>>>>>>>>>>
>>>>>>>>>> I read about the flume and webHDFS but I am not sure it will work
>>>>>>>>>> or not.
>>>>>>>>>>
>>>>>>>>>> Many thanks
>>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>>>>>>>>>> wget.null@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>>>>>>>
>>>>>>>>>>> - Alex
>>>>>>>>>>>
>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> > HI,
>>>>>>>>>>> >
>>>>>>>>>>> > I am generating files continuously in local folder of my base
>>>>>>>>>>> machine. How
>>>>>>>>>>> > I can now use the flume to stream the generated files from
>>>>>>>>>>> local folder to
>>>>>>>>>>> > HDFS.
>>>>>>>>>>> > I dont know how exactly configure the sources, sinks and hdfs.
>>>>>>>>>>> >
>>>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>>>> /usr/datastorage/
>>>>>>>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>>>>>>>> >
>>>>>>>>>>> > Please let me help.
>>>>>>>>>>> >
>>>>>>>>>>> > Many thanks
>>>>>>>>>>> >
>>>>>>>>>>> > Best regards,
>>>>>>>>>>> > KK
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by kashif khan <dr...@gmail.com>.
when I am applying the command as

$ hadoop fs -put /usr/Eclipse/Output.csv /user/root/Output.csv.

its work fine and file browsing in the hdfs. But i dont know why its not
work in program.

Many thanks for your cooperation.

Best regards,



On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <do...@gmail.com> wrote:

> It would be good if I could have a look on the files. Meantime try some
> other directories. Also, check the directory permissions once.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <dr...@gmail.com>wrote:
>
>>
>> I have tried through root user and made the following changes:
>>
>>
>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>> Path outputFile = new Path("/user/root/Output1.csv");
>>
>> No result. The following is the log output. The log shows the destination
>> is null.
>>
>>
>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>
>>
>> Thanks
>>
>>
>>
>>
>>
>>
>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <dr...@gmail.com>wrote:
>>
>>> Yeah, My cluster running. When brows http://hadoop1.example.com:
>>> 50070/dfshealth.jsp. I am getting the main page. Then click on Brows file
>>> system. I am getting the following:
>>>
>>> hbase
>>> tmp
>>> user
>>>
>>> And when click on user getting:
>>>
>>> beeswax
>>> huuser (I have created)
>>> root (I have created)
>>>
>>> Would you like to see my configuration file. As did not change any
>>> things, all by default. I have installed CDH4.1 and running on VMs.
>>>
>>> Many thanks
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>
>>>> Is your cluster running fine? Are you able to browse Hdfs through the
>>>> Hdfs Web Console at 50070?
>>>>
>>>> Regards,
>>>>     Mohammad Tariq
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <dr...@gmail.com>wrote:
>>>>
>>>>> Many thanks.
>>>>>
>>>>> I have changed the program accordingly. It does not show any error but
>>>>> one warring , but when I am browsing the HDFS folder, file is not copied.
>>>>>
>>>>>
>>>>> public class CopyData {
>>>>> public static void main(String[] args) throws IOException{
>>>>>         Configuration conf = new Configuration();
>>>>>         //Configuration configuration = new Configuration();
>>>>>         //configuration.addResource(new
>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>         //configuration.addResource(new
>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>
>>>>>         conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>>>>         conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>          FileSystem fs = FileSystem.get(conf);
>>>>>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>         Path outputFile = new Path("/user/hduser/Output1.csv");
>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>         fs.close();
>>>>>     }
>>>>> }
>>>>>
>>>>> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader <clinit>
>>>>> WARNING: Unable to load native-hadoop library for your platform...
>>>>> using builtin-java classes where applicable
>>>>>
>>>>> Have any idea?
>>>>>
>>>>> Many thanks
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>
>>>>>> If it is just copying the files without any processing or change, you
>>>>>> can use something like this :
>>>>>>
>>>>>> public class CopyData {
>>>>>>
>>>>>>     public static void main(String[] args) throws IOException{
>>>>>>
>>>>>>         Configuration configuration = new Configuration();
>>>>>>         configuration.addResource(new
>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>         configuration.addResource(new
>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>         Path inputFile = new Path("/home/mohammad/pc/work/FFT.java");
>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>         fs.close();
>>>>>>     }
>>>>>> }
>>>>>>
>>>>>> Obviously you have to modify it as per your requirements like
>>>>>> continuously polling the targeted directory for new files.
>>>>>>
>>>>>> Regards,
>>>>>>     Mohammad Tariq
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>>
>>>>>>> Thanks M  Tariq
>>>>>>>
>>>>>>> As I am new in  Java and Hadoop and have no much experience. I am
>>>>>>> trying to first write a simple program to upload data into HDFS and
>>>>>>> gradually move forward. I have written the following simple program to
>>>>>>> upload the file into HDFS, I dont know why it does not working.  could you
>>>>>>> please check it, if have time.
>>>>>>>
>>>>>>> import java.io.BufferedInputStream;
>>>>>>> import java.io.BufferedOutputStream;
>>>>>>> import java.io.File;
>>>>>>> import java.io.FileInputStream;
>>>>>>> import java.io.FileOutputStream;
>>>>>>> import java.io.IOException;
>>>>>>> import java.io.InputStream;
>>>>>>> import java.io.OutputStream;
>>>>>>> import java.nio.*;
>>>>>>> //import java.nio.file.Path;
>>>>>>>
>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>> public class hdfsdata {
>>>>>>>
>>>>>>>
>>>>>>> public static void main(String [] args) throws IOException
>>>>>>> {
>>>>>>>     try{
>>>>>>>
>>>>>>>
>>>>>>>     Configuration conf = new Configuration();
>>>>>>>     conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>     conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>
>>>>>>>     //String fileName = source.substring(source.lastIndexOf('/') +
>>>>>>> source.length());
>>>>>>>     String fileName = "Output1.csv";
>>>>>>>
>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>     {
>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>     }
>>>>>>>     else
>>>>>>>     {
>>>>>>>         dest = dest + fileName;
>>>>>>>
>>>>>>>     }
>>>>>>>     Path path = new Path(dest);
>>>>>>>
>>>>>>>
>>>>>>>     if(fileSystem.exists(path))
>>>>>>>     {
>>>>>>>         System.out.println("File" + dest + " already exists");
>>>>>>>     }
>>>>>>>
>>>>>>>
>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>    InputStream in = new BufferedInputStream(new FileInputStream(new
>>>>>>> File(source)));
>>>>>>>    File myfile = new File(source);
>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>    int numbytes = 0;
>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>
>>>>>>>    {
>>>>>>>        out.write(b,0,numbytes);
>>>>>>>    }
>>>>>>>    in.close();
>>>>>>>    out.close();
>>>>>>>    //bos.close();
>>>>>>>    fileSystem.close();
>>>>>>>     }
>>>>>>>     catch(Exception e)
>>>>>>>     {
>>>>>>>
>>>>>>>         System.out.println(e.toString());
>>>>>>>     }
>>>>>>>     }
>>>>>>>
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> Thanks again,
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>> KK
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <dontariq@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> You can set your cronjob to execute the program after every 5 sec.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>     Mohammad Tariq
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Well, I want to automatically upload the files as  the files are
>>>>>>>>> generating about every 3-5 sec and each file has size about 3MB.
>>>>>>>>>
>>>>>>>>>  Is it possible to automate the system using put or cp command?
>>>>>>>>>
>>>>>>>>> I read about the flume and webHDFS but I am not sure it will work
>>>>>>>>> or not.
>>>>>>>>>
>>>>>>>>> Many thanks
>>>>>>>>>
>>>>>>>>> Best regards
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>>>>>>>>> wget.null@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>>>>>>
>>>>>>>>>> - Alex
>>>>>>>>>>
>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <dr...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> > HI,
>>>>>>>>>> >
>>>>>>>>>> > I am generating files continuously in local folder of my base
>>>>>>>>>> machine. How
>>>>>>>>>> > I can now use the flume to stream the generated files from
>>>>>>>>>> local folder to
>>>>>>>>>> > HDFS.
>>>>>>>>>> > I dont know how exactly configure the sources, sinks and hdfs.
>>>>>>>>>> >
>>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>>> /usr/datastorage/
>>>>>>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>>>>>>> >
>>>>>>>>>> > Please let me help.
>>>>>>>>>> >
>>>>>>>>>> > Many thanks
>>>>>>>>>> >
>>>>>>>>>> > Best regards,
>>>>>>>>>> > KK
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by Mohammad Tariq <do...@gmail.com>.
It would be good if I could have a look on the files. Meantime try some
other directories. Also, check the directory permissions once.

Regards,
    Mohammad Tariq



On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <dr...@gmail.com> wrote:

>
> I have tried through root user and made the following changes:
>
>
> Path inputFile = new Path("/usr/Eclipse/Output.csv");
> Path outputFile = new Path("/user/root/Output1.csv");
>
> No result. The following is the log output. The log shows the destination
> is null.
>
>
> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>
>
> Thanks
>
>
>
>
>
>
> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <dr...@gmail.com>wrote:
>
>> Yeah, My cluster running. When brows http://hadoop1.example.com:
>> 50070/dfshealth.jsp. I am getting the main page. Then click on Brows file
>> system. I am getting the following:
>>
>> hbase
>> tmp
>> user
>>
>> And when click on user getting:
>>
>> beeswax
>> huuser (I have created)
>> root (I have created)
>>
>> Would you like to see my configuration file. As did not change any
>> things, all by default. I have installed CDH4.1 and running on VMs.
>>
>> Many thanks
>>
>>
>>
>>
>>
>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>
>>> Is your cluster running fine? Are you able to browse Hdfs through the
>>> Hdfs Web Console at 50070?
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <dr...@gmail.com>wrote:
>>>
>>>> Many thanks.
>>>>
>>>> I have changed the program accordingly. It does not show any error but
>>>> one warring , but when I am browsing the HDFS folder, file is not copied.
>>>>
>>>>
>>>> public class CopyData {
>>>> public static void main(String[] args) throws IOException{
>>>>         Configuration conf = new Configuration();
>>>>         //Configuration configuration = new Configuration();
>>>>         //configuration.addResource(new
>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>         //configuration.addResource(new
>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>
>>>>         conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>>>         conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>          FileSystem fs = FileSystem.get(conf);
>>>>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>         Path outputFile = new Path("/user/hduser/Output1.csv");
>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>         fs.close();
>>>>     }
>>>> }
>>>>
>>>> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader <clinit>
>>>> WARNING: Unable to load native-hadoop library for your platform...
>>>> using builtin-java classes where applicable
>>>>
>>>> Have any idea?
>>>>
>>>> Many thanks
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>
>>>>> If it is just copying the files without any processing or change, you
>>>>> can use something like this :
>>>>>
>>>>> public class CopyData {
>>>>>
>>>>>     public static void main(String[] args) throws IOException{
>>>>>
>>>>>         Configuration configuration = new Configuration();
>>>>>         configuration.addResource(new
>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>         configuration.addResource(new
>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>         Path inputFile = new Path("/home/mohammad/pc/work/FFT.java");
>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>         fs.close();
>>>>>     }
>>>>> }
>>>>>
>>>>> Obviously you have to modify it as per your requirements like
>>>>> continuously polling the targeted directory for new files.
>>>>>
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>
>>>>>> Thanks M  Tariq
>>>>>>
>>>>>> As I am new in  Java and Hadoop and have no much experience. I am
>>>>>> trying to first write a simple program to upload data into HDFS and
>>>>>> gradually move forward. I have written the following simple program to
>>>>>> upload the file into HDFS, I dont know why it does not working.  could you
>>>>>> please check it, if have time.
>>>>>>
>>>>>> import java.io.BufferedInputStream;
>>>>>> import java.io.BufferedOutputStream;
>>>>>> import java.io.File;
>>>>>> import java.io.FileInputStream;
>>>>>> import java.io.FileOutputStream;
>>>>>> import java.io.IOException;
>>>>>> import java.io.InputStream;
>>>>>> import java.io.OutputStream;
>>>>>> import java.nio.*;
>>>>>> //import java.nio.file.Path;
>>>>>>
>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>> import org.apache.hadoop.fs.Path;
>>>>>> public class hdfsdata {
>>>>>>
>>>>>>
>>>>>> public static void main(String [] args) throws IOException
>>>>>> {
>>>>>>     try{
>>>>>>
>>>>>>
>>>>>>     Configuration conf = new Configuration();
>>>>>>     conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>     conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>     String dest = "/user/hduser/input/";
>>>>>>
>>>>>>     //String fileName = source.substring(source.lastIndexOf('/') +
>>>>>> source.length());
>>>>>>     String fileName = "Output1.csv";
>>>>>>
>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>     {
>>>>>>         dest = dest + "/" +fileName;
>>>>>>     }
>>>>>>     else
>>>>>>     {
>>>>>>         dest = dest + fileName;
>>>>>>
>>>>>>     }
>>>>>>     Path path = new Path(dest);
>>>>>>
>>>>>>
>>>>>>     if(fileSystem.exists(path))
>>>>>>     {
>>>>>>         System.out.println("File" + dest + " already exists");
>>>>>>     }
>>>>>>
>>>>>>
>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>    InputStream in = new BufferedInputStream(new FileInputStream(new
>>>>>> File(source)));
>>>>>>    File myfile = new File(source);
>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>    int numbytes = 0;
>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>
>>>>>>    {
>>>>>>        out.write(b,0,numbytes);
>>>>>>    }
>>>>>>    in.close();
>>>>>>    out.close();
>>>>>>    //bos.close();
>>>>>>    fileSystem.close();
>>>>>>     }
>>>>>>     catch(Exception e)
>>>>>>     {
>>>>>>
>>>>>>         System.out.println(e.toString());
>>>>>>     }
>>>>>>     }
>>>>>>
>>>>>> }
>>>>>>
>>>>>>
>>>>>> Thanks again,
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> KK
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>>
>>>>>>> You can set your cronjob to execute the program after every 5 sec.
>>>>>>>
>>>>>>> Regards,
>>>>>>>     Mohammad Tariq
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <drkashif8310@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Well, I want to automatically upload the files as  the files are
>>>>>>>> generating about every 3-5 sec and each file has size about 3MB.
>>>>>>>>
>>>>>>>>  Is it possible to automate the system using put or cp command?
>>>>>>>>
>>>>>>>> I read about the flume and webHDFS but I am not sure it will work
>>>>>>>> or not.
>>>>>>>>
>>>>>>>> Many thanks
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>>>>>>>> wget.null@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>>>>>
>>>>>>>>> - Alex
>>>>>>>>>
>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <dr...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> > HI,
>>>>>>>>> >
>>>>>>>>> > I am generating files continuously in local folder of my base
>>>>>>>>> machine. How
>>>>>>>>> > I can now use the flume to stream the generated files from local
>>>>>>>>> folder to
>>>>>>>>> > HDFS.
>>>>>>>>> > I dont know how exactly configure the sources, sinks and hdfs.
>>>>>>>>> >
>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>> /usr/datastorage/
>>>>>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>>>>>> >
>>>>>>>>> > Please let me help.
>>>>>>>>> >
>>>>>>>>> > Many thanks
>>>>>>>>> >
>>>>>>>>> > Best regards,
>>>>>>>>> > KK
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by kashif khan <dr...@gmail.com>.
I have tried through root user and made the following changes:

Path inputFile = new Path("/usr/Eclipse/Output.csv");
Path outputFile = new Path("/user/root/Output1.csv");

No result. The following is the log output. The log shows the destination
is null.


2012-11-19 14:36:38,960 INFO FSNamesystem.audit:
allowed=true	ugi=dr.who
(auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
2012-11-19 14:36:38,977 INFO FSNamesystem.audit:
allowed=true	ugi=dr.who
(auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
2012-11-19 14:36:39,933 INFO FSNamesystem.audit:
allowed=true	ugi=hbase
(auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
2012-11-19 14:36:41,147 INFO FSNamesystem.audit:
allowed=true	ugi=dr.who
(auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
2012-11-19 14:36:41,229 INFO FSNamesystem.audit:
allowed=true	ugi=dr.who
(auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null


Thanks





On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <dr...@gmail.com> wrote:

> Yeah, My cluster running. When brows http://hadoop1.example.com:
> 50070/dfshealth.jsp. I am getting the main page. Then click on Brows file
> system. I am getting the following:
>
> hbase
> tmp
> user
>
> And when click on user getting:
>
> beeswax
> huuser (I have created)
> root (I have created)
>
> Would you like to see my configuration file. As did not change any things,
> all by default. I have installed CDH4.1 and running on VMs.
>
> Many thanks
>
>
>
>
>
> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <do...@gmail.com>wrote:
>
>> Is your cluster running fine? Are you able to browse Hdfs through the
>> Hdfs Web Console at 50070?
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <dr...@gmail.com>wrote:
>>
>>> Many thanks.
>>>
>>> I have changed the program accordingly. It does not show any error but
>>> one warring , but when I am browsing the HDFS folder, file is not copied.
>>>
>>>
>>> public class CopyData {
>>> public static void main(String[] args) throws IOException{
>>>         Configuration conf = new Configuration();
>>>         //Configuration configuration = new Configuration();
>>>         //configuration.addResource(new
>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>         //configuration.addResource(new
>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>
>>>         conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>>         conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>>>          FileSystem fs = FileSystem.get(conf);
>>>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>         Path outputFile = new Path("/user/hduser/Output1.csv");
>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>         fs.close();
>>>     }
>>> }
>>>
>>> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader <clinit>
>>> WARNING: Unable to load native-hadoop library for your platform... using
>>> builtin-java classes where applicable
>>>
>>> Have any idea?
>>>
>>> Many thanks
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>
>>>> If it is just copying the files without any processing or change, you
>>>> can use something like this :
>>>>
>>>> public class CopyData {
>>>>
>>>>     public static void main(String[] args) throws IOException{
>>>>
>>>>         Configuration configuration = new Configuration();
>>>>         configuration.addResource(new
>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>         configuration.addResource(new
>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>         Path inputFile = new Path("/home/mohammad/pc/work/FFT.java");
>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>         fs.close();
>>>>     }
>>>> }
>>>>
>>>> Obviously you have to modify it as per your requirements like
>>>> continuously polling the targeted directory for new files.
>>>>
>>>> Regards,
>>>>     Mohammad Tariq
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <dr...@gmail.com>wrote:
>>>>
>>>>> Thanks M  Tariq
>>>>>
>>>>> As I am new in  Java and Hadoop and have no much experience. I am
>>>>> trying to first write a simple program to upload data into HDFS and
>>>>> gradually move forward. I have written the following simple program to
>>>>> upload the file into HDFS, I dont know why it does not working.  could you
>>>>> please check it, if have time.
>>>>>
>>>>> import java.io.BufferedInputStream;
>>>>> import java.io.BufferedOutputStream;
>>>>> import java.io.File;
>>>>> import java.io.FileInputStream;
>>>>> import java.io.FileOutputStream;
>>>>> import java.io.IOException;
>>>>> import java.io.InputStream;
>>>>> import java.io.OutputStream;
>>>>> import java.nio.*;
>>>>> //import java.nio.file.Path;
>>>>>
>>>>> import org.apache.hadoop.conf.Configuration;
>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>> import org.apache.hadoop.fs.Path;
>>>>> public class hdfsdata {
>>>>>
>>>>>
>>>>> public static void main(String [] args) throws IOException
>>>>> {
>>>>>     try{
>>>>>
>>>>>
>>>>>     Configuration conf = new Configuration();
>>>>>     conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>>>>     conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>     String dest = "/user/hduser/input/";
>>>>>
>>>>>     //String fileName = source.substring(source.lastIndexOf('/') +
>>>>> source.length());
>>>>>     String fileName = "Output1.csv";
>>>>>
>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>     {
>>>>>         dest = dest + "/" +fileName;
>>>>>     }
>>>>>     else
>>>>>     {
>>>>>         dest = dest + fileName;
>>>>>
>>>>>     }
>>>>>     Path path = new Path(dest);
>>>>>
>>>>>
>>>>>     if(fileSystem.exists(path))
>>>>>     {
>>>>>         System.out.println("File" + dest + " already exists");
>>>>>     }
>>>>>
>>>>>
>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>    InputStream in = new BufferedInputStream(new FileInputStream(new
>>>>> File(source)));
>>>>>    File myfile = new File(source);
>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>    int numbytes = 0;
>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>
>>>>>    {
>>>>>        out.write(b,0,numbytes);
>>>>>    }
>>>>>    in.close();
>>>>>    out.close();
>>>>>    //bos.close();
>>>>>    fileSystem.close();
>>>>>     }
>>>>>     catch(Exception e)
>>>>>     {
>>>>>
>>>>>         System.out.println(e.toString());
>>>>>     }
>>>>>     }
>>>>>
>>>>> }
>>>>>
>>>>>
>>>>> Thanks again,
>>>>>
>>>>> Best regards,
>>>>>
>>>>> KK
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>>
>>>>>> You can set your cronjob to execute the program after every 5 sec.
>>>>>>
>>>>>> Regards,
>>>>>>     Mohammad Tariq
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>>
>>>>>>> Well, I want to automatically upload the files as  the files are
>>>>>>> generating about every 3-5 sec and each file has size about 3MB.
>>>>>>>
>>>>>>>  Is it possible to automate the system using put or cp command?
>>>>>>>
>>>>>>> I read about the flume and webHDFS but I am not sure it will work or
>>>>>>> not.
>>>>>>>
>>>>>>> Many thanks
>>>>>>>
>>>>>>> Best regards
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>>>>>>> wget.null@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>>>>
>>>>>>>> - Alex
>>>>>>>>
>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <dr...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> > HI,
>>>>>>>> >
>>>>>>>> > I am generating files continuously in local folder of my base
>>>>>>>> machine. How
>>>>>>>> > I can now use the flume to stream the generated files from local
>>>>>>>> folder to
>>>>>>>> > HDFS.
>>>>>>>> > I dont know how exactly configure the sources, sinks and hdfs.
>>>>>>>> >
>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>> /usr/datastorage/
>>>>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>>>>> >
>>>>>>>> > Please let me help.
>>>>>>>> >
>>>>>>>> > Many thanks
>>>>>>>> >
>>>>>>>> > Best regards,
>>>>>>>> > KK
>>>>>>>>
>>>>>>>> --
>>>>>>>> Alexander Alten-Lorenz
>>>>>>>> http://mapredit.blogspot.com
>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by kashif khan <dr...@gmail.com>.
Yeah, My cluster running. When brows http://hadoop1.example.com:
50070/dfshealth.jsp. I am getting the main page. Then click on Brows file
system. I am getting the following:

hbase
tmp
user

And when click on user getting:

beeswax
huuser (I have created)
root (I have created)

Would you like to see my configuration file. As did not change any things,
all by default. I have installed CDH4.1 and running on VMs.

Many thanks




On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Is your cluster running fine? Are you able to browse Hdfs through the Hdfs
> Web Console at 50070?
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <dr...@gmail.com>wrote:
>
>> Many thanks.
>>
>> I have changed the program accordingly. It does not show any error but
>> one warring , but when I am browsing the HDFS folder, file is not copied.
>>
>>
>> public class CopyData {
>> public static void main(String[] args) throws IOException{
>>         Configuration conf = new Configuration();
>>         //Configuration configuration = new Configuration();
>>         //configuration.addResource(new
>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>         //configuration.addResource(new
>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>
>>         conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>         conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>>          FileSystem fs = FileSystem.get(conf);
>>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>         Path outputFile = new Path("/user/hduser/Output1.csv");
>>         fs.copyFromLocalFile(inputFile, outputFile);
>>         fs.close();
>>     }
>> }
>>
>> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader <clinit>
>> WARNING: Unable to load native-hadoop library for your platform... using
>> builtin-java classes where applicable
>>
>> Have any idea?
>>
>> Many thanks
>>
>>
>>
>>
>>
>>
>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>
>>> If it is just copying the files without any processing or change, you
>>> can use something like this :
>>>
>>> public class CopyData {
>>>
>>>     public static void main(String[] args) throws IOException{
>>>
>>>         Configuration configuration = new Configuration();
>>>         configuration.addResource(new
>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>         configuration.addResource(new
>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>         FileSystem fs = FileSystem.get(configuration);
>>>         Path inputFile = new Path("/home/mohammad/pc/work/FFT.java");
>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>         fs.close();
>>>     }
>>> }
>>>
>>> Obviously you have to modify it as per your requirements like
>>> continuously polling the targeted directory for new files.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <dr...@gmail.com>wrote:
>>>
>>>> Thanks M  Tariq
>>>>
>>>> As I am new in  Java and Hadoop and have no much experience. I am
>>>> trying to first write a simple program to upload data into HDFS and
>>>> gradually move forward. I have written the following simple program to
>>>> upload the file into HDFS, I dont know why it does not working.  could you
>>>> please check it, if have time.
>>>>
>>>> import java.io.BufferedInputStream;
>>>> import java.io.BufferedOutputStream;
>>>> import java.io.File;
>>>> import java.io.FileInputStream;
>>>> import java.io.FileOutputStream;
>>>> import java.io.IOException;
>>>> import java.io.InputStream;
>>>> import java.io.OutputStream;
>>>> import java.nio.*;
>>>> //import java.nio.file.Path;
>>>>
>>>> import org.apache.hadoop.conf.Configuration;
>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>> import org.apache.hadoop.fs.FileSystem;
>>>> import org.apache.hadoop.fs.Path;
>>>> public class hdfsdata {
>>>>
>>>>
>>>> public static void main(String [] args) throws IOException
>>>> {
>>>>     try{
>>>>
>>>>
>>>>     Configuration conf = new Configuration();
>>>>     conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>>>     conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>     String dest = "/user/hduser/input/";
>>>>
>>>>     //String fileName = source.substring(source.lastIndexOf('/') +
>>>> source.length());
>>>>     String fileName = "Output1.csv";
>>>>
>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>     {
>>>>         dest = dest + "/" +fileName;
>>>>     }
>>>>     else
>>>>     {
>>>>         dest = dest + fileName;
>>>>
>>>>     }
>>>>     Path path = new Path(dest);
>>>>
>>>>
>>>>     if(fileSystem.exists(path))
>>>>     {
>>>>         System.out.println("File" + dest + " already exists");
>>>>     }
>>>>
>>>>
>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>    InputStream in = new BufferedInputStream(new FileInputStream(new
>>>> File(source)));
>>>>    File myfile = new File(source);
>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>    int numbytes = 0;
>>>>    while((numbytes = in.read(b)) >= 0)
>>>>
>>>>    {
>>>>        out.write(b,0,numbytes);
>>>>    }
>>>>    in.close();
>>>>    out.close();
>>>>    //bos.close();
>>>>    fileSystem.close();
>>>>     }
>>>>     catch(Exception e)
>>>>     {
>>>>
>>>>         System.out.println(e.toString());
>>>>     }
>>>>     }
>>>>
>>>> }
>>>>
>>>>
>>>> Thanks again,
>>>>
>>>> Best regards,
>>>>
>>>> KK
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>
>>>>> You can set your cronjob to execute the program after every 5 sec.
>>>>>
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <dr...@gmail.com>wrote:
>>>>>
>>>>>> Well, I want to automatically upload the files as  the files are
>>>>>> generating about every 3-5 sec and each file has size about 3MB.
>>>>>>
>>>>>>  Is it possible to automate the system using put or cp command?
>>>>>>
>>>>>> I read about the flume and webHDFS but I am not sure it will work or
>>>>>> not.
>>>>>>
>>>>>> Many thanks
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>>>>>> wget.null@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>>>
>>>>>>> - Alex
>>>>>>>
>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <dr...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> > HI,
>>>>>>> >
>>>>>>> > I am generating files continuously in local folder of my base
>>>>>>> machine. How
>>>>>>> > I can now use the flume to stream the generated files from local
>>>>>>> folder to
>>>>>>> > HDFS.
>>>>>>> > I dont know how exactly configure the sources, sinks and hdfs.
>>>>>>> >
>>>>>>> > 1) location of folder where files are generating: /usr/datastorage/
>>>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>>>> >
>>>>>>> > Please let me help.
>>>>>>> >
>>>>>>> > Many thanks
>>>>>>> >
>>>>>>> > Best regards,
>>>>>>> > KK
>>>>>>>
>>>>>>> --
>>>>>>> Alexander Alten-Lorenz
>>>>>>> http://mapredit.blogspot.com
>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by Mohammad Tariq <do...@gmail.com>.
Is your cluster running fine? Are you able to browse Hdfs through the Hdfs
Web Console at 50070?

Regards,
    Mohammad Tariq



On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <dr...@gmail.com> wrote:

> Many thanks.
>
> I have changed the program accordingly. It does not show any error but one
> warring , but when I am browsing the HDFS folder, file is not copied.
>
>
> public class CopyData {
> public static void main(String[] args) throws IOException{
>         Configuration conf = new Configuration();
>         //Configuration configuration = new Configuration();
>         //configuration.addResource(new
> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>         //configuration.addResource(new
> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>
>         conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>         conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>          FileSystem fs = FileSystem.get(conf);
>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>         Path outputFile = new Path("/user/hduser/Output1.csv");
>         fs.copyFromLocalFile(inputFile, outputFile);
>         fs.close();
>     }
> }
>
> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader <clinit>
> WARNING: Unable to load native-hadoop library for your platform... using
> builtin-java classes where applicable
>
> Have any idea?
>
> Many thanks
>
>
>
>
>
>
> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <do...@gmail.com>wrote:
>
>> If it is just copying the files without any processing or change, you can
>> use something like this :
>>
>> public class CopyData {
>>
>>     public static void main(String[] args) throws IOException{
>>
>>         Configuration configuration = new Configuration();
>>         configuration.addResource(new
>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>         configuration.addResource(new
>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>         FileSystem fs = FileSystem.get(configuration);
>>         Path inputFile = new Path("/home/mohammad/pc/work/FFT.java");
>>         Path outputFile = new Path("/mapout/FFT.java");
>>         fs.copyFromLocalFile(inputFile, outputFile);
>>         fs.close();
>>     }
>> }
>>
>> Obviously you have to modify it as per your requirements like
>> continuously polling the targeted directory for new files.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <dr...@gmail.com>wrote:
>>
>>> Thanks M  Tariq
>>>
>>> As I am new in  Java and Hadoop and have no much experience. I am trying
>>> to first write a simple program to upload data into HDFS and gradually move
>>> forward. I have written the following simple program to upload the file
>>> into HDFS, I dont know why it does not working.  could you please check it,
>>> if have time.
>>>
>>> import java.io.BufferedInputStream;
>>> import java.io.BufferedOutputStream;
>>> import java.io.File;
>>> import java.io.FileInputStream;
>>> import java.io.FileOutputStream;
>>> import java.io.IOException;
>>> import java.io.InputStream;
>>> import java.io.OutputStream;
>>> import java.nio.*;
>>> //import java.nio.file.Path;
>>>
>>> import org.apache.hadoop.conf.Configuration;
>>> import org.apache.hadoop.fs.FSDataInputStream;
>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>> import org.apache.hadoop.fs.FileSystem;
>>> import org.apache.hadoop.fs.Path;
>>> public class hdfsdata {
>>>
>>>
>>> public static void main(String [] args) throws IOException
>>> {
>>>     try{
>>>
>>>
>>>     Configuration conf = new Configuration();
>>>     conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>>     conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>     String source = "/usr/Eclipse/Output.csv";
>>>     String dest = "/user/hduser/input/";
>>>
>>>     //String fileName = source.substring(source.lastIndexOf('/') +
>>> source.length());
>>>     String fileName = "Output1.csv";
>>>
>>>     if (dest.charAt(dest.length() -1) != '/')
>>>     {
>>>         dest = dest + "/" +fileName;
>>>     }
>>>     else
>>>     {
>>>         dest = dest + fileName;
>>>
>>>     }
>>>     Path path = new Path(dest);
>>>
>>>
>>>     if(fileSystem.exists(path))
>>>     {
>>>         System.out.println("File" + dest + " already exists");
>>>     }
>>>
>>>
>>>    FSDataOutputStream out = fileSystem.create(path);
>>>    InputStream in = new BufferedInputStream(new FileInputStream(new
>>> File(source)));
>>>    File myfile = new File(source);
>>>    byte [] b = new byte [(int) myfile.length() ];
>>>    int numbytes = 0;
>>>    while((numbytes = in.read(b)) >= 0)
>>>
>>>    {
>>>        out.write(b,0,numbytes);
>>>    }
>>>    in.close();
>>>    out.close();
>>>    //bos.close();
>>>    fileSystem.close();
>>>     }
>>>     catch(Exception e)
>>>     {
>>>
>>>         System.out.println(e.toString());
>>>     }
>>>     }
>>>
>>> }
>>>
>>>
>>> Thanks again,
>>>
>>> Best regards,
>>>
>>> KK
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>>
>>>> You can set your cronjob to execute the program after every 5 sec.
>>>>
>>>> Regards,
>>>>     Mohammad Tariq
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <dr...@gmail.com>wrote:
>>>>
>>>>> Well, I want to automatically upload the files as  the files are
>>>>> generating about every 3-5 sec and each file has size about 3MB.
>>>>>
>>>>>  Is it possible to automate the system using put or cp command?
>>>>>
>>>>> I read about the flume and webHDFS but I am not sure it will work or
>>>>> not.
>>>>>
>>>>> Many thanks
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>>>>> wget.null@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>>
>>>>>> - Alex
>>>>>>
>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <dr...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> > HI,
>>>>>> >
>>>>>> > I am generating files continuously in local folder of my base
>>>>>> machine. How
>>>>>> > I can now use the flume to stream the generated files from local
>>>>>> folder to
>>>>>> > HDFS.
>>>>>> > I dont know how exactly configure the sources, sinks and hdfs.
>>>>>> >
>>>>>> > 1) location of folder where files are generating: /usr/datastorage/
>>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>>> >
>>>>>> > Please let me help.
>>>>>> >
>>>>>> > Many thanks
>>>>>> >
>>>>>> > Best regards,
>>>>>> > KK
>>>>>>
>>>>>> --
>>>>>> Alexander Alten-Lorenz
>>>>>> http://mapredit.blogspot.com
>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by kashif khan <dr...@gmail.com>.
Many thanks.

I have changed the program accordingly. It does not show any error but one
warring , but when I am browsing the HDFS folder, file is not copied.


public class CopyData {
public static void main(String[] args) throws IOException{
        Configuration conf = new Configuration();
        //Configuration configuration = new Configuration();
        //configuration.addResource(new
Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
        //configuration.addResource(new
Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
        conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
        conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
        FileSystem fs = FileSystem.get(conf);
        Path inputFile = new Path("/usr/Eclipse/Output.csv");
        Path outputFile = new Path("/user/hduser/Output1.csv");
        fs.copyFromLocalFile(inputFile, outputFile);
        fs.close();
    }
}

19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader <clinit>
WARNING: Unable to load native-hadoop library for your platform... using
builtin-java classes where applicable

Have any idea?

Many thanks





On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <do...@gmail.com> wrote:

> If it is just copying the files without any processing or change, you can
> use something like this :
>
> public class CopyData {
>
>     public static void main(String[] args) throws IOException{
>
>         Configuration configuration = new Configuration();
>         configuration.addResource(new
> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>         configuration.addResource(new
> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>         FileSystem fs = FileSystem.get(configuration);
>         Path inputFile = new Path("/home/mohammad/pc/work/FFT.java");
>         Path outputFile = new Path("/mapout/FFT.java");
>         fs.copyFromLocalFile(inputFile, outputFile);
>         fs.close();
>     }
> }
>
> Obviously you have to modify it as per your requirements like continuously
> polling the targeted directory for new files.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <dr...@gmail.com>wrote:
>
>> Thanks M  Tariq
>>
>> As I am new in  Java and Hadoop and have no much experience. I am trying
>> to first write a simple program to upload data into HDFS and gradually move
>> forward. I have written the following simple program to upload the file
>> into HDFS, I dont know why it does not working.  could you please check it,
>> if have time.
>>
>> import java.io.BufferedInputStream;
>> import java.io.BufferedOutputStream;
>> import java.io.File;
>> import java.io.FileInputStream;
>> import java.io.FileOutputStream;
>> import java.io.IOException;
>> import java.io.InputStream;
>> import java.io.OutputStream;
>> import java.nio.*;
>> //import java.nio.file.Path;
>>
>> import org.apache.hadoop.conf.Configuration;
>> import org.apache.hadoop.fs.FSDataInputStream;
>> import org.apache.hadoop.fs.FSDataOutputStream;
>> import org.apache.hadoop.fs.FileSystem;
>> import org.apache.hadoop.fs.Path;
>> public class hdfsdata {
>>
>>
>> public static void main(String [] args) throws IOException
>> {
>>     try{
>>
>>
>>     Configuration conf = new Configuration();
>>     conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>     conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>>     FileSystem fileSystem = FileSystem.get(conf);
>>     String source = "/usr/Eclipse/Output.csv";
>>     String dest = "/user/hduser/input/";
>>
>>     //String fileName = source.substring(source.lastIndexOf('/') +
>> source.length());
>>     String fileName = "Output1.csv";
>>
>>     if (dest.charAt(dest.length() -1) != '/')
>>     {
>>         dest = dest + "/" +fileName;
>>     }
>>     else
>>     {
>>         dest = dest + fileName;
>>
>>     }
>>     Path path = new Path(dest);
>>
>>
>>     if(fileSystem.exists(path))
>>     {
>>         System.out.println("File" + dest + " already exists");
>>     }
>>
>>
>>    FSDataOutputStream out = fileSystem.create(path);
>>    InputStream in = new BufferedInputStream(new FileInputStream(new
>> File(source)));
>>    File myfile = new File(source);
>>    byte [] b = new byte [(int) myfile.length() ];
>>    int numbytes = 0;
>>    while((numbytes = in.read(b)) >= 0)
>>
>>    {
>>        out.write(b,0,numbytes);
>>    }
>>    in.close();
>>    out.close();
>>    //bos.close();
>>    fileSystem.close();
>>     }
>>     catch(Exception e)
>>     {
>>
>>         System.out.println(e.toString());
>>     }
>>     }
>>
>> }
>>
>>
>> Thanks again,
>>
>> Best regards,
>>
>> KK
>>
>>
>>
>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <do...@gmail.com>wrote:
>>
>>> You can set your cronjob to execute the program after every 5 sec.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <dr...@gmail.com>wrote:
>>>
>>>> Well, I want to automatically upload the files as  the files are
>>>> generating about every 3-5 sec and each file has size about 3MB.
>>>>
>>>>  Is it possible to automate the system using put or cp command?
>>>>
>>>> I read about the flume and webHDFS but I am not sure it will work or
>>>> not.
>>>>
>>>> Many thanks
>>>>
>>>> Best regards
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>>>> wget.null@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>
>>>>> - Alex
>>>>>
>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <dr...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > HI,
>>>>> >
>>>>> > I am generating files continuously in local folder of my base
>>>>> machine. How
>>>>> > I can now use the flume to stream the generated files from local
>>>>> folder to
>>>>> > HDFS.
>>>>> > I dont know how exactly configure the sources, sinks and hdfs.
>>>>> >
>>>>> > 1) location of folder where files are generating: /usr/datastorage/
>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>> >
>>>>> > Please let me help.
>>>>> >
>>>>> > Many thanks
>>>>> >
>>>>> > Best regards,
>>>>> > KK
>>>>>
>>>>> --
>>>>> Alexander Alten-Lorenz
>>>>> http://mapredit.blogspot.com
>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by Mohammad Tariq <do...@gmail.com>.
If it is just copying the files without any processing or change, you can
use something like this :

public class CopyData {

    public static void main(String[] args) throws IOException{

        Configuration configuration = new Configuration();
        configuration.addResource(new
Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
        configuration.addResource(new
Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
        FileSystem fs = FileSystem.get(configuration);
        Path inputFile = new Path("/home/mohammad/pc/work/FFT.java");
        Path outputFile = new Path("/mapout/FFT.java");
        fs.copyFromLocalFile(inputFile, outputFile);
        fs.close();
    }
}

Obviously you have to modify it as per your requirements like continuously
polling the targeted directory for new files.

Regards,
    Mohammad Tariq



On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <dr...@gmail.com> wrote:

> Thanks M  Tariq
>
> As I am new in  Java and Hadoop and have no much experience. I am trying
> to first write a simple program to upload data into HDFS and gradually move
> forward. I have written the following simple program to upload the file
> into HDFS, I dont know why it does not working.  could you please check it,
> if have time.
>
> import java.io.BufferedInputStream;
> import java.io.BufferedOutputStream;
> import java.io.File;
> import java.io.FileInputStream;
> import java.io.FileOutputStream;
> import java.io.IOException;
> import java.io.InputStream;
> import java.io.OutputStream;
> import java.nio.*;
> //import java.nio.file.Path;
>
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.FSDataInputStream;
> import org.apache.hadoop.fs.FSDataOutputStream;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.Path;
> public class hdfsdata {
>
>
> public static void main(String [] args) throws IOException
> {
>     try{
>
>
>     Configuration conf = new Configuration();
>     conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>     conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>     FileSystem fileSystem = FileSystem.get(conf);
>     String source = "/usr/Eclipse/Output.csv";
>     String dest = "/user/hduser/input/";
>
>     //String fileName = source.substring(source.lastIndexOf('/') +
> source.length());
>     String fileName = "Output1.csv";
>
>     if (dest.charAt(dest.length() -1) != '/')
>     {
>         dest = dest + "/" +fileName;
>     }
>     else
>     {
>         dest = dest + fileName;
>
>     }
>     Path path = new Path(dest);
>
>
>     if(fileSystem.exists(path))
>     {
>         System.out.println("File" + dest + " already exists");
>     }
>
>
>    FSDataOutputStream out = fileSystem.create(path);
>    InputStream in = new BufferedInputStream(new FileInputStream(new
> File(source)));
>    File myfile = new File(source);
>    byte [] b = new byte [(int) myfile.length() ];
>    int numbytes = 0;
>    while((numbytes = in.read(b)) >= 0)
>
>    {
>        out.write(b,0,numbytes);
>    }
>    in.close();
>    out.close();
>    //bos.close();
>    fileSystem.close();
>     }
>     catch(Exception e)
>     {
>
>         System.out.println(e.toString());
>     }
>     }
>
> }
>
>
> Thanks again,
>
> Best regards,
>
> KK
>
>
>
> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <do...@gmail.com>wrote:
>
>> You can set your cronjob to execute the program after every 5 sec.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <dr...@gmail.com>wrote:
>>
>>> Well, I want to automatically upload the files as  the files are
>>> generating about every 3-5 sec and each file has size about 3MB.
>>>
>>>  Is it possible to automate the system using put or cp command?
>>>
>>> I read about the flume and webHDFS but I am not sure it will work or not.
>>>
>>> Many thanks
>>>
>>> Best regards
>>>
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>>> wget.null@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> Why do you don't use HDFS related tools like put or cp?
>>>>
>>>> - Alex
>>>>
>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <dr...@gmail.com>
>>>> wrote:
>>>>
>>>> > HI,
>>>> >
>>>> > I am generating files continuously in local folder of my base
>>>> machine. How
>>>> > I can now use the flume to stream the generated files from local
>>>> folder to
>>>> > HDFS.
>>>> > I dont know how exactly configure the sources, sinks and hdfs.
>>>> >
>>>> > 1) location of folder where files are generating: /usr/datastorage/
>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>> >
>>>> > Please let me help.
>>>> >
>>>> > Many thanks
>>>> >
>>>> > Best regards,
>>>> > KK
>>>>
>>>> --
>>>> Alexander Alten-Lorenz
>>>> http://mapredit.blogspot.com
>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>
>>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by kashif khan <dr...@gmail.com>.
Thanks M  Tariq

As I am new in  Java and Hadoop and have no much experience. I am trying to
first write a simple program to upload data into HDFS and gradually move
forward. I have written the following simple program to upload the file
into HDFS, I dont know why it does not working.  could you please check it,
if have time.

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.nio.*;
//import java.nio.file.Path;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class hdfsdata {


public static void main(String [] args) throws IOException
{
    try{


    Configuration conf = new Configuration();
    conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
    conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
    FileSystem fileSystem = FileSystem.get(conf);
    String source = "/usr/Eclipse/Output.csv";
    String dest = "/user/hduser/input/";

    //String fileName = source.substring(source.lastIndexOf('/') +
source.length());
    String fileName = "Output1.csv";

    if (dest.charAt(dest.length() -1) != '/')
    {
        dest = dest + "/" +fileName;
    }
    else
    {
        dest = dest + fileName;

    }
    Path path = new Path(dest);


    if(fileSystem.exists(path))
    {
        System.out.println("File" + dest + " already exists");
    }


   FSDataOutputStream out = fileSystem.create(path);
   InputStream in = new BufferedInputStream(new FileInputStream(new
File(source)));
   File myfile = new File(source);
   byte [] b = new byte [(int) myfile.length() ];
   int numbytes = 0;
   while((numbytes = in.read(b)) >= 0)

   {
       out.write(b,0,numbytes);
   }
   in.close();
   out.close();
   //bos.close();
   fileSystem.close();
    }
    catch(Exception e)
    {

        System.out.println(e.toString());
    }
    }

}


Thanks again,

Best regards,

KK


On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <do...@gmail.com> wrote:

> You can set your cronjob to execute the program after every 5 sec.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <dr...@gmail.com>wrote:
>
>> Well, I want to automatically upload the files as  the files are
>> generating about every 3-5 sec and each file has size about 3MB.
>>
>>  Is it possible to automate the system using put or cp command?
>>
>> I read about the flume and webHDFS but I am not sure it will work or not.
>>
>> Many thanks
>>
>> Best regards
>>
>>
>>
>>
>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
>> wget.null@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Why do you don't use HDFS related tools like put or cp?
>>>
>>> - Alex
>>>
>>> On Nov 19, 2012, at 11:44 AM, kashif khan <dr...@gmail.com>
>>> wrote:
>>>
>>> > HI,
>>> >
>>> > I am generating files continuously in local folder of my base machine.
>>> How
>>> > I can now use the flume to stream the generated files from local
>>> folder to
>>> > HDFS.
>>> > I dont know how exactly configure the sources, sinks and hdfs.
>>> >
>>> > 1) location of folder where files are generating: /usr/datastorage/
>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>> >
>>> > Please let me help.
>>> >
>>> > Many thanks
>>> >
>>> > Best regards,
>>> > KK
>>>
>>> --
>>> Alexander Alten-Lorenz
>>> http://mapredit.blogspot.com
>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>
>>>
>>
>

Re: Automatically upload files into HDFS

Posted by Mohammad Tariq <do...@gmail.com>.
You can set your cronjob to execute the program after every 5 sec.

Regards,
    Mohammad Tariq



On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <dr...@gmail.com> wrote:

> Well, I want to automatically upload the files as  the files are
> generating about every 3-5 sec and each file has size about 3MB.
>
>  Is it possible to automate the system using put or cp command?
>
> I read about the flume and webHDFS but I am not sure it will work or not.
>
> Many thanks
>
> Best regards
>
>
>
>
> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
> wget.null@gmail.com> wrote:
>
>> Hi,
>>
>> Why do you don't use HDFS related tools like put or cp?
>>
>> - Alex
>>
>> On Nov 19, 2012, at 11:44 AM, kashif khan <dr...@gmail.com> wrote:
>>
>> > HI,
>> >
>> > I am generating files continuously in local folder of my base machine.
>> How
>> > I can now use the flume to stream the generated files from local folder
>> to
>> > HDFS.
>> > I dont know how exactly configure the sources, sinks and hdfs.
>> >
>> > 1) location of folder where files are generating: /usr/datastorage/
>> > 2) name node address: htdfs://hadoop1.example.com:8020
>> >
>> > Please let me help.
>> >
>> > Many thanks
>> >
>> > Best regards,
>> > KK
>>
>> --
>> Alexander Alten-Lorenz
>> http://mapredit.blogspot.com
>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>
>>
>

Re: Automatically upload files into HDFS

Posted by kashif khan <dr...@gmail.com>.
Well, I want to automatically upload the files as  the files are generating
about every 3-5 sec and each file has size about 3MB.

 Is it possible to automate the system using put or cp command?

I read about the flume and webHDFS but I am not sure it will work or not.

Many thanks

Best regards



On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz <
wget.null@gmail.com> wrote:

> Hi,
>
> Why do you don't use HDFS related tools like put or cp?
>
> - Alex
>
> On Nov 19, 2012, at 11:44 AM, kashif khan <dr...@gmail.com> wrote:
>
> > HI,
> >
> > I am generating files continuously in local folder of my base machine.
> How
> > I can now use the flume to stream the generated files from local folder
> to
> > HDFS.
> > I dont know how exactly configure the sources, sinks and hdfs.
> >
> > 1) location of folder where files are generating: /usr/datastorage/
> > 2) name node address: htdfs://hadoop1.example.com:8020
> >
> > Please let me help.
> >
> > Many thanks
> >
> > Best regards,
> > KK
>
> --
> Alexander Alten-Lorenz
> http://mapredit.blogspot.com
> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>
>

Re: Automatically upload files into HDFS

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Hi,

Why do you don't use HDFS related tools like put or cp? 

- Alex

On Nov 19, 2012, at 11:44 AM, kashif khan <dr...@gmail.com> wrote:

> HI,
> 
> I am generating files continuously in local folder of my base machine. How
> I can now use the flume to stream the generated files from local folder to
> HDFS.
> I dont know how exactly configure the sources, sinks and hdfs.
> 
> 1) location of folder where files are generating: /usr/datastorage/
> 2) name node address: htdfs://hadoop1.example.com:8020
> 
> Please let me help.
> 
> Many thanks
> 
> Best regards,
> KK

--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF