You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Alan Miller <Al...@synopsys.com> on 2010/04/07 15:24:56 UTC

append operations

Hi,

I'm just getting started with HDFS and I'm using the java API.
What's the best way to synchronize files between my local and hdfs file systems.
I.e. if a 10G file only grows by 1G, I'd rather just append the 1G instead or copying the full 11G.

>From my understanding append operations haven't been implemented in HDFS yet.
Is there  any news or the status on this?

I was hoping to implement something  myself (using a FileInputStream and
FSDataOutputStream) but  FSDataOutputStream is not seekable.

Any suggestions?

Regards,
Alan

Re: append operations

Posted by Todd Lipcon <to...@cloudera.com>.
On Wed, Apr 7, 2010 at 11:21 AM, Thanh Do <th...@cs.wisc.edu> wrote:

> regarding append support, i know you guys have done a lot of work in fixing
> append in 0.22.
> but in the download web page, i can only find 0.20.2 in which append is
> optional and still buggy.
>
> can you show me how to get 0.22 version?
>
>
0.21 and 0.22 have not been released yet. You should only attempt to use
them if you're comfortable building from source control and are OK working
through some complications of running a trunk codebase.

-Todd


> 2010/4/7 Konstantin Boudnik <co...@yahoo-inc.com>
>
> Check HDFS-265. Append has been implemeted in the trunk (i.e. 0.22) so you
>> can
>> grab it, deploy, and start using it.
>>
>> Cos
>>
>> On Wed, Apr 07, 2010 at 06:24AM, Alan Miller wrote:
>> >    Hi,
>> >
>> >
>> >
>> >    I'm just getting started with HDFS and I'm using the java API.
>> >
>> >    What's the best way to synchronize files between my local and hdfs
>> file
>> >    systems.
>> >
>> >    I.e. if a 10G file only grows by 1G, I'd rather just append the 1G
>> instead
>> >    or copying the full 11G.
>> >
>> >
>> >
>> >    From my understanding append operations haven't been implemented in
>> HDFS
>> >    yet.
>> >
>> >    Is there  any news or the status on this?
>> >
>> >
>> >
>> >    I was hoping to implement something  myself (using a FileInputStream
>> and
>> >
>> >    FSDataOutputStream) but  FSDataOutputStream is not seekable.
>> >
>> >
>> >
>> >    Any suggestions?
>> >
>> >
>> >
>> >    Regards,
>> >
>> >    Alan
>>
>
>
>
> --
> thanh
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: append operations

Posted by Thanh Do <th...@cs.wisc.edu>.
regarding append support, i know you guys have done a lot of work in fixing
append in 0.22.
but in the download web page, i can only find 0.20.2 in which append is
optional and still buggy.

can you show me how to get 0.22 version?

best,

Thanh

2010/4/7 Konstantin Boudnik <co...@yahoo-inc.com>

> Check HDFS-265. Append has been implemeted in the trunk (i.e. 0.22) so you
> can
> grab it, deploy, and start using it.
>
> Cos
>
> On Wed, Apr 07, 2010 at 06:24AM, Alan Miller wrote:
> >    Hi,
> >
> >
> >
> >    I'm just getting started with HDFS and I'm using the java API.
> >
> >    What's the best way to synchronize files between my local and hdfs
> file
> >    systems.
> >
> >    I.e. if a 10G file only grows by 1G, I'd rather just append the 1G
> instead
> >    or copying the full 11G.
> >
> >
> >
> >    From my understanding append operations haven't been implemented in
> HDFS
> >    yet.
> >
> >    Is there  any news or the status on this?
> >
> >
> >
> >    I was hoping to implement something  myself (using a FileInputStream
> and
> >
> >    FSDataOutputStream) but  FSDataOutputStream is not seekable.
> >
> >
> >
> >    Any suggestions?
> >
> >
> >
> >    Regards,
> >
> >    Alan
>



-- 
thanh

Re: append operations

Posted by Konstantin Boudnik <co...@yahoo-inc.com>.
Check HDFS-265. Append has been implemeted in the trunk (i.e. 0.22) so you can
grab it, deploy, and start using it.

Cos

On Wed, Apr 07, 2010 at 06:24AM, Alan Miller wrote:
>    Hi,
> 
>     
> 
>    I'm just getting started with HDFS and I'm using the java API.
> 
>    What's the best way to synchronize files between my local and hdfs file
>    systems.
> 
>    I.e. if a 10G file only grows by 1G, I'd rather just append the 1G instead
>    or copying the full 11G.
> 
>     
> 
>    From my understanding append operations haven't been implemented in HDFS
>    yet.
> 
>    Is there  any news or the status on this?
> 
>     
> 
>    I was hoping to implement something  myself (using a FileInputStream and
> 
>    FSDataOutputStream) but  FSDataOutputStream is not seekable.
> 
>     
> 
>    Any suggestions?
> 
>     
> 
>    Regards,
> 
>    Alan

Re: append operations

Posted by Jitendra Nath Pandey <ji...@yahoo-inc.com>.
You might have to export HADOOP_HOME and HADOOP_CONF_DIR as well.

E.g
  export HADOOP_HOME=/opt/hadoop/0.22-SNAPSHOT/
And export HADOOP_CONF_DIR=/opt/hadoop/0.22-SNAPSHOT/conf
  if that is where your hadoop-home and conf are.


On 4/8/10 4:03 AM, "Alan Miller" <Al...@synopsys.com> wrote:

> Ok, great I¹d like to try the trunk, I got and compiled ³common² and ³hdfs²
> as follows but I can¹t get it to startup ("NoClassDefFoundError" see below).
> Am I missing something?
> 
>   svn co http://svn.apache.org/repos/asf/hadoop/hdfs/trunk
>   svn co http://svn.apache.org/repos/asf/hadoop/common/trunk
>   cd Š.
>   export ANT_HOME=/usr/share/ant
>   export JAVA_HOME=/opt/java/jdk1.6.0_10-i586
>   export PATH=$JAVA_HOME/bin:$PATH
>   ant -l build-compile.log \
>       -Dforrest.home=/usr/local/src/apache-forrest-0.8 \
>       -Djava5.home=/opt/java/jdk1.5.0_22-i586 \
>       package
> 
> Then I copied everything
>   from: /usr/local/src/hadoop/common/trunk/build/hadoop-hdfs-0.22.0-SNAPSHOT/*
>   and:   /usr/local/src/hadoop/hdfs/trunk/build/hadoop-core-0.22.0-SNAPSHOT/*
>   to:  /opt/hadoop/0.22-SNAPSHOT/
> 
> When I run start-dfs.sh I get ³java.lang.NoClassDefFoundError² errors:
> 
>   [root@amiller-e6400 ~]# start-dfs.sh
>   starting namenode, logging to
> /opt/hadoop/0.22.0-SNAPSHOT/logs/hadoop-root-namenode-localhost.out
>   localhost: starting datanode, logging to
> /opt/hadoop/0.22.0-SNAPSHOT/logs/hadoop-root-datanode-localhost.out
>   localhost: Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/hdfs/server/datanode/DataNode
>   localhost: Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.hdfs.server.datanode.DataNode
> 
> Regards,
> Alan


RE: append operations

Posted by Alan Miller <Al...@synopsys.com>.
Ok, great I’d like to try the trunk, I got and compiled “common” and “hdfs” 
as follows but I can’t get it to startup ("NoClassDefFoundError" see below). 
Am I missing something?

  svn co http://svn.apache.org/repos/asf/hadoop/hdfs/trunk
  svn co http://svn.apache.org/repos/asf/hadoop/common/trunk
  cd ….
  export ANT_HOME=/usr/share/ant
  export JAVA_HOME=/opt/java/jdk1.6.0_10-i586
  export PATH=$JAVA_HOME/bin:$PATH
  ant -l build-compile.log \
      -Dforrest.home=/usr/local/src/apache-forrest-0.8 \
      -Djava5.home=/opt/java/jdk1.5.0_22-i586 \
      package

Then I copied everything
  from: /usr/local/src/hadoop/common/trunk/build/hadoop-hdfs-0.22.0-SNAPSHOT/*
  and:   /usr/local/src/hadoop/hdfs/trunk/build/hadoop-core-0.22.0-SNAPSHOT/*
  to:  /opt/hadoop/0.22-SNAPSHOT/

When I run start-dfs.sh I get “java.lang.NoClassDefFoundError” errors:

  [root@amiller-e6400 ~]# start-dfs.sh
  starting namenode, logging to /opt/hadoop/0.22.0-SNAPSHOT/logs/hadoop-root-namenode-localhost.out
  localhost: starting datanode, logging to /opt/hadoop/0.22.0-SNAPSHOT/logs/hadoop-root-datanode-localhost.out
  localhost: Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/server/datanode/DataNode
  localhost: Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hdfs.server.datanode.DataNode

Regards,
Alan