You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Jeff Hammerbacher <je...@gmail.com> on 2007/07/29 07:53:37 UTC
using git for SCM?
hey all,
so we're starting to really use hadoop heavily here at facebook, and we'd
like to start contributing back to the hadoop trunk. we've been through
this process before (contributing heavily to an actively developed open
source project) with memached, and the guy who owns the trunk of that
project is quite the SCM nut who says that his life has been made infinitely
more pleasant by switching from svn to git for version control with
memcached.
i'm not suggesting the project be moved tomorrow. i just wanted to throw
the question out there to get a sense of how others feel about git as an SCM
tool. i have asked our memcached guy to write up the specific advantages of
git versus svn so i will be happy to follow up with details.
apologies if this has been brought up before; i've been reading the mailing
list on and off now for several months and a quick search of the archives
yielded no matches for "git" so hopefully this is a relevant topic.
thanks,
jeff
Re: using git for SCM?
Posted by Raghu Angadi <ra...@yahoo-inc.com>.
Enis Soztutar wrote:
> Hi,
>
> I have switched to git just two days ago, and i am quite happy about it.
> Although i do not want hadoop to move to git before apache
> infrastructure have matured enough. As Linus addresses in his talk, many
> users have been using git+svn in a very innovative way. The central
> repository is managed by svn and the users checkout the code, build a
> local git repo and work from there.
This seems like very useful case. When I was working with patch for
HADOOP-1134 that depended on (changing) patches for other jiras, I was
working with my own local svn. That implied I could not update from
central svn. This fixes it.
Raghu.
> Personally i find it *very*
> convenient to let both svn and git manage the code. What i do is make
> git ignore svn files, and make svn ignore git files. Then i just branch
> once for every issue i work on. Committing the changes before switching
> to the other branch(issue). After I've done with the issue i get the
> patch with regular svn diff. Anyhow i am telling about this because it
> enabled me to work on several distinct issues simultaneously using the
> same code base. I highly suggest this workspace-model to anyone working
> this way.
Re: using git for SCM?
Posted by Enis Soztutar <en...@gmail.com>.
Hi,
I have switched to git just two days ago, and i am quite happy about it.
Although i do not want hadoop to move to git before apache
infrastructure have matured enough. As Linus addresses in his talk, many
users have been using git+svn in a very innovative way. The central
repository is managed by svn and the users checkout the code, build a
local git repo and work from there. Personally i find it *very*
convenient to let both svn and git manage the code. What i do is make
git ignore svn files, and make svn ignore git files. Then i just branch
once for every issue i work on. Committing the changes before switching
to the other branch(issue). After I've done with the issue i get the
patch with regular svn diff. Anyhow i am telling about this because it
enabled me to work on several distinct issues simultaneously using the
same code base. I highly suggest this workspace-model to anyone working
this way.
Jeff Hammerbacher wrote:
> hey all,
>
> so we're starting to really use hadoop heavily here at facebook, and we'd
> like to start contributing back to the hadoop trunk. we've been through
> this process before (contributing heavily to an actively developed open
> source project) with memached, and the guy who owns the trunk of that
> project is quite the SCM nut who says that his life has been made infinitely
> more pleasant by switching from svn to git for version control with
> memcached.
>
> i'm not suggesting the project be moved tomorrow. i just wanted to throw
> the question out there to get a sense of how others feel about git as an SCM
> tool. i have asked our memcached guy to write up the specific advantages of
> git versus svn so i will be happy to follow up with details.
>
> apologies if this has been brought up before; i've been reading the mailing
> list on and off now for several months and a quick search of the archives
> yielded no matches for "git" so hopefully this is a relevant topic.
>
> thanks,
> jeff
>
>
Re: using git for SCM?
Posted by Jim White <ji...@pagesmiths.com>.
Eric Baldeschwieler wrote:
> Interesting topic.
> I'd be interested in learning what would be made infinitely better.
Distributed revision control is definitely superior to the conventional
centralized revision control plus patching rigmarole.
I don't think there is anything offering infinite improvement at this
stage though.
> But SVN shortcomings aren't on my top 10 hadoop issues yet.
> I'd be interested in hearing from folks who think they this would be
> very valuable.
>
> But I agree with doug. Too much leverage in using Apache's
> infrastructure to make this appealing.
> ...
Yes, indeed.
Especially since git is nowhere near suitable for folks who aren't
Linux-based, command-line loving hackers. A key thing to remember is
that many folks use the SCM integration in their IDE, an area where even
Subversion is still working to catch up with CVS.
BitKeeper (the commercial distributed revision control tool git was
written to replace) is at least cross-platform, but is not free. Nor
are it's benefits over the current system worth the total cost of
operation for ordinary projects like Hadoop.
Linus extols git's virtues (and Subversion's worthlessness) in his
inimitable style here:
http://www.youtube.com/watch?v=4XpnKHJAok8
Since we're discussing SCM, I'll mention darcs. Even though it may not
be as fast as git, it is OSS and implemented in Haskell, making it an
promising starting point for more creative applications. Links to that
and other interesting tools here:
http://www.ifcx.org/wiki/RevisionControl.html
Jim
Re: using git for SCM?
Posted by Eric Baldeschwieler <er...@yahoo-inc.com>.
Interesting topic.
I'd be interested in learning what would be made infinitely better.
But SVN shortcomings aren't on my top 10 hadoop issues yet.
I'd be interested in hearing from folks who think they this would be
very valuable.
But I agree with doug. Too much leverage in using Apache's
infrastructure to make this appealing.
On Jul 30, 2007, at 1:28 PM, Doug Cutting wrote:
> Jeff Hammerbacher wrote:
> > so we're starting to really use hadoop heavily here at facebook,
> and we'd
> > like to start contributing back to the hadoop trunk. we've been
> through
> > this process before (contributing heavily to an actively
> developed open
> > source project) with memached, and the guy who owns the trunk of
> that
> > project is quite the SCM nut who says that his life has been made
> infinitely
> > more pleasant by switching from svn to git for version control with
> > memcached.
> >
> > i'm not suggesting the project be moved tomorrow. i just wanted
> to throw
> > the question out there to get a sense of how others feel about
> git as an SCM
> > tool.
>
> The project has a fair amount of culture built around subversion.
> Also,
> Apache's infrastructure currently supports subversion, not git, so we
> would have to either manage our own repository, which is not
> encouraged,
> or make git a supported part of Apache's infrastructure, which is
> rather
> outside the scope of Hadoop. So we could potentially move to a
> different SCM, but someone would have to make a very compelling case
> before that would become a project priority.
>
> Doug
>
>
Re: using git for SCM?
Posted by Torsten Curdt <tc...@apache.org>.
On 30.07.2007, at 22:28, Doug Cutting wrote:
> Jeff Hammerbacher wrote:
>> so we're starting to really use hadoop heavily here at facebook,
>> and we'd
>> like to start contributing back to the hadoop trunk. we've been
>> through
>> this process before (contributing heavily to an actively developed
>> open
>> source project) with memached, and the guy who owns the trunk of that
>> project is quite the SCM nut who says that his life has been made
>> infinitely
>> more pleasant by switching from svn to git for version control with
>> memcached.
>> i'm not suggesting the project be moved tomorrow. i just wanted
>> to throw
>> the question out there to get a sense of how others feel about git
>> as an SCM
>> tool.
>
> The project has a fair amount of culture built around subversion.
> Also, Apache's infrastructure currently supports subversion, not
> git, so we would have to either manage our own repository, which is
> not encouraged, or make git a supported part of Apache's
> infrastructure, which is rather outside the scope of Hadoop. So we
> could potentially move to a different SCM, but someone would have
> to make a very compelling case before that would become a project
> priority.
Well, there is still git-svn as a two-way bridge. Having said that -
I have no clue how feasible this would be ...but Linus' talk got me
interested in git, too
http://youtube.com/watch?v=4XpnKHJAok8
cheers
--
Torsten
Re: using git for SCM?
Posted by Doug Cutting <cu...@apache.org>.
Jeff Hammerbacher wrote:
> so we're starting to really use hadoop heavily here at facebook, and we'd
> like to start contributing back to the hadoop trunk. we've been through
> this process before (contributing heavily to an actively developed open
> source project) with memached, and the guy who owns the trunk of that
> project is quite the SCM nut who says that his life has been made infinitely
> more pleasant by switching from svn to git for version control with
> memcached.
>
> i'm not suggesting the project be moved tomorrow. i just wanted to throw
> the question out there to get a sense of how others feel about git as an SCM
> tool.
The project has a fair amount of culture built around subversion. Also,
Apache's infrastructure currently supports subversion, not git, so we
would have to either manage our own repository, which is not encouraged,
or make git a supported part of Apache's infrastructure, which is rather
outside the scope of Hadoop. So we could potentially move to a
different SCM, but someone would have to make a very compelling case
before that would become a project priority.
Doug
Re: using git for SCM?
Posted by Prafulla Tekawade <pr...@gmail.com>.
Just post this comparison here,when that "memcached guy" is done with it...
I would like to read it....
>
> i have asked our memcached guy to write up the specific advantages of
> git versus svn so i will be happy to follow up with details.
>
>
>