You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hawq.apache.org by Ming Li <ml...@pivotal.io> on 2016/09/12 08:52:46 UTC

[Vote] Shrink a Git Repository

Hi all,

I heard from someone complain about the slowness to git clone hawq repo.
And we have almost finished large code modification for apache release, I
think it is opt time for us to shrink git repo size now.

I followed the steps in
http://stevelorek.com/how-to-shrink-a-git-repository.html, deleted some
large file which is not useful in current build process,  it can reduce
size from 174M to 100M.

However it need all clone to be re-clone again after git push the changed
repo. So I am writing this email to ask for voting it.

Thanks.

Re: [Vote] Shrink a Git Repository

Posted by Ming Li <ml...@pivotal.io>.
Even it is no need at current time, we need to shrink after release because
previous we changed a lot for building process, and now all these processes
are stable now.

A lot of files under specific directories are useless(e.g. goh/ext/*), we
can delete all files to decrease more space.


On Tue, Sep 13, 2016 at 10:48 AM, Radar Da lei <rl...@pivotal.io> wrote:

> I don't feel like the shrink is necessary, for a slow network, 177MB to
> 100MB does not help much.
>
> It will make all the users to delete their current git repo and redo the
> clone. And it might not be a simple git clone, will need to do a lot
> changes on local git settings, e.g. remotes, branches merges.  Users even
> might need to recreate their github fork.
>
> So I think  shrinking the HAWQ repo is good, but it's not worth to do it at
> this point.
>
> Thanks.
>
> Regards,
> Radar
>
> On Tue, Sep 13, 2016 at 10:33 AM, Paul Guo <pa...@gmail.com> wrote:
>
> > Does this affect an existing cloned repo (i.e. do we need to re-clone a
> > fresh repo if we want to check in something)?
> >
> > Does this affect "git checkout $tag"? (I need to rebase to an old
> release.)
> >
> > 2016-09-12 18:36 GMT+08:00 Ming Li <ml...@apache.org>:
> >
> > > Hi Ed,
> > >
> > > Here I just delete below large files from git repo, the contents in
> these
> > > file cannot be retrieved anymore( but all these files are useless in
> > > current build process), however git log is still there.
> > >
> > > Below is all file lists removed. Please review them and point out if
> they
> > > are in use.  Thanks
> > > ------------------------ Current removed  ------------------------
> > > All sizes are in kB. The pack column is the size of the object,
> > compressed,
> > > inside the pack file.
> > > size   pack   SHA                                       location
> > > 42707  8921   eb59c507535698b76e67d4965814e417fbdacde9
> > >  goh/ext/rhel5_x86_64/lib/libmadlib.so
> > > 22879  22283  bb2673f70c88573c45ea4ffa59da69e517aa2ba5
> > >  repo/Pivotal/libhdfs3/1.2.1-rc1/targzs/libhdfs3-rhel5_x86_
> > > 64-1.2.1-rc1.targz
> > > 12939  12450  a33fb9101505ab2475861d50b4b1ad9ec4da811a
> > >  repo/Pivotal/libhdfs3/1.2.1-rc1/targzs/libhdfs3-osx106_
> > > x86-1.2.1-rc1.targz
> > > 11077  10890  878971186fde916bab67555bc65ac4dcf662b5f8
> > >  pxf/tools/pxfd/pxfd-1.0-1.noarch.rpm
> > > 11019  10989  ec7c1e1a8d76f78c1222230ac13dbbbe8c5acc57
> > >  pxf/tools/pxfd/rpmbuild/SOURCES/pxfd.tar.gz
> > > 7573   1722   ab532846e72d24f11066e7c7f248d03d2fbbe8fa
> > >  depends/thirdparty/orc/examples/expected/demo-12-zlib.jsn.gz
> > > 5618   1836   b4f773943dea27e443abe0ee8bec0679de989e9b
> > > gpcc/WIN32/gpcc.ncb
> > > 5182   1016   1031f7d3f70ff48f674927cf486c95e0f9166860
> > >  goh/ext/osx105_x86/lib/libmadlib.so
> > > 5027   122    1d1d714a846259ec5b2b0471e55eec94efd7671d
> > >  depends/thirdparty/orc/examples/demo-11-none.orc
> > > 3685   3669   7c71d14c0fcca416444cda1b0673158b71011973
> > >  gpcc/ext/openssl-0.9.8r.tar.gz
> > > 1806   283    1e60f847659633dd4308a21f8d0b5c163cc519a0
> > >  releng/ereport/ereport.txt
> > > 1802   1796   97447c5120c7d2f1e738dbe6886cb58609ce739f
> > >  tools/bin/pythonSrc/epydoc-3.0.1.tar.gz
> > > 1750   314    271bb9448ee8c7baf13db19ada4d99a7c5b418eb
> > >  releng/ereport/ereport.txt
> > >
> > >
> > >
> > > On Mon, Sep 12, 2016 at 6:17 PM, Ed Espino <es...@apache.org> wrote:
> > >
> > > > -1 Need more information.  What is the impact (are any useful files
> > > lost)?
> > > > Are any files removed?
> > > >
> > > > -=e
> > > >
> > > > On Mon, Sep 12, 2016 at 4:52 PM, Ming Li <ml...@pivotal.io> wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I heard from someone complain about the slowness to git clone hawq
> > > repo.
> > > > > And we have almost finished large code modification for apache
> > > release, I
> > > > > think it is opt time for us to shrink git repo size now.
> > > > >
> > > > > I followed the steps in
> > > > > http://stevelorek.com/how-to-shrink-a-git-repository.html, deleted
> > > some
> > > > > large file which is not useful in current build process,  it can
> > reduce
> > > > > size from 174M to 100M.
> > > > >
> > > > > However it need all clone to be re-clone again after git push the
> > > changed
> > > > > repo. So I am writing this email to ask for voting it.
> > > > >
> > > > > Thanks.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Ed Espino*
> > > > *espino@apache.org <es...@apache.org>*
> > > >
> > >
> >
>

Re: [Vote] Shrink a Git Repository

Posted by Radar Da lei <rl...@pivotal.io>.
I don't feel like the shrink is necessary, for a slow network, 177MB to
100MB does not help much.

It will make all the users to delete their current git repo and redo the
clone. And it might not be a simple git clone, will need to do a lot
changes on local git settings, e.g. remotes, branches merges.  Users even
might need to recreate their github fork.

So I think  shrinking the HAWQ repo is good, but it's not worth to do it at
this point.

Thanks.

Regards,
Radar

On Tue, Sep 13, 2016 at 10:33 AM, Paul Guo <pa...@gmail.com> wrote:

> Does this affect an existing cloned repo (i.e. do we need to re-clone a
> fresh repo if we want to check in something)?
>
> Does this affect "git checkout $tag"? (I need to rebase to an old release.)
>
> 2016-09-12 18:36 GMT+08:00 Ming Li <ml...@apache.org>:
>
> > Hi Ed,
> >
> > Here I just delete below large files from git repo, the contents in these
> > file cannot be retrieved anymore( but all these files are useless in
> > current build process), however git log is still there.
> >
> > Below is all file lists removed. Please review them and point out if they
> > are in use.  Thanks
> > ------------------------ Current removed  ------------------------
> > All sizes are in kB. The pack column is the size of the object,
> compressed,
> > inside the pack file.
> > size   pack   SHA                                       location
> > 42707  8921   eb59c507535698b76e67d4965814e417fbdacde9
> >  goh/ext/rhel5_x86_64/lib/libmadlib.so
> > 22879  22283  bb2673f70c88573c45ea4ffa59da69e517aa2ba5
> >  repo/Pivotal/libhdfs3/1.2.1-rc1/targzs/libhdfs3-rhel5_x86_
> > 64-1.2.1-rc1.targz
> > 12939  12450  a33fb9101505ab2475861d50b4b1ad9ec4da811a
> >  repo/Pivotal/libhdfs3/1.2.1-rc1/targzs/libhdfs3-osx106_
> > x86-1.2.1-rc1.targz
> > 11077  10890  878971186fde916bab67555bc65ac4dcf662b5f8
> >  pxf/tools/pxfd/pxfd-1.0-1.noarch.rpm
> > 11019  10989  ec7c1e1a8d76f78c1222230ac13dbbbe8c5acc57
> >  pxf/tools/pxfd/rpmbuild/SOURCES/pxfd.tar.gz
> > 7573   1722   ab532846e72d24f11066e7c7f248d03d2fbbe8fa
> >  depends/thirdparty/orc/examples/expected/demo-12-zlib.jsn.gz
> > 5618   1836   b4f773943dea27e443abe0ee8bec0679de989e9b
> > gpcc/WIN32/gpcc.ncb
> > 5182   1016   1031f7d3f70ff48f674927cf486c95e0f9166860
> >  goh/ext/osx105_x86/lib/libmadlib.so
> > 5027   122    1d1d714a846259ec5b2b0471e55eec94efd7671d
> >  depends/thirdparty/orc/examples/demo-11-none.orc
> > 3685   3669   7c71d14c0fcca416444cda1b0673158b71011973
> >  gpcc/ext/openssl-0.9.8r.tar.gz
> > 1806   283    1e60f847659633dd4308a21f8d0b5c163cc519a0
> >  releng/ereport/ereport.txt
> > 1802   1796   97447c5120c7d2f1e738dbe6886cb58609ce739f
> >  tools/bin/pythonSrc/epydoc-3.0.1.tar.gz
> > 1750   314    271bb9448ee8c7baf13db19ada4d99a7c5b418eb
> >  releng/ereport/ereport.txt
> >
> >
> >
> > On Mon, Sep 12, 2016 at 6:17 PM, Ed Espino <es...@apache.org> wrote:
> >
> > > -1 Need more information.  What is the impact (are any useful files
> > lost)?
> > > Are any files removed?
> > >
> > > -=e
> > >
> > > On Mon, Sep 12, 2016 at 4:52 PM, Ming Li <ml...@pivotal.io> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I heard from someone complain about the slowness to git clone hawq
> > repo.
> > > > And we have almost finished large code modification for apache
> > release, I
> > > > think it is opt time for us to shrink git repo size now.
> > > >
> > > > I followed the steps in
> > > > http://stevelorek.com/how-to-shrink-a-git-repository.html, deleted
> > some
> > > > large file which is not useful in current build process,  it can
> reduce
> > > > size from 174M to 100M.
> > > >
> > > > However it need all clone to be re-clone again after git push the
> > changed
> > > > repo. So I am writing this email to ask for voting it.
> > > >
> > > > Thanks.
> > > >
> > >
> > >
> > >
> > > --
> > > *Ed Espino*
> > > *espino@apache.org <es...@apache.org>*
> > >
> >
>

Re: [Vote] Shrink a Git Repository

Posted by Paul Guo <pa...@gmail.com>.
Does this affect an existing cloned repo (i.e. do we need to re-clone a
fresh repo if we want to check in something)?

Does this affect "git checkout $tag"? (I need to rebase to an old release.)

2016-09-12 18:36 GMT+08:00 Ming Li <ml...@apache.org>:

> Hi Ed,
>
> Here I just delete below large files from git repo, the contents in these
> file cannot be retrieved anymore( but all these files are useless in
> current build process), however git log is still there.
>
> Below is all file lists removed. Please review them and point out if they
> are in use.  Thanks
> ------------------------ Current removed  ------------------------
> All sizes are in kB. The pack column is the size of the object, compressed,
> inside the pack file.
> size   pack   SHA                                       location
> 42707  8921   eb59c507535698b76e67d4965814e417fbdacde9
>  goh/ext/rhel5_x86_64/lib/libmadlib.so
> 22879  22283  bb2673f70c88573c45ea4ffa59da69e517aa2ba5
>  repo/Pivotal/libhdfs3/1.2.1-rc1/targzs/libhdfs3-rhel5_x86_
> 64-1.2.1-rc1.targz
> 12939  12450  a33fb9101505ab2475861d50b4b1ad9ec4da811a
>  repo/Pivotal/libhdfs3/1.2.1-rc1/targzs/libhdfs3-osx106_
> x86-1.2.1-rc1.targz
> 11077  10890  878971186fde916bab67555bc65ac4dcf662b5f8
>  pxf/tools/pxfd/pxfd-1.0-1.noarch.rpm
> 11019  10989  ec7c1e1a8d76f78c1222230ac13dbbbe8c5acc57
>  pxf/tools/pxfd/rpmbuild/SOURCES/pxfd.tar.gz
> 7573   1722   ab532846e72d24f11066e7c7f248d03d2fbbe8fa
>  depends/thirdparty/orc/examples/expected/demo-12-zlib.jsn.gz
> 5618   1836   b4f773943dea27e443abe0ee8bec0679de989e9b
> gpcc/WIN32/gpcc.ncb
> 5182   1016   1031f7d3f70ff48f674927cf486c95e0f9166860
>  goh/ext/osx105_x86/lib/libmadlib.so
> 5027   122    1d1d714a846259ec5b2b0471e55eec94efd7671d
>  depends/thirdparty/orc/examples/demo-11-none.orc
> 3685   3669   7c71d14c0fcca416444cda1b0673158b71011973
>  gpcc/ext/openssl-0.9.8r.tar.gz
> 1806   283    1e60f847659633dd4308a21f8d0b5c163cc519a0
>  releng/ereport/ereport.txt
> 1802   1796   97447c5120c7d2f1e738dbe6886cb58609ce739f
>  tools/bin/pythonSrc/epydoc-3.0.1.tar.gz
> 1750   314    271bb9448ee8c7baf13db19ada4d99a7c5b418eb
>  releng/ereport/ereport.txt
>
>
>
> On Mon, Sep 12, 2016 at 6:17 PM, Ed Espino <es...@apache.org> wrote:
>
> > -1 Need more information.  What is the impact (are any useful files
> lost)?
> > Are any files removed?
> >
> > -=e
> >
> > On Mon, Sep 12, 2016 at 4:52 PM, Ming Li <ml...@pivotal.io> wrote:
> >
> > > Hi all,
> > >
> > > I heard from someone complain about the slowness to git clone hawq
> repo.
> > > And we have almost finished large code modification for apache
> release, I
> > > think it is opt time for us to shrink git repo size now.
> > >
> > > I followed the steps in
> > > http://stevelorek.com/how-to-shrink-a-git-repository.html, deleted
> some
> > > large file which is not useful in current build process,  it can reduce
> > > size from 174M to 100M.
> > >
> > > However it need all clone to be re-clone again after git push the
> changed
> > > repo. So I am writing this email to ask for voting it.
> > >
> > > Thanks.
> > >
> >
> >
> >
> > --
> > *Ed Espino*
> > *espino@apache.org <es...@apache.org>*
> >
>

Re: [Vote] Shrink a Git Repository

Posted by Ming Li <ml...@apache.org>.
Hi Ed,

Here I just delete below large files from git repo, the contents in these
file cannot be retrieved anymore( but all these files are useless in
current build process), however git log is still there.

Below is all file lists removed. Please review them and point out if they
are in use.  Thanks
------------------------ Current removed  ------------------------
All sizes are in kB. The pack column is the size of the object, compressed,
inside the pack file.
size   pack   SHA                                       location
42707  8921   eb59c507535698b76e67d4965814e417fbdacde9
 goh/ext/rhel5_x86_64/lib/libmadlib.so
22879  22283  bb2673f70c88573c45ea4ffa59da69e517aa2ba5
 repo/Pivotal/libhdfs3/1.2.1-rc1/targzs/libhdfs3-rhel5_x86_64-1.2.1-rc1.targz
12939  12450  a33fb9101505ab2475861d50b4b1ad9ec4da811a
 repo/Pivotal/libhdfs3/1.2.1-rc1/targzs/libhdfs3-osx106_x86-1.2.1-rc1.targz
11077  10890  878971186fde916bab67555bc65ac4dcf662b5f8
 pxf/tools/pxfd/pxfd-1.0-1.noarch.rpm
11019  10989  ec7c1e1a8d76f78c1222230ac13dbbbe8c5acc57
 pxf/tools/pxfd/rpmbuild/SOURCES/pxfd.tar.gz
7573   1722   ab532846e72d24f11066e7c7f248d03d2fbbe8fa
 depends/thirdparty/orc/examples/expected/demo-12-zlib.jsn.gz
5618   1836   b4f773943dea27e443abe0ee8bec0679de989e9b  gpcc/WIN32/gpcc.ncb
5182   1016   1031f7d3f70ff48f674927cf486c95e0f9166860
 goh/ext/osx105_x86/lib/libmadlib.so
5027   122    1d1d714a846259ec5b2b0471e55eec94efd7671d
 depends/thirdparty/orc/examples/demo-11-none.orc
3685   3669   7c71d14c0fcca416444cda1b0673158b71011973
 gpcc/ext/openssl-0.9.8r.tar.gz
1806   283    1e60f847659633dd4308a21f8d0b5c163cc519a0
 releng/ereport/ereport.txt
1802   1796   97447c5120c7d2f1e738dbe6886cb58609ce739f
 tools/bin/pythonSrc/epydoc-3.0.1.tar.gz
1750   314    271bb9448ee8c7baf13db19ada4d99a7c5b418eb
 releng/ereport/ereport.txt



On Mon, Sep 12, 2016 at 6:17 PM, Ed Espino <es...@apache.org> wrote:

> -1 Need more information.  What is the impact (are any useful files lost)?
> Are any files removed?
>
> -=e
>
> On Mon, Sep 12, 2016 at 4:52 PM, Ming Li <ml...@pivotal.io> wrote:
>
> > Hi all,
> >
> > I heard from someone complain about the slowness to git clone hawq repo.
> > And we have almost finished large code modification for apache release, I
> > think it is opt time for us to shrink git repo size now.
> >
> > I followed the steps in
> > http://stevelorek.com/how-to-shrink-a-git-repository.html, deleted some
> > large file which is not useful in current build process,  it can reduce
> > size from 174M to 100M.
> >
> > However it need all clone to be re-clone again after git push the changed
> > repo. So I am writing this email to ask for voting it.
> >
> > Thanks.
> >
>
>
>
> --
> *Ed Espino*
> *espino@apache.org <es...@apache.org>*
>

Re: [Vote] Shrink a Git Repository

Posted by Ed Espino <es...@apache.org>.
-1 Need more information.  What is the impact (are any useful files lost)?
Are any files removed?

-=e

On Mon, Sep 12, 2016 at 4:52 PM, Ming Li <ml...@pivotal.io> wrote:

> Hi all,
>
> I heard from someone complain about the slowness to git clone hawq repo.
> And we have almost finished large code modification for apache release, I
> think it is opt time for us to shrink git repo size now.
>
> I followed the steps in
> http://stevelorek.com/how-to-shrink-a-git-repository.html, deleted some
> large file which is not useful in current build process,  it can reduce
> size from 174M to 100M.
>
> However it need all clone to be re-clone again after git push the changed
> repo. So I am writing this email to ask for voting it.
>
> Thanks.
>



-- 
*Ed Espino*
*espino@apache.org <es...@apache.org>*