You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hivemall.apache.org by Makoto Yui <yu...@gmail.com> on 2016/12/01 05:29:45 UTC

Re: Importing History or Not at Initial Code Dump

Hi,

Done the initial code dump!
https://github.com/apache/incubator-hivemall

Let's move development (Pull requests) to the ASF repository.

I'll update the project status page soon (and Dec report).

Thanks,
Makoto


2016-11-30 21:04 GMT+09:00 Makoto Yui <yu...@gmail.com>:
> I'm considering to import https://github.com/myui/incubator-hivemall
> to ASF repository tomorrow.
> Let me know if it's NOT okey.
>
> Github tag/release issue is my concern though ..
> https://lists.apache.org/thread.html/db78e1f8fc121d8e6b016d2f61d06ccafebf9fd30b4ec00883c78557@%3Clegal-discuss.apache.org%3E
>
> I would like to remain the past git tags to keep track of changes.
>
> Thanks,
> Makoto
>
> 2016-11-30 20:35 GMT+09:00 Makoto Yui <yu...@gmail.com>:
>> I'm considering to update the following way because git push does not
>> work when performing shallow copy (maybe due to ASF git server
>> version/configuration).
>>
>> You can find the tested repository on https://github.com/myui/incubator-hivemall
>>
>> $ git clone https://github.com/myui/hivemall.git incubator-hivemall
>> $ git filter-branch --index-filter 'git rm -r --cached
>> --ignore-unmatch lib/ target/*.jar' --tag-name-filter cat
>> --prune-empty -- --all
>> $ rm -rf .git/refs/original/
>> $ git reflog expire --expire=now --all
>> $ git gc --aggressive --prune=now
>> $ git remote set-url origin https://github.com/myui/incubator-hivemall.git
>> $ git push -f -u origin master
>> $ git push origin --tags --force
>>
>> $ git clone https://github.com/myui/incubator-hivemall.git
>> $ cd incubator-hivemall
>> $ git_find_big.sh | head -10
>>
>> All sizes are in kB's. The pack column is the size of the object,
>> compressed, inside the pack file.
>> size  pack  SHA                                       location
>> 1391  1383  b8d432e6a3c0074951abd35caf0a777caf47afbf
>> xgboost/lib/xgboost4j_0.60-0.10.jar
>> 765   303   11c617713ee2ad3f847aee7627ee8639c5a79667
>> core/src/test/resources/hivemall/mf/ml1k.train
>> 639   613   de4e32983604238bc72fe3f6cb6beea76fde0e8d
>> src/site/resources/images/hivemall_overview_bg.png
>> 382   117   8b66187fe067c3aa389ce8c98108f349ceae159c
>> src/site/resources/fonts/fontawesome-webfont.svg
>> 220   192   04d8605fd8daaafa72a2b6dfa2a2d48c75c57a10
>> src/site/resources/images/asf_bg.png
>> 194   186   fb29a3d2ee04b7981463de89a77ccc7436f4ad9a
>> docs/gitbook/resources/images/techstack.png
>> 191   76    e00b1127f6fb4fdcc1606a20b05e16b5456acacc
>> core/src/test/resources/hivemall/mf/ml1k.test
>> 149   88    f221e50a2ef60738ba30932d834530cdfe55cb3e
>> src/site/resources/fonts/fontawesome-webfont.ttf
>>
>> 2016-11-30 14:31 GMT+09:00 Makoto Yui <yu...@gmail.com>:
>>> Hi Takeshi,
>>>
>>> I was almost to perform the initial code dump (stopped).
>>>
>>> Be aware almost all commit hash will be changed when rewriting Git logs by [1].
>>> [1] git filter-branch --index-filter 'git rm -r --cached
>>> --ignore-unmatch lib/ target/*.jar' --prune-empty -- --all
>>>
>>> So, I'm considering to make a shallow copy limiting 100-300 or so
>>> (that does not include large binaries).
>>>
>>> Thanks,
>>> Makoto
>>>
>>> 2016-11-30 2:44 GMT+09:00 Takeshi Yamamuro <li...@gmail.com>:
>>>> Hi, all
>>>>
>>>> I also have no strong opinion though, it seems it'd be better to keep as
>>>> much activities (that is, commit logs) as possible there.
>>>> I'm afraid few activity logs possibly make newbies misunderstand that
>>>>  hivemall is inactive.
>>>>
>>>> As for the rebasing, it's not tough to rebase #285 (this is my own pr).
>>>> So, rewriting the logs sounds good to me.
>>>>
>>>> // maropu
>>>>
>>>> On Tue, Nov 29, 2016 at 11:24 PM, Makoto Yui <yu...@gmail.com> wrote:
>>>>
>>>>> Kai,
>>>>>
>>>>> 2016-11-29 22:35 GMT+09:00 Kai Sasaki <sa...@treasure-data.com>:
>>>>> > Currently we have 6 PRs and some of them (especially #285, #336 and #385)
>>>>> > are relatively large.
>>>>> > It might cause somewhat troublesome rebasing.
>>>>>
>>>>> Yes, it's my concern.
>>>>>
>>>>> But, such large PRs should better to be contributed in the Apache
>>>>> Incubation process.
>>>>> I'm considering to invite some of them to the Hivemall committer.
>>>>>
>>>>> Another concern is moving github stars/watchers as seen in [1].
>>>>> [1] https://issues.apache.org/jira/browse/INFRA-12995
>>>>>
>>>>> > Do you think some of them are not ready to be merged? I think merging
>>>>> some
>>>>> > of them before reflogging history
>>>>> > can make migrating work easy. But if they are not ready, it's okay. We
>>>>> can
>>>>> > work on rebasing after this work.
>>>>>
>>>>> I'm currently reviewing #385 but it need to be revised in several parts.
>>>>> Also, #336 requires large refactoring.
>>>>>
>>>>> So, better to do initial code dump first.
>>>>>
>>>>> Shallow copied repository can be pushed from git v1.9 and later
>>>>> (I'm not sure about ASF git version though).
>>>>> http://blogs.atlassian.com/2014/05/handle-big-repositories-git/
>>>>>
>>>>> Thanks,
>>>>> Makoto
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> ---
>>>> Takeshi Yamamuro

Re: Importing History or Not at Initial Code Dump

Posted by Takeshi Yamamuro <li...@gmail.com>.
Great work!

// maropu

On Thu, Dec 1, 2016 at 2:29 PM, Makoto Yui <yu...@gmail.com> wrote:

> Hi,
>
> Done the initial code dump!
> https://github.com/apache/incubator-hivemall
>
> Let's move development (Pull requests) to the ASF repository.
>
> I'll update the project status page soon (and Dec report).
>
> Thanks,
> Makoto
>
>
> 2016-11-30 21:04 GMT+09:00 Makoto Yui <yu...@gmail.com>:
> > I'm considering to import https://github.com/myui/incubator-hivemall
> > to ASF repository tomorrow.
> > Let me know if it's NOT okey.
> >
> > Github tag/release issue is my concern though ..
> > https://lists.apache.org/thread.html/db78e1f8fc121d8e6b016d2f61d06c
> cafebf9fd30b4ec00883c78557@%3Clegal-discuss.apache.org%3E
> >
> > I would like to remain the past git tags to keep track of changes.
> >
> > Thanks,
> > Makoto
> >
> > 2016-11-30 20:35 GMT+09:00 Makoto Yui <yu...@gmail.com>:
> >> I'm considering to update the following way because git push does not
> >> work when performing shallow copy (maybe due to ASF git server
> >> version/configuration).
> >>
> >> You can find the tested repository on https://github.com/myui/
> incubator-hivemall
> >>
> >> $ git clone https://github.com/myui/hivemall.git incubator-hivemall
> >> $ git filter-branch --index-filter 'git rm -r --cached
> >> --ignore-unmatch lib/ target/*.jar' --tag-name-filter cat
> >> --prune-empty -- --all
> >> $ rm -rf .git/refs/original/
> >> $ git reflog expire --expire=now --all
> >> $ git gc --aggressive --prune=now
> >> $ git remote set-url origin https://github.com/myui/
> incubator-hivemall.git
> >> $ git push -f -u origin master
> >> $ git push origin --tags --force
> >>
> >> $ git clone https://github.com/myui/incubator-hivemall.git
> >> $ cd incubator-hivemall
> >> $ git_find_big.sh | head -10
> >>
> >> All sizes are in kB's. The pack column is the size of the object,
> >> compressed, inside the pack file.
> >> size  pack  SHA                                       location
> >> 1391  1383  b8d432e6a3c0074951abd35caf0a777caf47afbf
> >> xgboost/lib/xgboost4j_0.60-0.10.jar
> >> 765   303   11c617713ee2ad3f847aee7627ee8639c5a79667
> >> core/src/test/resources/hivemall/mf/ml1k.train
> >> 639   613   de4e32983604238bc72fe3f6cb6beea76fde0e8d
> >> src/site/resources/images/hivemall_overview_bg.png
> >> 382   117   8b66187fe067c3aa389ce8c98108f349ceae159c
> >> src/site/resources/fonts/fontawesome-webfont.svg
> >> 220   192   04d8605fd8daaafa72a2b6dfa2a2d48c75c57a10
> >> src/site/resources/images/asf_bg.png
> >> 194   186   fb29a3d2ee04b7981463de89a77ccc7436f4ad9a
> >> docs/gitbook/resources/images/techstack.png
> >> 191   76    e00b1127f6fb4fdcc1606a20b05e16b5456acacc
> >> core/src/test/resources/hivemall/mf/ml1k.test
> >> 149   88    f221e50a2ef60738ba30932d834530cdfe55cb3e
> >> src/site/resources/fonts/fontawesome-webfont.ttf
> >>
> >> 2016-11-30 14:31 GMT+09:00 Makoto Yui <yu...@gmail.com>:
> >>> Hi Takeshi,
> >>>
> >>> I was almost to perform the initial code dump (stopped).
> >>>
> >>> Be aware almost all commit hash will be changed when rewriting Git
> logs by [1].
> >>> [1] git filter-branch --index-filter 'git rm -r --cached
> >>> --ignore-unmatch lib/ target/*.jar' --prune-empty -- --all
> >>>
> >>> So, I'm considering to make a shallow copy limiting 100-300 or so
> >>> (that does not include large binaries).
> >>>
> >>> Thanks,
> >>> Makoto
> >>>
> >>> 2016-11-30 2:44 GMT+09:00 Takeshi Yamamuro <li...@gmail.com>:
> >>>> Hi, all
> >>>>
> >>>> I also have no strong opinion though, it seems it'd be better to keep
> as
> >>>> much activities (that is, commit logs) as possible there.
> >>>> I'm afraid few activity logs possibly make newbies misunderstand that
> >>>>  hivemall is inactive.
> >>>>
> >>>> As for the rebasing, it's not tough to rebase #285 (this is my own
> pr).
> >>>> So, rewriting the logs sounds good to me.
> >>>>
> >>>> // maropu
> >>>>
> >>>> On Tue, Nov 29, 2016 at 11:24 PM, Makoto Yui <yu...@gmail.com>
> wrote:
> >>>>
> >>>>> Kai,
> >>>>>
> >>>>> 2016-11-29 22:35 GMT+09:00 Kai Sasaki <sa...@treasure-data.com>:
> >>>>> > Currently we have 6 PRs and some of them (especially #285, #336
> and #385)
> >>>>> > are relatively large.
> >>>>> > It might cause somewhat troublesome rebasing.
> >>>>>
> >>>>> Yes, it's my concern.
> >>>>>
> >>>>> But, such large PRs should better to be contributed in the Apache
> >>>>> Incubation process.
> >>>>> I'm considering to invite some of them to the Hivemall committer.
> >>>>>
> >>>>> Another concern is moving github stars/watchers as seen in [1].
> >>>>> [1] https://issues.apache.org/jira/browse/INFRA-12995
> >>>>>
> >>>>> > Do you think some of them are not ready to be merged? I think
> merging
> >>>>> some
> >>>>> > of them before reflogging history
> >>>>> > can make migrating work easy. But if they are not ready, it's
> okay. We
> >>>>> can
> >>>>> > work on rebasing after this work.
> >>>>>
> >>>>> I'm currently reviewing #385 but it need to be revised in several
> parts.
> >>>>> Also, #336 requires large refactoring.
> >>>>>
> >>>>> So, better to do initial code dump first.
> >>>>>
> >>>>> Shallow copied repository can be pushed from git v1.9 and later
> >>>>> (I'm not sure about ASF git version though).
> >>>>> http://blogs.atlassian.com/2014/05/handle-big-repositories-git/
> >>>>>
> >>>>> Thanks,
> >>>>> Makoto
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> ---
> >>>> Takeshi Yamamuro
>



-- 
---
Takeshi Yamamuro