You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Tomoko Uchida <to...@gmail.com> on 2022/07/01 07:53:41 UTC

Re: A prototype migration tool Jira to GitHub

It seems there are no major objections?

A status update - we now have a public ASF repository for the migration and
all work will be done under it, outside the main lucene repo/Jira/dev list.
https://github.com/apache/lucene-jira-archive

I'm not fully sure how many people are interested in this sub-project; just
for your information, there are two major blockers (to me).
- https://github.com/apache/lucene-jira-archive/issues/1
- https://github.com/apache/lucene-jira-archive/issues/3

Once we successfully address them, the migration will be done in the
following steps.
https://github.com/apache/lucene-jira-archive/issues/7

Tomoko


2022年6月29日(水) 15:56 Dawid Weiss <da...@gmail.com>:

> I looked at the first random issue and noticed these (perhaps known)
> issues -
>
> https://github.com/mocobeta/sandbox-lucene-10557/issues/10838
>
> 1) lists are converted into bold blocks (without the list):
>
> https://github.com/mocobeta/sandbox-lucene-10557/issues/10838#issuecomment-1166777318
>
> 2) inline images in the description point at nothing.
>
> But it's already quite impressive.
>
> Dawid
>
> On Tue, Jun 28, 2022 at 6:49 PM Tomoko Uchida
> <to...@gmail.com> wrote:
> >
> > I finished the second prototype. With a few exceptions, almost all
> existing issues were successfully migrated into the test repo. You can
> browse/search them.
> > https://github.com/mocobeta/sandbox-lucene-10557/issues
> >
> > Some limitations in the first prototype have been addressed. For
> example, we can preserve the original timestamp of the issues/comments.
> > I could list improvements and current limitations though, could you try
> it out yourself; any issues should be found by Jira issue numbers.
> > Note that "attachments" are still not ported. We've found workarounds so
> it will be addressed in the next iteration.
> >
> > I don't think we reached a conclusion, though, I fully recognize there
> are strong requests on the atomic switch to GitHub and I haven't seen
> objections on that so far - then I'll continue to work on improving the
> migration quality.
> > I would finish playing around with prototyping and if there are next
> iterations, these will be rehearsals for the actual migration.
> >
> >
> > Tomoko
> >
> >
> > 2022年6月27日(月) 10:27 Tomoko Uchida <to...@gmail.com>:
> >>
> >> > It looks like the GitHub Danger Zone can transfer a repository?
> >>
> >> "Transferring a repository" creates another repository different from
> apache/lucene. It'd make the migration process easy though, is it our
> intention to have an external repository for old issues?
> >>
> >> Tomoko
> >>
> >>
> >> 2022年6月27日(月) 8:24 Michael McCandless <lu...@mikemccandless.com>:
> >>>
> >>> It looks like the GitHub Danger Zone can transfer a repository?
> >>>
> >>> It's not clear if it can go from Personal -> Organization though.  I
> see Personal -> Personal and Organization -> Organization.
> >>>
> >>>
> https://docs.github.com/en/repositories/creating-and-managing-repositories/transferring-a-repository
> >>>
> >>> Mike McCandless
> >>>
> >>> http://blog.mikemccandless.com
> >>>
> >>>
> >>> On Sun, Jun 26, 2022 at 6:40 PM Tomoko Uchida <
> tomoko.uchida.1111@gmail.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>
> >>>>>
> >>>>> 2022年6月27日(月) 5:16 Michael Sokolov <ms...@gmail.com>:
> >>>>>>
> >>>>>> as for this access control/script monitoring problem, I wonder
> whether
> >>>>>> we could import all the issues into a new github repo owned by
> >>>>>> whomever is running the script, and then transfer from there to the
> >>>>>> lucene repo? It would be an extra step involving another script (or
> >>>>>> something), but maybe(?) that one could be much simpler since it is
> >>>>>> github->github?? If this works out, we could have full control of
> the
> >>>>>> first step and only hand off to infra the simpler copying job.
> >>>>>>
> >>>>>
> >>>>> I don't see the API or tool that transfers all issues from one repo
> to another repo.
> >>>>
> >>>>
> >>>> To be exact, I don't see the API or tool that transfers all issues
> from one repo to another repo while keeping cross-issue links.
> >>>> If we want to preserve cross-issue links, there's no difference
> between "Jira to GitHub" and "GitHub to GitHub".
> >>>>
> >>>>>
> >>>>>>
> >>>>>> On Sat, Jun 25, 2022 at 7:53 AM Tomoko Uchida
> >>>>>> <to...@gmail.com> wrote:
> >>>>>> >
> >>>>>> > I may have to share another practical consideration on the
> migration that I haven't mentioned yet.
> >>>>>> >
> >>>>>> > We are not allowed to have admin access to the lucene GitHub
> repo, so can't run the import job(s) on ourselves.
> >>>>>> > We'll have to make a tool with clear instructions for the
> migration and pass it to infra team, then support them via the jira (or
> slack?) if there are any problems.
> >>>>>> > See https://issues.apache.org/jira/browse/INFRA-20118
> >>>>>> >
> >>>>>> > We can do some preparation locally (e.g. dump Jira issues and
> convert them to importable format to GitHub), but the actual first and
> second pass import will be done by infra team.
> >>>>>> > I think I myself won't be able to have close contact with the
> infra team if the migration operation is too complicated due to the time
> difference and my communication ability - I'm not good at real-time
> conversation in English.
> >>>>>> > So if we need a complex migration plan, I think I'll have to find
> someone who is willing to take over the job.
> >>>>>> >
> >>>>>> >
> >>>>>> >
> >>>>>> > 2022年6月25日(土) 19:19 Tomoko Uchida <to...@gmail.com>:
> >>>>>> >>
> >>>>>> >> Hi Dawid,
> >>>>>> >>
> >>>>>> >> > Emm.. sorry for being slow - what is it that you want me to
> do? :) Unwatch->Ignore?
> >>>>>> >>
> >>>>>> >> I'm sorry for being ambiguous. Could you set your notification
> setting on the repository as "Participating and @mentions"?
> >>>>>> >> In the testing of migration scripts, I will import many fake
> issues where your account is linked as the original reporter/author with
> real mentions, like this example.
> >>>>>> >> https://github.com/mocobeta/migration-test-1/issues/111
> >>>>>> >> If they do not disturb your inbox with spam notifications then
> the test is successful.
> >>>>>> >>
> >>>>>> >> With regard to attachments:
> >>>>>> >>
> >>>>>> >> > 1) create a (separate?) git repository or branch with a
> separate root in the lucene repository with all jira attachments upon
> importing them.
> >>>>>> >> > 2) there are about 7k issues with attachments in Jira. We can
> split them into 25-issue batches and ask the crowd to port them manually
> >>>>>> >>
> >>>>>> >> Thanks for your suggestion, I don't come up with other options
> either. Both would need others' permission and/or extra work, so I think we
> can't control the process and outcome.
> >>>>>> >> For 1), we'll need to ask infra to create a repository and run
> another long-running batch, and it'll complicate the migration instructions
> - we'll not be allowed to have access tokens to commit files to an ASF repo
> from a program.
> >>>>>> >> For 2), I'm not sure how many people want to volunteer for the
> manual work.
> >>>>>> >>
> >>>>>> >> I cannot promise it will be eventually done, then I would leave
> it as a limitation of the migration.
> >>>>>> >> If there are no controllable solutions (to me) on this, I may
> ask others if we should migrate existing issues to GitHub "even if we can't
> migrate any attachments and have to keep them in Jira forever". Let me keep
> myself neutral about the idea of migrating all Jira issues, sorry... I'm
> working on this not to push it but to provide information and gain a
> certain agreement.
> >>>>>> >>
> >>>>>> >> Tomoko
> >>>>>> >>
> >>>>>> >>
> >>>>>> >> 2022年6月25日(土) 16:12 Dawid Weiss <da...@gmail.com>:
> >>>>>> >>>
> >>>>>> >>>
> >>>>>> >>> Hi Tomoko,
> >>>>>> >>>
> >>>>>> >>>>
> >>>>>> >>>> There are two ways to receive notifications as you know, 1)
> watch all activities and 2) receive notifications only when you are
> mentioned (default).
> >>>>>> >>>> I excluded your github account from marking up with backticks
> `` to create hyperlinks. Could you unwatch the repo again and then observe
> your inbox for a while, so that we can also test 2)?
> >>>>>> >>>>
> https://github.com/mocobeta/sandbox-lucene-10557/blob/main/migration/src/jira2github_import.py#L21
> >>>>>> >>>
> >>>>>> >>>
> >>>>>> >>> Emm.. sorry for being slow - what is it that you want me to do?
> :) Unwatch->Ignore?
> >>>>>> >>>
> >>>>>> >>>>
> >>>>>> >>>> In this Spring issue, the "attachment" link points to the
> original Jira file - so they still use Jira as a file server.
> >>>>>> >>>
> >>>>>> >>>
> >>>>>> >>> Ahh... right. In that case I have two ideas:
> >>>>>> >>>
> >>>>>> >>> 1) create a (separate?) git repository or branch with a
> separate root in the lucene repository with all jira attachments upon
> importing them. This could be structured in subfolders, for example:
> >>>>>> >>>
> >>>>>> >>> jira/xyz/attachment-1.jpg
> >>>>>> >>>
> >>>>>> >>> if this repository is checked in to github, the links to
> attachment could point at the "raw" git-serving service github offers. I'm
> not sure it emits proper content-types (for images, etc). Alternatively, it
> could be github-docs, which does serve them properly for static content.
> >>>>>> >>>
> >>>>>> >>> It will not support searches, of course, but it will be a
> consistent copy.
> >>>>>> >>>
> >>>>>> >>> 2) there are about 7k issues with attachments in Jira. We can
> split them into 25-issue batches and ask the crowd to port them manually...
> It will take time but once the issues are ported, it can be done
> incrementally over a longer time stretch, no rush there.
> >>>>>> >>>
> >>>>>> >>> Dawid
> >>>>>>
> >>>>>>
> ---------------------------------------------------------------------
> >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: A prototype migration tool Jira to GitHub

Posted by Tomoko Uchida <to...@gmail.com>.
Hello - it's my first and last call for volunteers for the GitHub migration.
Sorry, I know few people may be interested in this work. But I didn't want
just to complain as "no one helped me so I had to take over all pains on my
own" without explicitly calling for help.

These are tasks that should be addressed, while I'm still not exactly sure
how to resolve them (although it should be possible and there are some
clues).

- https://github.com/apache/lucene-jira-archive/issues/1
  - this needs knowledge about syntax parsing - and the effort of thorough
debugging with real data.
- https://github.com/apache/lucene-jira-archive/issues/3
  - this might need infra's help.
- https://github.com/apache/lucene-jira-archive/issues/15
  - this needs knowledge about Jira API and might need infra's help.

It's a kind of work you will not be credited (in terms of change log or
commit history), moreover, it may not so exciting to work on - therefore I
would really appreciate your help.

Thanks,
Tomoko


2022年7月1日(金) 16:53 Tomoko Uchida <to...@gmail.com>:

> It seems there are no major objections?
>
> A status update - we now have a public ASF repository for the migration
> and all work will be done under it, outside the main lucene repo/Jira/dev
> list.
> https://github.com/apache/lucene-jira-archive
>
> I'm not fully sure how many people are interested in this sub-project;
> just for your information, there are two major blockers (to me).
> - https://github.com/apache/lucene-jira-archive/issues/1
> - https://github.com/apache/lucene-jira-archive/issues/3
>
> Once we successfully address them, the migration will be done in the
> following steps.
> https://github.com/apache/lucene-jira-archive/issues/7
>
> Tomoko
>
>
> 2022年6月29日(水) 15:56 Dawid Weiss <da...@gmail.com>:
>
>> I looked at the first random issue and noticed these (perhaps known)
>> issues -
>>
>> https://github.com/mocobeta/sandbox-lucene-10557/issues/10838
>>
>> 1) lists are converted into bold blocks (without the list):
>>
>> https://github.com/mocobeta/sandbox-lucene-10557/issues/10838#issuecomment-1166777318
>>
>> 2) inline images in the description point at nothing.
>>
>> But it's already quite impressive.
>>
>> Dawid
>>
>> On Tue, Jun 28, 2022 at 6:49 PM Tomoko Uchida
>> <to...@gmail.com> wrote:
>> >
>> > I finished the second prototype. With a few exceptions, almost all
>> existing issues were successfully migrated into the test repo. You can
>> browse/search them.
>> > https://github.com/mocobeta/sandbox-lucene-10557/issues
>> >
>> > Some limitations in the first prototype have been addressed. For
>> example, we can preserve the original timestamp of the issues/comments.
>> > I could list improvements and current limitations though, could you try
>> it out yourself; any issues should be found by Jira issue numbers.
>> > Note that "attachments" are still not ported. We've found workarounds
>> so it will be addressed in the next iteration.
>> >
>> > I don't think we reached a conclusion, though, I fully recognize there
>> are strong requests on the atomic switch to GitHub and I haven't seen
>> objections on that so far - then I'll continue to work on improving the
>> migration quality.
>> > I would finish playing around with prototyping and if there are next
>> iterations, these will be rehearsals for the actual migration.
>> >
>> >
>> > Tomoko
>> >
>> >
>> > 2022年6月27日(月) 10:27 Tomoko Uchida <to...@gmail.com>:
>> >>
>> >> > It looks like the GitHub Danger Zone can transfer a repository?
>> >>
>> >> "Transferring a repository" creates another repository different from
>> apache/lucene. It'd make the migration process easy though, is it our
>> intention to have an external repository for old issues?
>> >>
>> >> Tomoko
>> >>
>> >>
>> >> 2022年6月27日(月) 8:24 Michael McCandless <lu...@mikemccandless.com>:
>> >>>
>> >>> It looks like the GitHub Danger Zone can transfer a repository?
>> >>>
>> >>> It's not clear if it can go from Personal -> Organization though.  I
>> see Personal -> Personal and Organization -> Organization.
>> >>>
>> >>>
>> https://docs.github.com/en/repositories/creating-and-managing-repositories/transferring-a-repository
>> >>>
>> >>> Mike McCandless
>> >>>
>> >>> http://blog.mikemccandless.com
>> >>>
>> >>>
>> >>> On Sun, Jun 26, 2022 at 6:40 PM Tomoko Uchida <
>> tomoko.uchida.1111@gmail.com> wrote:
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>>
>> >>>>>
>> >>>>> 2022年6月27日(月) 5:16 Michael Sokolov <ms...@gmail.com>:
>> >>>>>>
>> >>>>>> as for this access control/script monitoring problem, I wonder
>> whether
>> >>>>>> we could import all the issues into a new github repo owned by
>> >>>>>> whomever is running the script, and then transfer from there to the
>> >>>>>> lucene repo? It would be an extra step involving another script (or
>> >>>>>> something), but maybe(?) that one could be much simpler since it is
>> >>>>>> github->github?? If this works out, we could have full control of
>> the
>> >>>>>> first step and only hand off to infra the simpler copying job.
>> >>>>>>
>> >>>>>
>> >>>>> I don't see the API or tool that transfers all issues from one repo
>> to another repo.
>> >>>>
>> >>>>
>> >>>> To be exact, I don't see the API or tool that transfers all issues
>> from one repo to another repo while keeping cross-issue links.
>> >>>> If we want to preserve cross-issue links, there's no difference
>> between "Jira to GitHub" and "GitHub to GitHub".
>> >>>>
>> >>>>>
>> >>>>>>
>> >>>>>> On Sat, Jun 25, 2022 at 7:53 AM Tomoko Uchida
>> >>>>>> <to...@gmail.com> wrote:
>> >>>>>> >
>> >>>>>> > I may have to share another practical consideration on the
>> migration that I haven't mentioned yet.
>> >>>>>> >
>> >>>>>> > We are not allowed to have admin access to the lucene GitHub
>> repo, so can't run the import job(s) on ourselves.
>> >>>>>> > We'll have to make a tool with clear instructions for the
>> migration and pass it to infra team, then support them via the jira (or
>> slack?) if there are any problems.
>> >>>>>> > See https://issues.apache.org/jira/browse/INFRA-20118
>> >>>>>> >
>> >>>>>> > We can do some preparation locally (e.g. dump Jira issues and
>> convert them to importable format to GitHub), but the actual first and
>> second pass import will be done by infra team.
>> >>>>>> > I think I myself won't be able to have close contact with the
>> infra team if the migration operation is too complicated due to the time
>> difference and my communication ability - I'm not good at real-time
>> conversation in English.
>> >>>>>> > So if we need a complex migration plan, I think I'll have to
>> find someone who is willing to take over the job.
>> >>>>>> >
>> >>>>>> >
>> >>>>>> >
>> >>>>>> > 2022年6月25日(土) 19:19 Tomoko Uchida <tomoko.uchida.1111@gmail.com
>> >:
>> >>>>>> >>
>> >>>>>> >> Hi Dawid,
>> >>>>>> >>
>> >>>>>> >> > Emm.. sorry for being slow - what is it that you want me to
>> do? :) Unwatch->Ignore?
>> >>>>>> >>
>> >>>>>> >> I'm sorry for being ambiguous. Could you set your notification
>> setting on the repository as "Participating and @mentions"?
>> >>>>>> >> In the testing of migration scripts, I will import many fake
>> issues where your account is linked as the original reporter/author with
>> real mentions, like this example.
>> >>>>>> >> https://github.com/mocobeta/migration-test-1/issues/111
>> >>>>>> >> If they do not disturb your inbox with spam notifications then
>> the test is successful.
>> >>>>>> >>
>> >>>>>> >> With regard to attachments:
>> >>>>>> >>
>> >>>>>> >> > 1) create a (separate?) git repository or branch with a
>> separate root in the lucene repository with all jira attachments upon
>> importing them.
>> >>>>>> >> > 2) there are about 7k issues with attachments in Jira. We can
>> split them into 25-issue batches and ask the crowd to port them manually
>> >>>>>> >>
>> >>>>>> >> Thanks for your suggestion, I don't come up with other options
>> either. Both would need others' permission and/or extra work, so I think we
>> can't control the process and outcome.
>> >>>>>> >> For 1), we'll need to ask infra to create a repository and run
>> another long-running batch, and it'll complicate the migration instructions
>> - we'll not be allowed to have access tokens to commit files to an ASF repo
>> from a program.
>> >>>>>> >> For 2), I'm not sure how many people want to volunteer for the
>> manual work.
>> >>>>>> >>
>> >>>>>> >> I cannot promise it will be eventually done, then I would leave
>> it as a limitation of the migration.
>> >>>>>> >> If there are no controllable solutions (to me) on this, I may
>> ask others if we should migrate existing issues to GitHub "even if we can't
>> migrate any attachments and have to keep them in Jira forever". Let me keep
>> myself neutral about the idea of migrating all Jira issues, sorry... I'm
>> working on this not to push it but to provide information and gain a
>> certain agreement.
>> >>>>>> >>
>> >>>>>> >> Tomoko
>> >>>>>> >>
>> >>>>>> >>
>> >>>>>> >> 2022年6月25日(土) 16:12 Dawid Weiss <da...@gmail.com>:
>> >>>>>> >>>
>> >>>>>> >>>
>> >>>>>> >>> Hi Tomoko,
>> >>>>>> >>>
>> >>>>>> >>>>
>> >>>>>> >>>> There are two ways to receive notifications as you know, 1)
>> watch all activities and 2) receive notifications only when you are
>> mentioned (default).
>> >>>>>> >>>> I excluded your github account from marking up with backticks
>> `` to create hyperlinks. Could you unwatch the repo again and then observe
>> your inbox for a while, so that we can also test 2)?
>> >>>>>> >>>>
>> https://github.com/mocobeta/sandbox-lucene-10557/blob/main/migration/src/jira2github_import.py#L21
>> >>>>>> >>>
>> >>>>>> >>>
>> >>>>>> >>> Emm.. sorry for being slow - what is it that you want me to
>> do? :) Unwatch->Ignore?
>> >>>>>> >>>
>> >>>>>> >>>>
>> >>>>>> >>>> In this Spring issue, the "attachment" link points to the
>> original Jira file - so they still use Jira as a file server.
>> >>>>>> >>>
>> >>>>>> >>>
>> >>>>>> >>> Ahh... right. In that case I have two ideas:
>> >>>>>> >>>
>> >>>>>> >>> 1) create a (separate?) git repository or branch with a
>> separate root in the lucene repository with all jira attachments upon
>> importing them. This could be structured in subfolders, for example:
>> >>>>>> >>>
>> >>>>>> >>> jira/xyz/attachment-1.jpg
>> >>>>>> >>>
>> >>>>>> >>> if this repository is checked in to github, the links to
>> attachment could point at the "raw" git-serving service github offers. I'm
>> not sure it emits proper content-types (for images, etc). Alternatively, it
>> could be github-docs, which does serve them properly for static content.
>> >>>>>> >>>
>> >>>>>> >>> It will not support searches, of course, but it will be a
>> consistent copy.
>> >>>>>> >>>
>> >>>>>> >>> 2) there are about 7k issues with attachments in Jira. We can
>> split them into 25-issue batches and ask the crowd to port them manually...
>> It will take time but once the issues are ported, it can be done
>> incrementally over a longer time stretch, no rush there.
>> >>>>>> >>>
>> >>>>>> >>> Dawid
>> >>>>>>
>> >>>>>>
>> ---------------------------------------------------------------------
>> >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>>>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>