You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hadoop-migrations@infra.apache.org by Zoltan Haindrich <ki...@rxd.hu> on 2020/07/13 11:37:02 UTC
Re: Migration of Hadoop labelled nodes to new dedicated Master
Hey Dmitry!
For Hive we took a path - from which some parts might be usefull for you; I just write about it a bit:
For one thing - to have the PR/JIRA links - you might just need to add a asf.yaml as described in [1].
To test your project you have a few options;
* you could try to dump/load the jobs at the new Jenkins CI instance.
* you could create a new - more modern CI by rethinking it:
* option A (use jenkins):
* to enable building github PRs I think the most painless approach is to use a Jenkinsfile with a multibranch pipeline job
* the Jenkinsfile should contain the instructions to build the project - this way modification to the CI will also come in PRs as well [2]
* iirc for this to work you will need to set a user who has write access to the repo - without that; it won't be able to request merge commits from github
* for Hive we took this approach because we still struggle with all kind of test flakiness/etc...
* this is just one way to do it...jenkins can be utilized in various ways
* option B (github actions):
* "github actions" have a lot of resources in the background (iirc: you may use 20 instances of 2core xeons at a time to test a project)
* configuring github actions is a bit different - you may need to get used to it - but so far I find it usefull :)
* some asf projects are already utilizing "github actions" like ozone and calcite
* there are some limitations of GA: like publishing test reports is not really possible.. if red/green is enough then that might not even be important...
[1] https://cwiki.apache.org/confluence/display/INFRA/git+-+.asf.yaml+features
[2] https://www.jenkins.io/doc/pipeline/examples/
cheers,
Zoltan
On 7/13/20 1:13 PM, Dmitry Grinenko wrote:
> Hi Gavin,
>
> Thanks for the answer. Sadly, it is not me who previously configured the project on CI, so i would need to do it from the scratch and can be a bit slow due to that.
>
> The general idea is to:
> - use something like https://plugins.jenkins.io/ghprb/ to use it as trigger for Jenkins job to verify pull requests on GitHub
> - add link to the pull request in the https://issues.apache.org/jira automatically
>
>
> What would be needed in overall (let's call it a plan):
> - Create pipeline in Jenkins for the Ambari
> - Add GitHub Builder plugin, so we would able to integrate Jenkins CI with GitHub for automatic new pull request checking
> - configure the plugin to work with the pipeline and GitHub repository
> - add possibility to add links in ASF Jira to the GitHub Pull Request
>
>
> I'll start my work on the first item in the list.
>
>
> ------ Original Message ------
> From: "Gavin McDonald" <gm...@apache.org>
> To: hadoop-migrations@infra.apache.org
> Sent: 7/13/2020 12:15:47 PM
> Subject: Re: Re[2]: Migration of Hadoop labelled nodes to new dedicated Master
>
>> Hi Dmitry,
>>
>> Welcome! - You are not too late no, we need to start ramping up testing so
>> we can perform the migration of all Hadoop related projects to
>> ci-hadoop.apache.org.
>>
>> Please let me know what you need to get going - you already have login and
>> build/create
>> job ability - please test that.
>>
>> Later today I'll grab two more H nodes from builds.apache.org and place
>> them in ci-hadoop.a.o pool
>>
>> Thanks
>>
>>
>> On Thu, Jul 9, 2020 at 3:19 PM Dmitry Grinenko
>> <dg...@cloudera.com.invalid> wrote:
>>
>>> Hello All,
>>>
>>> It seems i'm bit late to the event due to some circumstances, but is
>>> there still an opportunity to
>>> participate in the migration?
>>>
>>> I'll like to step up as man from the Ambari team and like to migrate
>>> GitHub pull request builds to new infra.
>>>
>>>
>>> ------ Original Message ------
>>> From: "Gavin McDonald" <gm...@apache.org>
>>> To: hadoop-migrations@infra.apache.org
>>> Sent: 4/29/2020 4:05:49 PM
>>> Subject: Re: Migration of Hadoop labelled nodes to new dedicated Master
>>>
>>> >Hi All,
>>> >
>>> >Following on from the below email I sent *11 DAYS ago now*, so far we have
>>> >had *one reply *from mahout cc:d to me (thank you Trevor) , and had *ONE
>>> >PERSON* sign up to the new hadoop-migrations@infra.apache.org -
>>> >that is out of a total of *OVER 7000* people signed up to the 13 mailing
>>> >lists emailed.
>>> >
>>> >To recap what I asked for:-
>>> >
>>> >"...What I would like from each community, is to decide who is going to
>>> >help with their project in performing these migrations - ideally 2 or 3
>>> >folks who use the current builds.a.o regularly. Those folks should then
>>> >subscribe to the new dedicated hadoop-migrations@infra.apache.org mailing
>>> >lists as soon as possible so we can get started..."
>>> >
>>> >This will be last email I sent to your dev list directly. I am now
>>> building
>>> >a new Jenkins Master, and as soon as it is ready I will start to migrate
>>> >the Jenkins Nodes/Agents over to the new system.
>>> >And; when I am done, the existing builds.apache.org *WILL BE TURNED OFF*.
>>> >
>>> >I am now going to continue all conversations on the
>>> >hadoop-migrations@infra.apache.org list *only.*
>>> >
>>> >Thanks
>>> >
>>> >Gavin McDonald (ASF Infra)
>>> >
>>> >
>>> >On Sat, Apr 18, 2020 at 4:21 PM Gavin McDonald <gm...@apache.org>
>>> wrote:
>>> >
>>> >> Hi All,
>>> >>
>>> >> A couple of months ago, I wrote to a few project private lists
>>> mentioning
>>> >> the need to migrate Hadoop labelled nodes (H0-H21) over to a new
>>> dedicated
>>> >> Jenkins Master [1] (a Cloudbees Client Master.).
>>> >>
>>> >> I'd like to revisit this now that I have more time to dedicate to
>>> getting
>>> >> this done. However, keeping track across multiple mailing lists,
>>> >> separate conversations that spring up in various places is cumbersome
>>> and
>>> >> not realistic. To that end, I have created a new specific mailing list
>>> >> dedicated to the migrations of these nodes, and the projects that use
>>> them,
>>> >> over to the new system.
>>> >>
>>> >> The mailing list 'hadoop-migrations@infra.apache.org' is up and
>>> running
>>> >> now (and this will be the first post to it). Previous discussions were
>>> on
>>> >> the private PMC lists, (there was some debate about that but I wanted
>>> the
>>> >> PMCs initially to be aware of the change,) this new list is public and
>>> >> archived.
>>> >>
>>> >> This email is BCC'd to 13 projects dev lists [2] determined by the
>>> https:/
>>> >> hadoop.apache.org list of Related projects, minus Cassandra whom
>>> already
>>> >> have their own dedicated client master [3] and I added Yetus as I think
>>> >> they cross collaborate with many Hadoop based projects. If anyone
>>> thinks a
>>> >> project is missing, or should not be on the list, let me know.
>>> >>
>>> >> What I would like from each community, is to decide who is going to
>>> help
>>> >> with their project in performing these migrations - ideally 2 or 3
>>> folks
>>> >> who use the current builds.a.o regularly. Those folks should then
>>> subscribe
>>> >> to the new dedicated hadoop-migrations@infra.apache.org mailing lists
>>> as
>>> >> soon as possible so we can get started.
>>> >>
>>> >> About the current setup - and I hope this answers previously asked
>>> >> questions on private lists - the new dedicated master is a Cloudbees
>>> Client
>>> >> Master 2.204.3.7-rolling. It is not the same setup as the current
>>> Jenkins
>>> >> master on builds.a.o - it is not intended to be. It is more or less a
>>> >> 'clean install' in that I have not installed over 500 plugins as is the
>>> >> case on builds.a.o , I would rather we install plugins as we find we
>>> need
>>> >> them. So yes, there may be some features missing - the point of having
>>> >> people sign up to the new list is to find out what those are, get them
>>> >> installed, and get your builds to at least the same state they are in
>>> >> currently.
>>> >>
>>> >> We have 2 nodes on there currently for testing, as things progress we
>>> can
>>> >> transfer over a couple more, projects can start to migrate their jobs
>>> over
>>> >> at any time they are happy , until done. We also need to test auth -
>>> the
>>> >> master; and its nodes will be restricted to just Hadoop + Related
>>> projects
>>> >> (which is important this list of related projects is correct). No
>>> longer
>>> >> will other projects be able to hop on to Hadoop nodes, and no longer
>>> will
>>> >> Hadoop related projects be able to hop onto other folks nodes. This is
>>> a
>>> >> good thing, and may encourage some providers to donate a few more VMs
>>> for
>>> >> dedicated use.
>>> >>
>>> >> For now then, decide who will help with this process, and sign up to
>>> the
>>> >> new mailing list, and lets get started!
>>> >>
>>> >> Note I am NOT subscribed to any of your dev lists, so replies please cc
>>> >> the new list. and I will await your presence there to get started.
>>> >>
>>> >> Thanks all.
>>> >>
>>> >> Gavin McDonald (ASF Infra)
>>> >>
>>> >> [1] - https://ci-hadoop.apache.org
>>> >> [2] -
>>> >>
>>> hadoop,chukwa,avro,ambari,hbase,hive,mahout,pig,spark,submarine,tez,zookeeper,yetus
>>> >> [3] - https://ci-cassandra.apache.org
>>> >>
>>>
>>>
>>
>> --
>>
>> *Gavin McDonald*
>> Systems Administrator
>> ASF Infrastructure Team
>
Re[2]: Migration of Hadoop labelled nodes to new dedicated Master
Posted by Dmitry Grinenko <dg...@cloudera.com.INVALID>.
Hi Zoltan,
You information is really useful.
While i would definitely need Jenkins for such things as automatic pull
request linking to the jira i would like to give a shot for github
actions and try them and decide what is better.
It's like green/red is enough for us, but with the possibility to check
the test report (no need to publish it anywhere)
One question is - how to access the configuration section for the ASF
GitHub project? Is there any doc or i need some special permissions?
Thanks in advance
------ Original Message ------
From: "Zoltan Haindrich" <ki...@rxd.hu>
To: hadoop-migrations@infra.apache.org; "Dmitry Grinenko"
<dg...@cloudera.com.invalid>
Sent: 7/13/2020 2:37:02 PM
Subject: Re: Migration of Hadoop labelled nodes to new dedicated Master
>Hey Dmitry!
>
>For Hive we took a path - from which some parts might be usefull for you; I just write about it a bit:
>
>For one thing - to have the PR/JIRA links - you might just need to add a asf.yaml as described in [1].
>
>To test your project you have a few options;
>* you could try to dump/load the jobs at the new Jenkins CI instance.
>* you could create a new - more modern CI by rethinking it:
> * option A (use jenkins):
> * to enable building github PRs I think the most painless approach is to use a Jenkinsfile with a multibranch pipeline job
> * the Jenkinsfile should contain the instructions to build the project - this way modification to the CI will also come in PRs as well [2]
> * iirc for this to work you will need to set a user who has write access to the repo - without that; it won't be able to request merge commits from github
> * for Hive we took this approach because we still struggle with all kind of test flakiness/etc...
> * this is just one way to do it...jenkins can be utilized in various ways
> * option B (github actions):
> * "github actions" have a lot of resources in the background (iirc: you may use 20 instances of 2core xeons at a time to test a project)
> * configuring github actions is a bit different - you may need to get used to it - but so far I find it usefull :)
> * some asf projects are already utilizing "github actions" like ozone and calcite
> * there are some limitations of GA: like publishing test reports is not really possible.. if red/green is enough then that might not even be important...
>
>[1] https://cwiki.apache.org/confluence/display/INFRA/git+-+.asf.yaml+features
>[2] https://www.jenkins.io/doc/pipeline/examples/
>
>cheers,
>Zoltan
>
>On 7/13/20 1:13 PM, Dmitry Grinenko wrote:
>>Hi Gavin,
>>
>>Thanks for the answer. Sadly, it is not me who previously configured the project on CI, so i would need to do it from the scratch and can be a bit slow due to that.
>>
>>The general idea is to:
>>- use something like https://plugins.jenkins.io/ghprb/ to use it as trigger for Jenkins job to verify pull requests on GitHub
>>- add link to the pull request in the https://issues.apache.org/jira automatically
>>
>>
>>What would be needed in overall (let's call it a plan):
>>- Create pipeline in Jenkins for the Ambari
>>- Add GitHub Builder plugin, so we would able to integrate Jenkins CI with GitHub for automatic new pull request checking
>>- configure the plugin to work with the pipeline and GitHub repository
>>- add possibility to add links in ASF Jira to the GitHub Pull Request
>>
>>
>>I'll start my work on the first item in the list.
>>
>>
>>------ Original Message ------
>>From: "Gavin McDonald" <gm...@apache.org>
>>To: hadoop-migrations@infra.apache.org
>>Sent: 7/13/2020 12:15:47 PM
>>Subject: Re: Re[2]: Migration of Hadoop labelled nodes to new dedicated Master
>>
>>>Hi Dmitry,
>>>
>>>Welcome! - You are not too late no, we need to start ramping up testing so
>>>we can perform the migration of all Hadoop related projects to
>>>ci-hadoop.apache.org.
>>>
>>>Please let me know what you need to get going - you already have login and
>>>build/create
>>>job ability - please test that.
>>>
>>>Later today I'll grab two more H nodes from builds.apache.org and place
>>>them in ci-hadoop.a.o pool
>>>
>>>Thanks
>>>
>>>
>>>On Thu, Jul 9, 2020 at 3:19 PM Dmitry Grinenko
>>><dg...@cloudera.com.invalid> wrote:
>>>
>>>> Hello All,
>>>>
>>>> It seems i'm bit late to the event due to some circumstances, but is
>>>> there still an opportunity to
>>>> participate in the migration?
>>>>
>>>> I'll like to step up as man from the Ambari team and like to migrate
>>>> GitHub pull request builds to new infra.
>>>>
>>>>
>>>> ------ Original Message ------
>>>> From: "Gavin McDonald" <gm...@apache.org>
>>>> To: hadoop-migrations@infra.apache.org
>>>> Sent: 4/29/2020 4:05:49 PM
>>>> Subject: Re: Migration of Hadoop labelled nodes to new dedicated Master
>>>>
>>>> >Hi All,
>>>> >
>>>> >Following on from the below email I sent *11 DAYS ago now*, so far we have
>>>> >had *one reply *from mahout cc:d to me (thank you Trevor) , and had *ONE
>>>> >PERSON* sign up to the new hadoop-migrations@infra.apache.org -
>>>> >that is out of a total of *OVER 7000* people signed up to the 13 mailing
>>>> >lists emailed.
>>>> >
>>>> >To recap what I asked for:-
>>>> >
>>>> >"...What I would like from each community, is to decide who is going to
>>>> >help with their project in performing these migrations - ideally 2 or 3
>>>> >folks who use the current builds.a.o regularly. Those folks should then
>>>> >subscribe to the new dedicated hadoop-migrations@infra.apache.org mailing
>>>> >lists as soon as possible so we can get started..."
>>>> >
>>>> >This will be last email I sent to your dev list directly. I am now
>>>> building
>>>> >a new Jenkins Master, and as soon as it is ready I will start to migrate
>>>> >the Jenkins Nodes/Agents over to the new system.
>>>> >And; when I am done, the existing builds.apache.org *WILL BE TURNED OFF*.
>>>> >
>>>> >I am now going to continue all conversations on the
>>>> >hadoop-migrations@infra.apache.org list *only.*
>>>> >
>>>> >Thanks
>>>> >
>>>> >Gavin McDonald (ASF Infra)
>>>> >
>>>> >
>>>> >On Sat, Apr 18, 2020 at 4:21 PM Gavin McDonald <gm...@apache.org>
>>>> wrote:
>>>> >
>>>> >> Hi All,
>>>> >>
>>>> >> A couple of months ago, I wrote to a few project private lists
>>>> mentioning
>>>> >> the need to migrate Hadoop labelled nodes (H0-H21) over to a new
>>>> dedicated
>>>> >> Jenkins Master [1] (a Cloudbees Client Master.).
>>>> >>
>>>> >> I'd like to revisit this now that I have more time to dedicate to
>>>> getting
>>>> >> this done. However, keeping track across multiple mailing lists,
>>>> >> separate conversations that spring up in various places is cumbersome
>>>> and
>>>> >> not realistic. To that end, I have created a new specific mailing list
>>>> >> dedicated to the migrations of these nodes, and the projects that use
>>>> them,
>>>> >> over to the new system.
>>>> >>
>>>> >> The mailing list 'hadoop-migrations@infra.apache.org' is up and
>>>> running
>>>> >> now (and this will be the first post to it). Previous discussions were
>>>> on
>>>> >> the private PMC lists, (there was some debate about that but I wanted
>>>> the
>>>> >> PMCs initially to be aware of the change,) this new list is public and
>>>> >> archived.
>>>> >>
>>>> >> This email is BCC'd to 13 projects dev lists [2] determined by the
>>>> https:/
>>>> >> hadoop.apache.org list of Related projects, minus Cassandra whom
>>>> already
>>>> >> have their own dedicated client master [3] and I added Yetus as I think
>>>> >> they cross collaborate with many Hadoop based projects. If anyone
>>>> thinks a
>>>> >> project is missing, or should not be on the list, let me know.
>>>> >>
>>>> >> What I would like from each community, is to decide who is going to
>>>> help
>>>> >> with their project in performing these migrations - ideally 2 or 3
>>>> folks
>>>> >> who use the current builds.a.o regularly. Those folks should then
>>>> subscribe
>>>> >> to the new dedicated hadoop-migrations@infra.apache.org mailing lists
>>>> as
>>>> >> soon as possible so we can get started.
>>>> >>
>>>> >> About the current setup - and I hope this answers previously asked
>>>> >> questions on private lists - the new dedicated master is a Cloudbees
>>>> Client
>>>> >> Master 2.204.3.7-rolling. It is not the same setup as the current
>>>> Jenkins
>>>> >> master on builds.a.o - it is not intended to be. It is more or less a
>>>> >> 'clean install' in that I have not installed over 500 plugins as is the
>>>> >> case on builds.a.o , I would rather we install plugins as we find we
>>>> need
>>>> >> them. So yes, there may be some features missing - the point of having
>>>> >> people sign up to the new list is to find out what those are, get them
>>>> >> installed, and get your builds to at least the same state they are in
>>>> >> currently.
>>>> >>
>>>> >> We have 2 nodes on there currently for testing, as things progress we
>>>> can
>>>> >> transfer over a couple more, projects can start to migrate their jobs
>>>> over
>>>> >> at any time they are happy , until done. We also need to test auth -
>>>> the
>>>> >> master; and its nodes will be restricted to just Hadoop + Related
>>>> projects
>>>> >> (which is important this list of related projects is correct). No
>>>> longer
>>>> >> will other projects be able to hop on to Hadoop nodes, and no longer
>>>> will
>>>> >> Hadoop related projects be able to hop onto other folks nodes. This is
>>>> a
>>>> >> good thing, and may encourage some providers to donate a few more VMs
>>>> for
>>>> >> dedicated use.
>>>> >>
>>>> >> For now then, decide who will help with this process, and sign up to
>>>> the
>>>> >> new mailing list, and lets get started!
>>>> >>
>>>> >> Note I am NOT subscribed to any of your dev lists, so replies please cc
>>>> >> the new list. and I will await your presence there to get started.
>>>> >>
>>>> >> Thanks all.
>>>> >>
>>>> >> Gavin McDonald (ASF Infra)
>>>> >>
>>>> >> [1] - https://ci-hadoop.apache.org
>>>> >> [2] -
>>>> >>
>>>> hadoop,chukwa,avro,ambari,hbase,hive,mahout,pig,spark,submarine,tez,zookeeper,yetus
>>>> >> [3] - https://ci-cassandra.apache.org
>>>> >>
>>>>
>>>>
>>>
>>>--
>>>*Gavin McDonald*
>>>Systems Administrator
>>>ASF Infrastructure Team
>>