You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by shane knapp <sk...@berkeley.edu> on 2018/10/10 16:30:03 UTC

Re: moving the spark jenkins job builder repo from dbricks --> spark

hey everyone!

just for visibility, after some lengthy conversations w/some PMC members
(mostly sean and josh) about the location of the jenkins job builder
temples being in a private, databricks repo, we've decided to move them in
to the main apache spark repo.

https://docs.openstack.org/infra/jenkins-job-builder/

On Tue, Oct 9, 2018 at 10:22 PM Sean Owen <sr...@gmail.com> wrote:

> Some responses inline -- this discussion can do to dev@ though.
>
> dev@ added.


> On Tue, Oct 9, 2018 at 3:28 PM shane knapp <sk...@berkeley.edu> wrote:
> > JBB templates in spark repo:
> > * code path is currently undecided, maybe build/?  i honestly don't have
> any strong opinions
>
> How about a subfolder of dev/? that's where many items like the
> release scripts and build style checkers live.
>
> works for me.


> > * this stuff will only in the master branch
>
> Sure, it'll get versioned with branches anyway, but only the master
> branch will matter.
>
> > * the current JJB templates include release and packaging jobs, which
> aren't run via jenkins anymore.  this means we can remove the job builder
> configs for these two, as well as the encrypted secrets.
>
> Sure, if it's not used, remove it. I suppose the concerns below lessen
> if none of the jobs in question here create release artifacts.
>
> exactly.


> > * the JJB templates are able to be run by anyone w/jenkins login access
> without the need to commit changes to the repo.  this means there's a
> non-zero potential for bad actors to change the build configs.  since we
> will only be managing test and compile jobs through this, the chances for
> Real Bad Stuff[tm] is minimized.  i will also have a local server, not on
> the jenkins network, run a nightly cron job that grabs the latest configs
> from github and syncs them to jenkins.
>
> You mean anyone with access to amplab Jenkins? I think this is an
> acceptable risk, especially as it's never been an issue to date. The
> worst case is deleting or sabotaging CI jobs, right? not great, but,
> nobody would be able to commit code or anything. That's the worst
> case, right?
>
> re: access to jenkins -- correct.
re: worst case, deleting a job -- correct (but a cron sync of the jobs from
the tip of master will repair and damage done by nefarious folks).
re: committing code -- correct


> This might be a good time to ask whether we want to use a different CI
> system. I don't see a need to. I don't see any problem that's surfaced
> by publishing the configs as part of the project. If riselab is OK
> continuing to subsidize the build infra, and it continues to work for
> the project, it seems fine. As long as the PMC is meaningfully in
> control of it, I can't see any issue.
>

i have confirmation from RISELab's PI (ion stoica) that we are committed to
continuing to host the apache spark build system here.

if that situation changes (which i cannot foresee), it will of course be
brought up w/the community and the PMC immediately.  i would also like some
heads up from the PMC if the situation changes on their end.  ;)

btw, work will not begin on this until next week.  once i get a jira and PR
opened, i'll respond to this thread w/those links.

shane
-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

Re: moving the spark jenkins job builder repo from dbricks --> spark

Posted by shane knapp <sk...@berkeley.edu>.
looking here:
https://dist.apache.org/repos/dist/dev/spark/3.0.0-SNAPSHOT-2019_01_24_10_34-69dab94-docs/

and here:
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20Packaging/job/spark-master-docs/5312/console

this does confirm that these artifacts are indeed created by the packaging
docs builds.

i will disable them manually on jenkins, and make a note when moving the
JJB configs to not create them moving forward.

thanks sean!

shane



On Thu, Jan 24, 2019 at 4:48 PM Sean Owen <sr...@gmail.com> wrote:

> Are these docs builds creating the SNAPSHOT docs builds at
> https://dist.apache.org/repos/dist/dev/spark/ ? I think from a thread
> last month, these aren't used and should probably just be stopped.
>
> On Thu, Jan 24, 2019 at 3:34 PM shane knapp <sk...@berkeley.edu> wrote:
> >
> > revisiting this thread from october...  sorry for the delay in getting
> around to this until now, but the jenkins job builder configs (and
> associated apache credentials stored in there) are *directly* related to
> the work i'm doing here:
> > https://issues.apache.org/jira/browse/SPARK-26565
> > https://github.com/apache/spark/pull/23492
> >
> > anyways, for each branch, we currently have three packaging builds (
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20Packaging/):  docs,
> maven snapshot and release.
> >
> > i'm currently working on the release builds to test the release process
> w/o pushing artifacts (see above issue/PR).
> >
> > the maven snapshot builds are green, and working as intended (and use
> the ASF creds).
> >
> > my question is:  are we currently relying on any of these doc builds?
> >
> > thanks in advance,
> >
> > shane
> >
> > On Wed, Oct 17, 2018 at 10:48 AM shane knapp <sk...@berkeley.edu>
> wrote:
> >>
> >> On Wed, Oct 17, 2018 at 10:25 AM Yin Huai <yh...@databricks.com> wrote:
> >>>
> >>> Shane, Thank you for initiating this work! Can we do an audit of
> jenkins users and trim down the list?
> >>>
> >> re pruning external (spark-specific) users w/shell and jenkins login
> access:  we can absolutely do this.
> >>
> >> limiting logins for EECS students/faculty/staff is possible, but i will
> need to do some experiments.  we're using SSSD to manage our LDAP logins,
> and it is supposed to handle group filtering but i haven't had much luck
> actually getting it working.
> >>
> >>>
> >>> Also, for packaging jobs, those branch snapshot jobs are active (for
> example,
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20Packaging/job/spark-master-maven-snapshots/
> for publishing snapshot builds from master branch). They still need
> credentials. After we remove the encrypted credential file, are we planning
> to use jenkins as the single place to manage those credentials and we just
> refer to them in jenkins job config?
> >>>
> >> well, since the creds in the repo are actually encrypted, i think that
> keeping them in there is actually fine.  since i wasn't the one who set any
> of this up, however, i will defer to josh about this.
> >>
> >> shane
> >>
> >>>
> >>> On Wed, Oct 10, 2018 at 12:06 PM shane knapp <sk...@berkeley.edu>
> wrote:
> >>>>>
> >>>>> Not sure if that's what you meant; but it should be ok for the
> jenkins
> >>>>> servers to manually sync with master after you (or someone else) have
> >>>>> verified the changes. That should prevent inadvertent breakages since
> >>>>> I don't expect it to be easy to test those scripts without access to
> >>>>> some test jenkins server.
> >>>>>
> >>>> JJB has some built-in lint and testing, so that'll be the first step
> in verifying the build configs.
> >>>>
> >>>> i still have a dream where i have a fully functioning jenkins staging
> deployment...  one day i will make that happen.  :)
> >>>>
> >>>> shane
> >>>>
> >>>> --
> >>>> Shane Knapp
> >>>> UC Berkeley EECS Research / RISELab Staff Technical Lead
> >>>> https://rise.cs.berkeley.edu
> >>
> >>
> >>
> >> --
> >> Shane Knapp
> >> UC Berkeley EECS Research / RISELab Staff Technical Lead
> >> https://rise.cs.berkeley.edu
> >
> >
> >
> > --
> > Shane Knapp
> > UC Berkeley EECS Research / RISELab Staff Technical Lead
> > https://rise.cs.berkeley.edu
>


-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

Re: moving the spark jenkins job builder repo from dbricks --> spark

Posted by Sean Owen <sr...@gmail.com>.
Are these docs builds creating the SNAPSHOT docs builds at
https://dist.apache.org/repos/dist/dev/spark/ ? I think from a thread
last month, these aren't used and should probably just be stopped.

On Thu, Jan 24, 2019 at 3:34 PM shane knapp <sk...@berkeley.edu> wrote:
>
> revisiting this thread from october...  sorry for the delay in getting around to this until now, but the jenkins job builder configs (and associated apache credentials stored in there) are *directly* related to the work i'm doing here:
> https://issues.apache.org/jira/browse/SPARK-26565
> https://github.com/apache/spark/pull/23492
>
> anyways, for each branch, we currently have three packaging builds (https://amplab.cs.berkeley.edu/jenkins/view/Spark%20Packaging/):  docs, maven snapshot and release.
>
> i'm currently working on the release builds to test the release process w/o pushing artifacts (see above issue/PR).
>
> the maven snapshot builds are green, and working as intended (and use the ASF creds).
>
> my question is:  are we currently relying on any of these doc builds?
>
> thanks in advance,
>
> shane
>
> On Wed, Oct 17, 2018 at 10:48 AM shane knapp <sk...@berkeley.edu> wrote:
>>
>> On Wed, Oct 17, 2018 at 10:25 AM Yin Huai <yh...@databricks.com> wrote:
>>>
>>> Shane, Thank you for initiating this work! Can we do an audit of jenkins users and trim down the list?
>>>
>> re pruning external (spark-specific) users w/shell and jenkins login access:  we can absolutely do this.
>>
>> limiting logins for EECS students/faculty/staff is possible, but i will need to do some experiments.  we're using SSSD to manage our LDAP logins, and it is supposed to handle group filtering but i haven't had much luck actually getting it working.
>>
>>>
>>> Also, for packaging jobs, those branch snapshot jobs are active (for example, https://amplab.cs.berkeley.edu/jenkins/view/Spark%20Packaging/job/spark-master-maven-snapshots/ for publishing snapshot builds from master branch). They still need credentials. After we remove the encrypted credential file, are we planning to use jenkins as the single place to manage those credentials and we just refer to them in jenkins job config?
>>>
>> well, since the creds in the repo are actually encrypted, i think that keeping them in there is actually fine.  since i wasn't the one who set any of this up, however, i will defer to josh about this.
>>
>> shane
>>
>>>
>>> On Wed, Oct 10, 2018 at 12:06 PM shane knapp <sk...@berkeley.edu> wrote:
>>>>>
>>>>> Not sure if that's what you meant; but it should be ok for the jenkins
>>>>> servers to manually sync with master after you (or someone else) have
>>>>> verified the changes. That should prevent inadvertent breakages since
>>>>> I don't expect it to be easy to test those scripts without access to
>>>>> some test jenkins server.
>>>>>
>>>> JJB has some built-in lint and testing, so that'll be the first step in verifying the build configs.
>>>>
>>>> i still have a dream where i have a fully functioning jenkins staging deployment...  one day i will make that happen.  :)
>>>>
>>>> shane
>>>>
>>>> --
>>>> Shane Knapp
>>>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>>>> https://rise.cs.berkeley.edu
>>
>>
>>
>> --
>> Shane Knapp
>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>> https://rise.cs.berkeley.edu
>
>
>
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: moving the spark jenkins job builder repo from dbricks --> spark

Posted by shane knapp <sk...@berkeley.edu>.
revisiting this thread from october...  sorry for the delay in getting
around to this until now, but the jenkins job builder configs (and
associated apache credentials stored in there) are *directly* related to
the work i'm doing here:
https://issues.apache.org/jira/browse/SPARK-26565
https://github.com/apache/spark/pull/23492

anyways, for each branch, we currently have three packaging builds (
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20Packaging/):  docs,
maven snapshot and release.

i'm currently working on the release builds to test the release process w/o
pushing artifacts (see above issue/PR).

the maven snapshot builds are green, and working as intended (and use the
ASF creds).

my question is:  are we currently relying on any of these doc builds?

thanks in advance,

shane

On Wed, Oct 17, 2018 at 10:48 AM shane knapp <sk...@berkeley.edu> wrote:

> On Wed, Oct 17, 2018 at 10:25 AM Yin Huai <yh...@databricks.com> wrote:
>
>> Shane, Thank you for initiating this work! Can we do an audit of jenkins
>> users and trim down the list?
>>
>> re pruning external (spark-specific) users w/shell and jenkins login
> access:  we can absolutely do this.
>
> limiting logins for EECS students/faculty/staff is possible, but i will
> need to do some experiments.  we're using SSSD to manage our LDAP logins,
> and it is supposed to handle group filtering but i haven't had much luck
> actually getting it working.
>
>
>> Also, for packaging jobs, those branch snapshot jobs are active (for
>> example,
>> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20Packaging/job/spark-master-maven-snapshots/
>> for publishing snapshot builds from master branch). They still need
>> credentials. After we remove the encrypted credential file, are we planning
>> to use jenkins as the single place to manage those credentials and we just
>> refer to them in jenkins job config?
>>
>> well, since the creds in the repo are actually encrypted, i think that
> keeping them in there is actually fine.  since i wasn't the one who set any
> of this up, however, i will defer to josh about this.
>
> shane
>
>
>> On Wed, Oct 10, 2018 at 12:06 PM shane knapp <sk...@berkeley.edu> wrote:
>>
>>> Not sure if that's what you meant; but it should be ok for the jenkins
>>>> servers to manually sync with master after you (or someone else) have
>>>> verified the changes. That should prevent inadvertent breakages since
>>>> I don't expect it to be easy to test those scripts without access to
>>>> some test jenkins server.
>>>>
>>>> JJB has some built-in lint and testing, so that'll be the first step in
>>> verifying the build configs.
>>>
>>> i still have a dream where i have a fully functioning jenkins staging
>>> deployment...  one day i will make that happen.  :)
>>>
>>> shane
>>>
>>> --
>>> Shane Knapp
>>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>>> https://rise.cs.berkeley.edu
>>>
>>
>
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>


-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

Re: moving the spark jenkins job builder repo from dbricks --> spark

Posted by shane knapp <sk...@berkeley.edu>.
On Wed, Oct 17, 2018 at 10:25 AM Yin Huai <yh...@databricks.com> wrote:

> Shane, Thank you for initiating this work! Can we do an audit of jenkins
> users and trim down the list?
>
> re pruning external (spark-specific) users w/shell and jenkins login
access:  we can absolutely do this.

limiting logins for EECS students/faculty/staff is possible, but i will
need to do some experiments.  we're using SSSD to manage our LDAP logins,
and it is supposed to handle group filtering but i haven't had much luck
actually getting it working.


> Also, for packaging jobs, those branch snapshot jobs are active (for
> example,
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20Packaging/job/spark-master-maven-snapshots/
> for publishing snapshot builds from master branch). They still need
> credentials. After we remove the encrypted credential file, are we planning
> to use jenkins as the single place to manage those credentials and we just
> refer to them in jenkins job config?
>
> well, since the creds in the repo are actually encrypted, i think that
keeping them in there is actually fine.  since i wasn't the one who set any
of this up, however, i will defer to josh about this.

shane


> On Wed, Oct 10, 2018 at 12:06 PM shane knapp <sk...@berkeley.edu> wrote:
>
>> Not sure if that's what you meant; but it should be ok for the jenkins
>>> servers to manually sync with master after you (or someone else) have
>>> verified the changes. That should prevent inadvertent breakages since
>>> I don't expect it to be easy to test those scripts without access to
>>> some test jenkins server.
>>>
>>> JJB has some built-in lint and testing, so that'll be the first step in
>> verifying the build configs.
>>
>> i still have a dream where i have a fully functioning jenkins staging
>> deployment...  one day i will make that happen.  :)
>>
>> shane
>>
>> --
>> Shane Knapp
>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>> https://rise.cs.berkeley.edu
>>
>

-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

Re: moving the spark jenkins job builder repo from dbricks --> spark

Posted by Yin Huai <yh...@databricks.com>.
Shane, Thank you for initiating this work! Can we do an audit of jenkins
users and trim down the list?

Also, for packaging jobs, those branch snapshot jobs are active (for
example,
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20Packaging/job/spark-master-maven-snapshots/
for publishing snapshot builds from master branch). They still need
credentials. After we remove the encrypted credential file, are we planning
to use jenkins as the single place to manage those credentials and we just
refer to them in jenkins job config?

On Wed, Oct 10, 2018 at 12:06 PM shane knapp <sk...@berkeley.edu> wrote:

> Not sure if that's what you meant; but it should be ok for the jenkins
>> servers to manually sync with master after you (or someone else) have
>> verified the changes. That should prevent inadvertent breakages since
>> I don't expect it to be easy to test those scripts without access to
>> some test jenkins server.
>>
>> JJB has some built-in lint and testing, so that'll be the first step in
> verifying the build configs.
>
> i still have a dream where i have a fully functioning jenkins staging
> deployment...  one day i will make that happen.  :)
>
> shane
>
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>

Re: moving the spark jenkins job builder repo from dbricks --> spark

Posted by shane knapp <sk...@berkeley.edu>.
>
> Not sure if that's what you meant; but it should be ok for the jenkins
> servers to manually sync with master after you (or someone else) have
> verified the changes. That should prevent inadvertent breakages since
> I don't expect it to be easy to test those scripts without access to
> some test jenkins server.
>
> JJB has some built-in lint and testing, so that'll be the first step in
verifying the build configs.

i still have a dream where i have a fully functioning jenkins staging
deployment...  one day i will make that happen.  :)

shane

-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

Re: moving the spark jenkins job builder repo from dbricks --> spark

Posted by Marcelo Vanzin <va...@cloudera.com.INVALID>.
Thanks for doing this. The more things we have accessible to the
project members in general the better!

(Now there's that hive fork repo somewhere, but let's not talk about that.)

On Wed, Oct 10, 2018 at 9:30 AM shane knapp <sk...@berkeley.edu> wrote:
>> > * the JJB templates are able to be run by anyone w/jenkins login access without the need to commit changes to the repo.  this means there's a non-zero potential for bad actors to change the build configs.  since we will only be managing test and compile jobs through this, the chances for Real Bad Stuff[tm] is minimized.  i will also have a local server, not on the jenkins network, run a nightly cron job that grabs the latest configs from github and syncs them to jenkins.

Not sure if that's what you meant; but it should be ok for the jenkins
servers to manually sync with master after you (or someone else) have
verified the changes. That should prevent inadvertent breakages since
I don't expect it to be easy to test those scripts without access to
some test jenkins server.

-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org