You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bigtop.apache.org by MrAsanjar <af...@gmail.com> on 2021/04/01 21:08:18 UTC

Re: PPC CI server failure

Hi lads
I just got an email that IBM has reinstated the ppc64le VM.


On Mon, Mar 29, 2021 at 12:05 PM Evans Ye <ev...@apache.org> wrote:

> Great news and thanks, Amir!
>
> Jun HE <ju...@apache.org> 於 2021年3月29日 週一 下午1:54寫道:
>
> > Awesome! Looking forward to its back to CI.
> > Thanks a lot for helping on this, Asanjar!
> >
> > Regards,
> >
> > Jun
> >
> > MrAsanjar <af...@gmail.com> 于2021年3月29日周一 上午10:18写道:
> >
> > > Hi old friends :)
> > > We should have a ppc64le VM back online sometime this week. I'll keep
> you
> > > all posted.
> > >
> > > On Thu, Nov 19, 2020 at 9:05 PM Evans Ye <ev...@apache.org> wrote:
> > >
> > > > Hi rbkrishn,
> > > >
> > > > Would you mind to comment whether those PPC servers for Bigtop CI can
> > be
> > > > brought up and unlock our release process?
> > > > Thanks!
> > > >
> > > > Best,
> > > > Evans
> > > >
> > > > Kengo Seki <se...@apache.org> 於 2020年11月18日 週三 上午7:26寫道:
> > > >
> > > > > Thank you for checking, Evans and Amir!
> > > > >
> > > > > Kengo Seki <se...@apache.org>
> > > > >
> > > > > On Wed, Nov 18, 2020 at 2:09 AM Evans Ye <ev...@apache.org>
> wrote:
> > > > > >
> > > > > > Thank you, Amir.
> > > > > >
> > > > > > MrAsanjar <af...@gmail.com> 於 2020年11月18日 週三 00:39 寫道:
> > > > > >
> > > > > > > Hi Evans, let me check with IBM again.
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Nov 16, 2020 at 9:08 PM Evans Ye <ev...@apache.org>
> > > wrote:
> > > > > > >
> > > > > > > > Hi Amir,
> > > > > > > >
> > > > > > > > We're planning Bigtop 1.5 release and if we don't have the CI
> > > nodes
> > > > > for
> > > > > > > > PPC, we're not able to release 1.5 with PPC supported.
> > > > > > > > Could you help to confirm again? Thanks!
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Evans Ye
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > MrAsanjar <af...@gmail.com> 於 2020年9月17日 週四 下午8:56寫道:
> > > > > > > >
> > > > > > > > > I have informed IBM management regarding the situation,
> > waiting
> > > > > for a
> > > > > > > > > reply.
> > > > > > > > >
> > > > > > > > > On Thu, Sep 17, 2020 at 3:47 AM Evans Ye <
> evansye@apache.org
> > >
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Ok. Thanks for doing this to get the ball rolling.
> > > > > > > > > >
> > > > > > > > > > Kengo Seki <se...@apache.org> 於 2020年9月17日 週四 10:29 寫道:
> > > > > > > > > >
> > > > > > > > > > > Thank you for your help, Amir!
> > > > > > > > > > > It's just a heads-up, I temporarily disabled builds for
> > ppc
> > > > in
> > > > > the
> > > > > > > > > > > following Jenkins jobs so that they can finish.
> > > > > > > > > > >
> > > > > > > > > > > * Docker-Puppet-Trunk
> > > > > > > > > > > * Docker-Puppet-Trunk-pull
> > > > > > > > > > > * Docker-Toolchain-Trunk
> > > > > > > > > > > * Docker-Toolchain-Trunk-pull
> > > > > > > > > > >
> > > > > > > > > > > * Bigtop-trunk-packages
> > > > > > > > > > > * Bigtop-trunk-repos
> > > > > > > > > > >
> > > > > > > > > > > * Remove-All-Docker-Containers-Except-Nexus
> > > > > > > > > > > * Remove-Dangling-Docker-Images
> > > > > > > > > > > * Remove-Inactive-Containers
> > > > > > > > > > >
> > > > > > > > > > > Kengo Seki <se...@apache.org>
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Sep 16, 2020 at 7:35 PM Evans Ye <
> > > evansye@apache.org
> > > > >
> > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Awesome! Nice to hear from you, buddy!
> > > > > > > > > > > >
> > > > > > > > > > > > MrAsanjar <af...@gmail.com> 於 2020年9月16日 週三
> > 上午3:54寫道:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Evans,
> > > > > > > > > > > > > Let me see what I can do. Give me 24 hr :)
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, Sep 15, 2020 at 10:51 AM Evans Ye <
> > > > > evansye@apache.org>
> > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Yes. I think the action is correct. However [2]
> > might
> > > > be
> > > > > a
> > > > > > > > > > different
> > > > > > > > > > > > > thing
> > > > > > > > > > > > > > for PPC integration in Hadoop.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Amir,
> > > > > > > > > > > > > > Could you confirm?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Kengo Seki <se...@apache.org> 於 2020年9月14日 週一
> > > > 下午9:56寫道:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >> Thank you for the advice, Evans!
> > > > > > > > > > > > > >> Let me confirm about "PPC machine owners".
> > According
> > > > to
> > > > > > > Amir's
> > > > > > > > > > JIRA
> > > > > > > > > > > > > >> issues [1][2] and the powered-by list in the OSU
> > > site
> > > > > [3],
> > > > > > > > we're
> > > > > > > > > > > using
> > > > > > > > > > > > > >> a VM hosted by OSU OSL, right?
> > > > > > > > > > > > > >> If it's correct, I'm going to ask them for help
> > via
> > > > > > > > > > > > > >> powerdev-request@osuosl.org.
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >> [1]:
> > > > > > > > > > > > > >>
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/INFRA-11467?focusedCommentId=15300982&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15300982
> > > > > > > > > > > > > >> [2]:
> > > > https://issues.apache.org/jira/browse/INFRA-12014
> > > > > > > > > > > > > >> [3]:
> > > > > > > > > > > > >
> > > > > > > > >
> > > > >
> https://osuosl.org/services/powerdev/current-projects/#foss-projects
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >> Kengo Seki <se...@apache.org>
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >> On Mon, Sep 14, 2020 at 2:06 PM Evans Ye <
> > > > > > > evansye@apache.org>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > I'd suggest to reach out to PPC machine
> owners.
> > > > Worst
> > > > > case
> > > > > > > > Is
> > > > > > > > > we
> > > > > > > > > > > can
> > > > > > > > > > > > > >> > temporary  drop the PPC support to move the
> > > release
> > > > > > > forward.
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > Kengo Seki <se...@apache.org> 於 2020年9月14日
> 週一
> > > > 12:44
> > > > > 寫道:
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > > Hi everyone,
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > Let me share information about the CI
> > > environment.
> > > > > > > > > > > > > >> > > The worker node for ppc64le is currently
> > > offlined,
> > > > > so I
> > > > > > > > just
> > > > > > > > > > > killed
> > > > > > > > > > > > > >> all
> > > > > > > > > > > > > >> > > jobs
> > > > > > > > > > > > > >> > > in the queue waiting for it gets back. Its
> > > status
> > > > > is as
> > > > > > > > > > follows.
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > - According to the result of `who -b`, that
> > > > machine
> > > > > > > seems
> > > > > > > > to
> > > > > > > > > > be
> > > > > > > > > > > > > >> rebooted
> > > > > > > > > > > > > >> > >   on 2020-09-11 for some reason (probably
> > > > > unexpectedly).
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > - According to the result of dmesg, the root
> > > > volume
> > > > > was
> > > > > > > > > > mounted
> > > > > > > > > > > > > >> > >   in read-only mode because of a fsck
> failure.
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > >   [   34.840681] EXT4-fs (vda1): Couldn't
> > > remount
> > > > > RDWR
> > > > > > > > > because
> > > > > > > > > > > of
> > > > > > > > > > > > > >> > > unprocessed orphan inode list.  Please
> > > > > umount/remount
> > > > > > > > > instead
> > > > > > > > > > > > > >> > >   [   60.714110] cgroup: new mount options
> do
> > > not
> > > > > match
> > > > > > > > the
> > > > > > > > > > > existing
> > > > > > > > > > > > > >> > > superblock, will be ignored
> > > > > > > > > > > > > >> > >   [  316.385805] EXT4-fs (vda1): error count
> > > since
> > > > > last
> > > > > > > > > fsck:
> > > > > > > > > > > 9459
> > > > > > > > > > > > > >> > >   [  316.385824] EXT4-fs (vda1): initial
> error
> > > at
> > > > > time
> > > > > > > > > > > 1540294049:
> > > > > > > > > > > > > >> > > ext4_validate_inode_bitmap:134
> > > > > > > > > > > > > >> > >   [  316.385826] EXT4-fs (vda1): last error
> at
> > > > time
> > > > > > > > > > 1596881526:
> > > > > > > > > > > > > >> > > ext4_free_inode:383
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > It looks like some fsck work (and replacing
> > the
> > > > > volume,
> > > > > > > if
> > > > > > > > > it
> > > > > > > > > > > fails)
> > > > > > > > > > > > > >> > > are required,
> > > > > > > > > > > > > >> > > but I'm not sure if I could run something
> like
> > > > > `e2fsck
> > > > > > > > -p`,
> > > > > > > > > > > because
> > > > > > > > > > > > > >> > > I'm also not sure
> > > > > > > > > > > > > >> > > where does that machine exist or who's
> > managing
> > > > it.
> > > > > > > > > > > > > >> > > (I slightly thought it was running as a VM
> > with
> > > > > QEMU on
> > > > > > > > some
> > > > > > > > > > EC2
> > > > > > > > > > > > > >> > > instance, but I couldn't find it)
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > > Cos, Evans, Olaf
> > > > > > > > > > > > > >> > > Would you provide any suggestions?
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > Kengo Seki <se...@apache.org>
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >
>

Re: PPC CI server failure

Posted by Kengo Seki <se...@apache.org>.
Hi Amir,

The problem I previously reported was resolved in BIGTOP-3537, so I've
enabled ppc64le builds on CI.
Then I came across some failures. See the following comment for details.
https://issues.apache.org/jira/browse/BIGTOP-3533?focusedCommentId=17352207&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17352207

Kengo Seki <se...@apache.org>

On Mon, May 10, 2021 at 9:36 AM Kengo Seki <se...@apache.org> wrote:
>
> > Let me know when the ppc64le CI/CD is going to get enabled to help us identify the failing components.
>
> Sure! As a first step, I added the ppc64le configuration back to the
> Docker-related jobs.
>
> https://ci.bigtop.apache.org/view/Docker/job/Docker-Puppet-Trunk/
> https://ci.bigtop.apache.org/view/Docker/job/Docker-Puppet-Trunk-pull/
> https://ci.bigtop.apache.org/view/Docker/job/Docker-Toolchain-Trunk/
> https://ci.bigtop.apache.org/view/Docker/job/Docker-Toolchain-Trunk-pull/
>
> But Docker-Puppet-Trunk failed only on CentOS 8 due to PowerTools
> repository setting for some reason.
>
> https://ci.bigtop.apache.org/view/Docker/job/Docker-Puppet-Trunk/24/
>
> I'll keep investigating and let you know when there's any progress.
>
> Kengo Seki <se...@apache.org>
>
> On Mon, May 3, 2021 at 10:36 PM MrAsanjar . <as...@apache.org> wrote:
> >
> > Let me know when the ppc64le CI/CD is going to get enabled to help us
> > identify the failing components.
> >
> > On Fri, Apr 16, 2021 at 8:00 PM Kengo Seki <se...@apache.org> wrote:
> >
> > > Sorry for my late response, I was quite busy this week...
> > > Amir, thank you for recovering the ppc64le server! I've just enabled
> > > it on Jenkins and it seems to be healthy. I'm going to work on
> > > BIGTOP-3533.
> > > Also thanks to Evans and Olaf for helping him.
> > >
> > > Kengo Seki <se...@apache.org>
> > >
> > > On Sat, Apr 17, 2021 at 3:50 AM Olaf Flebbe <of...@oflebbe.de> wrote:
> > > >
> > > > I already gave the public key to asanjar.
> > > >
> > > > Olaf
> > > >
> > > > > Am 16.04.2021 um 10:49 schrieb Evans Ye <ev...@apache.org>:
> > > > >
> > > > > Let me help. I was busy on a thing.
> > > > >
> > > > >
> > > > > MrAsanjar . <as...@apache.org> 於 2021年4月15日 週四 下午10:30寫道:
> > > > >
> > > > >> In order to set up the new Jenkins slave for ppc64le (
> > > > >> https://issues.apache.org/jira/browse/BIGTOP-3534) we need Jenkins
> > > > >> master's
> > > > >> public ssh key. Who can help me here?
> > > > >>
> > > > >> On Fri, Apr 2, 2021 at 4:00 PM MrAsanjar <af...@gmail.com> wrote:
> > > > >>
> > > > >>> I have verified the state of ppc64le VM, it is operational. Could we
> > > > >>> enable the ppc64le build before OpenStack flag the VM as ideal again.
> > > > >>>
> > > > >>> On Thu, Apr 1, 2021 at 4:08 PM MrAsanjar <af...@gmail.com> wrote:
> > > > >>>
> > > > >>>> Hi lads
> > > > >>>> I just got an email that IBM has reinstated the ppc64le VM.
> > > > >>>>
> > > > >>>>
> > > > >>>> On Mon, Mar 29, 2021 at 12:05 PM Evans Ye <ev...@apache.org>
> > > wrote:
> > > > >>>>
> > > > >>>>> Great news and thanks, Amir!
> > > > >>>>>
> > > > >>>>> Jun HE <ju...@apache.org> 於 2021年3月29日 週一 下午1:54寫道:
> > > > >>>>>
> > > > >>>>>> Awesome! Looking forward to its back to CI.
> > > > >>>>>> Thanks a lot for helping on this, Asanjar!
> > > > >>>>>>
> > > > >>>>>> Regards,
> > > > >>>>>>
> > > > >>>>>> Jun
> > > > >>>>>>
> > > > >>>>>> MrAsanjar <af...@gmail.com> 于2021年3月29日周一 上午10:18写道:
> > > > >>>>>>
> > > > >>>>>>> Hi old friends :)
> > > > >>>>>>> We should have a ppc64le VM back online sometime this week. I'll
> > > > >>>>> keep you
> > > > >>>>>>> all posted.
> > > > >>>>>>>
> > > > >>>>>>> On Thu, Nov 19, 2020 at 9:05 PM Evans Ye <ev...@apache.org>
> > > > >> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> Hi rbkrishn,
> > > > >>>>>>>>
> > > > >>>>>>>> Would you mind to comment whether those PPC servers for Bigtop
> > > CI
> > > > >>>>> can
> > > > >>>>>> be
> > > > >>>>>>>> brought up and unlock our release process?
> > > > >>>>>>>> Thanks!
> > > > >>>>>>>>
> > > > >>>>>>>> Best,
> > > > >>>>>>>> Evans
> > > > >>>>>>>>
> > > > >>>>>>>> Kengo Seki <se...@apache.org> 於 2020年11月18日 週三 上午7:26寫道:
> > > > >>>>>>>>
> > > > >>>>>>>>> Thank you for checking, Evans and Amir!
> > > > >>>>>>>>>
> > > > >>>>>>>>> Kengo Seki <se...@apache.org>
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Wed, Nov 18, 2020 at 2:09 AM Evans Ye <ev...@apache.org>
> > > > >>>>> wrote:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Thank you, Amir.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> MrAsanjar <af...@gmail.com> 於 2020年11月18日 週三 00:39 寫道:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>> Hi Evans, let me check with IBM again.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> On Mon, Nov 16, 2020 at 9:08 PM Evans Ye <
> > > > >> evansye@apache.org
> > > > >>>>>>
> > > > >>>>>>> wrote:
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> Hi Amir,
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> We're planning Bigtop 1.5 release and if we don't have
> > > > >> the
> > > > >>>>> CI
> > > > >>>>>>> nodes
> > > > >>>>>>>>> for
> > > > >>>>>>>>>>>> PPC, we're not able to release 1.5 with PPC supported.
> > > > >>>>>>>>>>>> Could you help to confirm again? Thanks!
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Best,
> > > > >>>>>>>>>>>> Evans Ye
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> MrAsanjar <af...@gmail.com> 於 2020年9月17日 週四 下午8:56寫道:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> I have informed IBM management regarding the situation,
> > > > >>>>>> waiting
> > > > >>>>>>>>> for a
> > > > >>>>>>>>>>>>> reply.
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> On Thu, Sep 17, 2020 at 3:47 AM Evans Ye <
> > > > >>>>> evansye@apache.org
> > > > >>>>>>>
> > > > >>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> Ok. Thanks for doing this to get the ball rolling.
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> Kengo Seki <se...@apache.org> 於 2020年9月17日 週四 10:29
> > > > >>>>> 寫道:
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Thank you for your help, Amir!
> > > > >>>>>>>>>>>>>>> It's just a heads-up, I temporarily disabled builds
> > > > >>>>> for
> > > > >>>>>> ppc
> > > > >>>>>>>> in
> > > > >>>>>>>>> the
> > > > >>>>>>>>>>>>>>> following Jenkins jobs so that they can finish.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> * Docker-Puppet-Trunk
> > > > >>>>>>>>>>>>>>> * Docker-Puppet-Trunk-pull
> > > > >>>>>>>>>>>>>>> * Docker-Toolchain-Trunk
> > > > >>>>>>>>>>>>>>> * Docker-Toolchain-Trunk-pull
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> * Bigtop-trunk-packages
> > > > >>>>>>>>>>>>>>> * Bigtop-trunk-repos
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> * Remove-All-Docker-Containers-Except-Nexus
> > > > >>>>>>>>>>>>>>> * Remove-Dangling-Docker-Images
> > > > >>>>>>>>>>>>>>> * Remove-Inactive-Containers
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Kengo Seki <se...@apache.org>
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> On Wed, Sep 16, 2020 at 7:35 PM Evans Ye <
> > > > >>>>>>> evansye@apache.org
> > > > >>>>>>>>>
> > > > >>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> Awesome! Nice to hear from you, buddy!
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> MrAsanjar <af...@gmail.com> 於 2020年9月16日 週三
> > > > >>>>>> 上午3:54寫道:
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>> Hi Evans,
> > > > >>>>>>>>>>>>>>>>> Let me see what I can do. Give me 24 hr :)
> > > > >>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>> On Tue, Sep 15, 2020 at 10:51 AM Evans Ye <
> > > > >>>>>>>>> evansye@apache.org>
> > > > >>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>> Yes. I think the action is correct. However
> > > > >> [2]
> > > > >>>>>> might
> > > > >>>>>>>> be
> > > > >>>>>>>>> a
> > > > >>>>>>>>>>>>>> different
> > > > >>>>>>>>>>>>>>>>> thing
> > > > >>>>>>>>>>>>>>>>>> for PPC integration in Hadoop.
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>> Amir,
> > > > >>>>>>>>>>>>>>>>>> Could you confirm?
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org> 於 2020年9月14日
> > > > >> 週一
> > > > >>>>>>>> 下午9:56寫道:
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>> Thank you for the advice, Evans!
> > > > >>>>>>>>>>>>>>>>>>> Let me confirm about "PPC machine owners".
> > > > >>>>>> According
> > > > >>>>>>>> to
> > > > >>>>>>>>>>> Amir's
> > > > >>>>>>>>>>>>>> JIRA
> > > > >>>>>>>>>>>>>>>>>>> issues [1][2] and the powered-by list in the
> > > > >>>>> OSU
> > > > >>>>>>> site
> > > > >>>>>>>>> [3],
> > > > >>>>>>>>>>>> we're
> > > > >>>>>>>>>>>>>>> using
> > > > >>>>>>>>>>>>>>>>>>> a VM hosted by OSU OSL, right?
> > > > >>>>>>>>>>>>>>>>>>> If it's correct, I'm going to ask them for
> > > > >>>>> help
> > > > >>>>>> via
> > > > >>>>>>>>>>>>>>>>>>> powerdev-request@osuosl.org.
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>> [1]:
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>
> > > https://issues.apache.org/jira/browse/INFRA-11467?focusedCommentId=15300982&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15300982
> > > > >>>>>>>>>>>>>>>>>>> [2]:
> > > > >>>>>>>> https://issues.apache.org/jira/browse/INFRA-12014
> > > > >>>>>>>>>>>>>>>>>>> [3]:
> > > > >>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>
> > > https://osuosl.org/services/powerdev/current-projects/#foss-projects
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org>
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>> On Mon, Sep 14, 2020 at 2:06 PM Evans Ye <
> > > > >>>>>>>>>>> evansye@apache.org>
> > > > >>>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>> I'd suggest to reach out to PPC machine
> > > > >>>>> owners.
> > > > >>>>>>>> Worst
> > > > >>>>>>>>> case
> > > > >>>>>>>>>>>> Is
> > > > >>>>>>>>>>>>> we
> > > > >>>>>>>>>>>>>>> can
> > > > >>>>>>>>>>>>>>>>>>>> temporary  drop the PPC support to move
> > > > >> the
> > > > >>>>>>> release
> > > > >>>>>>>>>>> forward.
> > > > >>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org> 於
> > > > >>>>> 2020年9月14日 週一
> > > > >>>>>>>> 12:44
> > > > >>>>>>>>> 寫道:
> > > > >>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> Hi everyone,
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> Let me share information about the CI
> > > > >>>>>>> environment.
> > > > >>>>>>>>>>>>>>>>>>>>> The worker node for ppc64le is currently
> > > > >>>>>>> offlined,
> > > > >>>>>>>>> so I
> > > > >>>>>>>>>>>> just
> > > > >>>>>>>>>>>>>>> killed
> > > > >>>>>>>>>>>>>>>>>>> all
> > > > >>>>>>>>>>>>>>>>>>>>> jobs
> > > > >>>>>>>>>>>>>>>>>>>>> in the queue waiting for it gets back.
> > > > >> Its
> > > > >>>>>>> status
> > > > >>>>>>>>> is as
> > > > >>>>>>>>>>>>>> follows.
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> - According to the result of `who -b`,
> > > > >>>>> that
> > > > >>>>>>>> machine
> > > > >>>>>>>>>>> seems
> > > > >>>>>>>>>>>> to
> > > > >>>>>>>>>>>>>> be
> > > > >>>>>>>>>>>>>>>>>>> rebooted
> > > > >>>>>>>>>>>>>>>>>>>>>  on 2020-09-11 for some reason
> > > > >> (probably
> > > > >>>>>>>>> unexpectedly).
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> - According to the result of dmesg, the
> > > > >>>>> root
> > > > >>>>>>>> volume
> > > > >>>>>>>>> was
> > > > >>>>>>>>>>>>>> mounted
> > > > >>>>>>>>>>>>>>>>>>>>>  in read-only mode because of a fsck
> > > > >>>>> failure.
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>  [   34.840681] EXT4-fs (vda1):
> > > > >> Couldn't
> > > > >>>>>>> remount
> > > > >>>>>>>>> RDWR
> > > > >>>>>>>>>>>>> because
> > > > >>>>>>>>>>>>>>> of
> > > > >>>>>>>>>>>>>>>>>>>>> unprocessed orphan inode list.  Please
> > > > >>>>>>>>> umount/remount
> > > > >>>>>>>>>>>>> instead
> > > > >>>>>>>>>>>>>>>>>>>>>  [   60.714110] cgroup: new mount
> > > > >>>>> options do
> > > > >>>>>>> not
> > > > >>>>>>>>> match
> > > > >>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>> existing
> > > > >>>>>>>>>>>>>>>>>>>>> superblock, will be ignored
> > > > >>>>>>>>>>>>>>>>>>>>>  [  316.385805] EXT4-fs (vda1): error
> > > > >>>>> count
> > > > >>>>>>> since
> > > > >>>>>>>>> last
> > > > >>>>>>>>>>>>> fsck:
> > > > >>>>>>>>>>>>>>> 9459
> > > > >>>>>>>>>>>>>>>>>>>>>  [  316.385824] EXT4-fs (vda1): initial
> > > > >>>>> error
> > > > >>>>>>> at
> > > > >>>>>>>>> time
> > > > >>>>>>>>>>>>>>> 1540294049:
> > > > >>>>>>>>>>>>>>>>>>>>> ext4_validate_inode_bitmap:134
> > > > >>>>>>>>>>>>>>>>>>>>>  [  316.385826] EXT4-fs (vda1): last
> > > > >>>>> error at
> > > > >>>>>>>> time
> > > > >>>>>>>>>>>>>> 1596881526:
> > > > >>>>>>>>>>>>>>>>>>>>> ext4_free_inode:383
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> It looks like some fsck work (and
> > > > >>>>> replacing
> > > > >>>>>> the
> > > > >>>>>>>>> volume,
> > > > >>>>>>>>>>> if
> > > > >>>>>>>>>>>>> it
> > > > >>>>>>>>>>>>>>> fails)
> > > > >>>>>>>>>>>>>>>>>>>>> are required,
> > > > >>>>>>>>>>>>>>>>>>>>> but I'm not sure if I could run
> > > > >> something
> > > > >>>>> like
> > > > >>>>>>>>> `e2fsck
> > > > >>>>>>>>>>>> -p`,
> > > > >>>>>>>>>>>>>>> because
> > > > >>>>>>>>>>>>>>>>>>>>> I'm also not sure
> > > > >>>>>>>>>>>>>>>>>>>>> where does that machine exist or who's
> > > > >>>>>> managing
> > > > >>>>>>>> it.
> > > > >>>>>>>>>>>>>>>>>>>>> (I slightly thought it was running as a
> > > > >> VM
> > > > >>>>>> with
> > > > >>>>>>>>> QEMU on
> > > > >>>>>>>>>>>> some
> > > > >>>>>>>>>>>>>> EC2
> > > > >>>>>>>>>>>>>>>>>>>>> instance, but I couldn't find it)
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>> Cos, Evans, Olaf
> > > > >>>>>>>>>>>>>>>>>>>>> Would you provide any suggestions?
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org>
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>
> > > >
> > >

Re: PPC CI server failure

Posted by Kengo Seki <se...@apache.org>.
> Let me know when the ppc64le CI/CD is going to get enabled to help us identify the failing components.

Sure! As a first step, I added the ppc64le configuration back to the
Docker-related jobs.

https://ci.bigtop.apache.org/view/Docker/job/Docker-Puppet-Trunk/
https://ci.bigtop.apache.org/view/Docker/job/Docker-Puppet-Trunk-pull/
https://ci.bigtop.apache.org/view/Docker/job/Docker-Toolchain-Trunk/
https://ci.bigtop.apache.org/view/Docker/job/Docker-Toolchain-Trunk-pull/

But Docker-Puppet-Trunk failed only on CentOS 8 due to PowerTools
repository setting for some reason.

https://ci.bigtop.apache.org/view/Docker/job/Docker-Puppet-Trunk/24/

I'll keep investigating and let you know when there's any progress.

Kengo Seki <se...@apache.org>

On Mon, May 3, 2021 at 10:36 PM MrAsanjar . <as...@apache.org> wrote:
>
> Let me know when the ppc64le CI/CD is going to get enabled to help us
> identify the failing components.
>
> On Fri, Apr 16, 2021 at 8:00 PM Kengo Seki <se...@apache.org> wrote:
>
> > Sorry for my late response, I was quite busy this week...
> > Amir, thank you for recovering the ppc64le server! I've just enabled
> > it on Jenkins and it seems to be healthy. I'm going to work on
> > BIGTOP-3533.
> > Also thanks to Evans and Olaf for helping him.
> >
> > Kengo Seki <se...@apache.org>
> >
> > On Sat, Apr 17, 2021 at 3:50 AM Olaf Flebbe <of...@oflebbe.de> wrote:
> > >
> > > I already gave the public key to asanjar.
> > >
> > > Olaf
> > >
> > > > Am 16.04.2021 um 10:49 schrieb Evans Ye <ev...@apache.org>:
> > > >
> > > > Let me help. I was busy on a thing.
> > > >
> > > >
> > > > MrAsanjar . <as...@apache.org> 於 2021年4月15日 週四 下午10:30寫道:
> > > >
> > > >> In order to set up the new Jenkins slave for ppc64le (
> > > >> https://issues.apache.org/jira/browse/BIGTOP-3534) we need Jenkins
> > > >> master's
> > > >> public ssh key. Who can help me here?
> > > >>
> > > >> On Fri, Apr 2, 2021 at 4:00 PM MrAsanjar <af...@gmail.com> wrote:
> > > >>
> > > >>> I have verified the state of ppc64le VM, it is operational. Could we
> > > >>> enable the ppc64le build before OpenStack flag the VM as ideal again.
> > > >>>
> > > >>> On Thu, Apr 1, 2021 at 4:08 PM MrAsanjar <af...@gmail.com> wrote:
> > > >>>
> > > >>>> Hi lads
> > > >>>> I just got an email that IBM has reinstated the ppc64le VM.
> > > >>>>
> > > >>>>
> > > >>>> On Mon, Mar 29, 2021 at 12:05 PM Evans Ye <ev...@apache.org>
> > wrote:
> > > >>>>
> > > >>>>> Great news and thanks, Amir!
> > > >>>>>
> > > >>>>> Jun HE <ju...@apache.org> 於 2021年3月29日 週一 下午1:54寫道:
> > > >>>>>
> > > >>>>>> Awesome! Looking forward to its back to CI.
> > > >>>>>> Thanks a lot for helping on this, Asanjar!
> > > >>>>>>
> > > >>>>>> Regards,
> > > >>>>>>
> > > >>>>>> Jun
> > > >>>>>>
> > > >>>>>> MrAsanjar <af...@gmail.com> 于2021年3月29日周一 上午10:18写道:
> > > >>>>>>
> > > >>>>>>> Hi old friends :)
> > > >>>>>>> We should have a ppc64le VM back online sometime this week. I'll
> > > >>>>> keep you
> > > >>>>>>> all posted.
> > > >>>>>>>
> > > >>>>>>> On Thu, Nov 19, 2020 at 9:05 PM Evans Ye <ev...@apache.org>
> > > >> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi rbkrishn,
> > > >>>>>>>>
> > > >>>>>>>> Would you mind to comment whether those PPC servers for Bigtop
> > CI
> > > >>>>> can
> > > >>>>>> be
> > > >>>>>>>> brought up and unlock our release process?
> > > >>>>>>>> Thanks!
> > > >>>>>>>>
> > > >>>>>>>> Best,
> > > >>>>>>>> Evans
> > > >>>>>>>>
> > > >>>>>>>> Kengo Seki <se...@apache.org> 於 2020年11月18日 週三 上午7:26寫道:
> > > >>>>>>>>
> > > >>>>>>>>> Thank you for checking, Evans and Amir!
> > > >>>>>>>>>
> > > >>>>>>>>> Kengo Seki <se...@apache.org>
> > > >>>>>>>>>
> > > >>>>>>>>> On Wed, Nov 18, 2020 at 2:09 AM Evans Ye <ev...@apache.org>
> > > >>>>> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>> Thank you, Amir.
> > > >>>>>>>>>>
> > > >>>>>>>>>> MrAsanjar <af...@gmail.com> 於 2020年11月18日 週三 00:39 寫道:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> Hi Evans, let me check with IBM again.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Mon, Nov 16, 2020 at 9:08 PM Evans Ye <
> > > >> evansye@apache.org
> > > >>>>>>
> > > >>>>>>> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> Hi Amir,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> We're planning Bigtop 1.5 release and if we don't have
> > > >> the
> > > >>>>> CI
> > > >>>>>>> nodes
> > > >>>>>>>>> for
> > > >>>>>>>>>>>> PPC, we're not able to release 1.5 with PPC supported.
> > > >>>>>>>>>>>> Could you help to confirm again? Thanks!
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Best,
> > > >>>>>>>>>>>> Evans Ye
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> MrAsanjar <af...@gmail.com> 於 2020年9月17日 週四 下午8:56寫道:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> I have informed IBM management regarding the situation,
> > > >>>>>> waiting
> > > >>>>>>>>> for a
> > > >>>>>>>>>>>>> reply.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> On Thu, Sep 17, 2020 at 3:47 AM Evans Ye <
> > > >>>>> evansye@apache.org
> > > >>>>>>>
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Ok. Thanks for doing this to get the ball rolling.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Kengo Seki <se...@apache.org> 於 2020年9月17日 週四 10:29
> > > >>>>> 寫道:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Thank you for your help, Amir!
> > > >>>>>>>>>>>>>>> It's just a heads-up, I temporarily disabled builds
> > > >>>>> for
> > > >>>>>> ppc
> > > >>>>>>>> in
> > > >>>>>>>>> the
> > > >>>>>>>>>>>>>>> following Jenkins jobs so that they can finish.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> * Docker-Puppet-Trunk
> > > >>>>>>>>>>>>>>> * Docker-Puppet-Trunk-pull
> > > >>>>>>>>>>>>>>> * Docker-Toolchain-Trunk
> > > >>>>>>>>>>>>>>> * Docker-Toolchain-Trunk-pull
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> * Bigtop-trunk-packages
> > > >>>>>>>>>>>>>>> * Bigtop-trunk-repos
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> * Remove-All-Docker-Containers-Except-Nexus
> > > >>>>>>>>>>>>>>> * Remove-Dangling-Docker-Images
> > > >>>>>>>>>>>>>>> * Remove-Inactive-Containers
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Kengo Seki <se...@apache.org>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> On Wed, Sep 16, 2020 at 7:35 PM Evans Ye <
> > > >>>>>>> evansye@apache.org
> > > >>>>>>>>>
> > > >>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Awesome! Nice to hear from you, buddy!
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> MrAsanjar <af...@gmail.com> 於 2020年9月16日 週三
> > > >>>>>> 上午3:54寫道:
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Hi Evans,
> > > >>>>>>>>>>>>>>>>> Let me see what I can do. Give me 24 hr :)
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> On Tue, Sep 15, 2020 at 10:51 AM Evans Ye <
> > > >>>>>>>>> evansye@apache.org>
> > > >>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> Yes. I think the action is correct. However
> > > >> [2]
> > > >>>>>> might
> > > >>>>>>>> be
> > > >>>>>>>>> a
> > > >>>>>>>>>>>>>> different
> > > >>>>>>>>>>>>>>>>> thing
> > > >>>>>>>>>>>>>>>>>> for PPC integration in Hadoop.
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> Amir,
> > > >>>>>>>>>>>>>>>>>> Could you confirm?
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org> 於 2020年9月14日
> > > >> 週一
> > > >>>>>>>> 下午9:56寫道:
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> Thank you for the advice, Evans!
> > > >>>>>>>>>>>>>>>>>>> Let me confirm about "PPC machine owners".
> > > >>>>>> According
> > > >>>>>>>> to
> > > >>>>>>>>>>> Amir's
> > > >>>>>>>>>>>>>> JIRA
> > > >>>>>>>>>>>>>>>>>>> issues [1][2] and the powered-by list in the
> > > >>>>> OSU
> > > >>>>>>> site
> > > >>>>>>>>> [3],
> > > >>>>>>>>>>>> we're
> > > >>>>>>>>>>>>>>> using
> > > >>>>>>>>>>>>>>>>>>> a VM hosted by OSU OSL, right?
> > > >>>>>>>>>>>>>>>>>>> If it's correct, I'm going to ask them for
> > > >>>>> help
> > > >>>>>> via
> > > >>>>>>>>>>>>>>>>>>> powerdev-request@osuosl.org.
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> [1]:
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>
> > https://issues.apache.org/jira/browse/INFRA-11467?focusedCommentId=15300982&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15300982
> > > >>>>>>>>>>>>>>>>>>> [2]:
> > > >>>>>>>> https://issues.apache.org/jira/browse/INFRA-12014
> > > >>>>>>>>>>>>>>>>>>> [3]:
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>
> > https://osuosl.org/services/powerdev/current-projects/#foss-projects
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org>
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> On Mon, Sep 14, 2020 at 2:06 PM Evans Ye <
> > > >>>>>>>>>>> evansye@apache.org>
> > > >>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> I'd suggest to reach out to PPC machine
> > > >>>>> owners.
> > > >>>>>>>> Worst
> > > >>>>>>>>> case
> > > >>>>>>>>>>>> Is
> > > >>>>>>>>>>>>> we
> > > >>>>>>>>>>>>>>> can
> > > >>>>>>>>>>>>>>>>>>>> temporary  drop the PPC support to move
> > > >> the
> > > >>>>>>> release
> > > >>>>>>>>>>> forward.
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org> 於
> > > >>>>> 2020年9月14日 週一
> > > >>>>>>>> 12:44
> > > >>>>>>>>> 寫道:
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> Hi everyone,
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> Let me share information about the CI
> > > >>>>>>> environment.
> > > >>>>>>>>>>>>>>>>>>>>> The worker node for ppc64le is currently
> > > >>>>>>> offlined,
> > > >>>>>>>>> so I
> > > >>>>>>>>>>>> just
> > > >>>>>>>>>>>>>>> killed
> > > >>>>>>>>>>>>>>>>>>> all
> > > >>>>>>>>>>>>>>>>>>>>> jobs
> > > >>>>>>>>>>>>>>>>>>>>> in the queue waiting for it gets back.
> > > >> Its
> > > >>>>>>> status
> > > >>>>>>>>> is as
> > > >>>>>>>>>>>>>> follows.
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> - According to the result of `who -b`,
> > > >>>>> that
> > > >>>>>>>> machine
> > > >>>>>>>>>>> seems
> > > >>>>>>>>>>>> to
> > > >>>>>>>>>>>>>> be
> > > >>>>>>>>>>>>>>>>>>> rebooted
> > > >>>>>>>>>>>>>>>>>>>>>  on 2020-09-11 for some reason
> > > >> (probably
> > > >>>>>>>>> unexpectedly).
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> - According to the result of dmesg, the
> > > >>>>> root
> > > >>>>>>>> volume
> > > >>>>>>>>> was
> > > >>>>>>>>>>>>>> mounted
> > > >>>>>>>>>>>>>>>>>>>>>  in read-only mode because of a fsck
> > > >>>>> failure.
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>  [   34.840681] EXT4-fs (vda1):
> > > >> Couldn't
> > > >>>>>>> remount
> > > >>>>>>>>> RDWR
> > > >>>>>>>>>>>>> because
> > > >>>>>>>>>>>>>>> of
> > > >>>>>>>>>>>>>>>>>>>>> unprocessed orphan inode list.  Please
> > > >>>>>>>>> umount/remount
> > > >>>>>>>>>>>>> instead
> > > >>>>>>>>>>>>>>>>>>>>>  [   60.714110] cgroup: new mount
> > > >>>>> options do
> > > >>>>>>> not
> > > >>>>>>>>> match
> > > >>>>>>>>>>>> the
> > > >>>>>>>>>>>>>>> existing
> > > >>>>>>>>>>>>>>>>>>>>> superblock, will be ignored
> > > >>>>>>>>>>>>>>>>>>>>>  [  316.385805] EXT4-fs (vda1): error
> > > >>>>> count
> > > >>>>>>> since
> > > >>>>>>>>> last
> > > >>>>>>>>>>>>> fsck:
> > > >>>>>>>>>>>>>>> 9459
> > > >>>>>>>>>>>>>>>>>>>>>  [  316.385824] EXT4-fs (vda1): initial
> > > >>>>> error
> > > >>>>>>> at
> > > >>>>>>>>> time
> > > >>>>>>>>>>>>>>> 1540294049:
> > > >>>>>>>>>>>>>>>>>>>>> ext4_validate_inode_bitmap:134
> > > >>>>>>>>>>>>>>>>>>>>>  [  316.385826] EXT4-fs (vda1): last
> > > >>>>> error at
> > > >>>>>>>> time
> > > >>>>>>>>>>>>>> 1596881526:
> > > >>>>>>>>>>>>>>>>>>>>> ext4_free_inode:383
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> It looks like some fsck work (and
> > > >>>>> replacing
> > > >>>>>> the
> > > >>>>>>>>> volume,
> > > >>>>>>>>>>> if
> > > >>>>>>>>>>>>> it
> > > >>>>>>>>>>>>>>> fails)
> > > >>>>>>>>>>>>>>>>>>>>> are required,
> > > >>>>>>>>>>>>>>>>>>>>> but I'm not sure if I could run
> > > >> something
> > > >>>>> like
> > > >>>>>>>>> `e2fsck
> > > >>>>>>>>>>>> -p`,
> > > >>>>>>>>>>>>>>> because
> > > >>>>>>>>>>>>>>>>>>>>> I'm also not sure
> > > >>>>>>>>>>>>>>>>>>>>> where does that machine exist or who's
> > > >>>>>> managing
> > > >>>>>>>> it.
> > > >>>>>>>>>>>>>>>>>>>>> (I slightly thought it was running as a
> > > >> VM
> > > >>>>>> with
> > > >>>>>>>>> QEMU on
> > > >>>>>>>>>>>> some
> > > >>>>>>>>>>>>>> EC2
> > > >>>>>>>>>>>>>>>>>>>>> instance, but I couldn't find it)
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>> Cos, Evans, Olaf
> > > >>>>>>>>>>>>>>>>>>>>> Would you provide any suggestions?
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org>
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>
> > >
> >

Re: PPC CI server failure

Posted by "MrAsanjar ." <as...@apache.org>.
Let me know when the ppc64le CI/CD is going to get enabled to help us
identify the failing components.

On Fri, Apr 16, 2021 at 8:00 PM Kengo Seki <se...@apache.org> wrote:

> Sorry for my late response, I was quite busy this week...
> Amir, thank you for recovering the ppc64le server! I've just enabled
> it on Jenkins and it seems to be healthy. I'm going to work on
> BIGTOP-3533.
> Also thanks to Evans and Olaf for helping him.
>
> Kengo Seki <se...@apache.org>
>
> On Sat, Apr 17, 2021 at 3:50 AM Olaf Flebbe <of...@oflebbe.de> wrote:
> >
> > I already gave the public key to asanjar.
> >
> > Olaf
> >
> > > Am 16.04.2021 um 10:49 schrieb Evans Ye <ev...@apache.org>:
> > >
> > > Let me help. I was busy on a thing.
> > >
> > >
> > > MrAsanjar . <as...@apache.org> 於 2021年4月15日 週四 下午10:30寫道:
> > >
> > >> In order to set up the new Jenkins slave for ppc64le (
> > >> https://issues.apache.org/jira/browse/BIGTOP-3534) we need Jenkins
> > >> master's
> > >> public ssh key. Who can help me here?
> > >>
> > >> On Fri, Apr 2, 2021 at 4:00 PM MrAsanjar <af...@gmail.com> wrote:
> > >>
> > >>> I have verified the state of ppc64le VM, it is operational. Could we
> > >>> enable the ppc64le build before OpenStack flag the VM as ideal again.
> > >>>
> > >>> On Thu, Apr 1, 2021 at 4:08 PM MrAsanjar <af...@gmail.com> wrote:
> > >>>
> > >>>> Hi lads
> > >>>> I just got an email that IBM has reinstated the ppc64le VM.
> > >>>>
> > >>>>
> > >>>> On Mon, Mar 29, 2021 at 12:05 PM Evans Ye <ev...@apache.org>
> wrote:
> > >>>>
> > >>>>> Great news and thanks, Amir!
> > >>>>>
> > >>>>> Jun HE <ju...@apache.org> 於 2021年3月29日 週一 下午1:54寫道:
> > >>>>>
> > >>>>>> Awesome! Looking forward to its back to CI.
> > >>>>>> Thanks a lot for helping on this, Asanjar!
> > >>>>>>
> > >>>>>> Regards,
> > >>>>>>
> > >>>>>> Jun
> > >>>>>>
> > >>>>>> MrAsanjar <af...@gmail.com> 于2021年3月29日周一 上午10:18写道:
> > >>>>>>
> > >>>>>>> Hi old friends :)
> > >>>>>>> We should have a ppc64le VM back online sometime this week. I'll
> > >>>>> keep you
> > >>>>>>> all posted.
> > >>>>>>>
> > >>>>>>> On Thu, Nov 19, 2020 at 9:05 PM Evans Ye <ev...@apache.org>
> > >> wrote:
> > >>>>>>>
> > >>>>>>>> Hi rbkrishn,
> > >>>>>>>>
> > >>>>>>>> Would you mind to comment whether those PPC servers for Bigtop
> CI
> > >>>>> can
> > >>>>>> be
> > >>>>>>>> brought up and unlock our release process?
> > >>>>>>>> Thanks!
> > >>>>>>>>
> > >>>>>>>> Best,
> > >>>>>>>> Evans
> > >>>>>>>>
> > >>>>>>>> Kengo Seki <se...@apache.org> 於 2020年11月18日 週三 上午7:26寫道:
> > >>>>>>>>
> > >>>>>>>>> Thank you for checking, Evans and Amir!
> > >>>>>>>>>
> > >>>>>>>>> Kengo Seki <se...@apache.org>
> > >>>>>>>>>
> > >>>>>>>>> On Wed, Nov 18, 2020 at 2:09 AM Evans Ye <ev...@apache.org>
> > >>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>> Thank you, Amir.
> > >>>>>>>>>>
> > >>>>>>>>>> MrAsanjar <af...@gmail.com> 於 2020年11月18日 週三 00:39 寫道:
> > >>>>>>>>>>
> > >>>>>>>>>>> Hi Evans, let me check with IBM again.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Mon, Nov 16, 2020 at 9:08 PM Evans Ye <
> > >> evansye@apache.org
> > >>>>>>
> > >>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> Hi Amir,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> We're planning Bigtop 1.5 release and if we don't have
> > >> the
> > >>>>> CI
> > >>>>>>> nodes
> > >>>>>>>>> for
> > >>>>>>>>>>>> PPC, we're not able to release 1.5 with PPC supported.
> > >>>>>>>>>>>> Could you help to confirm again? Thanks!
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Best,
> > >>>>>>>>>>>> Evans Ye
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> MrAsanjar <af...@gmail.com> 於 2020年9月17日 週四 下午8:56寫道:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> I have informed IBM management regarding the situation,
> > >>>>>> waiting
> > >>>>>>>>> for a
> > >>>>>>>>>>>>> reply.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Thu, Sep 17, 2020 at 3:47 AM Evans Ye <
> > >>>>> evansye@apache.org
> > >>>>>>>
> > >>>>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Ok. Thanks for doing this to get the ball rolling.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Kengo Seki <se...@apache.org> 於 2020年9月17日 週四 10:29
> > >>>>> 寫道:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Thank you for your help, Amir!
> > >>>>>>>>>>>>>>> It's just a heads-up, I temporarily disabled builds
> > >>>>> for
> > >>>>>> ppc
> > >>>>>>>> in
> > >>>>>>>>> the
> > >>>>>>>>>>>>>>> following Jenkins jobs so that they can finish.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> * Docker-Puppet-Trunk
> > >>>>>>>>>>>>>>> * Docker-Puppet-Trunk-pull
> > >>>>>>>>>>>>>>> * Docker-Toolchain-Trunk
> > >>>>>>>>>>>>>>> * Docker-Toolchain-Trunk-pull
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> * Bigtop-trunk-packages
> > >>>>>>>>>>>>>>> * Bigtop-trunk-repos
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> * Remove-All-Docker-Containers-Except-Nexus
> > >>>>>>>>>>>>>>> * Remove-Dangling-Docker-Images
> > >>>>>>>>>>>>>>> * Remove-Inactive-Containers
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Kengo Seki <se...@apache.org>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On Wed, Sep 16, 2020 at 7:35 PM Evans Ye <
> > >>>>>>> evansye@apache.org
> > >>>>>>>>>
> > >>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Awesome! Nice to hear from you, buddy!
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> MrAsanjar <af...@gmail.com> 於 2020年9月16日 週三
> > >>>>>> 上午3:54寫道:
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Hi Evans,
> > >>>>>>>>>>>>>>>>> Let me see what I can do. Give me 24 hr :)
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> On Tue, Sep 15, 2020 at 10:51 AM Evans Ye <
> > >>>>>>>>> evansye@apache.org>
> > >>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Yes. I think the action is correct. However
> > >> [2]
> > >>>>>> might
> > >>>>>>>> be
> > >>>>>>>>> a
> > >>>>>>>>>>>>>> different
> > >>>>>>>>>>>>>>>>> thing
> > >>>>>>>>>>>>>>>>>> for PPC integration in Hadoop.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Amir,
> > >>>>>>>>>>>>>>>>>> Could you confirm?
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org> 於 2020年9月14日
> > >> 週一
> > >>>>>>>> 下午9:56寫道:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Thank you for the advice, Evans!
> > >>>>>>>>>>>>>>>>>>> Let me confirm about "PPC machine owners".
> > >>>>>> According
> > >>>>>>>> to
> > >>>>>>>>>>> Amir's
> > >>>>>>>>>>>>>> JIRA
> > >>>>>>>>>>>>>>>>>>> issues [1][2] and the powered-by list in the
> > >>>>> OSU
> > >>>>>>> site
> > >>>>>>>>> [3],
> > >>>>>>>>>>>> we're
> > >>>>>>>>>>>>>>> using
> > >>>>>>>>>>>>>>>>>>> a VM hosted by OSU OSL, right?
> > >>>>>>>>>>>>>>>>>>> If it's correct, I'm going to ask them for
> > >>>>> help
> > >>>>>> via
> > >>>>>>>>>>>>>>>>>>> powerdev-request@osuosl.org.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> [1]:
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>
> https://issues.apache.org/jira/browse/INFRA-11467?focusedCommentId=15300982&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15300982
> > >>>>>>>>>>>>>>>>>>> [2]:
> > >>>>>>>> https://issues.apache.org/jira/browse/INFRA-12014
> > >>>>>>>>>>>>>>>>>>> [3]:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>
> https://osuosl.org/services/powerdev/current-projects/#foss-projects
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> On Mon, Sep 14, 2020 at 2:06 PM Evans Ye <
> > >>>>>>>>>>> evansye@apache.org>
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> I'd suggest to reach out to PPC machine
> > >>>>> owners.
> > >>>>>>>> Worst
> > >>>>>>>>> case
> > >>>>>>>>>>>> Is
> > >>>>>>>>>>>>> we
> > >>>>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>>>>>>> temporary  drop the PPC support to move
> > >> the
> > >>>>>>> release
> > >>>>>>>>>>> forward.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org> 於
> > >>>>> 2020年9月14日 週一
> > >>>>>>>> 12:44
> > >>>>>>>>> 寫道:
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Hi everyone,
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Let me share information about the CI
> > >>>>>>> environment.
> > >>>>>>>>>>>>>>>>>>>>> The worker node for ppc64le is currently
> > >>>>>>> offlined,
> > >>>>>>>>> so I
> > >>>>>>>>>>>> just
> > >>>>>>>>>>>>>>> killed
> > >>>>>>>>>>>>>>>>>>> all
> > >>>>>>>>>>>>>>>>>>>>> jobs
> > >>>>>>>>>>>>>>>>>>>>> in the queue waiting for it gets back.
> > >> Its
> > >>>>>>> status
> > >>>>>>>>> is as
> > >>>>>>>>>>>>>> follows.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> - According to the result of `who -b`,
> > >>>>> that
> > >>>>>>>> machine
> > >>>>>>>>>>> seems
> > >>>>>>>>>>>> to
> > >>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>> rebooted
> > >>>>>>>>>>>>>>>>>>>>>  on 2020-09-11 for some reason
> > >> (probably
> > >>>>>>>>> unexpectedly).
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> - According to the result of dmesg, the
> > >>>>> root
> > >>>>>>>> volume
> > >>>>>>>>> was
> > >>>>>>>>>>>>>> mounted
> > >>>>>>>>>>>>>>>>>>>>>  in read-only mode because of a fsck
> > >>>>> failure.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>  [   34.840681] EXT4-fs (vda1):
> > >> Couldn't
> > >>>>>>> remount
> > >>>>>>>>> RDWR
> > >>>>>>>>>>>>> because
> > >>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>>>> unprocessed orphan inode list.  Please
> > >>>>>>>>> umount/remount
> > >>>>>>>>>>>>> instead
> > >>>>>>>>>>>>>>>>>>>>>  [   60.714110] cgroup: new mount
> > >>>>> options do
> > >>>>>>> not
> > >>>>>>>>> match
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>>>> existing
> > >>>>>>>>>>>>>>>>>>>>> superblock, will be ignored
> > >>>>>>>>>>>>>>>>>>>>>  [  316.385805] EXT4-fs (vda1): error
> > >>>>> count
> > >>>>>>> since
> > >>>>>>>>> last
> > >>>>>>>>>>>>> fsck:
> > >>>>>>>>>>>>>>> 9459
> > >>>>>>>>>>>>>>>>>>>>>  [  316.385824] EXT4-fs (vda1): initial
> > >>>>> error
> > >>>>>>> at
> > >>>>>>>>> time
> > >>>>>>>>>>>>>>> 1540294049:
> > >>>>>>>>>>>>>>>>>>>>> ext4_validate_inode_bitmap:134
> > >>>>>>>>>>>>>>>>>>>>>  [  316.385826] EXT4-fs (vda1): last
> > >>>>> error at
> > >>>>>>>> time
> > >>>>>>>>>>>>>> 1596881526:
> > >>>>>>>>>>>>>>>>>>>>> ext4_free_inode:383
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> It looks like some fsck work (and
> > >>>>> replacing
> > >>>>>> the
> > >>>>>>>>> volume,
> > >>>>>>>>>>> if
> > >>>>>>>>>>>>> it
> > >>>>>>>>>>>>>>> fails)
> > >>>>>>>>>>>>>>>>>>>>> are required,
> > >>>>>>>>>>>>>>>>>>>>> but I'm not sure if I could run
> > >> something
> > >>>>> like
> > >>>>>>>>> `e2fsck
> > >>>>>>>>>>>> -p`,
> > >>>>>>>>>>>>>>> because
> > >>>>>>>>>>>>>>>>>>>>> I'm also not sure
> > >>>>>>>>>>>>>>>>>>>>> where does that machine exist or who's
> > >>>>>> managing
> > >>>>>>>> it.
> > >>>>>>>>>>>>>>>>>>>>> (I slightly thought it was running as a
> > >> VM
> > >>>>>> with
> > >>>>>>>>> QEMU on
> > >>>>>>>>>>>> some
> > >>>>>>>>>>>>>> EC2
> > >>>>>>>>>>>>>>>>>>>>> instance, but I couldn't find it)
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> Cos, Evans, Olaf
> > >>>>>>>>>>>>>>>>>>>>> Would you provide any suggestions?
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org>
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> >
>

Re: PPC CI server failure

Posted by Kengo Seki <se...@apache.org>.
Sorry for my late response, I was quite busy this week...
Amir, thank you for recovering the ppc64le server! I've just enabled
it on Jenkins and it seems to be healthy. I'm going to work on
BIGTOP-3533.
Also thanks to Evans and Olaf for helping him.

Kengo Seki <se...@apache.org>

On Sat, Apr 17, 2021 at 3:50 AM Olaf Flebbe <of...@oflebbe.de> wrote:
>
> I already gave the public key to asanjar.
>
> Olaf
>
> > Am 16.04.2021 um 10:49 schrieb Evans Ye <ev...@apache.org>:
> >
> > Let me help. I was busy on a thing.
> >
> >
> > MrAsanjar . <as...@apache.org> 於 2021年4月15日 週四 下午10:30寫道:
> >
> >> In order to set up the new Jenkins slave for ppc64le (
> >> https://issues.apache.org/jira/browse/BIGTOP-3534) we need Jenkins
> >> master's
> >> public ssh key. Who can help me here?
> >>
> >> On Fri, Apr 2, 2021 at 4:00 PM MrAsanjar <af...@gmail.com> wrote:
> >>
> >>> I have verified the state of ppc64le VM, it is operational. Could we
> >>> enable the ppc64le build before OpenStack flag the VM as ideal again.
> >>>
> >>> On Thu, Apr 1, 2021 at 4:08 PM MrAsanjar <af...@gmail.com> wrote:
> >>>
> >>>> Hi lads
> >>>> I just got an email that IBM has reinstated the ppc64le VM.
> >>>>
> >>>>
> >>>> On Mon, Mar 29, 2021 at 12:05 PM Evans Ye <ev...@apache.org> wrote:
> >>>>
> >>>>> Great news and thanks, Amir!
> >>>>>
> >>>>> Jun HE <ju...@apache.org> 於 2021年3月29日 週一 下午1:54寫道:
> >>>>>
> >>>>>> Awesome! Looking forward to its back to CI.
> >>>>>> Thanks a lot for helping on this, Asanjar!
> >>>>>>
> >>>>>> Regards,
> >>>>>>
> >>>>>> Jun
> >>>>>>
> >>>>>> MrAsanjar <af...@gmail.com> 于2021年3月29日周一 上午10:18写道:
> >>>>>>
> >>>>>>> Hi old friends :)
> >>>>>>> We should have a ppc64le VM back online sometime this week. I'll
> >>>>> keep you
> >>>>>>> all posted.
> >>>>>>>
> >>>>>>> On Thu, Nov 19, 2020 at 9:05 PM Evans Ye <ev...@apache.org>
> >> wrote:
> >>>>>>>
> >>>>>>>> Hi rbkrishn,
> >>>>>>>>
> >>>>>>>> Would you mind to comment whether those PPC servers for Bigtop CI
> >>>>> can
> >>>>>> be
> >>>>>>>> brought up and unlock our release process?
> >>>>>>>> Thanks!
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Evans
> >>>>>>>>
> >>>>>>>> Kengo Seki <se...@apache.org> 於 2020年11月18日 週三 上午7:26寫道:
> >>>>>>>>
> >>>>>>>>> Thank you for checking, Evans and Amir!
> >>>>>>>>>
> >>>>>>>>> Kengo Seki <se...@apache.org>
> >>>>>>>>>
> >>>>>>>>> On Wed, Nov 18, 2020 at 2:09 AM Evans Ye <ev...@apache.org>
> >>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Thank you, Amir.
> >>>>>>>>>>
> >>>>>>>>>> MrAsanjar <af...@gmail.com> 於 2020年11月18日 週三 00:39 寫道:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Evans, let me check with IBM again.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Nov 16, 2020 at 9:08 PM Evans Ye <
> >> evansye@apache.org
> >>>>>>
> >>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Amir,
> >>>>>>>>>>>>
> >>>>>>>>>>>> We're planning Bigtop 1.5 release and if we don't have
> >> the
> >>>>> CI
> >>>>>>> nodes
> >>>>>>>>> for
> >>>>>>>>>>>> PPC, we're not able to release 1.5 with PPC supported.
> >>>>>>>>>>>> Could you help to confirm again? Thanks!
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best,
> >>>>>>>>>>>> Evans Ye
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> MrAsanjar <af...@gmail.com> 於 2020年9月17日 週四 下午8:56寫道:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> I have informed IBM management regarding the situation,
> >>>>>> waiting
> >>>>>>>>> for a
> >>>>>>>>>>>>> reply.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Thu, Sep 17, 2020 at 3:47 AM Evans Ye <
> >>>>> evansye@apache.org
> >>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Ok. Thanks for doing this to get the ball rolling.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Kengo Seki <se...@apache.org> 於 2020年9月17日 週四 10:29
> >>>>> 寫道:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thank you for your help, Amir!
> >>>>>>>>>>>>>>> It's just a heads-up, I temporarily disabled builds
> >>>>> for
> >>>>>> ppc
> >>>>>>>> in
> >>>>>>>>> the
> >>>>>>>>>>>>>>> following Jenkins jobs so that they can finish.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> * Docker-Puppet-Trunk
> >>>>>>>>>>>>>>> * Docker-Puppet-Trunk-pull
> >>>>>>>>>>>>>>> * Docker-Toolchain-Trunk
> >>>>>>>>>>>>>>> * Docker-Toolchain-Trunk-pull
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> * Bigtop-trunk-packages
> >>>>>>>>>>>>>>> * Bigtop-trunk-repos
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> * Remove-All-Docker-Containers-Except-Nexus
> >>>>>>>>>>>>>>> * Remove-Dangling-Docker-Images
> >>>>>>>>>>>>>>> * Remove-Inactive-Containers
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Kengo Seki <se...@apache.org>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Wed, Sep 16, 2020 at 7:35 PM Evans Ye <
> >>>>>>> evansye@apache.org
> >>>>>>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Awesome! Nice to hear from you, buddy!
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> MrAsanjar <af...@gmail.com> 於 2020年9月16日 週三
> >>>>>> 上午3:54寫道:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Hi Evans,
> >>>>>>>>>>>>>>>>> Let me see what I can do. Give me 24 hr :)
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Tue, Sep 15, 2020 at 10:51 AM Evans Ye <
> >>>>>>>>> evansye@apache.org>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Yes. I think the action is correct. However
> >> [2]
> >>>>>> might
> >>>>>>>> be
> >>>>>>>>> a
> >>>>>>>>>>>>>> different
> >>>>>>>>>>>>>>>>> thing
> >>>>>>>>>>>>>>>>>> for PPC integration in Hadoop.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Amir,
> >>>>>>>>>>>>>>>>>> Could you confirm?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org> 於 2020年9月14日
> >> 週一
> >>>>>>>> 下午9:56寫道:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Thank you for the advice, Evans!
> >>>>>>>>>>>>>>>>>>> Let me confirm about "PPC machine owners".
> >>>>>> According
> >>>>>>>> to
> >>>>>>>>>>> Amir's
> >>>>>>>>>>>>>> JIRA
> >>>>>>>>>>>>>>>>>>> issues [1][2] and the powered-by list in the
> >>>>> OSU
> >>>>>>> site
> >>>>>>>>> [3],
> >>>>>>>>>>>> we're
> >>>>>>>>>>>>>>> using
> >>>>>>>>>>>>>>>>>>> a VM hosted by OSU OSL, right?
> >>>>>>>>>>>>>>>>>>> If it's correct, I'm going to ask them for
> >>>>> help
> >>>>>> via
> >>>>>>>>>>>>>>>>>>> powerdev-request@osuosl.org.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> [1]:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >> https://issues.apache.org/jira/browse/INFRA-11467?focusedCommentId=15300982&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15300982
> >>>>>>>>>>>>>>>>>>> [2]:
> >>>>>>>> https://issues.apache.org/jira/browse/INFRA-12014
> >>>>>>>>>>>>>>>>>>> [3]:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>
> >>>>> https://osuosl.org/services/powerdev/current-projects/#foss-projects
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On Mon, Sep 14, 2020 at 2:06 PM Evans Ye <
> >>>>>>>>>>> evansye@apache.org>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> I'd suggest to reach out to PPC machine
> >>>>> owners.
> >>>>>>>> Worst
> >>>>>>>>> case
> >>>>>>>>>>>> Is
> >>>>>>>>>>>>> we
> >>>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>>>>> temporary  drop the PPC support to move
> >> the
> >>>>>>> release
> >>>>>>>>>>> forward.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org> 於
> >>>>> 2020年9月14日 週一
> >>>>>>>> 12:44
> >>>>>>>>> 寫道:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Let me share information about the CI
> >>>>>>> environment.
> >>>>>>>>>>>>>>>>>>>>> The worker node for ppc64le is currently
> >>>>>>> offlined,
> >>>>>>>>> so I
> >>>>>>>>>>>> just
> >>>>>>>>>>>>>>> killed
> >>>>>>>>>>>>>>>>>>> all
> >>>>>>>>>>>>>>>>>>>>> jobs
> >>>>>>>>>>>>>>>>>>>>> in the queue waiting for it gets back.
> >> Its
> >>>>>>> status
> >>>>>>>>> is as
> >>>>>>>>>>>>>> follows.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> - According to the result of `who -b`,
> >>>>> that
> >>>>>>>> machine
> >>>>>>>>>>> seems
> >>>>>>>>>>>> to
> >>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>> rebooted
> >>>>>>>>>>>>>>>>>>>>>  on 2020-09-11 for some reason
> >> (probably
> >>>>>>>>> unexpectedly).
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> - According to the result of dmesg, the
> >>>>> root
> >>>>>>>> volume
> >>>>>>>>> was
> >>>>>>>>>>>>>> mounted
> >>>>>>>>>>>>>>>>>>>>>  in read-only mode because of a fsck
> >>>>> failure.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>  [   34.840681] EXT4-fs (vda1):
> >> Couldn't
> >>>>>>> remount
> >>>>>>>>> RDWR
> >>>>>>>>>>>>> because
> >>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>>>> unprocessed orphan inode list.  Please
> >>>>>>>>> umount/remount
> >>>>>>>>>>>>> instead
> >>>>>>>>>>>>>>>>>>>>>  [   60.714110] cgroup: new mount
> >>>>> options do
> >>>>>>> not
> >>>>>>>>> match
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>>> existing
> >>>>>>>>>>>>>>>>>>>>> superblock, will be ignored
> >>>>>>>>>>>>>>>>>>>>>  [  316.385805] EXT4-fs (vda1): error
> >>>>> count
> >>>>>>> since
> >>>>>>>>> last
> >>>>>>>>>>>>> fsck:
> >>>>>>>>>>>>>>> 9459
> >>>>>>>>>>>>>>>>>>>>>  [  316.385824] EXT4-fs (vda1): initial
> >>>>> error
> >>>>>>> at
> >>>>>>>>> time
> >>>>>>>>>>>>>>> 1540294049:
> >>>>>>>>>>>>>>>>>>>>> ext4_validate_inode_bitmap:134
> >>>>>>>>>>>>>>>>>>>>>  [  316.385826] EXT4-fs (vda1): last
> >>>>> error at
> >>>>>>>> time
> >>>>>>>>>>>>>> 1596881526:
> >>>>>>>>>>>>>>>>>>>>> ext4_free_inode:383
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> It looks like some fsck work (and
> >>>>> replacing
> >>>>>> the
> >>>>>>>>> volume,
> >>>>>>>>>>> if
> >>>>>>>>>>>>> it
> >>>>>>>>>>>>>>> fails)
> >>>>>>>>>>>>>>>>>>>>> are required,
> >>>>>>>>>>>>>>>>>>>>> but I'm not sure if I could run
> >> something
> >>>>> like
> >>>>>>>>> `e2fsck
> >>>>>>>>>>>> -p`,
> >>>>>>>>>>>>>>> because
> >>>>>>>>>>>>>>>>>>>>> I'm also not sure
> >>>>>>>>>>>>>>>>>>>>> where does that machine exist or who's
> >>>>>> managing
> >>>>>>>> it.
> >>>>>>>>>>>>>>>>>>>>> (I slightly thought it was running as a
> >> VM
> >>>>>> with
> >>>>>>>>> QEMU on
> >>>>>>>>>>>> some
> >>>>>>>>>>>>>> EC2
> >>>>>>>>>>>>>>>>>>>>> instance, but I couldn't find it)
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Cos, Evans, Olaf
> >>>>>>>>>>>>>>>>>>>>> Would you provide any suggestions?
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>
>

Re: PPC CI server failure

Posted by Olaf Flebbe <of...@oflebbe.de>.
I already gave the public key to asanjar. 

Olaf

> Am 16.04.2021 um 10:49 schrieb Evans Ye <ev...@apache.org>:
> 
> Let me help. I was busy on a thing.
> 
> 
> MrAsanjar . <as...@apache.org> 於 2021年4月15日 週四 下午10:30寫道:
> 
>> In order to set up the new Jenkins slave for ppc64le (
>> https://issues.apache.org/jira/browse/BIGTOP-3534) we need Jenkins
>> master's
>> public ssh key. Who can help me here?
>> 
>> On Fri, Apr 2, 2021 at 4:00 PM MrAsanjar <af...@gmail.com> wrote:
>> 
>>> I have verified the state of ppc64le VM, it is operational. Could we
>>> enable the ppc64le build before OpenStack flag the VM as ideal again.
>>> 
>>> On Thu, Apr 1, 2021 at 4:08 PM MrAsanjar <af...@gmail.com> wrote:
>>> 
>>>> Hi lads
>>>> I just got an email that IBM has reinstated the ppc64le VM.
>>>> 
>>>> 
>>>> On Mon, Mar 29, 2021 at 12:05 PM Evans Ye <ev...@apache.org> wrote:
>>>> 
>>>>> Great news and thanks, Amir!
>>>>> 
>>>>> Jun HE <ju...@apache.org> 於 2021年3月29日 週一 下午1:54寫道:
>>>>> 
>>>>>> Awesome! Looking forward to its back to CI.
>>>>>> Thanks a lot for helping on this, Asanjar!
>>>>>> 
>>>>>> Regards,
>>>>>> 
>>>>>> Jun
>>>>>> 
>>>>>> MrAsanjar <af...@gmail.com> 于2021年3月29日周一 上午10:18写道:
>>>>>> 
>>>>>>> Hi old friends :)
>>>>>>> We should have a ppc64le VM back online sometime this week. I'll
>>>>> keep you
>>>>>>> all posted.
>>>>>>> 
>>>>>>> On Thu, Nov 19, 2020 at 9:05 PM Evans Ye <ev...@apache.org>
>> wrote:
>>>>>>> 
>>>>>>>> Hi rbkrishn,
>>>>>>>> 
>>>>>>>> Would you mind to comment whether those PPC servers for Bigtop CI
>>>>> can
>>>>>> be
>>>>>>>> brought up and unlock our release process?
>>>>>>>> Thanks!
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Evans
>>>>>>>> 
>>>>>>>> Kengo Seki <se...@apache.org> 於 2020年11月18日 週三 上午7:26寫道:
>>>>>>>> 
>>>>>>>>> Thank you for checking, Evans and Amir!
>>>>>>>>> 
>>>>>>>>> Kengo Seki <se...@apache.org>
>>>>>>>>> 
>>>>>>>>> On Wed, Nov 18, 2020 at 2:09 AM Evans Ye <ev...@apache.org>
>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Thank you, Amir.
>>>>>>>>>> 
>>>>>>>>>> MrAsanjar <af...@gmail.com> 於 2020年11月18日 週三 00:39 寫道:
>>>>>>>>>> 
>>>>>>>>>>> Hi Evans, let me check with IBM again.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Nov 16, 2020 at 9:08 PM Evans Ye <
>> evansye@apache.org
>>>>>> 
>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hi Amir,
>>>>>>>>>>>> 
>>>>>>>>>>>> We're planning Bigtop 1.5 release and if we don't have
>> the
>>>>> CI
>>>>>>> nodes
>>>>>>>>> for
>>>>>>>>>>>> PPC, we're not able to release 1.5 with PPC supported.
>>>>>>>>>>>> Could you help to confirm again? Thanks!
>>>>>>>>>>>> 
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Evans Ye
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> MrAsanjar <af...@gmail.com> 於 2020年9月17日 週四 下午8:56寫道:
>>>>>>>>>>>> 
>>>>>>>>>>>>> I have informed IBM management regarding the situation,
>>>>>> waiting
>>>>>>>>> for a
>>>>>>>>>>>>> reply.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Thu, Sep 17, 2020 at 3:47 AM Evans Ye <
>>>>> evansye@apache.org
>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Ok. Thanks for doing this to get the ball rolling.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Kengo Seki <se...@apache.org> 於 2020年9月17日 週四 10:29
>>>>> 寫道:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thank you for your help, Amir!
>>>>>>>>>>>>>>> It's just a heads-up, I temporarily disabled builds
>>>>> for
>>>>>> ppc
>>>>>>>> in
>>>>>>>>> the
>>>>>>>>>>>>>>> following Jenkins jobs so that they can finish.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> * Docker-Puppet-Trunk
>>>>>>>>>>>>>>> * Docker-Puppet-Trunk-pull
>>>>>>>>>>>>>>> * Docker-Toolchain-Trunk
>>>>>>>>>>>>>>> * Docker-Toolchain-Trunk-pull
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> * Bigtop-trunk-packages
>>>>>>>>>>>>>>> * Bigtop-trunk-repos
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> * Remove-All-Docker-Containers-Except-Nexus
>>>>>>>>>>>>>>> * Remove-Dangling-Docker-Images
>>>>>>>>>>>>>>> * Remove-Inactive-Containers
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org>
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Wed, Sep 16, 2020 at 7:35 PM Evans Ye <
>>>>>>> evansye@apache.org
>>>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Awesome! Nice to hear from you, buddy!
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> MrAsanjar <af...@gmail.com> 於 2020年9月16日 週三
>>>>>> 上午3:54寫道:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hi Evans,
>>>>>>>>>>>>>>>>> Let me see what I can do. Give me 24 hr :)
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Tue, Sep 15, 2020 at 10:51 AM Evans Ye <
>>>>>>>>> evansye@apache.org>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Yes. I think the action is correct. However
>> [2]
>>>>>> might
>>>>>>>> be
>>>>>>>>> a
>>>>>>>>>>>>>> different
>>>>>>>>>>>>>>>>> thing
>>>>>>>>>>>>>>>>>> for PPC integration in Hadoop.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Amir,
>>>>>>>>>>>>>>>>>> Could you confirm?
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org> 於 2020年9月14日
>> 週一
>>>>>>>> 下午9:56寫道:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Thank you for the advice, Evans!
>>>>>>>>>>>>>>>>>>> Let me confirm about "PPC machine owners".
>>>>>> According
>>>>>>>> to
>>>>>>>>>>> Amir's
>>>>>>>>>>>>>> JIRA
>>>>>>>>>>>>>>>>>>> issues [1][2] and the powered-by list in the
>>>>> OSU
>>>>>>> site
>>>>>>>>> [3],
>>>>>>>>>>>> we're
>>>>>>>>>>>>>>> using
>>>>>>>>>>>>>>>>>>> a VM hosted by OSU OSL, right?
>>>>>>>>>>>>>>>>>>> If it's correct, I'm going to ask them for
>>>>> help
>>>>>> via
>>>>>>>>>>>>>>>>>>> powerdev-request@osuosl.org.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> [1]:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>> https://issues.apache.org/jira/browse/INFRA-11467?focusedCommentId=15300982&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15300982
>>>>>>>>>>>>>>>>>>> [2]:
>>>>>>>> https://issues.apache.org/jira/browse/INFRA-12014
>>>>>>>>>>>>>>>>>>> [3]:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>> 
>>>>> https://osuosl.org/services/powerdev/current-projects/#foss-projects
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org>
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On Mon, Sep 14, 2020 at 2:06 PM Evans Ye <
>>>>>>>>>>> evansye@apache.org>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> I'd suggest to reach out to PPC machine
>>>>> owners.
>>>>>>>> Worst
>>>>>>>>> case
>>>>>>>>>>>> Is
>>>>>>>>>>>>> we
>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>>>> temporary  drop the PPC support to move
>> the
>>>>>>> release
>>>>>>>>>>> forward.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org> 於
>>>>> 2020年9月14日 週一
>>>>>>>> 12:44
>>>>>>>>> 寫道:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Let me share information about the CI
>>>>>>> environment.
>>>>>>>>>>>>>>>>>>>>> The worker node for ppc64le is currently
>>>>>>> offlined,
>>>>>>>>> so I
>>>>>>>>>>>> just
>>>>>>>>>>>>>>> killed
>>>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>>>>>> jobs
>>>>>>>>>>>>>>>>>>>>> in the queue waiting for it gets back.
>> Its
>>>>>>> status
>>>>>>>>> is as
>>>>>>>>>>>>>> follows.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> - According to the result of `who -b`,
>>>>> that
>>>>>>>> machine
>>>>>>>>>>> seems
>>>>>>>>>>>> to
>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>> rebooted
>>>>>>>>>>>>>>>>>>>>>  on 2020-09-11 for some reason
>> (probably
>>>>>>>>> unexpectedly).
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> - According to the result of dmesg, the
>>>>> root
>>>>>>>> volume
>>>>>>>>> was
>>>>>>>>>>>>>> mounted
>>>>>>>>>>>>>>>>>>>>>  in read-only mode because of a fsck
>>>>> failure.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>  [   34.840681] EXT4-fs (vda1):
>> Couldn't
>>>>>>> remount
>>>>>>>>> RDWR
>>>>>>>>>>>>> because
>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>> unprocessed orphan inode list.  Please
>>>>>>>>> umount/remount
>>>>>>>>>>>>> instead
>>>>>>>>>>>>>>>>>>>>>  [   60.714110] cgroup: new mount
>>>>> options do
>>>>>>> not
>>>>>>>>> match
>>>>>>>>>>>> the
>>>>>>>>>>>>>>> existing
>>>>>>>>>>>>>>>>>>>>> superblock, will be ignored
>>>>>>>>>>>>>>>>>>>>>  [  316.385805] EXT4-fs (vda1): error
>>>>> count
>>>>>>> since
>>>>>>>>> last
>>>>>>>>>>>>> fsck:
>>>>>>>>>>>>>>> 9459
>>>>>>>>>>>>>>>>>>>>>  [  316.385824] EXT4-fs (vda1): initial
>>>>> error
>>>>>>> at
>>>>>>>>> time
>>>>>>>>>>>>>>> 1540294049:
>>>>>>>>>>>>>>>>>>>>> ext4_validate_inode_bitmap:134
>>>>>>>>>>>>>>>>>>>>>  [  316.385826] EXT4-fs (vda1): last
>>>>> error at
>>>>>>>> time
>>>>>>>>>>>>>> 1596881526:
>>>>>>>>>>>>>>>>>>>>> ext4_free_inode:383
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> It looks like some fsck work (and
>>>>> replacing
>>>>>> the
>>>>>>>>> volume,
>>>>>>>>>>> if
>>>>>>>>>>>>> it
>>>>>>>>>>>>>>> fails)
>>>>>>>>>>>>>>>>>>>>> are required,
>>>>>>>>>>>>>>>>>>>>> but I'm not sure if I could run
>> something
>>>>> like
>>>>>>>>> `e2fsck
>>>>>>>>>>>> -p`,
>>>>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>>>>> I'm also not sure
>>>>>>>>>>>>>>>>>>>>> where does that machine exist or who's
>>>>>> managing
>>>>>>>> it.
>>>>>>>>>>>>>>>>>>>>> (I slightly thought it was running as a
>> VM
>>>>>> with
>>>>>>>>> QEMU on
>>>>>>>>>>>> some
>>>>>>>>>>>>>> EC2
>>>>>>>>>>>>>>>>>>>>> instance, but I couldn't find it)
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Cos, Evans, Olaf
>>>>>>>>>>>>>>>>>>>>> Would you provide any suggestions?
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Kengo Seki <se...@apache.org>
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> 


Re: PPC CI server failure

Posted by Evans Ye <ev...@apache.org>.
Let me help. I was busy on a thing.


MrAsanjar . <as...@apache.org> 於 2021年4月15日 週四 下午10:30寫道:

> In order to set up the new Jenkins slave for ppc64le (
> https://issues.apache.org/jira/browse/BIGTOP-3534) we need Jenkins
> master's
> public ssh key. Who can help me here?
>
> On Fri, Apr 2, 2021 at 4:00 PM MrAsanjar <af...@gmail.com> wrote:
>
> > I have verified the state of ppc64le VM, it is operational. Could we
> > enable the ppc64le build before OpenStack flag the VM as ideal again.
> >
> > On Thu, Apr 1, 2021 at 4:08 PM MrAsanjar <af...@gmail.com> wrote:
> >
> >> Hi lads
> >> I just got an email that IBM has reinstated the ppc64le VM.
> >>
> >>
> >> On Mon, Mar 29, 2021 at 12:05 PM Evans Ye <ev...@apache.org> wrote:
> >>
> >>> Great news and thanks, Amir!
> >>>
> >>> Jun HE <ju...@apache.org> 於 2021年3月29日 週一 下午1:54寫道:
> >>>
> >>> > Awesome! Looking forward to its back to CI.
> >>> > Thanks a lot for helping on this, Asanjar!
> >>> >
> >>> > Regards,
> >>> >
> >>> > Jun
> >>> >
> >>> > MrAsanjar <af...@gmail.com> 于2021年3月29日周一 上午10:18写道:
> >>> >
> >>> > > Hi old friends :)
> >>> > > We should have a ppc64le VM back online sometime this week. I'll
> >>> keep you
> >>> > > all posted.
> >>> > >
> >>> > > On Thu, Nov 19, 2020 at 9:05 PM Evans Ye <ev...@apache.org>
> wrote:
> >>> > >
> >>> > > > Hi rbkrishn,
> >>> > > >
> >>> > > > Would you mind to comment whether those PPC servers for Bigtop CI
> >>> can
> >>> > be
> >>> > > > brought up and unlock our release process?
> >>> > > > Thanks!
> >>> > > >
> >>> > > > Best,
> >>> > > > Evans
> >>> > > >
> >>> > > > Kengo Seki <se...@apache.org> 於 2020年11月18日 週三 上午7:26寫道:
> >>> > > >
> >>> > > > > Thank you for checking, Evans and Amir!
> >>> > > > >
> >>> > > > > Kengo Seki <se...@apache.org>
> >>> > > > >
> >>> > > > > On Wed, Nov 18, 2020 at 2:09 AM Evans Ye <ev...@apache.org>
> >>> wrote:
> >>> > > > > >
> >>> > > > > > Thank you, Amir.
> >>> > > > > >
> >>> > > > > > MrAsanjar <af...@gmail.com> 於 2020年11月18日 週三 00:39 寫道:
> >>> > > > > >
> >>> > > > > > > Hi Evans, let me check with IBM again.
> >>> > > > > > >
> >>> > > > > > >
> >>> > > > > > > On Mon, Nov 16, 2020 at 9:08 PM Evans Ye <
> evansye@apache.org
> >>> >
> >>> > > wrote:
> >>> > > > > > >
> >>> > > > > > > > Hi Amir,
> >>> > > > > > > >
> >>> > > > > > > > We're planning Bigtop 1.5 release and if we don't have
> the
> >>> CI
> >>> > > nodes
> >>> > > > > for
> >>> > > > > > > > PPC, we're not able to release 1.5 with PPC supported.
> >>> > > > > > > > Could you help to confirm again? Thanks!
> >>> > > > > > > >
> >>> > > > > > > > Best,
> >>> > > > > > > > Evans Ye
> >>> > > > > > > >
> >>> > > > > > > >
> >>> > > > > > > >
> >>> > > > > > > > MrAsanjar <af...@gmail.com> 於 2020年9月17日 週四 下午8:56寫道:
> >>> > > > > > > >
> >>> > > > > > > > > I have informed IBM management regarding the situation,
> >>> > waiting
> >>> > > > > for a
> >>> > > > > > > > > reply.
> >>> > > > > > > > >
> >>> > > > > > > > > On Thu, Sep 17, 2020 at 3:47 AM Evans Ye <
> >>> evansye@apache.org
> >>> > >
> >>> > > > > wrote:
> >>> > > > > > > > >
> >>> > > > > > > > > > Ok. Thanks for doing this to get the ball rolling.
> >>> > > > > > > > > >
> >>> > > > > > > > > > Kengo Seki <se...@apache.org> 於 2020年9月17日 週四 10:29
> >>> 寫道:
> >>> > > > > > > > > >
> >>> > > > > > > > > > > Thank you for your help, Amir!
> >>> > > > > > > > > > > It's just a heads-up, I temporarily disabled builds
> >>> for
> >>> > ppc
> >>> > > > in
> >>> > > > > the
> >>> > > > > > > > > > > following Jenkins jobs so that they can finish.
> >>> > > > > > > > > > >
> >>> > > > > > > > > > > * Docker-Puppet-Trunk
> >>> > > > > > > > > > > * Docker-Puppet-Trunk-pull
> >>> > > > > > > > > > > * Docker-Toolchain-Trunk
> >>> > > > > > > > > > > * Docker-Toolchain-Trunk-pull
> >>> > > > > > > > > > >
> >>> > > > > > > > > > > * Bigtop-trunk-packages
> >>> > > > > > > > > > > * Bigtop-trunk-repos
> >>> > > > > > > > > > >
> >>> > > > > > > > > > > * Remove-All-Docker-Containers-Except-Nexus
> >>> > > > > > > > > > > * Remove-Dangling-Docker-Images
> >>> > > > > > > > > > > * Remove-Inactive-Containers
> >>> > > > > > > > > > >
> >>> > > > > > > > > > > Kengo Seki <se...@apache.org>
> >>> > > > > > > > > > >
> >>> > > > > > > > > > > On Wed, Sep 16, 2020 at 7:35 PM Evans Ye <
> >>> > > evansye@apache.org
> >>> > > > >
> >>> > > > > > > wrote:
> >>> > > > > > > > > > > >
> >>> > > > > > > > > > > > Awesome! Nice to hear from you, buddy!
> >>> > > > > > > > > > > >
> >>> > > > > > > > > > > > MrAsanjar <af...@gmail.com> 於 2020年9月16日 週三
> >>> > 上午3:54寫道:
> >>> > > > > > > > > > > >
> >>> > > > > > > > > > > > > Hi Evans,
> >>> > > > > > > > > > > > > Let me see what I can do. Give me 24 hr :)
> >>> > > > > > > > > > > > >
> >>> > > > > > > > > > > > > On Tue, Sep 15, 2020 at 10:51 AM Evans Ye <
> >>> > > > > evansye@apache.org>
> >>> > > > > > > > > > wrote:
> >>> > > > > > > > > > > > >
> >>> > > > > > > > > > > > > > Yes. I think the action is correct. However
> [2]
> >>> > might
> >>> > > > be
> >>> > > > > a
> >>> > > > > > > > > > different
> >>> > > > > > > > > > > > > thing
> >>> > > > > > > > > > > > > > for PPC integration in Hadoop.
> >>> > > > > > > > > > > > > >
> >>> > > > > > > > > > > > > > Amir,
> >>> > > > > > > > > > > > > > Could you confirm?
> >>> > > > > > > > > > > > > >
> >>> > > > > > > > > > > > > > Kengo Seki <se...@apache.org> 於 2020年9月14日
> 週一
> >>> > > > 下午9:56寫道:
> >>> > > > > > > > > > > > > >
> >>> > > > > > > > > > > > > >> Thank you for the advice, Evans!
> >>> > > > > > > > > > > > > >> Let me confirm about "PPC machine owners".
> >>> > According
> >>> > > > to
> >>> > > > > > > Amir's
> >>> > > > > > > > > > JIRA
> >>> > > > > > > > > > > > > >> issues [1][2] and the powered-by list in the
> >>> OSU
> >>> > > site
> >>> > > > > [3],
> >>> > > > > > > > we're
> >>> > > > > > > > > > > using
> >>> > > > > > > > > > > > > >> a VM hosted by OSU OSL, right?
> >>> > > > > > > > > > > > > >> If it's correct, I'm going to ask them for
> >>> help
> >>> > via
> >>> > > > > > > > > > > > > >> powerdev-request@osuosl.org.
> >>> > > > > > > > > > > > > >>
> >>> > > > > > > > > > > > > >> [1]:
> >>> > > > > > > > > > > > > >>
> >>> > > > > > > > > > > > >
> >>> > > > > > > > > > >
> >>> > > > > > > > > >
> >>> > > > > > > > >
> >>> > > > > > > >
> >>> > > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> https://issues.apache.org/jira/browse/INFRA-11467?focusedCommentId=15300982&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15300982
> >>> > > > > > > > > > > > > >> [2]:
> >>> > > > https://issues.apache.org/jira/browse/INFRA-12014
> >>> > > > > > > > > > > > > >> [3]:
> >>> > > > > > > > > > > > >
> >>> > > > > > > > >
> >>> > > > >
> >>> https://osuosl.org/services/powerdev/current-projects/#foss-projects
> >>> > > > > > > > > > > > > >>
> >>> > > > > > > > > > > > > >> Kengo Seki <se...@apache.org>
> >>> > > > > > > > > > > > > >>
> >>> > > > > > > > > > > > > >>
> >>> > > > > > > > > > > > > >> On Mon, Sep 14, 2020 at 2:06 PM Evans Ye <
> >>> > > > > > > evansye@apache.org>
> >>> > > > > > > > > > > wrote:
> >>> > > > > > > > > > > > > >> >
> >>> > > > > > > > > > > > > >> > I'd suggest to reach out to PPC machine
> >>> owners.
> >>> > > > Worst
> >>> > > > > case
> >>> > > > > > > > Is
> >>> > > > > > > > > we
> >>> > > > > > > > > > > can
> >>> > > > > > > > > > > > > >> > temporary  drop the PPC support to move
> the
> >>> > > release
> >>> > > > > > > forward.
> >>> > > > > > > > > > > > > >> >
> >>> > > > > > > > > > > > > >> > Kengo Seki <se...@apache.org> 於
> >>> 2020年9月14日 週一
> >>> > > > 12:44
> >>> > > > > 寫道:
> >>> > > > > > > > > > > > > >> >
> >>> > > > > > > > > > > > > >> > > Hi everyone,
> >>> > > > > > > > > > > > > >> > >
> >>> > > > > > > > > > > > > >> > > Let me share information about the CI
> >>> > > environment.
> >>> > > > > > > > > > > > > >> > > The worker node for ppc64le is currently
> >>> > > offlined,
> >>> > > > > so I
> >>> > > > > > > > just
> >>> > > > > > > > > > > killed
> >>> > > > > > > > > > > > > >> all
> >>> > > > > > > > > > > > > >> > > jobs
> >>> > > > > > > > > > > > > >> > > in the queue waiting for it gets back.
> Its
> >>> > > status
> >>> > > > > is as
> >>> > > > > > > > > > follows.
> >>> > > > > > > > > > > > > >> > >
> >>> > > > > > > > > > > > > >> > > - According to the result of `who -b`,
> >>> that
> >>> > > > machine
> >>> > > > > > > seems
> >>> > > > > > > > to
> >>> > > > > > > > > > be
> >>> > > > > > > > > > > > > >> rebooted
> >>> > > > > > > > > > > > > >> > >   on 2020-09-11 for some reason
> (probably
> >>> > > > > unexpectedly).
> >>> > > > > > > > > > > > > >> > >
> >>> > > > > > > > > > > > > >> > > - According to the result of dmesg, the
> >>> root
> >>> > > > volume
> >>> > > > > was
> >>> > > > > > > > > > mounted
> >>> > > > > > > > > > > > > >> > >   in read-only mode because of a fsck
> >>> failure.
> >>> > > > > > > > > > > > > >> > >
> >>> > > > > > > > > > > > > >> > >   [   34.840681] EXT4-fs (vda1):
> Couldn't
> >>> > > remount
> >>> > > > > RDWR
> >>> > > > > > > > > because
> >>> > > > > > > > > > > of
> >>> > > > > > > > > > > > > >> > > unprocessed orphan inode list.  Please
> >>> > > > > umount/remount
> >>> > > > > > > > > instead
> >>> > > > > > > > > > > > > >> > >   [   60.714110] cgroup: new mount
> >>> options do
> >>> > > not
> >>> > > > > match
> >>> > > > > > > > the
> >>> > > > > > > > > > > existing
> >>> > > > > > > > > > > > > >> > > superblock, will be ignored
> >>> > > > > > > > > > > > > >> > >   [  316.385805] EXT4-fs (vda1): error
> >>> count
> >>> > > since
> >>> > > > > last
> >>> > > > > > > > > fsck:
> >>> > > > > > > > > > > 9459
> >>> > > > > > > > > > > > > >> > >   [  316.385824] EXT4-fs (vda1): initial
> >>> error
> >>> > > at
> >>> > > > > time
> >>> > > > > > > > > > > 1540294049:
> >>> > > > > > > > > > > > > >> > > ext4_validate_inode_bitmap:134
> >>> > > > > > > > > > > > > >> > >   [  316.385826] EXT4-fs (vda1): last
> >>> error at
> >>> > > > time
> >>> > > > > > > > > > 1596881526:
> >>> > > > > > > > > > > > > >> > > ext4_free_inode:383
> >>> > > > > > > > > > > > > >> > >
> >>> > > > > > > > > > > > > >> > > It looks like some fsck work (and
> >>> replacing
> >>> > the
> >>> > > > > volume,
> >>> > > > > > > if
> >>> > > > > > > > > it
> >>> > > > > > > > > > > fails)
> >>> > > > > > > > > > > > > >> > > are required,
> >>> > > > > > > > > > > > > >> > > but I'm not sure if I could run
> something
> >>> like
> >>> > > > > `e2fsck
> >>> > > > > > > > -p`,
> >>> > > > > > > > > > > because
> >>> > > > > > > > > > > > > >> > > I'm also not sure
> >>> > > > > > > > > > > > > >> > > where does that machine exist or who's
> >>> > managing
> >>> > > > it.
> >>> > > > > > > > > > > > > >> > > (I slightly thought it was running as a
> VM
> >>> > with
> >>> > > > > QEMU on
> >>> > > > > > > > some
> >>> > > > > > > > > > EC2
> >>> > > > > > > > > > > > > >> > > instance, but I couldn't find it)
> >>> > > > > > > > > > > > > >> > >
> >>> > > > > > > > > > > > > >> > > > Cos, Evans, Olaf
> >>> > > > > > > > > > > > > >> > > Would you provide any suggestions?
> >>> > > > > > > > > > > > > >> > >
> >>> > > > > > > > > > > > > >> > > Kengo Seki <se...@apache.org>
> >>> > > > > > > > > > > > > >> > >
> >>> > > > > > > > > > > > > >>
> >>> > > > > > > > > > > > > >
> >>> > > > > > > > > > > > >
> >>> > > > > > > > > > >
> >>> > > > > > > > > >
> >>> > > > > > > > >
> >>> > > > > > > >
> >>> > > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> >>
>

Re: PPC CI server failure

Posted by "MrAsanjar ." <as...@apache.org>.
In order to set up the new Jenkins slave for ppc64le (
https://issues.apache.org/jira/browse/BIGTOP-3534) we need Jenkins master's
public ssh key. Who can help me here?

On Fri, Apr 2, 2021 at 4:00 PM MrAsanjar <af...@gmail.com> wrote:

> I have verified the state of ppc64le VM, it is operational. Could we
> enable the ppc64le build before OpenStack flag the VM as ideal again.
>
> On Thu, Apr 1, 2021 at 4:08 PM MrAsanjar <af...@gmail.com> wrote:
>
>> Hi lads
>> I just got an email that IBM has reinstated the ppc64le VM.
>>
>>
>> On Mon, Mar 29, 2021 at 12:05 PM Evans Ye <ev...@apache.org> wrote:
>>
>>> Great news and thanks, Amir!
>>>
>>> Jun HE <ju...@apache.org> 於 2021年3月29日 週一 下午1:54寫道:
>>>
>>> > Awesome! Looking forward to its back to CI.
>>> > Thanks a lot for helping on this, Asanjar!
>>> >
>>> > Regards,
>>> >
>>> > Jun
>>> >
>>> > MrAsanjar <af...@gmail.com> 于2021年3月29日周一 上午10:18写道:
>>> >
>>> > > Hi old friends :)
>>> > > We should have a ppc64le VM back online sometime this week. I'll
>>> keep you
>>> > > all posted.
>>> > >
>>> > > On Thu, Nov 19, 2020 at 9:05 PM Evans Ye <ev...@apache.org> wrote:
>>> > >
>>> > > > Hi rbkrishn,
>>> > > >
>>> > > > Would you mind to comment whether those PPC servers for Bigtop CI
>>> can
>>> > be
>>> > > > brought up and unlock our release process?
>>> > > > Thanks!
>>> > > >
>>> > > > Best,
>>> > > > Evans
>>> > > >
>>> > > > Kengo Seki <se...@apache.org> 於 2020年11月18日 週三 上午7:26寫道:
>>> > > >
>>> > > > > Thank you for checking, Evans and Amir!
>>> > > > >
>>> > > > > Kengo Seki <se...@apache.org>
>>> > > > >
>>> > > > > On Wed, Nov 18, 2020 at 2:09 AM Evans Ye <ev...@apache.org>
>>> wrote:
>>> > > > > >
>>> > > > > > Thank you, Amir.
>>> > > > > >
>>> > > > > > MrAsanjar <af...@gmail.com> 於 2020年11月18日 週三 00:39 寫道:
>>> > > > > >
>>> > > > > > > Hi Evans, let me check with IBM again.
>>> > > > > > >
>>> > > > > > >
>>> > > > > > > On Mon, Nov 16, 2020 at 9:08 PM Evans Ye <evansye@apache.org
>>> >
>>> > > wrote:
>>> > > > > > >
>>> > > > > > > > Hi Amir,
>>> > > > > > > >
>>> > > > > > > > We're planning Bigtop 1.5 release and if we don't have the
>>> CI
>>> > > nodes
>>> > > > > for
>>> > > > > > > > PPC, we're not able to release 1.5 with PPC supported.
>>> > > > > > > > Could you help to confirm again? Thanks!
>>> > > > > > > >
>>> > > > > > > > Best,
>>> > > > > > > > Evans Ye
>>> > > > > > > >
>>> > > > > > > >
>>> > > > > > > >
>>> > > > > > > > MrAsanjar <af...@gmail.com> 於 2020年9月17日 週四 下午8:56寫道:
>>> > > > > > > >
>>> > > > > > > > > I have informed IBM management regarding the situation,
>>> > waiting
>>> > > > > for a
>>> > > > > > > > > reply.
>>> > > > > > > > >
>>> > > > > > > > > On Thu, Sep 17, 2020 at 3:47 AM Evans Ye <
>>> evansye@apache.org
>>> > >
>>> > > > > wrote:
>>> > > > > > > > >
>>> > > > > > > > > > Ok. Thanks for doing this to get the ball rolling.
>>> > > > > > > > > >
>>> > > > > > > > > > Kengo Seki <se...@apache.org> 於 2020年9月17日 週四 10:29
>>> 寫道:
>>> > > > > > > > > >
>>> > > > > > > > > > > Thank you for your help, Amir!
>>> > > > > > > > > > > It's just a heads-up, I temporarily disabled builds
>>> for
>>> > ppc
>>> > > > in
>>> > > > > the
>>> > > > > > > > > > > following Jenkins jobs so that they can finish.
>>> > > > > > > > > > >
>>> > > > > > > > > > > * Docker-Puppet-Trunk
>>> > > > > > > > > > > * Docker-Puppet-Trunk-pull
>>> > > > > > > > > > > * Docker-Toolchain-Trunk
>>> > > > > > > > > > > * Docker-Toolchain-Trunk-pull
>>> > > > > > > > > > >
>>> > > > > > > > > > > * Bigtop-trunk-packages
>>> > > > > > > > > > > * Bigtop-trunk-repos
>>> > > > > > > > > > >
>>> > > > > > > > > > > * Remove-All-Docker-Containers-Except-Nexus
>>> > > > > > > > > > > * Remove-Dangling-Docker-Images
>>> > > > > > > > > > > * Remove-Inactive-Containers
>>> > > > > > > > > > >
>>> > > > > > > > > > > Kengo Seki <se...@apache.org>
>>> > > > > > > > > > >
>>> > > > > > > > > > > On Wed, Sep 16, 2020 at 7:35 PM Evans Ye <
>>> > > evansye@apache.org
>>> > > > >
>>> > > > > > > wrote:
>>> > > > > > > > > > > >
>>> > > > > > > > > > > > Awesome! Nice to hear from you, buddy!
>>> > > > > > > > > > > >
>>> > > > > > > > > > > > MrAsanjar <af...@gmail.com> 於 2020年9月16日 週三
>>> > 上午3:54寫道:
>>> > > > > > > > > > > >
>>> > > > > > > > > > > > > Hi Evans,
>>> > > > > > > > > > > > > Let me see what I can do. Give me 24 hr :)
>>> > > > > > > > > > > > >
>>> > > > > > > > > > > > > On Tue, Sep 15, 2020 at 10:51 AM Evans Ye <
>>> > > > > evansye@apache.org>
>>> > > > > > > > > > wrote:
>>> > > > > > > > > > > > >
>>> > > > > > > > > > > > > > Yes. I think the action is correct. However [2]
>>> > might
>>> > > > be
>>> > > > > a
>>> > > > > > > > > > different
>>> > > > > > > > > > > > > thing
>>> > > > > > > > > > > > > > for PPC integration in Hadoop.
>>> > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > Amir,
>>> > > > > > > > > > > > > > Could you confirm?
>>> > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > Kengo Seki <se...@apache.org> 於 2020年9月14日 週一
>>> > > > 下午9:56寫道:
>>> > > > > > > > > > > > > >
>>> > > > > > > > > > > > > >> Thank you for the advice, Evans!
>>> > > > > > > > > > > > > >> Let me confirm about "PPC machine owners".
>>> > According
>>> > > > to
>>> > > > > > > Amir's
>>> > > > > > > > > > JIRA
>>> > > > > > > > > > > > > >> issues [1][2] and the powered-by list in the
>>> OSU
>>> > > site
>>> > > > > [3],
>>> > > > > > > > we're
>>> > > > > > > > > > > using
>>> > > > > > > > > > > > > >> a VM hosted by OSU OSL, right?
>>> > > > > > > > > > > > > >> If it's correct, I'm going to ask them for
>>> help
>>> > via
>>> > > > > > > > > > > > > >> powerdev-request@osuosl.org.
>>> > > > > > > > > > > > > >>
>>> > > > > > > > > > > > > >> [1]:
>>> > > > > > > > > > > > > >>
>>> > > > > > > > > > > > >
>>> > > > > > > > > > >
>>> > > > > > > > > >
>>> > > > > > > > >
>>> > > > > > > >
>>> > > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://issues.apache.org/jira/browse/INFRA-11467?focusedCommentId=15300982&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15300982
>>> > > > > > > > > > > > > >> [2]:
>>> > > > https://issues.apache.org/jira/browse/INFRA-12014
>>> > > > > > > > > > > > > >> [3]:
>>> > > > > > > > > > > > >
>>> > > > > > > > >
>>> > > > >
>>> https://osuosl.org/services/powerdev/current-projects/#foss-projects
>>> > > > > > > > > > > > > >>
>>> > > > > > > > > > > > > >> Kengo Seki <se...@apache.org>
>>> > > > > > > > > > > > > >>
>>> > > > > > > > > > > > > >>
>>> > > > > > > > > > > > > >> On Mon, Sep 14, 2020 at 2:06 PM Evans Ye <
>>> > > > > > > evansye@apache.org>
>>> > > > > > > > > > > wrote:
>>> > > > > > > > > > > > > >> >
>>> > > > > > > > > > > > > >> > I'd suggest to reach out to PPC machine
>>> owners.
>>> > > > Worst
>>> > > > > case
>>> > > > > > > > Is
>>> > > > > > > > > we
>>> > > > > > > > > > > can
>>> > > > > > > > > > > > > >> > temporary  drop the PPC support to move the
>>> > > release
>>> > > > > > > forward.
>>> > > > > > > > > > > > > >> >
>>> > > > > > > > > > > > > >> > Kengo Seki <se...@apache.org> 於
>>> 2020年9月14日 週一
>>> > > > 12:44
>>> > > > > 寫道:
>>> > > > > > > > > > > > > >> >
>>> > > > > > > > > > > > > >> > > Hi everyone,
>>> > > > > > > > > > > > > >> > >
>>> > > > > > > > > > > > > >> > > Let me share information about the CI
>>> > > environment.
>>> > > > > > > > > > > > > >> > > The worker node for ppc64le is currently
>>> > > offlined,
>>> > > > > so I
>>> > > > > > > > just
>>> > > > > > > > > > > killed
>>> > > > > > > > > > > > > >> all
>>> > > > > > > > > > > > > >> > > jobs
>>> > > > > > > > > > > > > >> > > in the queue waiting for it gets back. Its
>>> > > status
>>> > > > > is as
>>> > > > > > > > > > follows.
>>> > > > > > > > > > > > > >> > >
>>> > > > > > > > > > > > > >> > > - According to the result of `who -b`,
>>> that
>>> > > > machine
>>> > > > > > > seems
>>> > > > > > > > to
>>> > > > > > > > > > be
>>> > > > > > > > > > > > > >> rebooted
>>> > > > > > > > > > > > > >> > >   on 2020-09-11 for some reason (probably
>>> > > > > unexpectedly).
>>> > > > > > > > > > > > > >> > >
>>> > > > > > > > > > > > > >> > > - According to the result of dmesg, the
>>> root
>>> > > > volume
>>> > > > > was
>>> > > > > > > > > > mounted
>>> > > > > > > > > > > > > >> > >   in read-only mode because of a fsck
>>> failure.
>>> > > > > > > > > > > > > >> > >
>>> > > > > > > > > > > > > >> > >   [   34.840681] EXT4-fs (vda1): Couldn't
>>> > > remount
>>> > > > > RDWR
>>> > > > > > > > > because
>>> > > > > > > > > > > of
>>> > > > > > > > > > > > > >> > > unprocessed orphan inode list.  Please
>>> > > > > umount/remount
>>> > > > > > > > > instead
>>> > > > > > > > > > > > > >> > >   [   60.714110] cgroup: new mount
>>> options do
>>> > > not
>>> > > > > match
>>> > > > > > > > the
>>> > > > > > > > > > > existing
>>> > > > > > > > > > > > > >> > > superblock, will be ignored
>>> > > > > > > > > > > > > >> > >   [  316.385805] EXT4-fs (vda1): error
>>> count
>>> > > since
>>> > > > > last
>>> > > > > > > > > fsck:
>>> > > > > > > > > > > 9459
>>> > > > > > > > > > > > > >> > >   [  316.385824] EXT4-fs (vda1): initial
>>> error
>>> > > at
>>> > > > > time
>>> > > > > > > > > > > 1540294049:
>>> > > > > > > > > > > > > >> > > ext4_validate_inode_bitmap:134
>>> > > > > > > > > > > > > >> > >   [  316.385826] EXT4-fs (vda1): last
>>> error at
>>> > > > time
>>> > > > > > > > > > 1596881526:
>>> > > > > > > > > > > > > >> > > ext4_free_inode:383
>>> > > > > > > > > > > > > >> > >
>>> > > > > > > > > > > > > >> > > It looks like some fsck work (and
>>> replacing
>>> > the
>>> > > > > volume,
>>> > > > > > > if
>>> > > > > > > > > it
>>> > > > > > > > > > > fails)
>>> > > > > > > > > > > > > >> > > are required,
>>> > > > > > > > > > > > > >> > > but I'm not sure if I could run something
>>> like
>>> > > > > `e2fsck
>>> > > > > > > > -p`,
>>> > > > > > > > > > > because
>>> > > > > > > > > > > > > >> > > I'm also not sure
>>> > > > > > > > > > > > > >> > > where does that machine exist or who's
>>> > managing
>>> > > > it.
>>> > > > > > > > > > > > > >> > > (I slightly thought it was running as a VM
>>> > with
>>> > > > > QEMU on
>>> > > > > > > > some
>>> > > > > > > > > > EC2
>>> > > > > > > > > > > > > >> > > instance, but I couldn't find it)
>>> > > > > > > > > > > > > >> > >
>>> > > > > > > > > > > > > >> > > > Cos, Evans, Olaf
>>> > > > > > > > > > > > > >> > > Would you provide any suggestions?
>>> > > > > > > > > > > > > >> > >
>>> > > > > > > > > > > > > >> > > Kengo Seki <se...@apache.org>
>>> > > > > > > > > > > > > >> > >
>>> > > > > > > > > > > > > >>
>>> > > > > > > > > > > > > >
>>> > > > > > > > > > > > >
>>> > > > > > > > > > >
>>> > > > > > > > > >
>>> > > > > > > > >
>>> > > > > > > >
>>> > > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>>
>>

Re: PPC CI server failure

Posted by MrAsanjar <af...@gmail.com>.
I have verified the state of ppc64le VM, it is operational. Could we enable
the ppc64le build before OpenStack flag the VM as ideal again.

On Thu, Apr 1, 2021 at 4:08 PM MrAsanjar <af...@gmail.com> wrote:

> Hi lads
> I just got an email that IBM has reinstated the ppc64le VM.
>
>
> On Mon, Mar 29, 2021 at 12:05 PM Evans Ye <ev...@apache.org> wrote:
>
>> Great news and thanks, Amir!
>>
>> Jun HE <ju...@apache.org> 於 2021年3月29日 週一 下午1:54寫道:
>>
>> > Awesome! Looking forward to its back to CI.
>> > Thanks a lot for helping on this, Asanjar!
>> >
>> > Regards,
>> >
>> > Jun
>> >
>> > MrAsanjar <af...@gmail.com> 于2021年3月29日周一 上午10:18写道:
>> >
>> > > Hi old friends :)
>> > > We should have a ppc64le VM back online sometime this week. I'll keep
>> you
>> > > all posted.
>> > >
>> > > On Thu, Nov 19, 2020 at 9:05 PM Evans Ye <ev...@apache.org> wrote:
>> > >
>> > > > Hi rbkrishn,
>> > > >
>> > > > Would you mind to comment whether those PPC servers for Bigtop CI
>> can
>> > be
>> > > > brought up and unlock our release process?
>> > > > Thanks!
>> > > >
>> > > > Best,
>> > > > Evans
>> > > >
>> > > > Kengo Seki <se...@apache.org> 於 2020年11月18日 週三 上午7:26寫道:
>> > > >
>> > > > > Thank you for checking, Evans and Amir!
>> > > > >
>> > > > > Kengo Seki <se...@apache.org>
>> > > > >
>> > > > > On Wed, Nov 18, 2020 at 2:09 AM Evans Ye <ev...@apache.org>
>> wrote:
>> > > > > >
>> > > > > > Thank you, Amir.
>> > > > > >
>> > > > > > MrAsanjar <af...@gmail.com> 於 2020年11月18日 週三 00:39 寫道:
>> > > > > >
>> > > > > > > Hi Evans, let me check with IBM again.
>> > > > > > >
>> > > > > > >
>> > > > > > > On Mon, Nov 16, 2020 at 9:08 PM Evans Ye <ev...@apache.org>
>> > > wrote:
>> > > > > > >
>> > > > > > > > Hi Amir,
>> > > > > > > >
>> > > > > > > > We're planning Bigtop 1.5 release and if we don't have the
>> CI
>> > > nodes
>> > > > > for
>> > > > > > > > PPC, we're not able to release 1.5 with PPC supported.
>> > > > > > > > Could you help to confirm again? Thanks!
>> > > > > > > >
>> > > > > > > > Best,
>> > > > > > > > Evans Ye
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > MrAsanjar <af...@gmail.com> 於 2020年9月17日 週四 下午8:56寫道:
>> > > > > > > >
>> > > > > > > > > I have informed IBM management regarding the situation,
>> > waiting
>> > > > > for a
>> > > > > > > > > reply.
>> > > > > > > > >
>> > > > > > > > > On Thu, Sep 17, 2020 at 3:47 AM Evans Ye <
>> evansye@apache.org
>> > >
>> > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > Ok. Thanks for doing this to get the ball rolling.
>> > > > > > > > > >
>> > > > > > > > > > Kengo Seki <se...@apache.org> 於 2020年9月17日 週四 10:29
>> 寫道:
>> > > > > > > > > >
>> > > > > > > > > > > Thank you for your help, Amir!
>> > > > > > > > > > > It's just a heads-up, I temporarily disabled builds
>> for
>> > ppc
>> > > > in
>> > > > > the
>> > > > > > > > > > > following Jenkins jobs so that they can finish.
>> > > > > > > > > > >
>> > > > > > > > > > > * Docker-Puppet-Trunk
>> > > > > > > > > > > * Docker-Puppet-Trunk-pull
>> > > > > > > > > > > * Docker-Toolchain-Trunk
>> > > > > > > > > > > * Docker-Toolchain-Trunk-pull
>> > > > > > > > > > >
>> > > > > > > > > > > * Bigtop-trunk-packages
>> > > > > > > > > > > * Bigtop-trunk-repos
>> > > > > > > > > > >
>> > > > > > > > > > > * Remove-All-Docker-Containers-Except-Nexus
>> > > > > > > > > > > * Remove-Dangling-Docker-Images
>> > > > > > > > > > > * Remove-Inactive-Containers
>> > > > > > > > > > >
>> > > > > > > > > > > Kengo Seki <se...@apache.org>
>> > > > > > > > > > >
>> > > > > > > > > > > On Wed, Sep 16, 2020 at 7:35 PM Evans Ye <
>> > > evansye@apache.org
>> > > > >
>> > > > > > > wrote:
>> > > > > > > > > > > >
>> > > > > > > > > > > > Awesome! Nice to hear from you, buddy!
>> > > > > > > > > > > >
>> > > > > > > > > > > > MrAsanjar <af...@gmail.com> 於 2020年9月16日 週三
>> > 上午3:54寫道:
>> > > > > > > > > > > >
>> > > > > > > > > > > > > Hi Evans,
>> > > > > > > > > > > > > Let me see what I can do. Give me 24 hr :)
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > On Tue, Sep 15, 2020 at 10:51 AM Evans Ye <
>> > > > > evansye@apache.org>
>> > > > > > > > > > wrote:
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > > Yes. I think the action is correct. However [2]
>> > might
>> > > > be
>> > > > > a
>> > > > > > > > > > different
>> > > > > > > > > > > > > thing
>> > > > > > > > > > > > > > for PPC integration in Hadoop.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Amir,
>> > > > > > > > > > > > > > Could you confirm?
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Kengo Seki <se...@apache.org> 於 2020年9月14日 週一
>> > > > 下午9:56寫道:
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >> Thank you for the advice, Evans!
>> > > > > > > > > > > > > >> Let me confirm about "PPC machine owners".
>> > According
>> > > > to
>> > > > > > > Amir's
>> > > > > > > > > > JIRA
>> > > > > > > > > > > > > >> issues [1][2] and the powered-by list in the
>> OSU
>> > > site
>> > > > > [3],
>> > > > > > > > we're
>> > > > > > > > > > > using
>> > > > > > > > > > > > > >> a VM hosted by OSU OSL, right?
>> > > > > > > > > > > > > >> If it's correct, I'm going to ask them for help
>> > via
>> > > > > > > > > > > > > >> powerdev-request@osuosl.org.
>> > > > > > > > > > > > > >>
>> > > > > > > > > > > > > >> [1]:
>> > > > > > > > > > > > > >>
>> > > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://issues.apache.org/jira/browse/INFRA-11467?focusedCommentId=15300982&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15300982
>> > > > > > > > > > > > > >> [2]:
>> > > > https://issues.apache.org/jira/browse/INFRA-12014
>> > > > > > > > > > > > > >> [3]:
>> > > > > > > > > > > > >
>> > > > > > > > >
>> > > > >
>> https://osuosl.org/services/powerdev/current-projects/#foss-projects
>> > > > > > > > > > > > > >>
>> > > > > > > > > > > > > >> Kengo Seki <se...@apache.org>
>> > > > > > > > > > > > > >>
>> > > > > > > > > > > > > >>
>> > > > > > > > > > > > > >> On Mon, Sep 14, 2020 at 2:06 PM Evans Ye <
>> > > > > > > evansye@apache.org>
>> > > > > > > > > > > wrote:
>> > > > > > > > > > > > > >> >
>> > > > > > > > > > > > > >> > I'd suggest to reach out to PPC machine
>> owners.
>> > > > Worst
>> > > > > case
>> > > > > > > > Is
>> > > > > > > > > we
>> > > > > > > > > > > can
>> > > > > > > > > > > > > >> > temporary  drop the PPC support to move the
>> > > release
>> > > > > > > forward.
>> > > > > > > > > > > > > >> >
>> > > > > > > > > > > > > >> > Kengo Seki <se...@apache.org> 於 2020年9月14日
>> 週一
>> > > > 12:44
>> > > > > 寫道:
>> > > > > > > > > > > > > >> >
>> > > > > > > > > > > > > >> > > Hi everyone,
>> > > > > > > > > > > > > >> > >
>> > > > > > > > > > > > > >> > > Let me share information about the CI
>> > > environment.
>> > > > > > > > > > > > > >> > > The worker node for ppc64le is currently
>> > > offlined,
>> > > > > so I
>> > > > > > > > just
>> > > > > > > > > > > killed
>> > > > > > > > > > > > > >> all
>> > > > > > > > > > > > > >> > > jobs
>> > > > > > > > > > > > > >> > > in the queue waiting for it gets back. Its
>> > > status
>> > > > > is as
>> > > > > > > > > > follows.
>> > > > > > > > > > > > > >> > >
>> > > > > > > > > > > > > >> > > - According to the result of `who -b`, that
>> > > > machine
>> > > > > > > seems
>> > > > > > > > to
>> > > > > > > > > > be
>> > > > > > > > > > > > > >> rebooted
>> > > > > > > > > > > > > >> > >   on 2020-09-11 for some reason (probably
>> > > > > unexpectedly).
>> > > > > > > > > > > > > >> > >
>> > > > > > > > > > > > > >> > > - According to the result of dmesg, the
>> root
>> > > > volume
>> > > > > was
>> > > > > > > > > > mounted
>> > > > > > > > > > > > > >> > >   in read-only mode because of a fsck
>> failure.
>> > > > > > > > > > > > > >> > >
>> > > > > > > > > > > > > >> > >   [   34.840681] EXT4-fs (vda1): Couldn't
>> > > remount
>> > > > > RDWR
>> > > > > > > > > because
>> > > > > > > > > > > of
>> > > > > > > > > > > > > >> > > unprocessed orphan inode list.  Please
>> > > > > umount/remount
>> > > > > > > > > instead
>> > > > > > > > > > > > > >> > >   [   60.714110] cgroup: new mount options
>> do
>> > > not
>> > > > > match
>> > > > > > > > the
>> > > > > > > > > > > existing
>> > > > > > > > > > > > > >> > > superblock, will be ignored
>> > > > > > > > > > > > > >> > >   [  316.385805] EXT4-fs (vda1): error
>> count
>> > > since
>> > > > > last
>> > > > > > > > > fsck:
>> > > > > > > > > > > 9459
>> > > > > > > > > > > > > >> > >   [  316.385824] EXT4-fs (vda1): initial
>> error
>> > > at
>> > > > > time
>> > > > > > > > > > > 1540294049:
>> > > > > > > > > > > > > >> > > ext4_validate_inode_bitmap:134
>> > > > > > > > > > > > > >> > >   [  316.385826] EXT4-fs (vda1): last
>> error at
>> > > > time
>> > > > > > > > > > 1596881526:
>> > > > > > > > > > > > > >> > > ext4_free_inode:383
>> > > > > > > > > > > > > >> > >
>> > > > > > > > > > > > > >> > > It looks like some fsck work (and replacing
>> > the
>> > > > > volume,
>> > > > > > > if
>> > > > > > > > > it
>> > > > > > > > > > > fails)
>> > > > > > > > > > > > > >> > > are required,
>> > > > > > > > > > > > > >> > > but I'm not sure if I could run something
>> like
>> > > > > `e2fsck
>> > > > > > > > -p`,
>> > > > > > > > > > > because
>> > > > > > > > > > > > > >> > > I'm also not sure
>> > > > > > > > > > > > > >> > > where does that machine exist or who's
>> > managing
>> > > > it.
>> > > > > > > > > > > > > >> > > (I slightly thought it was running as a VM
>> > with
>> > > > > QEMU on
>> > > > > > > > some
>> > > > > > > > > > EC2
>> > > > > > > > > > > > > >> > > instance, but I couldn't find it)
>> > > > > > > > > > > > > >> > >
>> > > > > > > > > > > > > >> > > > Cos, Evans, Olaf
>> > > > > > > > > > > > > >> > > Would you provide any suggestions?
>> > > > > > > > > > > > > >> > >
>> > > > > > > > > > > > > >> > > Kengo Seki <se...@apache.org>
>> > > > > > > > > > > > > >> > >
>> > > > > > > > > > > > > >>
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>