You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Antoine Pitrou <an...@python.org> on 2020/04/20 17:28:33 UTC

[C++] Big-endian support

Hello,

Recently some issues have been opened for big-endian support (i.e.
support for big-endian *hosts*), and a couple patches submitted, thanks
to Kazuaki Ishizaki.  See e.g.:

https://issues.apache.org/jira/browse/ARROW-8457
https://issues.apache.org/jira/browse/ARROW-8467
https://issues.apache.org/jira/browse/ARROW-8486
https://issues.apache.org/jira/browse/ARROW-8506
https://issues.apache.org/jira/browse/PARQUET-1845

Achieving big-endian support support accross the C++ Arrow and Parquet
codebases is likely to be a very significant effort, potentially
requiring cooperation between multiple developers.  An additional
problem is that, without any Continuous Integration set up, it will be
impossible to ensure progress and be notified of regressions.

If other people are seriously interested in the desired outcome, they
should probably team up with Kazuaki Ishizaki and discuss a practical
plan to avoid drowning in the difficulties.

Regards

Antoine.

Re: [C++] Big-endian support

Posted by Wes McKinney <we...@gmail.com>.
On Wed, Apr 22, 2020 at 12:05 AM Micah Kornfield <em...@gmail.com> wrote:
>
> >
> > That said, if big-endian developers would assist with
> > other parts of the C++ project as a sort of "quid-pro-quo" to balance
> > the time spent on code review relating to big-endian that would be
> > helpful.
>
> I think setting/resetting up setting up CI would need to be included in
> this, otherwise even with in-depth reviews, I think it will be easy to
> forget about big endian architectures.
>
> An additional
> > problem is that, without any Continuous Integration set up, it will be
> > impossible to ensure progress and be notified of regressions.
>
> This might be hijacking the thread, but I think we might have similar
> issues for AVX-512 specific code?

Yes, this is true, but AVX-512-capable machines are significantly less
exotic (I develop on one -- i9-9960X -- for example)

> Thanks,
> Micah
>
>
> On Tue, Apr 21, 2020 at 5:10 AM Wes McKinney <we...@gmail.com> wrote:
>
> > I will add that I think big-endian support would be valuable so that
> > the library can be used everywhere, including more exotic mainframe
> > type systems like IBM Z.
> >
> > That said, the code review burden to other C++ developers is likely to
> > become significant, so a solo developer with access to big-endian
> > hardware submitting pull requests could be problematic since no one
> > else with close knowledge of the codebase has a need to support
> > big-endian. That said, if big-endian developers would assist with
> > other parts of the C++ project as a sort of "quid-pro-quo" to balance
> > the time spent on code review relating to big-endian that would be
> > helpful.
> >
> > On Mon, Apr 20, 2020 at 12:38 PM Antoine Pitrou <an...@python.org>
> > wrote:
> > >
> > >
> > > Hello,
> > >
> > > Recently some issues have been opened for big-endian support (i.e.
> > > support for big-endian *hosts*), and a couple patches submitted, thanks
> > > to Kazuaki Ishizaki.  See e.g.:
> > >
> > > https://issues.apache.org/jira/browse/ARROW-8457
> > > https://issues.apache.org/jira/browse/ARROW-8467
> > > https://issues.apache.org/jira/browse/ARROW-8486
> > > https://issues.apache.org/jira/browse/ARROW-8506
> > > https://issues.apache.org/jira/browse/PARQUET-1845
> > >
> > > Achieving big-endian support support accross the C++ Arrow and Parquet
> > > codebases is likely to be a very significant effort, potentially
> > > requiring cooperation between multiple developers.  An additional
> > > problem is that, without any Continuous Integration set up, it will be
> > > impossible to ensure progress and be notified of regressions.
> > >
> > > If other people are seriously interested in the desired outcome, they
> > > should probably team up with Kazuaki Ishizaki and discuss a practical
> > > plan to avoid drowning in the difficulties.
> > >
> > > Regards
> > >
> > > Antoine.
> >

Re: [C++] Big-endian support

Posted by Micah Kornfield <em...@gmail.com>.
>
> That said, if big-endian developers would assist with
> other parts of the C++ project as a sort of "quid-pro-quo" to balance
> the time spent on code review relating to big-endian that would be
> helpful.

I think setting/resetting up setting up CI would need to be included in
this, otherwise even with in-depth reviews, I think it will be easy to
forget about big endian architectures.

An additional
> problem is that, without any Continuous Integration set up, it will be
> impossible to ensure progress and be notified of regressions.

This might be hijacking the thread, but I think we might have similar
issues for AVX-512 specific code?

Thanks,
Micah


On Tue, Apr 21, 2020 at 5:10 AM Wes McKinney <we...@gmail.com> wrote:

> I will add that I think big-endian support would be valuable so that
> the library can be used everywhere, including more exotic mainframe
> type systems like IBM Z.
>
> That said, the code review burden to other C++ developers is likely to
> become significant, so a solo developer with access to big-endian
> hardware submitting pull requests could be problematic since no one
> else with close knowledge of the codebase has a need to support
> big-endian. That said, if big-endian developers would assist with
> other parts of the C++ project as a sort of "quid-pro-quo" to balance
> the time spent on code review relating to big-endian that would be
> helpful.
>
> On Mon, Apr 20, 2020 at 12:38 PM Antoine Pitrou <an...@python.org>
> wrote:
> >
> >
> > Hello,
> >
> > Recently some issues have been opened for big-endian support (i.e.
> > support for big-endian *hosts*), and a couple patches submitted, thanks
> > to Kazuaki Ishizaki.  See e.g.:
> >
> > https://issues.apache.org/jira/browse/ARROW-8457
> > https://issues.apache.org/jira/browse/ARROW-8467
> > https://issues.apache.org/jira/browse/ARROW-8486
> > https://issues.apache.org/jira/browse/ARROW-8506
> > https://issues.apache.org/jira/browse/PARQUET-1845
> >
> > Achieving big-endian support support accross the C++ Arrow and Parquet
> > codebases is likely to be a very significant effort, potentially
> > requiring cooperation between multiple developers.  An additional
> > problem is that, without any Continuous Integration set up, it will be
> > impossible to ensure progress and be notified of regressions.
> >
> > If other people are seriously interested in the desired outcome, they
> > should probably team up with Kazuaki Ishizaki and discuss a practical
> > plan to avoid drowning in the difficulties.
> >
> > Regards
> >
> > Antoine.
>

Re: [C++] Big-endian support

Posted by Wes McKinney <we...@gmail.com>.
hi Kazuaki

On Wed, Apr 22, 2020 at 12:41 AM Kazuaki Ishizaki <IS...@jp.ibm.com> wrote:
>
> Thank you for your comments. I see that the developers would assist of
> other parts, too.
>
> For developing OSS on big-endian, here are resource for an environment and
> CI. They would be helpful for code review, too.
> A trial zLinux VM for OSS development is available. Once we create a VM
> with RHEL or SLES, it is available up to 120 days. The procedure to create
> a VM is available at
> https://github.com/linuxone-community-cloud/technical-resources/blob/master/deploy-virtual-server.md
> .
> Regarding CI, TravisCI on zLinux is available. The article is available at
> https://blog.travis-ci.com/2019-11-12-multi-cpu-architecture-ibm-power-ibm-z

This is good to know. I think we will need you or one of your
colleagues to contribute to the setup and maintenance of this in the
project's CI infrastructure.

>
> Kazuaki Ishizaki,
>
>
>
> From:   Wes McKinney <we...@gmail.com>
> To:     dev <de...@arrow.apache.org>
> Date:   2020/04/21 21:11
> Subject:        [EXTERNAL] Re: [C++] Big-endian support
>
>
>
> I will add that I think big-endian support would be valuable so that
> the library can be used everywhere, including more exotic mainframe
> type systems like IBM Z.
>
> That said, the code review burden to other C++ developers is likely to
> become significant, so a solo developer with access to big-endian
> hardware submitting pull requests could be problematic since no one
> else with close knowledge of the codebase has a need to support
> big-endian. That said, if big-endian developers would assist with
> other parts of the C++ project as a sort of "quid-pro-quo" to balance
> the time spent on code review relating to big-endian that would be
> helpful.
>
> On Mon, Apr 20, 2020 at 12:38 PM Antoine Pitrou <an...@python.org>
> wrote:
> >
> >
> > Hello,
> >
> > Recently some issues have been opened for big-endian support (i.e.
> > support for big-endian *hosts*), and a couple patches submitted, thanks
> > to Kazuaki Ishizaki.  See e.g.:
> >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_ARROW-2D8457&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f-ZCGj9Pg&m=LDU6YozRdOdA8sz-N3IT2-1CDlHn6-VgsQhvAmqcjF0&s=wWPbfEjThpmG7B3LCiHadi28EXcx7v7yhYYAZ8p80cI&e=
>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_ARROW-2D8467&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f-ZCGj9Pg&m=LDU6YozRdOdA8sz-N3IT2-1CDlHn6-VgsQhvAmqcjF0&s=xuVttzSLurzBSLUpBFdnMwWtZ7rKCbEcgjCYm72K2QY&e=
>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_ARROW-2D8486&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f-ZCGj9Pg&m=LDU6YozRdOdA8sz-N3IT2-1CDlHn6-VgsQhvAmqcjF0&s=StvnEO4FScjt-7328AEqPbMEe-fLs-Ms2g94VHkYHF4&e=
>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_ARROW-2D8506&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f-ZCGj9Pg&m=LDU6YozRdOdA8sz-N3IT2-1CDlHn6-VgsQhvAmqcjF0&s=U6wwz875yuTkN4WdS7v_zB4SjIyooH6bgeVh57ByPnE&e=
>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_PARQUET-2D1845&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f-ZCGj9Pg&m=LDU6YozRdOdA8sz-N3IT2-1CDlHn6-VgsQhvAmqcjF0&s=ZrNKpegyRlg0SErlYs1FOdBjElPuUzHdxRINWQIzn98&e=
>
> >
> > Achieving big-endian support support accross the C++ Arrow and Parquet
> > codebases is likely to be a very significant effort, potentially
> > requiring cooperation between multiple developers.  An additional
> > problem is that, without any Continuous Integration set up, it will be
> > impossible to ensure progress and be notified of regressions.
> >
> > If other people are seriously interested in the desired outcome, they
> > should probably team up with Kazuaki Ishizaki and discuss a practical
> > plan to avoid drowning in the difficulties.
> >
> > Regards
> >
> > Antoine.
>
>
>
>

Re: [C++] Big-endian support

Posted by Kazuaki Ishizaki <IS...@jp.ibm.com>.
Thank you for your comments. I see that the developers would assist of 
other parts, too.

For developing OSS on big-endian, here are resource for an environment and 
CI. They would be helpful for code review, too.
A trial zLinux VM for OSS development is available. Once we create a VM 
with RHEL or SLES, it is available up to 120 days. The procedure to create 
a VM is available at 
https://github.com/linuxone-community-cloud/technical-resources/blob/master/deploy-virtual-server.md
.
Regarding CI, TravisCI on zLinux is available. The article is available at 
https://blog.travis-ci.com/2019-11-12-multi-cpu-architecture-ibm-power-ibm-z
.

Kazuaki Ishizaki,



From:   Wes McKinney <we...@gmail.com>
To:     dev <de...@arrow.apache.org>
Date:   2020/04/21 21:11
Subject:        [EXTERNAL] Re: [C++] Big-endian support



I will add that I think big-endian support would be valuable so that
the library can be used everywhere, including more exotic mainframe
type systems like IBM Z.

That said, the code review burden to other C++ developers is likely to
become significant, so a solo developer with access to big-endian
hardware submitting pull requests could be problematic since no one
else with close knowledge of the codebase has a need to support
big-endian. That said, if big-endian developers would assist with
other parts of the C++ project as a sort of "quid-pro-quo" to balance
the time spent on code review relating to big-endian that would be
helpful.

On Mon, Apr 20, 2020 at 12:38 PM Antoine Pitrou <an...@python.org> 
wrote:
>
>
> Hello,
>
> Recently some issues have been opened for big-endian support (i.e.
> support for big-endian *hosts*), and a couple patches submitted, thanks
> to Kazuaki Ishizaki.  See e.g.:
>
> 
https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_ARROW-2D8457&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f-ZCGj9Pg&m=LDU6YozRdOdA8sz-N3IT2-1CDlHn6-VgsQhvAmqcjF0&s=wWPbfEjThpmG7B3LCiHadi28EXcx7v7yhYYAZ8p80cI&e= 

> 
https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_ARROW-2D8467&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f-ZCGj9Pg&m=LDU6YozRdOdA8sz-N3IT2-1CDlHn6-VgsQhvAmqcjF0&s=xuVttzSLurzBSLUpBFdnMwWtZ7rKCbEcgjCYm72K2QY&e= 

> 
https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_ARROW-2D8486&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f-ZCGj9Pg&m=LDU6YozRdOdA8sz-N3IT2-1CDlHn6-VgsQhvAmqcjF0&s=StvnEO4FScjt-7328AEqPbMEe-fLs-Ms2g94VHkYHF4&e= 

> 
https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_ARROW-2D8506&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f-ZCGj9Pg&m=LDU6YozRdOdA8sz-N3IT2-1CDlHn6-VgsQhvAmqcjF0&s=U6wwz875yuTkN4WdS7v_zB4SjIyooH6bgeVh57ByPnE&e= 

> 
https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_PARQUET-2D1845&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=b70dG_9wpCdZSkBJahHYQ4IwKMdp2hQM29f-ZCGj9Pg&m=LDU6YozRdOdA8sz-N3IT2-1CDlHn6-VgsQhvAmqcjF0&s=ZrNKpegyRlg0SErlYs1FOdBjElPuUzHdxRINWQIzn98&e= 

>
> Achieving big-endian support support accross the C++ Arrow and Parquet
> codebases is likely to be a very significant effort, potentially
> requiring cooperation between multiple developers.  An additional
> problem is that, without any Continuous Integration set up, it will be
> impossible to ensure progress and be notified of regressions.
>
> If other people are seriously interested in the desired outcome, they
> should probably team up with Kazuaki Ishizaki and discuss a practical
> plan to avoid drowning in the difficulties.
>
> Regards
>
> Antoine.





Re: [C++] Big-endian support

Posted by Wes McKinney <we...@gmail.com>.
I will add that I think big-endian support would be valuable so that
the library can be used everywhere, including more exotic mainframe
type systems like IBM Z.

That said, the code review burden to other C++ developers is likely to
become significant, so a solo developer with access to big-endian
hardware submitting pull requests could be problematic since no one
else with close knowledge of the codebase has a need to support
big-endian. That said, if big-endian developers would assist with
other parts of the C++ project as a sort of "quid-pro-quo" to balance
the time spent on code review relating to big-endian that would be
helpful.

On Mon, Apr 20, 2020 at 12:38 PM Antoine Pitrou <an...@python.org> wrote:
>
>
> Hello,
>
> Recently some issues have been opened for big-endian support (i.e.
> support for big-endian *hosts*), and a couple patches submitted, thanks
> to Kazuaki Ishizaki.  See e.g.:
>
> https://issues.apache.org/jira/browse/ARROW-8457
> https://issues.apache.org/jira/browse/ARROW-8467
> https://issues.apache.org/jira/browse/ARROW-8486
> https://issues.apache.org/jira/browse/ARROW-8506
> https://issues.apache.org/jira/browse/PARQUET-1845
>
> Achieving big-endian support support accross the C++ Arrow and Parquet
> codebases is likely to be a very significant effort, potentially
> requiring cooperation between multiple developers.  An additional
> problem is that, without any Continuous Integration set up, it will be
> impossible to ensure progress and be notified of regressions.
>
> If other people are seriously interested in the desired outcome, they
> should probably team up with Kazuaki Ishizaki and discuss a practical
> plan to avoid drowning in the difficulties.
>
> Regards
>
> Antoine.