You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Micah Kornfield <em...@gmail.com> on 2019/11/21 04:50:12 UTC
[DISCUSS][C++] Pointer name aliasing
A recent PR for datasets [1] seems to have introduced the convention of
aliasing "std::shared_ptr<Type>" with "TypePtr" for some type. I think in
the past we had decided not to use a convention like this but I can't find
the thread. IMO, I think this generally makes the code less understandable
but this is a matter of taste.
Before the pattern gets too ingrained in the code base I just wanted to
make sure we discussed using this on the mailing list and made sure there
was a consensus on when to use the pattern.
Thanks,
Micah
[1] https://github.com/apache/arrow/pull/5857/files
Re: [DISCUSS][C++] Pointer name aliasing
Posted by Wes McKinney <we...@gmail.com>.
Hi Liya,
In general I think when a development guideline promotes readability
it is a good thing. Issues like these may be addressed on a case by
case basis, though.
best
Wes
On Sun, Dec 22, 2019 at 8:54 PM Fan Liya <li...@gmail.com> wrote:
>
> IMO, this question relates to something general and fundamental.
>
> Generally, name alias leads to two results:
> 1) It makes writing code easier
> 2) It makes reading code more difficult
>
> Personally, I prefer readability to writability.
> However, I am wrondering if we have some general principles regarding this?
>
> Best,
> Liya Fan
>
>
> On Fri, Dec 20, 2019 at 12:17 AM Francois Saint-Jacques <
> fsaintjacques@gmail.com> wrote:
>
> > I created the following ticket (and sub-tasks) [1] to track
> >
> > François
> >
> > [1] https://jira.apache.org/jira/browse/ARROW-7438
> >
> > On Tue, Nov 26, 2019 at 12:09 AM Micah Kornfield <em...@gmail.com>
> > wrote:
> > >
> > > I would need to look at the other instances as well. I will try to so by
> > > next week, but I think we can probably take an incremental approach of:
> > > 1. Eliminate *Ptr in src/arrow code (discuss similar changes in
> > > parquet/gandiva).
> > > 2. Decide on the Iterator/Vector.
> > >
> > > On Fri, Nov 22, 2019 at 10:47 AM Wes McKinney <we...@gmail.com>
> > wrote:
> > >
> > > > hi Francois
> > > >
> > > > On Fri, Nov 22, 2019 at 11:17 AM Francois Saint-Jacques
> > > > <fs...@gmail.com> wrote:
> > > > >
> > > > > I'll revert, some questions:
> > > > >
> > > > > 1. Should we revert only the pointer aliases, or also the
> > > > Vector/Iterator.
> > > >
> > > > Could you clarify what Vector/Iterator aliases you are referring to,
> > > > like RecordBatchIterator?
> > > >
> > > > I think we may need to distinguish between endogenous aliases versus
> > > > aliases involving STL types.
> > > >
> > > > IMHO
> > > >
> > > > using RecordBatchIterator = Iterator<shared_ptr<RecordBatch>>;
> > > >
> > > > is less problematic from a readability standpoint than
> > > >
> > > > using RecordBatchPtr = shared_ptr<RecordBatch>;
> > > >
> > > > > 2. Should we revert all modules, i.e. gandiva and compute.
> > > >
> > > > I would say one step at a time -- in the case of Gandiva I would say
> > > > that we should open a JIRA issue and discuss further to ensure that we
> > > > do not cause disruption for public API consumers. Since this software
> > > > has already been released multiple times, a different approach may be
> > > > needed
> > > >
> > > > >
> > > > > François
> > > >
> >
Re: [DISCUSS][C++] Pointer name aliasing
Posted by Fan Liya <li...@gmail.com>.
IMO, this question relates to something general and fundamental.
Generally, name alias leads to two results:
1) It makes writing code easier
2) It makes reading code more difficult
Personally, I prefer readability to writability.
However, I am wrondering if we have some general principles regarding this?
Best,
Liya Fan
On Fri, Dec 20, 2019 at 12:17 AM Francois Saint-Jacques <
fsaintjacques@gmail.com> wrote:
> I created the following ticket (and sub-tasks) [1] to track
>
> François
>
> [1] https://jira.apache.org/jira/browse/ARROW-7438
>
> On Tue, Nov 26, 2019 at 12:09 AM Micah Kornfield <em...@gmail.com>
> wrote:
> >
> > I would need to look at the other instances as well. I will try to so by
> > next week, but I think we can probably take an incremental approach of:
> > 1. Eliminate *Ptr in src/arrow code (discuss similar changes in
> > parquet/gandiva).
> > 2. Decide on the Iterator/Vector.
> >
> > On Fri, Nov 22, 2019 at 10:47 AM Wes McKinney <we...@gmail.com>
> wrote:
> >
> > > hi Francois
> > >
> > > On Fri, Nov 22, 2019 at 11:17 AM Francois Saint-Jacques
> > > <fs...@gmail.com> wrote:
> > > >
> > > > I'll revert, some questions:
> > > >
> > > > 1. Should we revert only the pointer aliases, or also the
> > > Vector/Iterator.
> > >
> > > Could you clarify what Vector/Iterator aliases you are referring to,
> > > like RecordBatchIterator?
> > >
> > > I think we may need to distinguish between endogenous aliases versus
> > > aliases involving STL types.
> > >
> > > IMHO
> > >
> > > using RecordBatchIterator = Iterator<shared_ptr<RecordBatch>>;
> > >
> > > is less problematic from a readability standpoint than
> > >
> > > using RecordBatchPtr = shared_ptr<RecordBatch>;
> > >
> > > > 2. Should we revert all modules, i.e. gandiva and compute.
> > >
> > > I would say one step at a time -- in the case of Gandiva I would say
> > > that we should open a JIRA issue and discuss further to ensure that we
> > > do not cause disruption for public API consumers. Since this software
> > > has already been released multiple times, a different approach may be
> > > needed
> > >
> > > >
> > > > François
> > >
>
Re: [DISCUSS][C++] Pointer name aliasing
Posted by Francois Saint-Jacques <fs...@gmail.com>.
I created the following ticket (and sub-tasks) [1] to track
François
[1] https://jira.apache.org/jira/browse/ARROW-7438
On Tue, Nov 26, 2019 at 12:09 AM Micah Kornfield <em...@gmail.com> wrote:
>
> I would need to look at the other instances as well. I will try to so by
> next week, but I think we can probably take an incremental approach of:
> 1. Eliminate *Ptr in src/arrow code (discuss similar changes in
> parquet/gandiva).
> 2. Decide on the Iterator/Vector.
>
> On Fri, Nov 22, 2019 at 10:47 AM Wes McKinney <we...@gmail.com> wrote:
>
> > hi Francois
> >
> > On Fri, Nov 22, 2019 at 11:17 AM Francois Saint-Jacques
> > <fs...@gmail.com> wrote:
> > >
> > > I'll revert, some questions:
> > >
> > > 1. Should we revert only the pointer aliases, or also the
> > Vector/Iterator.
> >
> > Could you clarify what Vector/Iterator aliases you are referring to,
> > like RecordBatchIterator?
> >
> > I think we may need to distinguish between endogenous aliases versus
> > aliases involving STL types.
> >
> > IMHO
> >
> > using RecordBatchIterator = Iterator<shared_ptr<RecordBatch>>;
> >
> > is less problematic from a readability standpoint than
> >
> > using RecordBatchPtr = shared_ptr<RecordBatch>;
> >
> > > 2. Should we revert all modules, i.e. gandiva and compute.
> >
> > I would say one step at a time -- in the case of Gandiva I would say
> > that we should open a JIRA issue and discuss further to ensure that we
> > do not cause disruption for public API consumers. Since this software
> > has already been released multiple times, a different approach may be
> > needed
> >
> > >
> > > François
> >
Re: [DISCUSS][C++] Pointer name aliasing
Posted by Micah Kornfield <em...@gmail.com>.
I would need to look at the other instances as well. I will try to so by
next week, but I think we can probably take an incremental approach of:
1. Eliminate *Ptr in src/arrow code (discuss similar changes in
parquet/gandiva).
2. Decide on the Iterator/Vector.
On Fri, Nov 22, 2019 at 10:47 AM Wes McKinney <we...@gmail.com> wrote:
> hi Francois
>
> On Fri, Nov 22, 2019 at 11:17 AM Francois Saint-Jacques
> <fs...@gmail.com> wrote:
> >
> > I'll revert, some questions:
> >
> > 1. Should we revert only the pointer aliases, or also the
> Vector/Iterator.
>
> Could you clarify what Vector/Iterator aliases you are referring to,
> like RecordBatchIterator?
>
> I think we may need to distinguish between endogenous aliases versus
> aliases involving STL types.
>
> IMHO
>
> using RecordBatchIterator = Iterator<shared_ptr<RecordBatch>>;
>
> is less problematic from a readability standpoint than
>
> using RecordBatchPtr = shared_ptr<RecordBatch>;
>
> > 2. Should we revert all modules, i.e. gandiva and compute.
>
> I would say one step at a time -- in the case of Gandiva I would say
> that we should open a JIRA issue and discuss further to ensure that we
> do not cause disruption for public API consumers. Since this software
> has already been released multiple times, a different approach may be
> needed
>
> >
> > François
>
Re: [DISCUSS][C++] Pointer name aliasing
Posted by Wes McKinney <we...@gmail.com>.
hi Francois
On Fri, Nov 22, 2019 at 11:17 AM Francois Saint-Jacques
<fs...@gmail.com> wrote:
>
> I'll revert, some questions:
>
> 1. Should we revert only the pointer aliases, or also the Vector/Iterator.
Could you clarify what Vector/Iterator aliases you are referring to,
like RecordBatchIterator?
I think we may need to distinguish between endogenous aliases versus
aliases involving STL types.
IMHO
using RecordBatchIterator = Iterator<shared_ptr<RecordBatch>>;
is less problematic from a readability standpoint than
using RecordBatchPtr = shared_ptr<RecordBatch>;
> 2. Should we revert all modules, i.e. gandiva and compute.
I would say one step at a time -- in the case of Gandiva I would say
that we should open a JIRA issue and discuss further to ensure that we
do not cause disruption for public API consumers. Since this software
has already been released multiple times, a different approach may be
needed
>
> François
Re: [DISCUSS][C++] Pointer name aliasing
Posted by Francois Saint-Jacques <fs...@gmail.com>.
I'll revert, some questions:
1. Should we revert only the pointer aliases, or also the Vector/Iterator.
2. Should we revert all modules, i.e. gandiva and compute.
François
Re: [DISCUSS][C++] Pointer name aliasing
Posted by Wes McKinney <we...@gmail.com>.
On Thu, Nov 21, 2019 at 11:27 PM Micah Kornfield <em...@gmail.com> wrote:
>
> >
> > I think we should mostly be careful about public APIs. With public
> > APIs we should write out the types and avoid aliases. With
> > implementation details and private/protected class members, I think it
> > is fine to use aliases.
>
> My concern with this is that in general if the types are in the header
> files they have a way of leaking out (whether intentional or not).
Yes, after looking at the PR that motivated this e-mail thread, I am
concerned about having aliases in public header files
>
>
> On Thu, Nov 21, 2019 at 12:06 PM Wes McKinney <we...@gmail.com> wrote:
>
> > I think we should mostly be careful about public APIs. With public
> > APIs we should write out the types and avoid aliases. With
> > implementation details and private/protected class members, I think it
> > is fine to use aliases.
> >
> > On Thu, Nov 21, 2019 at 11:06 AM Antoine Pitrou <so...@pitrou.net>
> > wrote:
> > >
> > > On Thu, 21 Nov 2019 08:40:10 -0500
> > > Francois Saint-Jacques <fs...@gmail.com> wrote:
> > > > This notation is already used in some parts of the codebase [1]. I
> > > > think it was introduced when absorbing gandiva and then in a draft of
> > > > the logical operations in the compute module. I have no strong opinion
> > > > for/against. I find it convenient to reduce typing, but the style
> > > > guide argue against this.
> > > >
> > > > What about other aliases (Vector & Iterator)? If we revert this
> > > > change, we should do it uniformly, e.g. in gandiva and compute.
> > >
> > > Vector and Iterator sound ok to me (though Iterator could yield some
> > > confusion with STL iterators, and Iterator<T> isn't really longer to
> > > type than TIterator).
> > >
> > > Regards
> > >
> > > Antoine.
> > >
> > >
> >
Re: [DISCUSS][C++] Pointer name aliasing
Posted by Micah Kornfield <em...@gmail.com>.
>
> I think we should mostly be careful about public APIs. With public
> APIs we should write out the types and avoid aliases. With
> implementation details and private/protected class members, I think it
> is fine to use aliases.
My concern with this is that in general if the types are in the header
files they have a way of leaking out (whether intentional or not).
On Thu, Nov 21, 2019 at 12:06 PM Wes McKinney <we...@gmail.com> wrote:
> I think we should mostly be careful about public APIs. With public
> APIs we should write out the types and avoid aliases. With
> implementation details and private/protected class members, I think it
> is fine to use aliases.
>
> On Thu, Nov 21, 2019 at 11:06 AM Antoine Pitrou <so...@pitrou.net>
> wrote:
> >
> > On Thu, 21 Nov 2019 08:40:10 -0500
> > Francois Saint-Jacques <fs...@gmail.com> wrote:
> > > This notation is already used in some parts of the codebase [1]. I
> > > think it was introduced when absorbing gandiva and then in a draft of
> > > the logical operations in the compute module. I have no strong opinion
> > > for/against. I find it convenient to reduce typing, but the style
> > > guide argue against this.
> > >
> > > What about other aliases (Vector & Iterator)? If we revert this
> > > change, we should do it uniformly, e.g. in gandiva and compute.
> >
> > Vector and Iterator sound ok to me (though Iterator could yield some
> > confusion with STL iterators, and Iterator<T> isn't really longer to
> > type than TIterator).
> >
> > Regards
> >
> > Antoine.
> >
> >
>
Re: [DISCUSS][C++] Pointer name aliasing
Posted by Wes McKinney <we...@gmail.com>.
I think we should mostly be careful about public APIs. With public
APIs we should write out the types and avoid aliases. With
implementation details and private/protected class members, I think it
is fine to use aliases.
On Thu, Nov 21, 2019 at 11:06 AM Antoine Pitrou <so...@pitrou.net> wrote:
>
> On Thu, 21 Nov 2019 08:40:10 -0500
> Francois Saint-Jacques <fs...@gmail.com> wrote:
> > This notation is already used in some parts of the codebase [1]. I
> > think it was introduced when absorbing gandiva and then in a draft of
> > the logical operations in the compute module. I have no strong opinion
> > for/against. I find it convenient to reduce typing, but the style
> > guide argue against this.
> >
> > What about other aliases (Vector & Iterator)? If we revert this
> > change, we should do it uniformly, e.g. in gandiva and compute.
>
> Vector and Iterator sound ok to me (though Iterator could yield some
> confusion with STL iterators, and Iterator<T> isn't really longer to
> type than TIterator).
>
> Regards
>
> Antoine.
>
>
Re: [DISCUSS][C++] Pointer name aliasing
Posted by Antoine Pitrou <so...@pitrou.net>.
On Thu, 21 Nov 2019 08:40:10 -0500
Francois Saint-Jacques <fs...@gmail.com> wrote:
> This notation is already used in some parts of the codebase [1]. I
> think it was introduced when absorbing gandiva and then in a draft of
> the logical operations in the compute module. I have no strong opinion
> for/against. I find it convenient to reduce typing, but the style
> guide argue against this.
>
> What about other aliases (Vector & Iterator)? If we revert this
> change, we should do it uniformly, e.g. in gandiva and compute.
Vector and Iterator sound ok to me (though Iterator could yield some
confusion with STL iterators, and Iterator<T> isn't really longer to
type than TIterator).
Regards
Antoine.
Re: [DISCUSS][C++] Pointer name aliasing
Posted by Francois Saint-Jacques <fs...@gmail.com>.
This notation is already used in some parts of the codebase [1]. I
think it was introduced when absorbing gandiva and then in a draft of
the logical operations in the compute module. I have no strong opinion
for/against. I find it convenient to reduce typing, but the style
guide argue against this.
What about other aliases (Vector & Iterator)? If we revert this
change, we should do it uniformly, e.g. in gandiva and compute.
François
[1] https://gist.github.com/fsaintjacques/18720eebd9de3bb7770586ed8ec0ef6f
On Thu, Nov 21, 2019 at 6:10 AM Antoine Pitrou <so...@pitrou.net> wrote:
>
> On Wed, 20 Nov 2019 20:50:12 -0800
> Micah Kornfield <em...@gmail.com> wrote:
> > A recent PR for datasets [1] seems to have introduced the convention of
> > aliasing "std::shared_ptr<Type>" with "TypePtr" for some type. I think in
> > the past we had decided not to use a convention like this but I can't find
> > the thread. IMO, I think this generally makes the code less understandable
> > but this is a matter of taste.
>
> I agree this introduces ambiguity for casual readers of the code. The
> question is whether the savings in typing are worth it. Personally,
> I've become used to writing "shared_ptr" a lot :-)
>
> Regards
>
> Antoine.
>
>
Re: [DISCUSS][C++] Pointer name aliasing
Posted by Antoine Pitrou <so...@pitrou.net>.
On Wed, 20 Nov 2019 20:50:12 -0800
Micah Kornfield <em...@gmail.com> wrote:
> A recent PR for datasets [1] seems to have introduced the convention of
> aliasing "std::shared_ptr<Type>" with "TypePtr" for some type. I think in
> the past we had decided not to use a convention like this but I can't find
> the thread. IMO, I think this generally makes the code less understandable
> but this is a matter of taste.
I agree this introduces ambiguity for casual readers of the code. The
question is whether the savings in typing are worth it. Personally,
I've become used to writing "shared_ptr" a lot :-)
Regards
Antoine.