You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Aliaksei Sandryhaila <as...@apache.org> on 2016/03/07 17:22:44 UTC

Parquet-cpp dependency on C++11

Hi Wes and Julien,

At this point, parquet-cpp is heavily reliant on C++11 features and semantics. Believe it or not :), there are plenty of companies still
running older versions of Linux that do not support C++11. Removing this dependency will make parquet-cpp usable (and much more appealing) to them.

We would like to make parquet-cpp C++09 compatible. The end goal is to have a library that can compile with and without --std==c++11 flag. There are two parts of this process. The first one is to redefine or remove C++11 keywords, such as auto, unique_ptr, std::move, or for( : ) loops. The other part is to evaluate our use of C++11 features that are harder to replace, such as shared_ptr, make_shared(), etc., and either write our own implementation for this or modify code where appropriate (such as replace shared_ptr with unique_ptr where possible).

We can do this either by maintaining a separate feature branch and periodically pulling new code from parquet-cpp; or by implementing the compatibility functionality directly in parquet-cpp (all future PRs will be tested for c++09 compatibility during CI builds).

What are your thoughts on this?

Thank you,
Aliaksei.


Re: Parquet-cpp dependency on C++11

Posted by Uwe Korn <uw...@xhochy.com>.
Hello,

Ubuntu 12.04 (as in the default GCC 4.6) has C++11 support, only partial 
but it covers the most common features. It is named C++0x there as the 
standard had not been finalized at the date of the GCC 4.6 release. A 
good overview is https://gcc.gnu.org/projects/cxx0x.html and the linked 
status subpages.

Cheers,
Uwe

On 07.03.16 18:24, Ryan Blue wrote:
>  From some quick searching, it looks like C++11 is supported on Ubuntu 14.04
> LTS but not on 12.04 LTS. Considering that 14.04 is already nearly 2 years
> old (and 16.04 comes out soon), I think it is fairly reasonable to depend
> on C++11 even though 12.04 still has another 2 years of life. Everyone has
> had 2 years to update to the current LTS.
>
> I only looked into Ubuntu, but I'm guessing that this is about the same for
> redhat or centos. I think we should stay with C++11 and expect anyone on
> the old releases to install newer C++ libs if they want to use Parquet-CPP,
> unless there's some reason I'm missing why this is a more wide-spread
> problem than it looks like.
>
> rb
>
> On Mon, Mar 7, 2016 at 9:02 AM, Wes McKinney <we...@cloudera.com> wrote:
>
>> hello,
>>
>> responses inline
>>
>> On Mon, Mar 7, 2016 at 8:22 AM, Aliaksei Sandryhaila
>> <as...@apache.org> wrote:
>>> Hi Wes and Julien,
>>>
>>> At this point, parquet-cpp is heavily reliant on C++11 features and
>>> semantics. Believe it or not :), there are plenty of companies still
>>> running older versions of Linux that do not support C++11. Removing this
>>> dependency will make parquet-cpp usable (and much more appealing) to
>> them.
>> Just to be clear -- is this a problem for you specifically? Any other
>> context would be helpful.
>>
>> It is not especially difficult to set up a portable C++11 build
>> toolchain even on Linux distributions that do not have a new enough
>> gcc in their package repository. Both Impala and Kudu have recently
>> developed isolated 3rd-party toolchains to facilitate development and
>> packaging for these systems. See for example
>> https://github.com/cloudera/native-toolchain
>>
>>> We would like to make parquet-cpp C++09 compatible. The end goal is to
>> have
>>> a library that can compile with and without --std==c++11 flag. There are
>> two
>>> parts of this process. The first one is to redefine or remove C++11
>>> keywords, such as auto, unique_ptr, std::move, or for( : ) loops. The
>> other
>>> part is to evaluate our use of C++11 features that are harder to replace,
>>> such as shared_ptr, make_shared(), etc., and either write our own
>>> implementation for this or modify code where appropriate (such as replace
>>> shared_ptr with unique_ptr where possible).
>>>
>>> We can do this either by maintaining a separate feature branch and
>>> periodically pulling new code from parquet-cpp; or by implementing the
>>> compatibility functionality directly in parquet-cpp (all future PRs will
>> be
>>> tested for c++09 compatibility during CI builds).
>>>
>> I'm fairly negative on dropping C++11 in trunk / main library
>> development -- it would be a hardship for me personally, and
>> additionally deter software engineers who are increasingly coming back
>> to C++ development because of C++11/14.
>>
>> This leaves legacy C++<11 projects that wish to use parquet-cpp as a
>> 3rd-party dependency somewhat out in the cold. One approach is to
>> provide a wrapper API for projects that cannot interact with APIs that
>> use C++11 facilities (like std::unique_ptr). The same approach could
>> be used to provide a C API for the project. A wrapper API would be
>> much easier to maintain and test without having a separate branch to
>> keep in sync -- there might be some pitfalls here that I'm not aware
>> of so let me know what you think.
>>
>> Thanks,
>> Wes
>>
>>> What are your thoughts on this?
>>>
>>> Thank you,
>>> Aliaksei.
>>>
>
>


Re: Parquet-cpp dependency on C++11

Posted by Aliaksei Sandryhaila <as...@apache.org>.
Hi Ryan,

Yes, this is a reasonable expectation.

Something I forgot to mention is VS C++. AFAIK, it still has only 
partial support for c++11 features.

 From your and Wes' feedback, it looks like we'll keep C++09 
compatibility separate from the master branch.

Thank you,
Aliaksei.


On 03/07/2016 12:24 PM, Ryan Blue wrote:
>  From some quick searching, it looks like C++11 is supported on Ubuntu 14.04
> LTS but not on 12.04 LTS. Considering that 14.04 is already nearly 2 years
> old (and 16.04 comes out soon), I think it is fairly reasonable to depend
> on C++11 even though 12.04 still has another 2 years of life. Everyone has
> had 2 years to update to the current LTS.
>
> I only looked into Ubuntu, but I'm guessing that this is about the same for
> redhat or centos. I think we should stay with C++11 and expect anyone on
> the old releases to install newer C++ libs if they want to use Parquet-CPP,
> unless there's some reason I'm missing why this is a more wide-spread
> problem than it looks like.
>
> rb
>
> On Mon, Mar 7, 2016 at 9:02 AM, Wes McKinney <we...@cloudera.com> wrote:
>
>> hello,
>>
>> responses inline
>>
>> On Mon, Mar 7, 2016 at 8:22 AM, Aliaksei Sandryhaila
>> <as...@apache.org> wrote:
>>> Hi Wes and Julien,
>>>
>>> At this point, parquet-cpp is heavily reliant on C++11 features and
>>> semantics. Believe it or not :), there are plenty of companies still
>>> running older versions of Linux that do not support C++11. Removing this
>>> dependency will make parquet-cpp usable (and much more appealing) to
>> them.
>> Just to be clear -- is this a problem for you specifically? Any other
>> context would be helpful.
>>
>> It is not especially difficult to set up a portable C++11 build
>> toolchain even on Linux distributions that do not have a new enough
>> gcc in their package repository. Both Impala and Kudu have recently
>> developed isolated 3rd-party toolchains to facilitate development and
>> packaging for these systems. See for example
>> https://github.com/cloudera/native-toolchain
>>
>>> We would like to make parquet-cpp C++09 compatible. The end goal is to
>> have
>>> a library that can compile with and without --std==c++11 flag. There are
>> two
>>> parts of this process. The first one is to redefine or remove C++11
>>> keywords, such as auto, unique_ptr, std::move, or for( : ) loops. The
>> other
>>> part is to evaluate our use of C++11 features that are harder to replace,
>>> such as shared_ptr, make_shared(), etc., and either write our own
>>> implementation for this or modify code where appropriate (such as replace
>>> shared_ptr with unique_ptr where possible).
>>>
>>> We can do this either by maintaining a separate feature branch and
>>> periodically pulling new code from parquet-cpp; or by implementing the
>>> compatibility functionality directly in parquet-cpp (all future PRs will
>> be
>>> tested for c++09 compatibility during CI builds).
>>>
>> I'm fairly negative on dropping C++11 in trunk / main library
>> development -- it would be a hardship for me personally, and
>> additionally deter software engineers who are increasingly coming back
>> to C++ development because of C++11/14.
>>
>> This leaves legacy C++<11 projects that wish to use parquet-cpp as a
>> 3rd-party dependency somewhat out in the cold. One approach is to
>> provide a wrapper API for projects that cannot interact with APIs that
>> use C++11 facilities (like std::unique_ptr). The same approach could
>> be used to provide a C API for the project. A wrapper API would be
>> much easier to maintain and test without having a separate branch to
>> keep in sync -- there might be some pitfalls here that I'm not aware
>> of so let me know what you think.
>>
>> Thanks,
>> Wes
>>
>>> What are your thoughts on this?
>>>
>>> Thank you,
>>> Aliaksei.
>>>
>
>


Re: Parquet-cpp dependency on C++11

Posted by Ryan Blue <rb...@netflix.com.INVALID>.
>From some quick searching, it looks like C++11 is supported on Ubuntu 14.04
LTS but not on 12.04 LTS. Considering that 14.04 is already nearly 2 years
old (and 16.04 comes out soon), I think it is fairly reasonable to depend
on C++11 even though 12.04 still has another 2 years of life. Everyone has
had 2 years to update to the current LTS.

I only looked into Ubuntu, but I'm guessing that this is about the same for
redhat or centos. I think we should stay with C++11 and expect anyone on
the old releases to install newer C++ libs if they want to use Parquet-CPP,
unless there's some reason I'm missing why this is a more wide-spread
problem than it looks like.

rb

On Mon, Mar 7, 2016 at 9:02 AM, Wes McKinney <we...@cloudera.com> wrote:

> hello,
>
> responses inline
>
> On Mon, Mar 7, 2016 at 8:22 AM, Aliaksei Sandryhaila
> <as...@apache.org> wrote:
> > Hi Wes and Julien,
> >
> > At this point, parquet-cpp is heavily reliant on C++11 features and
> > semantics. Believe it or not :), there are plenty of companies still
> > running older versions of Linux that do not support C++11. Removing this
> > dependency will make parquet-cpp usable (and much more appealing) to
> them.
> >
>
> Just to be clear -- is this a problem for you specifically? Any other
> context would be helpful.
>
> It is not especially difficult to set up a portable C++11 build
> toolchain even on Linux distributions that do not have a new enough
> gcc in their package repository. Both Impala and Kudu have recently
> developed isolated 3rd-party toolchains to facilitate development and
> packaging for these systems. See for example
> https://github.com/cloudera/native-toolchain
>
> > We would like to make parquet-cpp C++09 compatible. The end goal is to
> have
> > a library that can compile with and without --std==c++11 flag. There are
> two
> > parts of this process. The first one is to redefine or remove C++11
> > keywords, such as auto, unique_ptr, std::move, or for( : ) loops. The
> other
> > part is to evaluate our use of C++11 features that are harder to replace,
> > such as shared_ptr, make_shared(), etc., and either write our own
> > implementation for this or modify code where appropriate (such as replace
> > shared_ptr with unique_ptr where possible).
> >
> > We can do this either by maintaining a separate feature branch and
> > periodically pulling new code from parquet-cpp; or by implementing the
> > compatibility functionality directly in parquet-cpp (all future PRs will
> be
> > tested for c++09 compatibility during CI builds).
> >
>
> I'm fairly negative on dropping C++11 in trunk / main library
> development -- it would be a hardship for me personally, and
> additionally deter software engineers who are increasingly coming back
> to C++ development because of C++11/14.
>
> This leaves legacy C++<11 projects that wish to use parquet-cpp as a
> 3rd-party dependency somewhat out in the cold. One approach is to
> provide a wrapper API for projects that cannot interact with APIs that
> use C++11 facilities (like std::unique_ptr). The same approach could
> be used to provide a C API for the project. A wrapper API would be
> much easier to maintain and test without having a separate branch to
> keep in sync -- there might be some pitfalls here that I'm not aware
> of so let me know what you think.
>
> Thanks,
> Wes
>
> > What are your thoughts on this?
> >
> > Thank you,
> > Aliaksei.
> >
>



-- 
Ryan Blue
Software Engineer
Netflix

Re: Parquet-cpp dependency on C++11

Posted by Aliaksei Sandryhaila <as...@gmail.com>.
Hi Todd,

We are aware of it, but so far we haven't had a strong enough case to 
use it. Maybe this will change with parquet-cpp.

Regards,
Aliaksei.


On 03/07/2016 03:04 PM, Todd Lipcon wrote:
> Have you looked into using the RHEL "devtoolset"? It allows you to use gcc
> 4.9 and a newer libstdcxx, and automatically static-links the necessary
> portions of the library into your application so that it can continue to
> run on earlier RHEL systems.
>
> This is what we're doing now with Kudu.
>
> -Todd
>
> On Mon, Mar 7, 2016 at 11:59 AM, Aliaksei Sandryhaila <as...@apache.org>
> wrote:
>
>> Wes,
>>
>> We do have customers that use older Linux versions and use Vertica
>> compiled without c++11. Since we would like to integrate parquet-cpp into
>> our product, we need to deal with the c++11 dependency.
>>
>> We can maintain this as a separate branch, if you and others don't feel
>> it's worthwhile incorporating this functionality into the master.
>>
>> Thank you,
>> Aliaksei.
>>
>>
>>
>> On 03/07/2016 12:02 PM, Wes McKinney wrote:
>>
>>> hello,
>>>
>>> responses inline
>>>
>>> On Mon, Mar 7, 2016 at 8:22 AM, Aliaksei Sandryhaila
>>> <as...@apache.org> wrote:
>>>
>>>> Hi Wes and Julien,
>>>>
>>>> At this point, parquet-cpp is heavily reliant on C++11 features and
>>>> semantics. Believe it or not :), there are plenty of companies still
>>>> running older versions of Linux that do not support C++11. Removing this
>>>> dependency will make parquet-cpp usable (and much more appealing) to
>>>> them.
>>>>
>>>> Just to be clear -- is this a problem for you specifically? Any other
>>> context would be helpful.
>>>
>>> It is not especially difficult to set up a portable C++11 build
>>> toolchain even on Linux distributions that do not have a new enough
>>> gcc in their package repository. Both Impala and Kudu have recently
>>> developed isolated 3rd-party toolchains to facilitate development and
>>> packaging for these systems. See for example
>>> https://github.com/cloudera/native-toolchain
>>>
>>> We would like to make parquet-cpp C++09 compatible. The end goal is to
>>>> have
>>>> a library that can compile with and without --std==c++11 flag. There are
>>>> two
>>>> parts of this process. The first one is to redefine or remove C++11
>>>> keywords, such as auto, unique_ptr, std::move, or for( : ) loops. The
>>>> other
>>>> part is to evaluate our use of C++11 features that are harder to replace,
>>>> such as shared_ptr, make_shared(), etc., and either write our own
>>>> implementation for this or modify code where appropriate (such as replace
>>>> shared_ptr with unique_ptr where possible).
>>>>
>>>> We can do this either by maintaining a separate feature branch and
>>>> periodically pulling new code from parquet-cpp; or by implementing the
>>>> compatibility functionality directly in parquet-cpp (all future PRs will
>>>> be
>>>> tested for c++09 compatibility during CI builds).
>>>>
>>>> I'm fairly negative on dropping C++11 in trunk / main library
>>> development -- it would be a hardship for me personally, and
>>> additionally deter software engineers who are increasingly coming back
>>> to C++ development because of C++11/14.
>>>
>>> This leaves legacy C++<11 projects that wish to use parquet-cpp as a
>>> 3rd-party dependency somewhat out in the cold. One approach is to
>>> provide a wrapper API for projects that cannot interact with APIs that
>>> use C++11 facilities (like std::unique_ptr). The same approach could
>>> be used to provide a C API for the project. A wrapper API would be
>>> much easier to maintain and test without having a separate branch to
>>> keep in sync -- there might be some pitfalls here that I'm not aware
>>> of so let me know what you think.
>>>
>>> Thanks,
>>> Wes
>>>
>>> What are your thoughts on this?
>>>> Thank you,
>>>> Aliaksei.
>>>>
>>>>
>


Re: Parquet-cpp dependency on C++11

Posted by Todd Lipcon <to...@cloudera.com>.
Have you looked into using the RHEL "devtoolset"? It allows you to use gcc
4.9 and a newer libstdcxx, and automatically static-links the necessary
portions of the library into your application so that it can continue to
run on earlier RHEL systems.

This is what we're doing now with Kudu.

-Todd

On Mon, Mar 7, 2016 at 11:59 AM, Aliaksei Sandryhaila <as...@apache.org>
wrote:

> Wes,
>
> We do have customers that use older Linux versions and use Vertica
> compiled without c++11. Since we would like to integrate parquet-cpp into
> our product, we need to deal with the c++11 dependency.
>
> We can maintain this as a separate branch, if you and others don't feel
> it's worthwhile incorporating this functionality into the master.
>
> Thank you,
> Aliaksei.
>
>
>
> On 03/07/2016 12:02 PM, Wes McKinney wrote:
>
>> hello,
>>
>> responses inline
>>
>> On Mon, Mar 7, 2016 at 8:22 AM, Aliaksei Sandryhaila
>> <as...@apache.org> wrote:
>>
>>> Hi Wes and Julien,
>>>
>>> At this point, parquet-cpp is heavily reliant on C++11 features and
>>> semantics. Believe it or not :), there are plenty of companies still
>>> running older versions of Linux that do not support C++11. Removing this
>>> dependency will make parquet-cpp usable (and much more appealing) to
>>> them.
>>>
>>> Just to be clear -- is this a problem for you specifically? Any other
>> context would be helpful.
>>
>> It is not especially difficult to set up a portable C++11 build
>> toolchain even on Linux distributions that do not have a new enough
>> gcc in their package repository. Both Impala and Kudu have recently
>> developed isolated 3rd-party toolchains to facilitate development and
>> packaging for these systems. See for example
>> https://github.com/cloudera/native-toolchain
>>
>> We would like to make parquet-cpp C++09 compatible. The end goal is to
>>> have
>>> a library that can compile with and without --std==c++11 flag. There are
>>> two
>>> parts of this process. The first one is to redefine or remove C++11
>>> keywords, such as auto, unique_ptr, std::move, or for( : ) loops. The
>>> other
>>> part is to evaluate our use of C++11 features that are harder to replace,
>>> such as shared_ptr, make_shared(), etc., and either write our own
>>> implementation for this or modify code where appropriate (such as replace
>>> shared_ptr with unique_ptr where possible).
>>>
>>> We can do this either by maintaining a separate feature branch and
>>> periodically pulling new code from parquet-cpp; or by implementing the
>>> compatibility functionality directly in parquet-cpp (all future PRs will
>>> be
>>> tested for c++09 compatibility during CI builds).
>>>
>>> I'm fairly negative on dropping C++11 in trunk / main library
>> development -- it would be a hardship for me personally, and
>> additionally deter software engineers who are increasingly coming back
>> to C++ development because of C++11/14.
>>
>> This leaves legacy C++<11 projects that wish to use parquet-cpp as a
>> 3rd-party dependency somewhat out in the cold. One approach is to
>> provide a wrapper API for projects that cannot interact with APIs that
>> use C++11 facilities (like std::unique_ptr). The same approach could
>> be used to provide a C API for the project. A wrapper API would be
>> much easier to maintain and test without having a separate branch to
>> keep in sync -- there might be some pitfalls here that I'm not aware
>> of so let me know what you think.
>>
>> Thanks,
>> Wes
>>
>> What are your thoughts on this?
>>>
>>> Thank you,
>>> Aliaksei.
>>>
>>>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Parquet-cpp dependency on C++11

Posted by Aliaksei Sandryhaila <as...@apache.org>.
Wes,

We do have customers that use older Linux versions and use Vertica 
compiled without c++11. Since we would like to integrate parquet-cpp 
into our product, we need to deal with the c++11 dependency.

We can maintain this as a separate branch, if you and others don't feel 
it's worthwhile incorporating this functionality into the master.

Thank you,
Aliaksei.


On 03/07/2016 12:02 PM, Wes McKinney wrote:
> hello,
>
> responses inline
>
> On Mon, Mar 7, 2016 at 8:22 AM, Aliaksei Sandryhaila
> <as...@apache.org> wrote:
>> Hi Wes and Julien,
>>
>> At this point, parquet-cpp is heavily reliant on C++11 features and
>> semantics. Believe it or not :), there are plenty of companies still
>> running older versions of Linux that do not support C++11. Removing this
>> dependency will make parquet-cpp usable (and much more appealing) to them.
>>
> Just to be clear -- is this a problem for you specifically? Any other
> context would be helpful.
>
> It is not especially difficult to set up a portable C++11 build
> toolchain even on Linux distributions that do not have a new enough
> gcc in their package repository. Both Impala and Kudu have recently
> developed isolated 3rd-party toolchains to facilitate development and
> packaging for these systems. See for example
> https://github.com/cloudera/native-toolchain
>
>> We would like to make parquet-cpp C++09 compatible. The end goal is to have
>> a library that can compile with and without --std==c++11 flag. There are two
>> parts of this process. The first one is to redefine or remove C++11
>> keywords, such as auto, unique_ptr, std::move, or for( : ) loops. The other
>> part is to evaluate our use of C++11 features that are harder to replace,
>> such as shared_ptr, make_shared(), etc., and either write our own
>> implementation for this or modify code where appropriate (such as replace
>> shared_ptr with unique_ptr where possible).
>>
>> We can do this either by maintaining a separate feature branch and
>> periodically pulling new code from parquet-cpp; or by implementing the
>> compatibility functionality directly in parquet-cpp (all future PRs will be
>> tested for c++09 compatibility during CI builds).
>>
> I'm fairly negative on dropping C++11 in trunk / main library
> development -- it would be a hardship for me personally, and
> additionally deter software engineers who are increasingly coming back
> to C++ development because of C++11/14.
>
> This leaves legacy C++<11 projects that wish to use parquet-cpp as a
> 3rd-party dependency somewhat out in the cold. One approach is to
> provide a wrapper API for projects that cannot interact with APIs that
> use C++11 facilities (like std::unique_ptr). The same approach could
> be used to provide a C API for the project. A wrapper API would be
> much easier to maintain and test without having a separate branch to
> keep in sync -- there might be some pitfalls here that I'm not aware
> of so let me know what you think.
>
> Thanks,
> Wes
>
>> What are your thoughts on this?
>>
>> Thank you,
>> Aliaksei.
>>


Re: Parquet-cpp dependency on C++11

Posted by Wes McKinney <we...@cloudera.com>.
hello,

responses inline

On Mon, Mar 7, 2016 at 8:22 AM, Aliaksei Sandryhaila
<as...@apache.org> wrote:
> Hi Wes and Julien,
>
> At this point, parquet-cpp is heavily reliant on C++11 features and
> semantics. Believe it or not :), there are plenty of companies still
> running older versions of Linux that do not support C++11. Removing this
> dependency will make parquet-cpp usable (and much more appealing) to them.
>

Just to be clear -- is this a problem for you specifically? Any other
context would be helpful.

It is not especially difficult to set up a portable C++11 build
toolchain even on Linux distributions that do not have a new enough
gcc in their package repository. Both Impala and Kudu have recently
developed isolated 3rd-party toolchains to facilitate development and
packaging for these systems. See for example
https://github.com/cloudera/native-toolchain

> We would like to make parquet-cpp C++09 compatible. The end goal is to have
> a library that can compile with and without --std==c++11 flag. There are two
> parts of this process. The first one is to redefine or remove C++11
> keywords, such as auto, unique_ptr, std::move, or for( : ) loops. The other
> part is to evaluate our use of C++11 features that are harder to replace,
> such as shared_ptr, make_shared(), etc., and either write our own
> implementation for this or modify code where appropriate (such as replace
> shared_ptr with unique_ptr where possible).
>
> We can do this either by maintaining a separate feature branch and
> periodically pulling new code from parquet-cpp; or by implementing the
> compatibility functionality directly in parquet-cpp (all future PRs will be
> tested for c++09 compatibility during CI builds).
>

I'm fairly negative on dropping C++11 in trunk / main library
development -- it would be a hardship for me personally, and
additionally deter software engineers who are increasingly coming back
to C++ development because of C++11/14.

This leaves legacy C++<11 projects that wish to use parquet-cpp as a
3rd-party dependency somewhat out in the cold. One approach is to
provide a wrapper API for projects that cannot interact with APIs that
use C++11 facilities (like std::unique_ptr). The same approach could
be used to provide a C API for the project. A wrapper API would be
much easier to maintain and test without having a separate branch to
keep in sync -- there might be some pitfalls here that I'm not aware
of so let me know what you think.

Thanks,
Wes

> What are your thoughts on this?
>
> Thank you,
> Aliaksei.
>