You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kudu.apache.org by Todd Lipcon <to...@cloudera.com> on 2017/01/20 00:16:47 UTC

Use of boost non-header-only functionality

Hey folks,

I'm starting to look at implementing a 96-bit timestamp type that would be
compatible with Impala and Parquet's TIMESTAMP. Internally, the 96 bits are
a 32-bit day number (Julian day) and a 64-bit time on that day
(nanoseconds).

This maps nicely to the boost::date_time::ptime data type for the purposes
of stringification, etc. Doing manual stringification without using a
library is a bit messy, since it has to account for leap years, etc, when
mapping the day number to a year/month/day.

Unfortunately, we currently only use boost in a header-only fashion, and it
appears the necessary functionality is not header-only. I can't quite
recall why we have the header-only restriction - it seems we should be able
to static-link boost like any other library. Does anyone have a problem
with making that change? Was there some issue that I'm forgetting about?

-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Use of boost non-header-only functionality

Posted by Todd Lipcon <to...@cloudera.com>.
It seems like the Java 8 time library supports it:
http://stackoverflow.com/questions/14988459/how-do-i-use-julian-day-numbers-with-the-java-calendar-api
And perhaps also supported by Jodatime.

Hive seems to use 'jodd':
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java
http://jodd.org/doc/jdatetime.html



On Thu, Jan 19, 2017 at 4:34 PM, Dan Burkert <da...@apache.org> wrote:

> Is there an equivalent library for the JVM?
>
> - Dan
>
> On Thu, Jan 19, 2017 at 4:33 PM, Dan Burkert <da...@apache.org>
> wrote:
>
> > I think for a long time we resisted vendoring boost, and so avoiding the
> > dynamically linked parts of boost was advantageous to avoid
> > incompatibilities in system boost versions.  I recall we are vendoring
> > boost now, so that should be less of a concern, I would think.
> >
> > - Dan
> >
> > On Thu, Jan 19, 2017 at 4:16 PM, Todd Lipcon <to...@cloudera.com> wrote:
> >
> >> Hey folks,
> >>
> >> I'm starting to look at implementing a 96-bit timestamp type that would
> be
> >> compatible with Impala and Parquet's TIMESTAMP. Internally, the 96 bits
> >> are
> >> a 32-bit day number (Julian day) and a 64-bit time on that day
> >> (nanoseconds).
> >>
> >> This maps nicely to the boost::date_time::ptime data type for the
> purposes
> >> of stringification, etc. Doing manual stringification without using a
> >> library is a bit messy, since it has to account for leap years, etc,
> when
> >> mapping the day number to a year/month/day.
> >>
> >> Unfortunately, we currently only use boost in a header-only fashion, and
> >> it
> >> appears the necessary functionality is not header-only. I can't quite
> >> recall why we have the header-only restriction - it seems we should be
> >> able
> >> to static-link boost like any other library. Does anyone have a problem
> >> with making that change? Was there some issue that I'm forgetting about?
> >>
> >> -Todd
> >> --
> >> Todd Lipcon
> >> Software Engineer, Cloudera
> >>
> >
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Use of boost non-header-only functionality

Posted by Dan Burkert <da...@apache.org>.
Is there an equivalent library for the JVM?

- Dan

On Thu, Jan 19, 2017 at 4:33 PM, Dan Burkert <da...@apache.org> wrote:

> I think for a long time we resisted vendoring boost, and so avoiding the
> dynamically linked parts of boost was advantageous to avoid
> incompatibilities in system boost versions.  I recall we are vendoring
> boost now, so that should be less of a concern, I would think.
>
> - Dan
>
> On Thu, Jan 19, 2017 at 4:16 PM, Todd Lipcon <to...@cloudera.com> wrote:
>
>> Hey folks,
>>
>> I'm starting to look at implementing a 96-bit timestamp type that would be
>> compatible with Impala and Parquet's TIMESTAMP. Internally, the 96 bits
>> are
>> a 32-bit day number (Julian day) and a 64-bit time on that day
>> (nanoseconds).
>>
>> This maps nicely to the boost::date_time::ptime data type for the purposes
>> of stringification, etc. Doing manual stringification without using a
>> library is a bit messy, since it has to account for leap years, etc, when
>> mapping the day number to a year/month/day.
>>
>> Unfortunately, we currently only use boost in a header-only fashion, and
>> it
>> appears the necessary functionality is not header-only. I can't quite
>> recall why we have the header-only restriction - it seems we should be
>> able
>> to static-link boost like any other library. Does anyone have a problem
>> with making that change? Was there some issue that I'm forgetting about?
>>
>> -Todd
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>
>

Re: Use of boost non-header-only functionality

Posted by Todd Lipcon <to...@cloudera.com>.
On Thu, Jan 19, 2017 at 5:25 PM, Adar Dembo <ad...@cloudera.com> wrote:

> At one point some of us thought to (eventually) eradicate boost from
> Kudu altogether, hence drawing the line on not building boost
> libraries. I think that culminated with Mike's reimplementation of
> boost::optional. Given that it was eventually abandoned after push
> back, I guess the idea of no boost at all is a dead one.
>

Yea, while I think a lot of boost is somewhat questionable, a blanket "no
boost" seems like a puritanical restriction.


>
> Technically there's no issue with reintroducing boost libraries (now
> that it's been vendored, as Dan said). But if we're going to do that,
> perhaps we should spell out in more detail how one should answer the
> question of "I need X and I see it in in at least two of std::,
> boost::, or kudu::. Which do I use?".
>

I just put up a candidate change to the style guide:
https://gerrit.cloudera.org/#/c/5752/
Please take a look.

-Todd


>
> Git commit 5b83695 has more color.
>
> On Thu, Jan 19, 2017 at 4:33 PM, Dan Burkert <da...@apache.org>
> wrote:
> > I think for a long time we resisted vendoring boost, and so avoiding the
> > dynamically linked parts of boost was advantageous to avoid
> > incompatibilities in system boost versions.  I recall we are vendoring
> > boost now, so that should be less of a concern, I would think.
> >
> > - Dan
> >
> > On Thu, Jan 19, 2017 at 4:16 PM, Todd Lipcon <to...@cloudera.com> wrote:
> >
> >> Hey folks,
> >>
> >> I'm starting to look at implementing a 96-bit timestamp type that would
> be
> >> compatible with Impala and Parquet's TIMESTAMP. Internally, the 96 bits
> are
> >> a 32-bit day number (Julian day) and a 64-bit time on that day
> >> (nanoseconds).
> >>
> >> This maps nicely to the boost::date_time::ptime data type for the
> purposes
> >> of stringification, etc. Doing manual stringification without using a
> >> library is a bit messy, since it has to account for leap years, etc,
> when
> >> mapping the day number to a year/month/day.
> >>
> >> Unfortunately, we currently only use boost in a header-only fashion,
> and it
> >> appears the necessary functionality is not header-only. I can't quite
> >> recall why we have the header-only restriction - it seems we should be
> able
> >> to static-link boost like any other library. Does anyone have a problem
> >> with making that change? Was there some issue that I'm forgetting about?
> >>
> >> -Todd
> >> --
> >> Todd Lipcon
> >> Software Engineer, Cloudera
> >>
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Use of boost non-header-only functionality

Posted by Adar Dembo <ad...@cloudera.com>.
At one point some of us thought to (eventually) eradicate boost from
Kudu altogether, hence drawing the line on not building boost
libraries. I think that culminated with Mike's reimplementation of
boost::optional. Given that it was eventually abandoned after push
back, I guess the idea of no boost at all is a dead one.

Technically there's no issue with reintroducing boost libraries (now
that it's been vendored, as Dan said). But if we're going to do that,
perhaps we should spell out in more detail how one should answer the
question of "I need X and I see it in in at least two of std::,
boost::, or kudu::. Which do I use?".

Git commit 5b83695 has more color.

On Thu, Jan 19, 2017 at 4:33 PM, Dan Burkert <da...@apache.org> wrote:
> I think for a long time we resisted vendoring boost, and so avoiding the
> dynamically linked parts of boost was advantageous to avoid
> incompatibilities in system boost versions.  I recall we are vendoring
> boost now, so that should be less of a concern, I would think.
>
> - Dan
>
> On Thu, Jan 19, 2017 at 4:16 PM, Todd Lipcon <to...@cloudera.com> wrote:
>
>> Hey folks,
>>
>> I'm starting to look at implementing a 96-bit timestamp type that would be
>> compatible with Impala and Parquet's TIMESTAMP. Internally, the 96 bits are
>> a 32-bit day number (Julian day) and a 64-bit time on that day
>> (nanoseconds).
>>
>> This maps nicely to the boost::date_time::ptime data type for the purposes
>> of stringification, etc. Doing manual stringification without using a
>> library is a bit messy, since it has to account for leap years, etc, when
>> mapping the day number to a year/month/day.
>>
>> Unfortunately, we currently only use boost in a header-only fashion, and it
>> appears the necessary functionality is not header-only. I can't quite
>> recall why we have the header-only restriction - it seems we should be able
>> to static-link boost like any other library. Does anyone have a problem
>> with making that change? Was there some issue that I'm forgetting about?
>>
>> -Todd
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>

Re: Use of boost non-header-only functionality

Posted by Dan Burkert <da...@apache.org>.
I think for a long time we resisted vendoring boost, and so avoiding the
dynamically linked parts of boost was advantageous to avoid
incompatibilities in system boost versions.  I recall we are vendoring
boost now, so that should be less of a concern, I would think.

- Dan

On Thu, Jan 19, 2017 at 4:16 PM, Todd Lipcon <to...@cloudera.com> wrote:

> Hey folks,
>
> I'm starting to look at implementing a 96-bit timestamp type that would be
> compatible with Impala and Parquet's TIMESTAMP. Internally, the 96 bits are
> a 32-bit day number (Julian day) and a 64-bit time on that day
> (nanoseconds).
>
> This maps nicely to the boost::date_time::ptime data type for the purposes
> of stringification, etc. Doing manual stringification without using a
> library is a bit messy, since it has to account for leap years, etc, when
> mapping the day number to a year/month/day.
>
> Unfortunately, we currently only use boost in a header-only fashion, and it
> appears the necessary functionality is not header-only. I can't quite
> recall why we have the header-only restriction - it seems we should be able
> to static-link boost like any other library. Does anyone have a problem
> with making that change? Was there some issue that I'm forgetting about?
>
> -Todd
> --
> Todd Lipcon
> Software Engineer, Cloudera
>