You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "Gang Wu (JIRA)" <ji...@apache.org> on 2017/03/22 23:37:41 UTC
[jira] [Commented] (ORC-37) Represent the in memory timestamps
using UTC rather than the local timezone.
[ https://issues.apache.org/jira/browse/ORC-37?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15937369#comment-15937369 ]
Gang Wu commented on ORC-37:
----------------------------
Looks like this is already fixed in the following code:
void TimestampColumnReader::next(ColumnVectorBatch& rowBatch,
uint64_t numValues,
char *notNull) {
ColumnReader::next(rowBatch, numValues, notNull);
notNull = rowBatch.hasNulls ? rowBatch.notNull.data() : nullptr;
TimestampVectorBatch& timestampBatch =
dynamic_cast<TimestampVectorBatch&>(rowBatch);
int64_t *secsBuffer = timestampBatch.data.data();
secondsRle->next(secsBuffer, numValues, notNull);
int64_t *nanoBuffer = timestampBatch.nanoseconds.data();
nanoRle->next(nanoBuffer, numValues, notNull);
// Construct the values
for(uint64_t i=0; i < numValues; i++) {
if (notNull == nullptr || notNull[i]) {
uint64_t zeros = nanoBuffer[i] & 0x7;
nanoBuffer[i] >>= 3;
if (zeros != 0) {
for(uint64_t j = 0; j <= zeros; ++j) {
nanoBuffer[i] *= 10;
}
}
int64_t writerTime = secsBuffer[i] + epochOffset;
secsBuffer[i] = writerTime +
writerTimezone.getVariant(writerTime).gmtOffset;
if (secsBuffer[i] < 0 && nanoBuffer[i] != 0) {
secsBuffer[i] -= 1;
}
}
}
}
Correct me if I'm wrong. Thanks!
> Represent the in memory timestamps using UTC rather than the local timezone.
> ----------------------------------------------------------------------------
>
> Key: ORC-37
> URL: https://issues.apache.org/jira/browse/ORC-37
> Project: ORC
> Issue Type: Improvement
> Components: C++
> Reporter: Owen O'Malley
> Assignee: Owen O'Malley
>
> Change the representation of TimestampVectorBatch to be in UTC rather than local time.
> The advantages are:
> * More closely matches the SQL semantics of timestamp without timezone.
> * Allows accurate representation of all values including the ones that occur
> during the local leap forward/back for daylight savings.
> * One less timezone conversion.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)