You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2016/11/06 19:18:59 UTC
[jira] [Resolved] (PARQUET-764) [CPP] Parquet Writer does not write
Boolean values correctly
[ https://issues.apache.org/jira/browse/PARQUET-764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wes McKinney resolved PARQUET-764.
----------------------------------
Resolution: Fixed
Fix Version/s: cpp-0.1
Issue resolved by pull request 185
[https://github.com/apache/parquet-cpp/pull/185]
> [CPP] Parquet Writer does not write Boolean values correctly
> ------------------------------------------------------------
>
> Key: PARQUET-764
> URL: https://issues.apache.org/jira/browse/PARQUET-764
> Project: Parquet
> Issue Type: Bug
> Reporter: Deepak Majeti
> Assignee: Uwe L. Korn
> Fix For: cpp-0.1
>
>
> The core of the problem is due to https://github.com/apache/parquet-cpp/blob/master/src/parquet/encodings/plain-encoding.h#L203
> The bit packing happens for every Write(). However, the packing is done at the byte level. If the number of (1-bit) values are not a multiple of 8, it results in padding incorrect values (false for boolean).
> To reproduce: src/parquet/column/column-writer-test.cc
> {code}
> using TestBooleanValuesWriter = TestPrimitiveWriter<BooleanType>;
> TEST_F(TestBooleanValuesWriter, AlternateBooleanValues) {
> this->SetUpSchema(Repetition::REQUIRED);
> auto writer = this->BuildWriter();
> for (int i = 0; i < SMALL_SIZE; i++) {
> bool value = (i % 2 == 0) ? true : false;
> writer->WriteBatch(1, nullptr, nullptr, &value);
> }
> writer->Close();
> this->ReadColumn();
> for (int i = 0; i < SMALL_SIZE; i++) {
> ASSERT_EQ((i % 2 == 0) ? true : false, this->values_out_[i]) << i;
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)