You are viewing a plain text version of this content. The canonical link for it is here.
- [GitHub] [parquet-mr] wgtmac commented on pull request #985: PARQUET-2173. Fix parquet build against hadoop 3.3.3+ - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/01 02:00:32 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2173) Fix parquet build against hadoop 3.3.3+ - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/01 02:01:00 UTC, 2 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/01 02:02:52 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2149) Implement async IO for Parquet file reader - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/01 02:03:00 UTC, 6 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/01 07:26:33 UTC, 9 replies.
- [jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/01 07:27:00 UTC, 65 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/01 07:29:48 UTC, 0 replies.
- Parquet array schema incompatibilities - posted by Laurynas Katkus <la...@mambu.com> on 2023/02/01 16:09:31 UTC, 0 replies.
- [GitHub] [parquet-format] emkornfield commented on pull request #184: PARQUET-758: Add Float16/Half-float logical type - posted by "emkornfield (via GitHub)" <gi...@apache.org> on 2023/02/01 17:48:04 UTC, 1 replies.
- [jira] [Commented] (PARQUET-758) [Format] HALF precision FLOAT Logical type - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/01 17:49:00 UTC, 5 replies.
- [GitHub] [parquet-format] pitrou commented on pull request #184: PARQUET-758: Add Float16/Half-float logical type - posted by "pitrou (via GitHub)" <gi...@apache.org> on 2023/02/01 17:50:38 UTC, 1 replies.
- [GitHub] [parquet-format] shangxinli commented on pull request #184: PARQUET-758: Add Float16/Half-float logical type - posted by "shangxinli (via GitHub)" <gi...@apache.org> on 2023/02/01 18:27:42 UTC, 0 replies.
- [C++] Parquet and Arrow overlap - posted by Will Jones <wi...@gmail.com> on 2023/02/01 19:27:22 UTC, 5 replies.
- [GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization - posted by "jiangjiguang (via GitHub)" <gi...@apache.org> on 2023/02/02 02:32:29 UTC, 3 replies.
- [GitHub] [parquet-mr] jiangjiguang commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization - posted by "jiangjiguang (via GitHub)" <gi...@apache.org> on 2023/02/02 03:54:28 UTC, 26 replies.
- [GitHub] [parquet-mr] gszadovszky merged pull request #985: PARQUET-2173. Fix parquet build against hadoop 3.3.3+ - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/02 07:18:48 UTC, 0 replies.
- [GitHub] [parquet-mr] steveloughran commented on pull request #985: PARQUET-2173. Fix parquet build against hadoop 3.3.3+ - posted by "steveloughran (via GitHub)" <gi...@apache.org> on 2023/02/02 11:32:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2173) Fix parquet build against hadoop 3.3.3+ - posted by "Steve Loughran (Jira)" <ji...@apache.org> on 2023/02/02 11:32:00 UTC, 0 replies.
- [GitHub] [parquet-mr] jianchun opened a new pull request, #1022: PARQUET-831: fix estimate page size check overflow corrupting parquet - posted by "jianchun (via GitHub)" <gi...@apache.org> on 2023/02/03 18:34:49 UTC, 0 replies.
- [jira] [Commented] (PARQUET-831) Corrupt Parquet Files - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/03 18:35:00 UTC, 10 replies.
- [jira] [Created] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly - posted by "Mars (Jira)" <ji...@apache.org> on 2023/02/04 09:09:00 UTC, 0 replies.
- [jira] [Updated] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly - posted by "Mars (Jira)" <ji...@apache.org> on 2023/02/04 09:14:00 UTC, 3 replies.
- [GitHub] [parquet-mr] yabola opened a new pull request, #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/02/04 09:21:46 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/04 09:22:00 UTC, 83 replies.
- [jira] [Created] (PARQUET-2238) Spec and parquet-mr disagree on DELTA_BYTE_ARRAY encoding - posted by "Jan Finis (Jira)" <ji...@apache.org> on 2023/02/04 11:31:00 UTC, 0 replies.
- [jira] [Updated] (PARQUET-2238) Spec and parquet-mr disagree on DELTA_BYTE_ARRAY encoding - posted by "Jan Finis (Jira)" <ji...@apache.org> on 2023/02/04 11:33:00 UTC, 1 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/04 16:00:37 UTC, 10 replies.
- [GitHub] [parquet-mr] jatin-bhateja commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization - posted by "jatin-bhateja (via GitHub)" <gi...@apache.org> on 2023/02/04 16:01:32 UTC, 6 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #1022: PARQUET-831: fix estimate page size check overflow corrupting parquet - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/04 16:14:31 UTC, 0 replies.
- [GitHub] [parquet-mr] jianchun commented on pull request #1022: PARQUET-831: fix estimate page size check overflow corrupting parquet - posted by "jianchun (via GitHub)" <gi...@apache.org> on 2023/02/04 17:09:04 UTC, 0 replies.
- [GitHub] [parquet-mr] yabola commented on pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/02/05 08:26:14 UTC, 11 replies.
- [GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/02/05 09:24:15 UTC, 46 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/05 14:54:19 UTC, 2 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1022: PARQUET-831: fix estimate page size check overflow corrupting parquet - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/06 08:47:40 UTC, 2 replies.
- [jira] [Created] (PARQUET-2239) Replace log4j1 with reload4j - posted by "Akshat Mathur (Jira)" <ji...@apache.org> on 2023/02/06 10:51:00 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2239) Replace log4j1 with reload4j - posted by "Steve Loughran (Jira)" <ji...@apache.org> on 2023/02/06 11:27:00 UTC, 1 replies.
- [GitHub] [parquet-mr] shangxinli commented on pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly - posted by "shangxinli (via GitHub)" <gi...@apache.org> on 2023/02/06 15:20:29 UTC, 0 replies.
- [GitHub] [parquet-mr] jianchun commented on a diff in pull request #1022: PARQUET-831: fix estimate page size check overflow corrupting parquet - posted by "jianchun (via GitHub)" <gi...@apache.org> on 2023/02/06 17:36:02 UTC, 3 replies.
- [GitHub] [parquet-mr] sunchao commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/02/06 20:16:41 UTC, 2 replies.
- [GitHub] [parquet-mr] dongjoon-hyun commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/02/06 20:46:09 UTC, 0 replies.
- [jira] [Created] (PARQUET-2240) DateTimeFormatter is used in static context, but not thread safe - posted by "Shani Elharrar (Jira)" <ji...@apache.org> on 2023/02/07 09:06:00 UTC, 0 replies.
- [GitHub] [parquet-format] XinyuZeng opened a new pull request, #190: Minor: add FIXED_LEN_BYTE_ARRAY under Types in doc - posted by "XinyuZeng (via GitHub)" <gi...@apache.org> on 2023/02/07 09:11:06 UTC, 0 replies.
- [jira] [Updated] (PARQUET-2240) DateTimeFormatter is used in static context, but not thread safe - posted by "Shani Elharrar (Jira)" <ji...@apache.org> on 2023/02/07 10:37:00 UTC, 0 replies.
- [GitHub] [parquet-format] shangxinli commented on pull request #164: PARQUET-1950: Define core features - posted by "shangxinli (via GitHub)" <gi...@apache.org> on 2023/02/07 19:13:32 UTC, 0 replies.
- [jira] [Commented] (PARQUET-1950) Define core features / compliance level - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/07 19:14:00 UTC, 3 replies.
- [GitHub] [parquet-format] shangxinli commented on pull request #175: PARQUET-2005: Upgrade Apache Thrift to 0.14.1 - posted by "shangxinli (via GitHub)" <gi...@apache.org> on 2023/02/07 19:15:14 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2005) Upgrade thrift to 0.14.1 - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/07 19:16:00 UTC, 1 replies.
- [GitHub] [parquet-mr] shangxinli commented on pull request #1021: PARQUET-2229: ParquetRewriter masks and encrypts the same column - posted by "shangxinli (via GitHub)" <gi...@apache.org> on 2023/02/07 19:21:10 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2229) ParquetRewriter supports masking and encrypting the same column - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/07 19:22:00 UTC, 2 replies.
- [GitHub] [parquet-format] Fokko merged pull request #175: PARQUET-2005: Upgrade Apache Thrift to 0.14.1 - posted by "Fokko (via GitHub)" <gi...@apache.org> on 2023/02/07 21:17:28 UTC, 0 replies.
- [GitHub] [parquet-format] dependabot[bot] opened a new pull request, #191: Bump junit from 4.10 to 4.13.1 - posted by "dependabot[bot] (via GitHub)" <gi...@apache.org> on 2023/02/07 21:48:33 UTC, 0 replies.
- [GitHub] [parquet-mr] WangYuxing0924 commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization - posted by "WangYuxing0924 (via GitHub)" <gi...@apache.org> on 2023/02/08 03:02:06 UTC, 0 replies.
- [GitHub] [parquet-mr] Fang-Xie commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization - posted by "Fang-Xie (via GitHub)" <gi...@apache.org> on 2023/02/08 03:04:11 UTC, 0 replies.
- [GitHub] [parquet-format] mapleFU commented on pull request #190: Minor: add FIXED_LEN_BYTE_ARRAY under Types in doc - posted by "mapleFU (via GitHub)" <gi...@apache.org> on 2023/02/08 06:05:21 UTC, 0 replies.
- [GitHub] [parquet-format] gszadovszky commented on pull request #164: PARQUET-1950: Define core features - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/08 08:34:11 UTC, 1 replies.
- [DISCUSS] ByteStreamSplitDecoder broken in presence of nulls - posted by wish maple <ma...@gmail.com> on 2023/02/09 03:46:24 UTC, 0 replies.
- [jira] [Created] (PARQUET-2241) ByteStreamSplitDecoder broken in presence of nulls - posted by "Xuwei Fu (Jira)" <ji...@apache.org> on 2023/02/09 03:50:00 UTC, 0 replies.
- [jira] [Created] (PARQUET-2242) record count for row group size check configurable - posted by "xjlem (Jira)" <ji...@apache.org> on 2023/02/09 12:43:00 UTC, 0 replies.
- [jira] [Updated] (PARQUET-2242) record count for row group size check configurable - posted by "xjlem (Jira)" <ji...@apache.org> on 2023/02/09 12:50:00 UTC, 1 replies.
- [GitHub] [parquet-mr] xjlem opened a new pull request, #1024: Parquet 2242:record count for row group size check configurable - posted by "xjlem (via GitHub)" <gi...@apache.org> on 2023/02/09 13:07:57 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1024: Parquet 2242:record count for row group size check configurable - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/10 01:44:44 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #1021: PARQUET-2229: ParquetRewriter masks and encrypts the same column - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/10 05:06:25 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2241) ByteStreamSplitDecoder broken in presence of nulls - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/02/10 08:29:00 UTC, 14 replies.
- [jira] [Updated] (PARQUET-2241) ByteStreamSplitDecoder broken in presence of nulls - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/02/10 09:02:00 UTC, 1 replies.
- [GitHub] [parquet-format] wgtmac opened a new pull request, #192: PARQUET-2241: Update wording of BYTE_STREAM_SPLIT encoding - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/10 14:54:07 UTC, 0 replies.
- [GitHub] [parquet-format] wgtmac commented on pull request #192: PARQUET-2241: Update wording of BYTE_STREAM_SPLIT encoding - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/10 14:55:52 UTC, 0 replies.
- [GitHub] [parquet-format] pitrou commented on pull request #192: PARQUET-2241: Update wording of BYTE_STREAM_SPLIT encoding - posted by "pitrou (via GitHub)" <gi...@apache.org> on 2023/02/10 15:06:27 UTC, 0 replies.
- [GitHub] [parquet-format] mapleFU commented on pull request #192: PARQUET-2241: Update wording of BYTE_STREAM_SPLIT encoding - posted by "mapleFU (via GitHub)" <gi...@apache.org> on 2023/02/10 16:43:16 UTC, 0 replies.
- [GitHub] [parquet-format] emkornfield commented on pull request #192: PARQUET-2241: Update wording of BYTE_STREAM_SPLIT encoding - posted by "emkornfield (via GitHub)" <gi...@apache.org> on 2023/02/10 18:09:47 UTC, 0 replies.
- [GitHub] [parquet-format] shangxinli merged pull request #192: PARQUET-2241: Update wording of BYTE_STREAM_SPLIT encoding - posted by "shangxinli (via GitHub)" <gi...@apache.org> on 2023/02/10 22:20:04 UTC, 0 replies.
- [GitHub] [parquet-mr] shangxinli merged pull request #1021: PARQUET-2229: ParquetRewriter masks and encrypts the same column - posted by "shangxinli (via GitHub)" <gi...@apache.org> on 2023/02/10 22:22:58 UTC, 0 replies.
- [jira] [Assigned] (PARQUET-2241) ByteStreamSplitDecoder broken in presence of nulls - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/02/11 05:22:00 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac opened a new pull request, #1025: PARQUET-2241: Fix ByteStreamSplitValuesReader with nulls - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/11 15:13:39 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2229) ParquetRewriter supports masking and encrypting the same column - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/02/11 15:26:00 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #1025: PARQUET-2241: Fix ByteStreamSplitValuesReader with nulls - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/11 15:50:57 UTC, 1 replies.
- [GitHub] [parquet-format] emkornfield commented on pull request #164: PARQUET-1950: Define core features - posted by "emkornfield (via GitHub)" <gi...@apache.org> on 2023/02/13 07:05:36 UTC, 0 replies.
- [GitHub] [parquet-mr] xjlem commented on a diff in pull request #1024: PARQUET-2242:record count for row group size check configurable - posted by "xjlem (via GitHub)" <gi...@apache.org> on 2023/02/13 08:44:50 UTC, 2 replies.
- [jira] [Commented] (PARQUET-2242) record count for row group size check configurable - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/13 08:45:00 UTC, 4 replies.
- [GitHub] [parquet-mr] wgtmac opened a new pull request, #1026: PARQUET-2228: ParquetRewriter supports more than one input file - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/13 15:08:47 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2228) ParquetRewriter supports more than one input file - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/13 15:09:00 UTC, 19 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1026: PARQUET-2228: ParquetRewriter supports more than one input file - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/13 15:15:15 UTC, 4 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1024: PARQUET-2242:record count for row group size check configurable - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/13 15:18:10 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #1026: PARQUET-2228: ParquetRewriter supports more than one input file - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/13 15:49:26 UTC, 4 replies.
- [GitHub] [parquet-mr] shangxinli commented on a diff in pull request #1025: PARQUET-2241: Fix ByteStreamSplitValuesReader with nulls - posted by "shangxinli (via GitHub)" <gi...@apache.org> on 2023/02/13 16:52:10 UTC, 0 replies.
- [GitHub] [parquet-mr] shangxinli commented on a diff in pull request #1026: PARQUET-2228: ParquetRewriter supports more than one input file - posted by "shangxinli (via GitHub)" <gi...@apache.org> on 2023/02/13 16:59:45 UTC, 1 replies.
- [GitHub] [parquet-format] julienledem commented on pull request #184: PARQUET-758: Add Float16/Half-float logical type - posted by "julienledem (via GitHub)" <gi...@apache.org> on 2023/02/13 17:31:05 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1025: PARQUET-2241: Fix ByteStreamSplitValuesReader with nulls - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/14 01:36:14 UTC, 0 replies.
- [GitHub] [parquet-mr] xjlem closed pull request #1024: PARQUET-2242:record count for row group size check configurable - posted by "xjlem (via GitHub)" <gi...@apache.org> on 2023/02/14 03:30:30 UTC, 0 replies.
- [jira] [Comment Edited] (PARQUET-2241) ByteStreamSplitDecoder broken in presence of nulls - posted by "Gabor Szadovszky (Jira)" <ji...@apache.org> on 2023/02/14 08:38:00 UTC, 0 replies.
- [GitHub] [parquet-mr] ggershinsky commented on a diff in pull request #1026: PARQUET-2228: ParquetRewriter supports more than one input file - posted by "ggershinsky (via GitHub)" <gi...@apache.org> on 2023/02/14 12:58:27 UTC, 0 replies.
- [jira] [Created] (PARQUET-2243) Support zstd-jni in DirectCodecFactory - posted by "Gabor Szadovszky (Jira)" <ji...@apache.org> on 2023/02/15 07:22:00 UTC, 0 replies.
- [GitHub] [parquet-mr] gszadovszky opened a new pull request, #1027: PARQUET-2243: Support zstd-jni in DirectCodecFactory - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/15 08:39:30 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2243) Support zstd-jni in DirectCodecFactory - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/15 08:40:00 UTC, 5 replies.
- [jira] [Created] (PARQUET-2244) Dictionary filter may skip row-groups incorrectly when evaluating notIn - posted by "Yujiang Zhong (Jira)" <ji...@apache.org> on 2023/02/15 10:01:00 UTC, 0 replies.
- [jira] [Updated] (PARQUET-2244) Dictionary filter may skip row-groups incorrectly when evaluating notIn - posted by "Yujiang Zhong (Jira)" <ji...@apache.org> on 2023/02/15 10:02:00 UTC, 0 replies.
- [GitHub] [parquet-mr] zhongyujiang opened a new pull request, #1028: PARQUET-2244: Fix notIn for columns with null values - posted by "zhongyujiang (via GitHub)" <gi...@apache.org> on 2023/02/15 10:05:38 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2244) Dictionary filter may skip row-groups incorrectly when evaluating notIn - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/15 10:06:00 UTC, 13 replies.
- [GitHub] [parquet-mr] zhongyujiang commented on pull request #1028: PARQUET-2244: Fix notIn for columns with null values - posted by "zhongyujiang (via GitHub)" <gi...@apache.org> on 2023/02/15 10:11:01 UTC, 5 replies.
- [GitHub] [parquet-mr] gszadovszky commented on a diff in pull request #1026: PARQUET-2228: ParquetRewriter supports more than one input file - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/15 10:21:24 UTC, 1 replies.
- [GitHub] [parquet-mr] gszadovszky commented on pull request #1026: PARQUET-2228: ParquetRewriter supports more than one input file - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/15 10:30:03 UTC, 1 replies.
- [GitHub] [parquet-mr] gszadovszky commented on pull request #1028: PARQUET-2244: Fix notIn for columns with null values - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/15 10:58:07 UTC, 1 replies.
- [GitHub] [parquet-mr] gszadovszky merged pull request #1028: PARQUET-2244: Fix notIn for columns with null values - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/15 10:58:35 UTC, 0 replies.
- [jira] [Assigned] (PARQUET-2244) Dictionary filter may skip row-groups incorrectly when evaluating notIn - posted by "Gabor Szadovszky (Jira)" <ji...@apache.org> on 2023/02/15 11:00:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2244) Dictionary filter may skip row-groups incorrectly when evaluating notIn - posted by "Gabor Szadovszky (Jira)" <ji...@apache.org> on 2023/02/15 11:00:00 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1027: PARQUET-2243: Support zstd-jni in DirectCodecFactory - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/15 11:32:54 UTC, 1 replies.
- [jira] [Created] (PARQUET-2245) Improve dictionary filter evaluating notEq - posted by "Yujiang Zhong (Jira)" <ji...@apache.org> on 2023/02/15 12:22:00 UTC, 0 replies.
- [GitHub] [parquet-mr] zhongyujiang opened a new pull request, #1029: PARQUET-2245: Improve dictionary filter evaluating notEq - posted by "zhongyujiang (via GitHub)" <gi...@apache.org> on 2023/02/15 12:30:56 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2245) Improve dictionary filter evaluating notEq - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/15 12:31:00 UTC, 1 replies.
- [GitHub] [parquet-mr] gszadovszky commented on a diff in pull request #1027: PARQUET-2243: Support zstd-jni in DirectCodecFactory - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/15 12:51:31 UTC, 0 replies.
- [jira] [Created] (PARQUET-2246) Add short circuit logic to column index filter - posted by "Yujiang Zhong (Jira)" <ji...@apache.org> on 2023/02/15 13:01:00 UTC, 0 replies.
- [GitHub] [parquet-mr] zhongyujiang opened a new pull request, #1030: PARQUET-2246: Add short circuit logic to column index filter - posted by "zhongyujiang (via GitHub)" <gi...@apache.org> on 2023/02/15 13:09:37 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2246) Add short circuit logic to column index filter - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/15 13:10:00 UTC, 2 replies.
- [GitHub] [parquet-mr] gszadovszky commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/15 13:13:43 UTC, 10 replies.
- [GitHub] [parquet-mr] huaxingao commented on pull request #1028: PARQUET-2244: Fix notIn for columns with null values - posted by "huaxingao (via GitHub)" <gi...@apache.org> on 2023/02/16 04:04:48 UTC, 1 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #1028: PARQUET-2244: Fix notIn for columns with null values - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/16 05:19:03 UTC, 1 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1029: PARQUET-2245: Improve dictionary filter evaluating notEq - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/16 05:50:49 UTC, 0 replies.
- [GitHub] [parquet-mr] gszadovszky commented on pull request #1027: PARQUET-2243: Support zstd-jni in DirectCodecFactory - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/16 08:08:13 UTC, 0 replies.
- [jira] [Created] (PARQUET-2247) Fail-fast if CapacityByteArrayOutputStream write overflow - posted by "dzcxzl (Jira)" <ji...@apache.org> on 2023/02/16 13:46:00 UTC, 0 replies.
- [GitHub] [parquet-mr] cxzl25 opened a new pull request, #1031: PARQUET-2247: Fail-fast if CapacityByteArrayOutputStream write overflow - posted by "cxzl25 (via GitHub)" <gi...@apache.org> on 2023/02/16 13:50:12 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2247) Fail-fast if CapacityByteArrayOutputStream write overflow - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/16 13:51:00 UTC, 6 replies.
- [GitHub] [parquet-mr] cxzl25 commented on a diff in pull request #1031: PARQUET-2247: Fail-fast if CapacityByteArrayOutputStream write overflow - posted by "cxzl25 (via GitHub)" <gi...@apache.org> on 2023/02/16 14:14:06 UTC, 1 replies.
- [jira] [Created] (PARQUET-2248) ParquetRewriter supports merging files by record - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/02/16 15:30:00 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1031: PARQUET-2247: Fail-fast if CapacityByteArrayOutputStream write overflow - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/17 02:29:25 UTC, 2 replies.
- [jira] [Commented] (PARQUET-2240) DateTimeFormatter is used in static context, but not thread safe - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/02/17 08:51:00 UTC, 0 replies.
- [jira] [Created] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs - posted by "Jan Finis (Jira)" <ji...@apache.org> on 2023/02/19 12:10:00 UTC, 0 replies.
- [GitHub] [parquet-mr] pan3793 commented on pull request #982: PARQUET-2160: Close ZstdInputStream to free off-heap memory in time. - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/02/19 13:26:06 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2160) Close decompression stream to free off-heap memory in time - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/19 13:27:00 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/02/20 05:23:00 UTC, 8 replies.
- [jira] [Comment Edited] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs - posted by "Jan Finis (Jira)" <ji...@apache.org> on 2023/02/20 09:56:00 UTC, 5 replies.
- [GitHub] [parquet-mr] shangxinli commented on pull request #1026: PARQUET-2228: ParquetRewriter supports more than one input file - posted by "shangxinli (via GitHub)" <gi...@apache.org> on 2023/02/20 17:33:43 UTC, 0 replies.
- [GitHub] [parquet-mr] parthchandra opened a new pull request, #1032: PARQUET-2164: Check size of buffered data to prevent page data from overflowing - posted by "parthchandra (via GitHub)" <gi...@apache.org> on 2023/02/20 19:51:49 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2164) CapacityByteArrayOutputStream overflow while writing causes negative row group sizes to be written - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/20 19:52:00 UTC, 8 replies.
- [GitHub] [parquet-mr] cxzl25 commented on pull request #1032: PARQUET-2164: Check size of buffered data to prevent page data from overflowing - posted by "cxzl25 (via GitHub)" <gi...@apache.org> on 2023/02/21 03:02:44 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1032: PARQUET-2164: Check size of buffered data to prevent page data from overflowing - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/21 03:08:34 UTC, 2 replies.
- [GitHub] [parquet-mr] gszadovszky merged pull request #1026: PARQUET-2228: ParquetRewriter supports more than one input file - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/21 10:43:12 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2228) ParquetRewriter supports more than one input file - posted by "Gabor Szadovszky (Jira)" <ji...@apache.org> on 2023/02/21 10:44:00 UTC, 0 replies.
- [GitHub] [parquet-mr] gszadovszky commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/21 14:04:53 UTC, 6 replies.
- [GitHub] [parquet-mr] gszadovszky merged pull request #1025: PARQUET-2241: Fix ByteStreamSplitValuesReader with nulls - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/21 15:26:41 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2241) ByteStreamSplitDecoder broken in presence of nulls - posted by "Gabor Szadovszky (Jira)" <ji...@apache.org> on 2023/02/21 15:28:00 UTC, 0 replies.
- [GitHub] [parquet-mr] parthchandra commented on a diff in pull request #1032: PARQUET-2164: Check size of buffered data to prevent page data from overflowing - posted by "parthchandra (via GitHub)" <gi...@apache.org> on 2023/02/21 23:01:46 UTC, 3 replies.
- [GitHub] [parquet-mr] zhongyujiang commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly - posted by "zhongyujiang (via GitHub)" <gi...@apache.org> on 2023/02/22 02:03:27 UTC, 1 replies.
- [jira] [Assigned] (PARQUET-2247) Fail-fast if CapacityByteArrayOutputStream write overflow - posted by "Gabor Szadovszky (Jira)" <ji...@apache.org> on 2023/02/22 08:17:00 UTC, 1 replies.
- [GitHub] [parquet-mr] gszadovszky merged pull request #1031: PARQUET-2247: Fail-fast if CapacityByteArrayOutputStream write overflow - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/22 08:17:30 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2247) Fail-fast if CapacityByteArrayOutputStream write overflow - posted by "Gabor Szadovszky (Jira)" <ji...@apache.org> on 2023/02/22 08:18:00 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #1030: PARQUET-2246: Add short circuit logic to column index filter - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/22 08:26:55 UTC, 0 replies.
- [GitHub] [parquet-mr] ggershinsky commented on a diff in pull request #1019: PARQUET-2103: Fix crypto exception in print toPrettyJSON - posted by "ggershinsky (via GitHub)" <gi...@apache.org> on 2023/02/22 12:22:15 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2103) crypto exception in print toPrettyJSON - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/22 12:23:00 UTC, 3 replies.
- [GitHub] [parquet-mr] gszadovszky merged pull request #1027: PARQUET-2243: Support zstd-jni in DirectCodecFactory - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/23 06:14:35 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2243) Support zstd-jni in DirectCodecFactory - posted by "Gabor Szadovszky (Jira)" <ji...@apache.org> on 2023/02/23 06:15:00 UTC, 0 replies.
- [GitHub] [parquet-mr] gszadovszky merged pull request #1030: PARQUET-2246: Add short circuit logic to column index filter - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/23 13:36:50 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2246) Add short circuit logic to column index filter - posted by "Gabor Szadovszky (Jira)" <ji...@apache.org> on 2023/02/23 13:37:00 UTC, 0 replies.
- [jira] [Assigned] (PARQUET-2246) Add short circuit logic to column index filter - posted by "Gabor Szadovszky (Jira)" <ji...@apache.org> on 2023/02/23 13:37:00 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2198) Vulnerabilities in jackson-databind - posted by "Brais Couce (Jira)" <ji...@apache.org> on 2023/02/23 14:22:00 UTC, 2 replies.
- [jira] [Created] (PARQUET-2250) Expose column descriptor through RecordReader - posted by "fatemah (Jira)" <ji...@apache.org> on 2023/02/23 21:08:00 UTC, 0 replies.
- [jira] [Updated] (PARQUET-2250) Expose column descriptor through RecordReader - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/23 21:09:00 UTC, 0 replies.
- Canceled event: Parquet Sync @ Tue Feb 28, 2023 8:30am - 9:30am (PST) (dev@parquet.apache.org) - posted by sh...@uber.com.INVALID on 2023/02/23 21:50:55 UTC, 0 replies.
- [jira] [Created] (PARQUET-2251) Avoid generating Bloomfilter when all pages of one column are encoded by dictionary - posted by "Mars (Jira)" <ji...@apache.org> on 2023/02/24 03:26:00 UTC, 0 replies.
- [jira] [Updated] (PARQUET-2251) Avoid generating Bloomfilter when all pages of one column are encoded by dictionary - posted by "Mars (Jira)" <ji...@apache.org> on 2023/02/24 03:26:00 UTC, 2 replies.
- [jira] [Updated] (PARQUET-2251) Avoid generating Bloomfilter when all pages of a column are encoded by dictionary - posted by "Mars (Jira)" <ji...@apache.org> on 2023/02/24 03:34:00 UTC, 0 replies.
- [GitHub] [parquet-mr] yabola opened a new pull request, #1033: PARQUET-2251 Avoid generating Bloomfilter when all pages of a column are encoded by dictionary in parquet pageV1 - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/02/24 03:35:12 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2251) Avoid generating Bloomfilter when all pages of a column are encoded by dictionary - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/24 03:36:00 UTC, 5 replies.
- [GitHub] [parquet-mr] yabola commented on pull request #1033: PARQUET-2251 Avoid generating Bloomfilter when all pages of a column are encoded by dictionary in parquet pageV1 - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/02/24 04:24:07 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2201) Add Stress test for RecordReader SkipRecords - posted by "Micah Kornfield (Jira)" <ji...@apache.org> on 2023/02/24 04:59:00 UTC, 0 replies.
- [jira] [Assigned] (PARQUET-2201) Add Stress test for RecordReader SkipRecords - posted by "Micah Kornfield (Jira)" <ji...@apache.org> on 2023/02/24 04:59:00 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1033: PARQUET-2251 Avoid generating Bloomfilter when all pages of a column are encoded by dictionary in parquet pageV1 - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/24 05:23:54 UTC, 0 replies.
- [GitHub] [parquet-mr] yabola commented on a diff in pull request #1033: PARQUET-2251 Avoid generating Bloomfilter when all pages of a column are encoded by dictionary in parquet v1 - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/02/24 06:24:15 UTC, 0 replies.
- [GitHub] [parquet-mr] whcdjj commented on pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader - posted by "whcdjj (via GitHub)" <gi...@apache.org> on 2023/02/24 06:47:44 UTC, 2 replies.
- [jira] [Assigned] (PARQUET-1629) Page-level CRC checksum verification for DataPageV2 - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/02/24 09:04:00 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac opened a new pull request, #1034: PARQUET-2230: Add a new rewrite command powered by ParquetRewriter - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/24 10:05:07 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2230) Add a new rewrite command powered by ParquetRewriter - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/24 10:06:00 UTC, 8 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #1034: PARQUET-2230: Add a new rewrite command powered by ParquetRewriter - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/24 10:59:37 UTC, 3 replies.
- [GitHub] [parquet-mr] parthchandra commented on pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader - posted by "parthchandra (via GitHub)" <gi...@apache.org> on 2023/02/24 17:38:47 UTC, 2 replies.
- [GitHub] [parquet-mr] jianchun closed pull request #1022: PARQUET-831: fix estimate page size check overflow corrupting parquet - posted by "jianchun (via GitHub)" <gi...@apache.org> on 2023/02/24 23:27:37 UTC, 0 replies.
- [GitHub] [parquet-mr] jerolba opened a new pull request, #1035: PARQUET-2202: Review usage and implementation of Preconditions.checkargument method - posted by "jerolba (via GitHub)" <gi...@apache.org> on 2023/02/26 16:04:32 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2202) Redundant String allocation on the hot path in CapacityByteArrayOutputStream.setByte - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/26 16:05:00 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages - posted by "Xuwei Fu (Jira)" <ji...@apache.org> on 2023/02/27 05:56:00 UTC, 9 replies.
- [GitHub] [parquet-mr] gszadovszky merged pull request #1034: PARQUET-2230: Add a new rewrite command powered by ParquetRewriter - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/27 07:43:31 UTC, 0 replies.
- [GitHub] [parquet-mr] gszadovszky merged pull request #1033: PARQUET-2251 Avoid generating Bloomfilter when all pages of a column are encoded by dictionary in parquet v1 - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/27 07:45:49 UTC, 0 replies.
- [GitHub] [parquet-format] wgtmac opened a new pull request, #193: PARQUET-2222: RLE encoding spec incorrect for v2 data pages - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/27 09:01:30 UTC, 0 replies.
- [GitHub] [parquet-format] wgtmac commented on pull request #193: PARQUET-2222: RLE encoding spec incorrect for v2 data pages - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/27 09:08:59 UTC, 1 replies.
- [GitHub] [parquet-format] pitrou commented on pull request #193: PARQUET-2222: RLE encoding spec incorrect for v2 data pages - posted by "pitrou (via GitHub)" <gi...@apache.org> on 2023/02/27 09:18:50 UTC, 0 replies.
- [GitHub] [parquet-mr] gszadovszky commented on pull request #1034: PARQUET-2230: Add a new rewrite command powered by ParquetRewriter - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/02/27 09:18:54 UTC, 0 replies.
- [GitHub] [parquet-mr] yabola commented on pull request #1033: PARQUET-2251 Avoid generating Bloomfilter when all pages of a column are encoded by dictionary in parquet v1 - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/02/27 09:51:02 UTC, 0 replies.
- [GitHub] [parquet-mr] satish-mittal commented on pull request #1005: PARQUET-2198 : Updating jackson data bind version to fix CVEs - posted by "satish-mittal (via GitHub)" <gi...@apache.org> on 2023/02/27 13:52:33 UTC, 0 replies.
- Gang Wu as new Apache Parquet committer - posted by Xinli shang <sh...@uber.com.INVALID> on 2023/02/27 14:27:23 UTC, 0 replies.
- [GitHub] [parquet-mr] shangxinli commented on pull request #1005: PARQUET-2198 : Updating jackson data bind version to fix CVEs - posted by "shangxinli (via GitHub)" <gi...@apache.org> on 2023/02/27 17:44:00 UTC, 0 replies.
- Parquet Null logical type question - posted by Jerry Adair <Je...@sas.com.INVALID> on 2023/02/27 17:46:32 UTC, 0 replies.
- Re: [IANA #1264350] application/vnd.apache.parquet registration request - posted by Scott Lutwyche <Sc...@des.qld.gov.au> on 2023/02/27 23:22:27 UTC, 1 replies.
- [GitHub] [parquet-mr] wgtmac opened a new pull request, #1036: PARQUET-2230: [CLI] Deprecate commands replaced by rewrite - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/28 05:05:20 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #1036: PARQUET-2230: [CLI] Deprecate commands replaced by rewrite - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/28 06:05:52 UTC, 0 replies.
- [GitHub] [parquet-mr] ggershinsky commented on pull request #1019: PARQUET-2103: Fix crypto exception in print toPrettyJSON - posted by "ggershinsky (via GitHub)" <gi...@apache.org> on 2023/02/28 07:01:32 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #1019: PARQUET-2103: Fix crypto exception in print toPrettyJSON - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/02/28 07:04:14 UTC, 0 replies.
- [GitHub] [parquet-mr] ggershinsky merged pull request #1019: PARQUET-2103: Fix crypto exception in print toPrettyJSON - posted by "ggershinsky (via GitHub)" <gi...@apache.org> on 2023/02/28 08:54:27 UTC, 0 replies.
- Fallback Encoding for Very Sparse or Sorted Datasets - posted by Patrick Hansert <ha...@informatik.uni-kl.de> on 2023/02/28 15:29:44 UTC, 0 replies.