You are viewing a plain text version of this content. The canonical link for it is here.
- Re: Fallback Encoding for Very Sparse or Sorted Datasets - posted by Gang Wu <us...@gmail.com> on 2023/03/01 02:09:41 UTC, 4 replies.
- Re: Parquet Null logical type question - posted by Micah Kornfield <em...@gmail.com> on 2023/03/01 03:09:25 UTC, 1 replies.
- [GitHub] [parquet-mr] shangxinli opened a new pull request, #1037: Add Gang Wu as committer - posted by "shangxinli (via GitHub)" <gi...@apache.org> on 2023/03/01 04:39:20 UTC, 0 replies.
- [GitHub] [parquet-mr] shangxinli commented on pull request #1037: Add Gang Wu as committer - posted by "shangxinli (via GitHub)" <gi...@apache.org> on 2023/03/01 04:40:04 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1037: Add Gang Wu as committer - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/01 04:41:26 UTC, 0 replies.
- [GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization - posted by "jiangjiguang (via GitHub)" <gi...@apache.org> on 2023/03/01 05:13:34 UTC, 2 replies.
- [GitHub] [parquet-mr] jiangjiguang closed pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization - posted by "jiangjiguang (via GitHub)" <gi...@apache.org> on 2023/03/01 05:13:35 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/01 05:14:00 UTC, 26 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #1036: PARQUET-2230: [CLI] Deprecate commands replaced by rewrite - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/01 05:16:23 UTC, 1 replies.
- [jira] [Commented] (PARQUET-2230) Add a new rewrite command powered by ParquetRewriter - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/01 05:17:00 UTC, 4 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/01 05:17:49 UTC, 1 replies.
- [GitHub] [parquet-mr] jiangjiguang opened a new pull request, #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization - posted by "jiangjiguang (via GitHub)" <gi...@apache.org> on 2023/03/01 05:35:04 UTC, 0 replies.
- [GitHub] [parquet-mr] shangxinli merged pull request #1037: Add Gang Wu as committer - posted by "shangxinli (via GitHub)" <gi...@apache.org> on 2023/03/01 06:22:37 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/01 06:40:28 UTC, 0 replies.
- [GitHub] [parquet-mr] jiangjiguang commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization - posted by "jiangjiguang (via GitHub)" <gi...@apache.org> on 2023/03/01 06:45:59 UTC, 4 replies.
- [jira] [Created] (PARQUET-2252) Make some methods public to allow external projects to implement page skipping - posted by "Yujiang Zhong (Jira)" <ji...@apache.org> on 2023/03/01 11:32:00 UTC, 0 replies.
- [GitHub] [parquet-mr] zhongyujiang opened a new pull request, #1038: PARQUET-2252: Make some methods public to allow external projects to … - posted by "zhongyujiang (via GitHub)" <gi...@apache.org> on 2023/03/01 11:49:16 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2252) Make some methods public to allow external projects to implement page skipping - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/01 11:50:00 UTC, 10 replies.
- [GitHub] [parquet-mr] zhongyujiang commented on pull request #1038: PARQUET-2252: Make some methods public to allow external projects to … - posted by "zhongyujiang (via GitHub)" <gi...@apache.org> on 2023/03/01 11:50:32 UTC, 0 replies.
- [GitHub] [parquet-mr] gszadovszky commented on pull request #1036: PARQUET-2230: [CLI] Deprecate commands replaced by rewrite - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/03/01 13:15:47 UTC, 1 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #1038: PARQUET-2252: Make some methods public to allow external projects to … - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/01 15:39:33 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac merged pull request #1036: PARQUET-2230: [CLI] Deprecate commands replaced by rewrite - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/01 15:43:23 UTC, 0 replies.
- [GitHub] [parquet-mr] gszadovszky commented on pull request #1038: PARQUET-2252: Make some methods public to allow external projects to … - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/03/01 16:08:35 UTC, 0 replies.
- [GitHub] [parquet-format] wgtmac commented on a diff in pull request #190: Minor: add FIXED_LEN_BYTE_ARRAY under Types in doc - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/02 01:28:18 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1038: PARQUET-2252: Make some methods public to allow external projects to … - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/02 01:34:38 UTC, 1 replies.
- [jira] [Resolved] (PARQUET-2251) Avoid generating Bloomfilter when all pages of a column are encoded by dictionary - posted by "Mars (Jira)" <ji...@apache.org> on 2023/03/02 02:12:00 UTC, 0 replies.
- [GitHub] [parquet-mr] jiangjiguang commented on a diff in pull request #1011: PARQUET-2159: Vectorized BytePacker decoder using Java VectorAPI - posted by "jiangjiguang (via GitHub)" <gi...@apache.org> on 2023/03/02 02:33:19 UTC, 6 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #1011: PARQUET-2159: Vectorized BytePacker decoder using Java VectorAPI - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/02 06:56:07 UTC, 1 replies.
- [GitHub] [parquet-format] gszadovszky commented on a diff in pull request #190: MINOR: Add FIXED_LEN_BYTE_ARRAY Type - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/03/02 08:10:35 UTC, 0 replies.
- [GitHub] [parquet-mr] gszadovszky commented on a diff in pull request #1011: PARQUET-2159: Vectorized BytePacker decoder using Java VectorAPI - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/03/02 08:33:17 UTC, 1 replies.
- [GitHub] [parquet-mr] nikhilenr commented on pull request #1005: PARQUET-2198 : Updating jackson data bind version to fix CVEs - posted by "nikhilenr (via GitHub)" <gi...@apache.org> on 2023/03/02 08:45:24 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2198) Vulnerabilities in jackson-databind - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/02 08:46:00 UTC, 7 replies.
- Workaround lack of union data type - posted by Hinko Kocevar <Hi...@ess.eu.INVALID> on 2023/03/02 09:00:14 UTC, 2 replies.
- [GitHub] [parquet-mr] botchniaque commented on pull request #1005: PARQUET-2198 : Updating jackson data bind version to fix CVEs - posted by "botchniaque (via GitHub)" <gi...@apache.org> on 2023/03/02 09:40:45 UTC, 0 replies.
- [GitHub] [parquet-mr] steveloughran commented on pull request #1005: PARQUET-2198 : Updating jackson data bind version to fix CVEs - posted by "steveloughran (via GitHub)" <gi...@apache.org> on 2023/03/02 11:34:51 UTC, 1 replies.
- [jira] [Created] (PARQUET-2253) Postpone dictionary encoding decision for starting null pages. - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/02 11:39:00 UTC, 0 replies.
- [GitHub] [parquet-format] wgtmac commented on a diff in pull request #190: MINOR: Add FIXED_LEN_BYTE_ARRAY Type - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/02 11:41:27 UTC, 0 replies.
- [GitHub] [parquet-format] XinyuZeng commented on a diff in pull request #190: MINOR: Add FIXED_LEN_BYTE_ARRAY Type - posted by "XinyuZeng (via GitHub)" <gi...@apache.org> on 2023/03/02 11:57:03 UTC, 0 replies.
- [GitHub] [parquet-mr] zhongyujiang commented on a diff in pull request #1038: PARQUET-2252: Make some methods public to allow external projects to … - posted by "zhongyujiang (via GitHub)" <gi...@apache.org> on 2023/03/02 12:34:22 UTC, 1 replies.
- [GitHub] [parquet-format] wgtmac merged pull request #190: MINOR: Add FIXED_LEN_BYTE_ARRAY Type - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/02 14:06:35 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2250) Expose column descriptor through RecordReader - posted by "Weston Pace (Jira)" <ji...@apache.org> on 2023/03/03 00:44:00 UTC, 0 replies.
- [jira] [Assigned] (PARQUET-2250) Expose column descriptor through RecordReader - posted by "Weston Pace (Jira)" <ji...@apache.org> on 2023/03/03 00:45:00 UTC, 0 replies.
- [GitHub] [parquet-mr] yabola commented on pull request #1039: PARQUET-2237 Improve performance by skipping BloomFilter when column has a dictionary filter - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/03/03 02:00:39 UTC, 1 replies.
- [jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/03 02:01:00 UTC, 6 replies.
- [jira] [Resolved] (PARQUET-2225) [C++] Allow reading dense with RecordReader - posted by "Micah Kornfield (Jira)" <ji...@apache.org> on 2023/03/03 19:42:00 UTC, 0 replies.
- [GitHub] [parquet-mr] jatin-bhateja commented on pull request #1011: PARQUET-2159: Vectorized BytePacker decoder using Java VectorAPI - posted by "jatin-bhateja (via GitHub)" <gi...@apache.org> on 2023/03/04 03:49:32 UTC, 0 replies.
- [GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: Vectorized BytePacker decoder using Java VectorAPI - posted by "jiangjiguang (via GitHub)" <gi...@apache.org> on 2023/03/04 13:05:40 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac merged pull request #1011: PARQUET-2159: Vectorized BytePacker decoder using Java VectorAPI - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/04 13:51:33 UTC, 0 replies.
- Re: Gang Wu as new Apache Parquet committer - posted by Micah Kornfield <em...@gmail.com> on 2023/03/04 20:40:34 UTC, 2 replies.
- [GitHub] [parquet-mr] rdblue commented on a diff in pull request #1038: PARQUET-2252: Make some methods public to allow external projects to … - posted by "rdblue (via GitHub)" <gi...@apache.org> on 2023/03/05 21:37:05 UTC, 0 replies.
- [GitHub] [parquet-mr] yabola closed pull request #1039: PARQUET-2237 Improve performance by skipping BloomFilter when column has a dictionary filter - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/03/06 00:48:53 UTC, 0 replies.
- [GitHub] [parquet-mr] jiangjiguang closed pull request #1006: PARQUET-2190 byte array has better performance than ByteBuffer - posted by "jiangjiguang (via GitHub)" <gi...@apache.org> on 2023/03/06 02:56:29 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2190) byte array has better performance than ByteBuffer - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/06 02:57:00 UTC, 0 replies.
- [GitHub] [parquet-mr] rdblue merged pull request #1038: PARQUET-2252: Make some methods public to allow external projects to … - posted by "rdblue (via GitHub)" <gi...@apache.org> on 2023/03/07 00:24:15 UTC, 0 replies.
- [GitHub] [parquet-mr] rdblue commented on pull request #1038: PARQUET-2252: Make some methods public to allow external projects to … - posted by "rdblue (via GitHub)" <gi...@apache.org> on 2023/03/07 00:24:43 UTC, 0 replies.
- [jira] [Assigned] (PARQUET-2252) Make some methods public to allow external projects to implement page skipping - posted by "Yujiang Zhong (Jira)" <ji...@apache.org> on 2023/03/07 01:22:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2252) Make some methods public to allow external projects to implement page skipping - posted by "Yujiang Zhong (Jira)" <ji...@apache.org> on 2023/03/07 01:24:00 UTC, 0 replies.
- [jira] [Created] (PARQUET-2254) Build a BloomFilter with a more precise size - posted by "Mars (Jira)" <ji...@apache.org> on 2023/03/07 07:11:00 UTC, 0 replies.
- [jira] [Updated] (PARQUET-2254) Build a BloomFilter with a more precise size - posted by "Mars (Jira)" <ji...@apache.org> on 2023/03/07 07:14:00 UTC, 2 replies.
- [GitHub] [parquet-mr] yabola commented on pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/03/07 07:15:03 UTC, 1 replies.
- [GitHub] [parquet-mr] gszadovszky commented on pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/03/07 08:03:59 UTC, 0 replies.
- [jira] [Assigned] (PARQUET-2254) Build a BloomFilter with a more precise size - posted by "Gabor Szadovszky (Jira)" <ji...@apache.org> on 2023/03/07 08:09:00 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2254) Build a BloomFilter with a more precise size - posted by "Gabor Szadovszky (Jira)" <ji...@apache.org> on 2023/03/07 08:43:00 UTC, 15 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/07 14:31:27 UTC, 0 replies.
- [GitHub] [parquet-mr] vectorijk commented on a diff in pull request #1026: PARQUET-2228: ParquetRewriter supports more than one input file - posted by "vectorijk (via GitHub)" <gi...@apache.org> on 2023/03/07 23:01:57 UTC, 2 replies.
- [jira] [Commented] (PARQUET-2228) ParquetRewriter supports more than one input file - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/07 23:02:00 UTC, 3 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1026: PARQUET-2228: ParquetRewriter supports more than one input file - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/08 01:20:44 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1035: PARQUET-2202: Review usage and implementation of Preconditions.checkargument method - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/08 01:28:49 UTC, 1 replies.
- [jira] [Commented] (PARQUET-2202) Redundant String allocation on the hot path in CapacityByteArrayOutputStream.setByte - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/08 01:29:00 UTC, 5 replies.
- [GitHub] [parquet-mr] jerolba commented on a diff in pull request #1035: PARQUET-2202: Review usage and implementation of Preconditions.checkargument method - posted by "jerolba (via GitHub)" <gi...@apache.org> on 2023/03/08 11:15:02 UTC, 1 replies.
- [GitHub] [parquet-mr] jerolba commented on pull request #1035: PARQUET-2202: Review usage and implementation of Preconditions.checkargument method - posted by "jerolba (via GitHub)" <gi...@apache.org> on 2023/03/08 11:17:39 UTC, 0 replies.
- [jira] [Assigned] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly - posted by "Mars (Jira)" <ji...@apache.org> on 2023/03/08 16:10:00 UTC, 0 replies.
- [jira] [Commented] (PARQUET-1889) Register a MIME type for the Parquet format. - posted by "Bryce Mecum (Jira)" <ji...@apache.org> on 2023/03/08 21:11:00 UTC, 0 replies.
- [GitHub] [parquet-mr] vimal3271 commented on pull request #1005: PARQUET-2198 : Updating jackson data bind version to fix CVEs - posted by "vimal3271 (via GitHub)" <gi...@apache.org> on 2023/03/10 08:19:04 UTC, 0 replies.
- [GitHub] [parquet-mr] shangxinli commented on pull request #1005: PARQUET-2198 : Updating jackson data bind version to fix CVEs - posted by "shangxinli (via GitHub)" <gi...@apache.org> on 2023/03/10 15:21:39 UTC, 1 replies.
- [GitHub] [parquet-mr] wgtmac merged pull request #1032: PARQUET-2164: Check size of buffered data to prevent page data from overflowing - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/11 14:07:00 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2164) CapacityByteArrayOutputStream overflow while writing causes negative row group sizes to be written - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/11 14:08:00 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac merged pull request #1035: PARQUET-2202: Review usage and implementation of Preconditions.checkargument method - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/11 14:23:11 UTC, 0 replies.
- Writing to Parquet - posted by srinivasarao vundavalli <sr...@gmail.com> on 2023/03/13 13:52:06 UTC, 0 replies.
- [jira] [Created] (PARQUET-2255) BloomFilter and float point is ambiguous - posted by "Xuwei Fu (Jira)" <ji...@apache.org> on 2023/03/13 14:26:00 UTC, 0 replies.
- [jira] [Created] (PARQUET-2256) Adding Compression for BloomFilter - posted by "Xuwei Fu (Jira)" <ji...@apache.org> on 2023/03/13 14:31:00 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2255) BloomFilter and float point is ambiguous - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/13 15:01:00 UTC, 4 replies.
- [jira] [Updated] (PARQUET-2256) Adding Compression for BloomFilter - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/13 15:09:00 UTC, 1 replies.
- [jira] [Created] (PARQUET-2257) [Format] Add bloom_filter_length to ColumnMetaData - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/13 15:09:00 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2256) Adding Compression for BloomFilter - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/13 15:12:00 UTC, 6 replies.
- [GitHub] [parquet-format] wgtmac opened a new pull request, #194: PARQUET-2257: Add bloom_filter_length to ColumnMetaData - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/13 15:17:04 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2257) [Format] Add bloom_filter_length to ColumnMetaData - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/13 15:18:00 UTC, 5 replies.
- [GitHub] [parquet-format] wgtmac commented on pull request #194: PARQUET-2257: Add bloom_filter_length to ColumnMetaData - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/13 15:19:20 UTC, 1 replies.
- [GitHub] [parquet-format] mapleFU commented on a diff in pull request #194: PARQUET-2257: Add bloom_filter_length to ColumnMetaData - posted by "mapleFU (via GitHub)" <gi...@apache.org> on 2023/03/13 15:23:24 UTC, 0 replies.
- [GitHub] [parquet-format] wgtmac commented on a diff in pull request #194: PARQUET-2257: Add bloom_filter_length to ColumnMetaData - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/13 15:26:19 UTC, 1 replies.
- Apache Parquet Jackson Update - posted by Shrushti Patel <sh...@slack-corp.com.INVALID> on 2023/03/15 17:24:46 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2230) Add a new rewrite command powered by ParquetRewriter - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/16 09:27:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2219) ParquetFileReader throws a runtime exception when a file contains only headers and now row data - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/16 09:28:00 UTC, 0 replies.
- [jira] [Created] (PARQUET-2258) Storing toString fields in FilterPredicate instances can lead to memory pressure - posted by "László Bodor (Jira)" <ji...@apache.org> on 2023/03/16 11:55:00 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2258) Storing toString fields in FilterPredicate instances can lead to memory pressure - posted by "László Bodor (Jira)" <ji...@apache.org> on 2023/03/16 11:57:00 UTC, 6 replies.
- [GitHub] [parquet-mr] abstractdog opened a new pull request, #1040: PARQUET-2258: Storing toString fields in FilterPredicate instances can lead to memory pressure - posted by "abstractdog (via GitHub)" <gi...@apache.org> on 2023/03/16 12:17:40 UTC, 0 replies.
- [jira] [Updated] (PARQUET-2258) Storing toString fields in FilterPredicate instances can lead to memory pressure - posted by "László Bodor (Jira)" <ji...@apache.org> on 2023/03/16 13:23:00 UTC, 3 replies.
- [jira] [Assigned] (PARQUET-2258) Storing toString fields in FilterPredicate instances can lead to memory pressure - posted by "László Bodor (Jira)" <ji...@apache.org> on 2023/03/16 13:32:00 UTC, 0 replies.
- [GitHub] [parquet-mr] abstractdog commented on pull request #1040: PARQUET-2258: Storing toString fields in FilterPredicate instances can lead to memory pressure - posted by "abstractdog (via GitHub)" <gi...@apache.org> on 2023/03/16 13:33:00 UTC, 1 replies.
- [jira] [Created] (PARQUET-2259) Update parquet site - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/16 15:41:00 UTC, 0 replies.
- [jira] [Updated] (PARQUET-2259) [Site] Update parquet site - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/16 15:42:00 UTC, 1 replies.
- [GitHub] [parquet-mr] wgtmac merged pull request #1040: PARQUET-2258: Storing toString fields in FilterPredicate instances can lead to memory pressure - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/17 01:35:30 UTC, 0 replies.
- [jira] [Commented] (PARQUET-1690) Integer Overflow of BinaryStatistics#isSmallerThan() - posted by "Alexey Diomin (Jira)" <ji...@apache.org> on 2023/03/17 07:15:00 UTC, 2 replies.
- [jira] [Comment Edited] (PARQUET-1690) Integer Overflow of BinaryStatistics#isSmallerThan() - posted by "Alexey Diomin (Jira)" <ji...@apache.org> on 2023/03/17 07:18:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2258) Storing toString fields in FilterPredicate instances can lead to memory pressure - posted by "László Bodor (Jira)" <ji...@apache.org> on 2023/03/17 07:26:00 UTC, 0 replies.
- [GitHub] [parquet-site] wgtmac commented on pull request #31: PARQUET-2259: Update site to sync with latest parquet-format - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/17 07:27:54 UTC, 2 replies.
- [GitHub] [parquet-format] mapleFU opened a new pull request, #195: PARQUET-2256: Add BloomFilter Compression - posted by "mapleFU (via GitHub)" <gi...@apache.org> on 2023/03/17 08:09:55 UTC, 0 replies.
- [GitHub] [parquet-format] mapleFU commented on pull request #195: PARQUET-2256: Add BloomFilter Compression - posted by "mapleFU (via GitHub)" <gi...@apache.org> on 2023/03/17 08:10:19 UTC, 0 replies.
- [GitHub] [parquet-site] gszadovszky commented on pull request #31: PARQUET-2259: Update site to sync with latest parquet-format - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/03/17 08:15:02 UTC, 2 replies.
- [jira] [Assigned] (PARQUET-2256) Adding Compression for BloomFilter - posted by "Gabor Szadovszky (Jira)" <ji...@apache.org> on 2023/03/17 08:27:00 UTC, 0 replies.
- [GitHub] [parquet-format] wgtmac commented on pull request #193: PARQUET-2222: RLE encoding spec incorrect for v2 data pages - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/17 08:27:05 UTC, 4 replies.
- [jira] [Commented] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/17 08:28:00 UTC, 16 replies.
- [GitHub] [parquet-format] gszadovszky commented on pull request #195: PARQUET-2256: Add BloomFilter Compression - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/03/17 10:18:38 UTC, 0 replies.
- [GitHub] [parquet-format] gszadovszky commented on pull request #193: PARQUET-2222: RLE encoding spec incorrect for v2 data pages - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/03/17 11:04:20 UTC, 0 replies.
- [GitHub] [parquet-mr] mdadil-dk commented on pull request #1005: PARQUET-2198 : Updating jackson data bind version to fix CVEs - posted by "mdadil-dk (via GitHub)" <gi...@apache.org> on 2023/03/17 12:10:18 UTC, 0 replies.
- [GitHub] [parquet-format] wgtmac commented on pull request #195: PARQUET-2256: Add BloomFilter Compression - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/17 14:37:26 UTC, 0 replies.
- [GitHub] [parquet-format] pitrou commented on pull request #193: PARQUET-2222: RLE encoding spec incorrect for v2 data pages - posted by "pitrou (via GitHub)" <gi...@apache.org> on 2023/03/17 16:11:39 UTC, 6 replies.
- [GitHub] [parquet-format] mapleFU commented on pull request #193: PARQUET-2222: RLE encoding spec incorrect for v2 data pages - posted by "mapleFU (via GitHub)" <gi...@apache.org> on 2023/03/17 17:04:14 UTC, 2 replies.
- [GitHub] [parquet-mr] hamza-tam opened a new pull request, #1041: Uniformizing booleans naming in ParquetWriter - posted by "hamza-tam (via GitHub)" <gi...@apache.org> on 2023/03/17 22:58:26 UTC, 0 replies.
- [GitHub] [parquet-mr] yabola opened a new pull request, #1042: PARQUET-2254 Support building dynamic bloom filter that adapts to the data - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/03/19 03:15:48 UTC, 0 replies.
- [jira] [Created] (PARQUET-2260) Bloom filter bytes size should't be larger than `parquet.bloom.filter.max.bytes` in the configuration - posted by "Mars (Jira)" <ji...@apache.org> on 2023/03/19 09:59:00 UTC, 0 replies.
- [jira] [Updated] (PARQUET-2260) Bloom filter bytes size should't be larger than maxBytes size in the configuration - posted by "Mars (Jira)" <ji...@apache.org> on 2023/03/19 09:59:00 UTC, 2 replies.
- [jira] [Assigned] (PARQUET-2260) Bloom filter bytes size should't be larger than maxBytes size in the configuration - posted by "Mars (Jira)" <ji...@apache.org> on 2023/03/19 09:59:00 UTC, 0 replies.
- [jira] [Updated] (PARQUET-2260) Bloom filter bytes size shouldn't be larger than maxBytes size in the configuration - posted by "Mars (Jira)" <ji...@apache.org> on 2023/03/19 10:22:00 UTC, 2 replies.
- [GitHub] [parquet-mr] yabola opened a new pull request, #1043: Bloom filter bytes size shouldn't be larger than maxBytes size in the configuration - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/03/19 10:22:46 UTC, 0 replies.
- [GitHub] [parquet-mr] yabola commented on pull request #1043: PARQUET-2260 Bloom filter size shouldn't be larger than maxBytes in the configuration - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/03/19 12:44:37 UTC, 1 replies.
- [jira] [Commented] (PARQUET-2260) Bloom filter bytes size shouldn't be larger than maxBytes size in the configuration - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/19 12:45:00 UTC, 19 replies.
- [GitHub] [parquet-mr] yabola commented on pull request #1042: PARQUET-2254 Support building dynamic bloom filter that adapts to the data - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/03/19 13:00:54 UTC, 2 replies.
- [GitHub] [parquet-mr] yabola commented on a diff in pull request #1042: PARQUET-2254 Support building dynamic bloom filter that adapts to the data - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/03/19 13:00:56 UTC, 5 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1043: PARQUET-2260 Bloom filter size shouldn't be larger than maxBytes in the configuration - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/21 04:59:23 UTC, 4 replies.
- [GitHub] [parquet-mr] yabola commented on a diff in pull request #1043: PARQUET-2260 Bloom filter size shouldn't be larger than maxBytes in the configuration - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/03/21 06:05:19 UTC, 11 replies.
- [GitHub] [parquet-format] JFinis opened a new pull request, #196: PARQUET-2249: Add nan_count to handle NaNs in statistics - posted by "JFinis (via GitHub)" <gi...@apache.org> on 2023/03/22 11:41:11 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/22 11:42:00 UTC, 30 replies.
- [GitHub] [parquet-mr] wgtmac opened a new pull request, #1044: PARQUET-1629: Support CRC for data page v2 - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/23 05:34:44 UTC, 0 replies.
- [jira] [Commented] (PARQUET-1629) Page-level CRC checksum verification for DataPageV2 - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/23 05:35:00 UTC, 0 replies.
- [GitHub] [parquet-format] wgtmac commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/23 09:43:19 UTC, 1 replies.
- [GitHub] [parquet-mr] hazelnutsgz commented on pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader - posted by "hazelnutsgz (via GitHub)" <gi...@apache.org> on 2023/03/23 09:52:18 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2149) Implement async IO for Parquet file reader - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/23 09:53:00 UTC, 3 replies.
- [GitHub] [parquet-format] JFinis commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics - posted by "JFinis (via GitHub)" <gi...@apache.org> on 2023/03/23 10:47:36 UTC, 16 replies.
- [GitHub] [parquet-format] mapleFU commented on a diff in pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics - posted by "mapleFU (via GitHub)" <gi...@apache.org> on 2023/03/23 11:03:17 UTC, 4 replies.
- [GitHub] [parquet-format] zhongyujiang commented on pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics - posted by "zhongyujiang (via GitHub)" <gi...@apache.org> on 2023/03/23 13:53:03 UTC, 1 replies.
- [GitHub] [parquet-format] JFinis commented on pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics - posted by "JFinis (via GitHub)" <gi...@apache.org> on 2023/03/23 15:10:28 UTC, 1 replies.
- [GitHub] [parquet-mr] parthchandra commented on pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader - posted by "parthchandra (via GitHub)" <gi...@apache.org> on 2023/03/24 01:29:34 UTC, 1 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/24 01:31:01 UTC, 0 replies.
- [GitHub] [parquet-format] wgtmac merged pull request #193: PARQUET-2222: RLE encoding spec incorrect for v2 data pages - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/24 09:52:12 UTC, 0 replies.
- [DISCUSS] Add a Plain Encoding Size Bytes to Parquet Metadata - posted by Micah Kornfield <em...@gmail.com> on 2023/03/24 16:26:51 UTC, 9 replies.
- [jira] [Created] (PARQUET-2261) [Format] Add statistics that reflect decoded size to metadata - posted by "Micah Kornfield (Jira)" <ji...@apache.org> on 2023/03/26 02:30:00 UTC, 0 replies.
- [GitHub] [parquet-format] mapleFU commented on a diff in pull request #197: PARQUET-2261: Initial proposal for unencoded/uncompressed statistics - posted by "mapleFU (via GitHub)" <gi...@apache.org> on 2023/03/26 05:19:50 UTC, 2 replies.
- [jira] [Commented] (PARQUET-2261) [Format] Add statistics that reflect decoded size to metadata - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/26 05:20:00 UTC, 46 replies.
- [GitHub] [parquet-format] wgtmac commented on a diff in pull request #197: PARQUET-2261: Initial proposal for unencoded/uncompressed statistics - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/26 05:27:45 UTC, 3 replies.
- [jira] [Resolved] (PARQUET-2202) Redundant String allocation on the hot path in CapacityByteArrayOutputStream.setByte - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 05:43:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2164) CapacityByteArrayOutputStream overflow while writing causes negative row group sizes to be written - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 05:45:00 UTC, 0 replies.
- [jira] [Updated] (PARQUET-2252) Make some methods public to allow external projects to implement page skipping - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 05:46:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2159) Parquet bit-packing de/encode optimization - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 05:47:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2103) crypto exception in print toPrettyJSON - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 05:50:00 UTC, 0 replies.
- [GitHub] [parquet-format] emkornfield commented on a diff in pull request #197: PARQUET-2261: Initial proposal for unencoded/uncompressed statistics - posted by "emkornfield (via GitHub)" <gi...@apache.org> on 2023/03/26 05:51:18 UTC, 7 replies.
- [jira] [Resolved] (PARQUET-2224) Publish SBOM artifacts - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:00:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2208) Add details to nested column encryption config doc and exception text - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:01:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2198) Vulnerabilities in jackson-databind - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:02:00 UTC, 0 replies.
- [jira] [Assigned] (PARQUET-2195) Add scan command to parquet-cli - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:03:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2195) Add scan command to parquet-cli - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:03:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2177) Fix parquet-cli not to fail showing descriptions - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:03:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-1711) [parquet-protobuf] stack overflow when work with well known json type - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:05:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2176) Parquet writers should allow for configurable index/statistics truncation - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:07:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2197) Document uniform encryption - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:08:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2185) ParquetReader constructed using builder fails to read encrypted files - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:09:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2192) Add Java 17 build test to GitHub action - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:10:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2191) Upgrade Scala to 2.12.17 - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:11:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2167) CLI show footer command fails if Parquet file contains date fields - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:12:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2169) Upgrade Avro to version 1.11.1 - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:12:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2155) Upgrade protobuf version to 3.17.3 - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:13:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2134) Incorrect type checking in HadoopStreams.wrap - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:13:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2138) Add ShowBloomFilterCommand to parquet-cli - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:15:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2161) Row positions are computed incorrectly when range or offset metadata filter is used - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:16:00 UTC, 0 replies.
- [jira] [Resolved] (PARQUET-2154) ParquetFileReader should close its input stream when `filterRowGroups` throw Exception in constructor - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/26 06:17:00 UTC, 0 replies.
- [VOTE] Release Apache Parquet 1.12.4 RC0 - posted by Gang Wu <us...@gmail.com> on 2023/03/26 14:42:08 UTC, 16 replies.
- [jira] [Commented] (PARQUET-2224) Publish SBOM artifacts - posted by "Dongjoon Hyun (Jira)" <ji...@apache.org> on 2023/03/27 00:15:00 UTC, 12 replies.
- [jira] [Assigned] (PARQUET-2224) Publish SBOM artifacts - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/27 01:18:00 UTC, 0 replies.
- [jira] [Created] (PARQUET-2262) Fix local build failure from maven-surefire-plugin due to missing surefire.argLine - posted by "Gang Wu (Jira)" <ji...@apache.org> on 2023/03/27 01:57:00 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac opened a new pull request, #1045: PARQUET-2262: Fix local build failure due to missing surefire.argLine - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/27 02:00:23 UTC, 0 replies.
- [jira] [Commented] (PARQUET-2262) Fix local build failure from maven-surefire-plugin due to missing surefire.argLine - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/27 02:01:00 UTC, 1 replies.
- [GitHub] [parquet-format] wgtmac commented on pull request #197: PARQUET-2261: Initial proposal for unencoded/uncompressed statistics - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/27 03:55:16 UTC, 0 replies.
- [GitHub] [parquet-format] emkornfield commented on a diff in pull request #197: PARQUET-2261: Proposal for unencoded/uncompressed statistics - posted by "emkornfield (via GitHub)" <gi...@apache.org> on 2023/03/27 04:19:01 UTC, 13 replies.
- [GitHub] [parquet-format] wgtmac commented on a diff in pull request #197: PARQUET-2261: Proposal for unencoded/uncompressed statistics - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/27 05:10:14 UTC, 7 replies.
- [GitHub] [parquet-mr] wgtmac commented on pull request #1045: PARQUET-2262: Fix local build failure due to missing surefire.argLine - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/27 14:23:26 UTC, 0 replies.
- [GitHub] [parquet-format] mapleFU commented on a diff in pull request #197: PARQUET-2261: Proposal for unencoded/uncompressed statistics - posted by "mapleFU (via GitHub)" <gi...@apache.org> on 2023/03/27 14:25:33 UTC, 1 replies.
- [GitHub] [parquet-format] mapleFU commented on pull request #197: PARQUET-2261: Proposal for unencoded/uncompressed statistics - posted by "mapleFU (via GitHub)" <gi...@apache.org> on 2023/03/27 14:28:33 UTC, 0 replies.
- [GitHub] [parquet-format] emkornfield commented on pull request #197: PARQUET-2261: Proposal for unencoded/uncompressed statistics - posted by "emkornfield (via GitHub)" <gi...@apache.org> on 2023/03/27 19:06:16 UTC, 0 replies.
- [DISCUSS](PARQUET-2249) Add nan_count to handle NaNs in statistics - posted by Jan Finis <jp...@gmail.com> on 2023/03/28 08:21:55 UTC, 0 replies.
- [GitHub] [parquet-format] yqiu2 commented on a diff in pull request #197: PARQUET-2261: Proposal for unencoded/uncompressed statistics - posted by "yqiu2 (via GitHub)" <gi...@apache.org> on 2023/03/28 19:28:09 UTC, 4 replies.
- A Message from the Board to PMC members - posted by Rich Bowen <rb...@apache.org> on 2023/03/29 12:42:18 UTC, 0 replies.
- [GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1042: PARQUET-2254 Support building dynamic bloom filter that adapts to the data - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/03/31 09:16:29 UTC, 0 replies.
- [GitHub] [parquet-mr] ConeyLiu commented on a diff in pull request #456: PARQUET-1211: Column indexes: read/write API - posted by "ConeyLiu (via GitHub)" <gi...@apache.org> on 2023/03/31 09:27:55 UTC, 0 replies.
- [jira] [Commented] (PARQUET-1211) Column indexes: read/write API - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/31 09:28:00 UTC, 1 replies.
- [GitHub] [parquet-mr] gszadovszky commented on a diff in pull request #1043: PARQUET-2260 Bloom filter size shouldn't be larger than maxBytes in the configuration - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/03/31 11:48:18 UTC, 0 replies.
- [GitHub] [parquet-format] gszadovszky commented on pull request #196: PARQUET-2249: Add nan_count to handle NaNs in statistics - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/03/31 13:00:28 UTC, 0 replies.
- [GitHub] [parquet-mr] gszadovszky commented on a diff in pull request #456: PARQUET-1211: Column indexes: read/write API - posted by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/03/31 14:48:33 UTC, 0 replies.