You are viewing a plain text version of this content. The canonical link for it is here.
- [GitHub] [spark] zhenlineo opened a new pull request, #40628: [SPARK-42999][Connect] Dataset#foreach, foreachPartition - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/01 00:08:34 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #39136: Create stable names for dynamically generated classes - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/01 00:19:37 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #39130: [SPARK-xxxxx][DOCUMENTATION][PYTHON] Fix grammar in docstring for toDF(). - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/01 00:19:39 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #39114: [SPARK-40708][SQL][WIP] Auto update partition statistics based on write metrics - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/01 00:19:40 UTC, 0 replies.
- [GitHub] [spark] zhenlineo commented on pull request #40564: [SPARK-42519] [Test] [Connect] Add More WriteTo Tests In Spark Connect Client - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/01 00:21:10 UTC, 0 replies.
- [GitHub] [spark] liuzqt opened a new pull request, #40629: [SPARK-42980][CORE] Implement a lightweight SmallBroadcast - posted by "liuzqt (via GitHub)" <gi...@apache.org> on 2023/04/01 00:24:36 UTC, 0 replies.
- [GitHub] [spark] aokolnychyi opened a new pull request, #40630: [SPARK-42997][SQL] TableOutputResolver must use correct column paths in error messages for arrays and maps - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/01 00:55:06 UTC, 0 replies.
- [GitHub] [spark] aokolnychyi commented on a diff in pull request #40630: [SPARK-42997][SQL] TableOutputResolver must use correct column paths in error messages for arrays and maps - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/01 00:57:06 UTC, 5 replies.
- [GitHub] [spark] aokolnychyi commented on pull request #40630: [SPARK-42997][SQL] TableOutputResolver must use correct column paths in error messages for arrays and maps - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/01 00:59:08 UTC, 1 replies.
- [GitHub] [spark] clownxc closed pull request #40626: [SPARK-42860][SQL] Add analysed logical mode in org.apache.spark.sql.execution.ExplainMode - posted by "clownxc (via GitHub)" <gi...@apache.org> on 2023/04/01 01:05:57 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40564: [SPARK-42519] [Test] [Connect] Add More WriteTo Tests In Spark Connect Client - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/01 01:13:30 UTC, 1 replies.
- [GitHub] [spark] clownxc opened a new pull request, #40631: [SPARK-42860] [SQL] Add analysed logical mode in org.apache.spark.sql.execution.ExplainMode - posted by "clownxc (via GitHub)" <gi...@apache.org> on 2023/04/01 01:15:55 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40564: [SPARK-42519] [Test] [Connect] Add More WriteTo Tests In Spark Connect Client - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/01 01:16:38 UTC, 6 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40627: [SPARK-42998][CONNECT][PYTHON] Fix DataFrame.collect with null struct - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/01 01:35:14 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40627: [SPARK-42998][CONNECT][PYTHON] Fix DataFrame.collect with null struct - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/01 01:36:10 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40607: [SPARK-42993][ML][CONNECT] Make PyTorch Distributor support Spark Connect - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/01 01:56:01 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on a diff in pull request #40630: [SPARK-42997][SQL] TableOutputResolver must use correct column paths in error messages for arrays and maps - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/01 02:03:27 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40555: [SPARK-42926][BUILD][SQL] Upgrade Parquet to 1.12.4 - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/01 02:07:50 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X opened a new pull request, #40632: [SPARK-42298][SQL] Assign name to _LEGACY_ERROR_TEMP_2132 - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/01 02:53:24 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on a diff in pull request #40632: [SPARK-42298][SQL] Assign name to _LEGACY_ERROR_TEMP_2132 - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/01 02:56:21 UTC, 9 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/01 03:03:20 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on a diff in pull request #40564: [SPARK-42519][CONNECT][TESTS] Add More WriteTo Tests In Spark Connect Client - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/01 03:06:22 UTC, 5 replies.
- [GitHub] [spark] shrprasa commented on pull request #40258: [SPARK-42655][SQL] Incorrect ambiguous column reference error - posted by "shrprasa (via GitHub)" <gi...@apache.org> on 2023/04/01 03:44:22 UTC, 2 replies.
- [GitHub] [spark] shrprasa commented on pull request #40128: [SPARK-42466][K8S]: Cleanup k8s upload directory when job terminates - posted by "shrprasa (via GitHub)" <gi...@apache.org> on 2023/04/01 03:47:35 UTC, 1 replies.
- [GitHub] [spark] wangyum closed pull request #40555: [SPARK-42926][BUILD][SQL] Upgrade Parquet to 1.12.4 - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/01 04:54:49 UTC, 0 replies.
- [GitHub] [spark] wangyum opened a new pull request, #40633: [SPARK-43000][SQL] Do not cast to double type in `PromoteStrings` - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/01 07:09:11 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40613: [SPARK-42974][CORE] Restore `Utils.createTempDir` to use the `ShutdownHookManager` and clean up `JavaUtils.createTempDir` method. - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/01 07:24:33 UTC, 3 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40564: [SPARK-42519][CONNECT][TESTS] Add More WriteTo Tests In Spark Connect Client - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/01 07:40:01 UTC, 2 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40564: [SPARK-42519][CONNECT][TESTS] Add More WriteTo Tests In Spark Connect Client - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/01 08:34:41 UTC, 1 replies.
- [GitHub] [spark] filozof50 commented on pull request #39136: Create stable names for dynamically generated classes - posted by "filozof50 (via GitHub)" <gi...@apache.org> on 2023/04/01 09:36:57 UTC, 0 replies.
- [GitHub] [spark] yabola commented on a diff in pull request #39950: [SPARK-42388][SQL] Avoid parquet footer reads twice in vectorized reader - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/04/01 12:37:08 UTC, 2 replies.
- [GitHub] [spark] Leibnizhu opened a new pull request, #40634: [SPARK-42840][SQL] Rename the error class _LEGACY_ERROR_TEMP_2004 to NO_DEFAULT_FOR_DATA_TYPE - posted by "Leibnizhu (via GitHub)" <gi...@apache.org> on 2023/04/01 12:48:18 UTC, 0 replies.
- [GitHub] [spark] grundprinzip commented on pull request #40603: [MINOR][CONNECT] Adding Proto Debug String to Job Description. - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/01 13:41:20 UTC, 1 replies.
- [GitHub] [spark] srowen commented on a diff in pull request #40613: [SPARK-42974][CORE] Restore `Utils.createTempDir` to use the `ShutdownHookManager` and clean up `JavaUtils.createTempDir` method. - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/01 13:45:46 UTC, 0 replies.
- [GitHub] [spark] juanvisoler commented on pull request #40608: [SPARK-35198][CONNECT][CORE][PYTHON][SQL] Add support for calling debugCodegen from Python & Java - posted by "juanvisoler (via GitHub)" <gi...@apache.org> on 2023/04/01 16:15:21 UTC, 1 replies.
- [GitHub] [spark] tamama commented on pull request #37206: [SPARK-39696][CORE] Ensure Concurrent r/w `TaskMetrics` not throw Exception - posted by "tamama (via GitHub)" <gi...@apache.org> on 2023/04/01 16:21:35 UTC, 2 replies.
- [GitHub] [spark] srowen commented on pull request #40619: fix typo in StorageLevel __eq__() - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/01 19:21:51 UTC, 0 replies.
- [GitHub] [spark] srowen commented on pull request #40620: fix typo in pyspark/pandas/config.py - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/01 19:22:06 UTC, 0 replies.
- [GitHub] [spark] srowen commented on pull request #40606: Debugging is awesome - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/01 19:22:56 UTC, 0 replies.
- [GitHub] [spark] srowen closed pull request #39880: typo: StogeLevel -> StorageLevel - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/01 19:24:00 UTC, 0 replies.
- [GitHub] [spark] srowen commented on pull request #39880: typo: StogeLevel -> StorageLevel - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/01 19:24:02 UTC, 0 replies.
- [GitHub] [spark] grundprinzip commented on pull request #40606: Debugging is awesome - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/01 19:24:26 UTC, 0 replies.
- [GitHub] [spark] grundprinzip closed pull request #40606: Debugging is awesome - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/01 19:24:27 UTC, 0 replies.
- [GitHub] [spark] srowen commented on a diff in pull request #40588: [SPARK-42964][SQL] PosgresDialect '42P07' also means table already exists - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/01 19:27:59 UTC, 1 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40613: [SPARK-42974][CORE] Restore `Utils.createTempDir` to use the `ShutdownHookManager` and clean up `JavaUtils.createTempDir` method. - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/01 22:53:11 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #39136: Create stable names for dynamically generated classes - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/02 00:20:10 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #39114: [SPARK-40708][SQL][WIP] Auto update partition statistics based on write metrics - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/02 00:20:11 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #35785: [SPARK-38213][STREAMING] Adding KafkaSink Metrics feature - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/02 00:20:12 UTC, 0 replies.
- [GitHub] [spark] clownxc closed pull request #40631: [SPARK-42860][SQL] Add analysed logical mode in org.apache.spark.sql.execution.ExplainMode - posted by "clownxc (via GitHub)" <gi...@apache.org> on 2023/04/02 03:01:17 UTC, 0 replies.
- [GitHub] [spark] clownxc opened a new pull request, #40635: Add analysed logical mode in org.apache.spark.sql.execution.ExplainMode - posted by "clownxc (via GitHub)" <gi...@apache.org> on 2023/04/02 03:01:41 UTC, 0 replies.
- [GitHub] [spark] clownxc opened a new pull request, #40636: [SPARK-42774][SQL]Expose VectorTypes API for DataSourceV2 Batch Scans - posted by "clownxc (via GitHub)" <gi...@apache.org> on 2023/04/02 05:23:36 UTC, 0 replies.
- [GitHub] [spark] thyecust commented on pull request #40622: fix typo in ResourceRequest.equals() - posted by "thyecust (via GitHub)" <gi...@apache.org> on 2023/04/02 15:07:14 UTC, 0 replies.
- [GitHub] [spark] clownxc closed pull request #40636: [SPARK-42774][SQL]Expose VectorTypes API for DataSourceV2 Batch Scans - posted by "clownxc (via GitHub)" <gi...@apache.org> on 2023/04/02 15:19:05 UTC, 0 replies.
- [GitHub] [spark] ShreyeshArangath opened a new pull request, #40637: Modify yarn client application report logging frequency to reduce noise - posted by "ShreyeshArangath (via GitHub)" <gi...@apache.org> on 2023/04/02 17:20:01 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40438: [SPARK-42806][SPARK-42811][CONNECT] Add `Catalog` support - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/02 23:49:59 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40438: [SPARK-42806][SPARK-42811][CONNECT] Add `Catalog` support - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/02 23:50:21 UTC, 0 replies.
- [GitHub] [spark] wangyum opened a new pull request, #40555: [SPARK-42926][BUILD][SQL] Upgrade Parquet to 1.12.4 - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/02 23:55:13 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40561: [SPARK-42931][SS] Introduce dropDuplicatesWithinWatermark - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/02 23:56:51 UTC, 7 replies.
- [GitHub] [spark] wangyum commented on pull request #40633: [SPARK-43000][SQL] Do not cast to double type in `PromoteStrings` - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/02 23:57:25 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #39184: [SPARK-41635][SQL] GROUP BY ALL - ansi mode test case - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/03 00:18:53 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #35785: [SPARK-38213][STREAMING] Adding KafkaSink Metrics feature - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/03 00:18:55 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40635: [SPARK-42860][SQL] Add analysed logical mode in org.apache.spark.sql.execution.ExplainMode - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/03 00:20:15 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40637: Modify yarn client application report logging frequency to reduce noise - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/03 00:20:47 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40633: [SPARK-43000][SQL] Do not cast to double type in `PromoteStrings` - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/03 00:21:25 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40628: [SPARK-42999][Connect] Dataset#foreach, foreachPartition - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/03 00:23:15 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40607: [SPARK-42993][ML][CONNECT] Make PyTorch Distributor compatible with Spark Connect - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/03 00:25:34 UTC, 5 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40615: [WIP][SPARK-16484][SQL] Add support for Datasketches HllSketch - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/03 00:26:54 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40614: [SPARK-42987][DOCS] Correction of protobuf sql documentation - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/03 00:28:12 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40607: [SPARK-42993][ML][CONNECT] Make PyTorch Distributor compatible with Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/03 00:35:25 UTC, 2 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40622: fix typo in ResourceRequest.equals() - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/03 00:37:35 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40607: [SPARK-42993][ML][CONNECT] Make PyTorch Distributor compatible with Spark Connect - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/03 00:53:05 UTC, 22 replies.
- [GitHub] [spark] clownxc opened a new pull request, #40638: [SPARK-42774][SQL]Expose VectorTypes API for DataSourceV2 Batch Scans - posted by "clownxc (via GitHub)" <gi...@apache.org> on 2023/04/03 01:04:36 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40525: [SPARK-42859][CONNECT][PS] Basic support for pandas API on Spark Connect - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/03 01:31:38 UTC, 1 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40438: [SPARK-42806][SPARK-42811][CONNECT] Add `Catalog` support - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/03 02:02:27 UTC, 0 replies.
- [GitHub] [spark] WeichenXu123 commented on a diff in pull request #40607: [SPARK-42993][ML][CONNECT] Make PyTorch Distributor compatible with Spark Connect - posted by "WeichenXu123 (via GitHub)" <gi...@apache.org> on 2023/04/03 02:12:45 UTC, 14 replies.
- [GitHub] [spark] ulysses-you commented on pull request #32987: [SPARK-35564][SQL] Support subexpression elimination for conditionally evaluated expressions - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/03 02:33:39 UTC, 0 replies.
- [GitHub] [spark] itholic commented on a diff in pull request #40525: [SPARK-42859][CONNECT][PS] Basic support for pandas API on Spark Connect - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/03 02:40:37 UTC, 19 replies.
- [GitHub] [spark] srowen closed pull request #40622: [SPARK-43004][CORE] Fix typo in ResourceRequest.equals() - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/03 03:36:13 UTC, 0 replies.
- [GitHub] [spark] srowen commented on pull request #40622: [SPARK-43004][CORE] Fix typo in ResourceRequest.equals() - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/03 03:37:11 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40564: [SPARK-42519][CONNECT][TESTS] Add More WriteTo Tests In Spark Connect Client - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/03 05:29:30 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40564: [SPARK-42519][CONNECT][TESTS] Add More WriteTo Tests In Spark Connect Client - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/03 05:29:51 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on pull request #40564: [SPARK-42519][CONNECT][TESTS] Add More WriteTo Tests In Spark Connect Client - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/03 05:34:04 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on a diff in pull request #40616: [SPARK-42991][SQL] Disable string type +/- interval in ANSI mode - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/03 06:07:28 UTC, 1 replies.
- [GitHub] [spark] Yikf commented on pull request #40437: [SPARK-41259][SQL] SparkSQLDriver Output schema and result string should be consistent - posted by "Yikf (via GitHub)" <gi...@apache.org> on 2023/04/03 06:12:55 UTC, 8 replies.
- [GitHub] [spark] huaxingao commented on a diff in pull request #40630: [SPARK-42997][SQL] TableOutputResolver must use correct column paths in error messages for arrays and maps - posted by "huaxingao (via GitHub)" <gi...@apache.org> on 2023/04/03 06:17:34 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40639: [SPARK-43007][BUILD] Upgrade rocksdbjni to 8.0.0 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/03 06:59:06 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40630: [SPARK-42997][SQL] TableOutputResolver must use correct column paths in error messages for arrays and maps - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/03 07:43:20 UTC, 0 replies.
- [GitHub] [spark] yaooqinn closed pull request #40583: [SPARK-42955][SQL] Skip classifyException and wrap AnalysisException for SparkThrowable - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/03 07:51:30 UTC, 0 replies.
- [GitHub] [spark] yaooqinn commented on pull request #40583: [SPARK-42955][SQL] Skip classifyException and wrap AnalysisException for SparkThrowable - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/03 07:51:48 UTC, 0 replies.
- [GitHub] [spark] yaooqinn commented on a diff in pull request #40588: [SPARK-42964][SQL] PosgresDialect '42P07' also means table already exists - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/03 07:52:26 UTC, 1 replies.
- [GitHub] [spark] olaky commented on pull request #40124: [SPARK-37980][SQL] Access row_index via _metadata if possible in tests - posted by "olaky (via GitHub)" <gi...@apache.org> on 2023/04/03 08:10:55 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40640: [SPARK-43008][BUILD] Upgrade joda-time from 2.12.2 to 2.12.5 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/03 08:38:03 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40437: [SPARK-41259][SQL] SparkSQLDriver Output schema and result string should be consistent - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/03 08:40:34 UTC, 2 replies.
- [GitHub] [spark] jaceklaskowski commented on a diff in pull request #40607: [SPARK-42993][ML][CONNECT] Make PyTorch Distributor compatible with Spark Connect - posted by "jaceklaskowski (via GitHub)" <gi...@apache.org> on 2023/04/03 08:45:35 UTC, 0 replies.
- [GitHub] [spark] Leibnizhu commented on pull request #40634: [SPARK-42840][SQL] Rename the error class _LEGACY_ERROR_TEMP_2004 to NO_DEFAULT_FOR_DATA_TYPE - posted by "Leibnizhu (via GitHub)" <gi...@apache.org> on 2023/04/03 08:49:04 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable` - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/03 08:55:30 UTC, 6 replies.
- [GitHub] [spark] Yikf commented on a diff in pull request #40437: [SPARK-41259][SQL] SparkSQLDriver Output schema and result string should be consistent - posted by "Yikf (via GitHub)" <gi...@apache.org> on 2023/04/03 08:56:05 UTC, 3 replies.
- [GitHub] [spark] yaooqinn commented on a diff in pull request #40437: [SPARK-41259][SQL] SparkSQLDriver Output schema and result string should be consistent - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/03 09:04:10 UTC, 1 replies.
- [GitHub] [spark] jaceklaskowski commented on a diff in pull request #40638: [SPARK-42774][SQL]Expose VectorTypes API for DataSourceV2 Batch Scans - posted by "jaceklaskowski (via GitHub)" <gi...@apache.org> on 2023/04/03 09:05:05 UTC, 3 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/03 09:06:05 UTC, 2 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/03 09:24:31 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40623: [SPARK-43009][SQL] Parameterized `sql()` with `Any` constants - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/03 09:29:42 UTC, 3 replies.
- [GitHub] [spark] yaooqinn commented on pull request #40588: [SPARK-42964][SQL] PosgresDialect '42P07' also means table already exists - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/03 10:31:02 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40641: [SPARK-43011][SQL] `array_insert` should fail with 0 index - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/03 10:57:16 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40641: [SPARK-43011][SQL] `array_insert` should fail with 0 index - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/03 10:59:30 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40641: [SPARK-43011][SQL] `array_insert` should fail with 0 index - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/03 10:59:37 UTC, 1 replies.
- [GitHub] [spark] itholic opened a new pull request, #40642: [SPARK-43010][PYTHON] Migrate Column errors into error class - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/03 10:59:48 UTC, 0 replies.
- [GitHub] [spark] itholic commented on pull request #39937: [SPARK-42309][SQL] Introduce `INCOMPATIBLE_DATA_TO_TABLE` and sub classes. - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/03 11:00:38 UTC, 1 replies.
- [GitHub] [spark] attilapiros commented on pull request #39775: [SPARK-42219][CORE] Introducing a config to close all active SparkContexts after the Main method has finished - posted by "attilapiros (via GitHub)" <gi...@apache.org> on 2023/04/03 11:04:15 UTC, 1 replies.
- [GitHub] [spark] itholic commented on a diff in pull request #40641: [SPARK-43011][SQL] `array_insert` should fail with 0 index - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/03 11:06:00 UTC, 2 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40641: [SPARK-43011][SQL] `array_insert` should fail with 0 index - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/03 11:41:01 UTC, 0 replies.
- [GitHub] [spark] eejbyfeldt commented on pull request #37206: [SPARK-39696][CORE] Ensure Concurrent r/w `TaskMetrics` not throw Exception - posted by "eejbyfeldt (via GitHub)" <gi...@apache.org> on 2023/04/03 11:44:45 UTC, 1 replies.
- [GitHub] [spark] itholic opened a new pull request, #40643: [SPARK-43013][PYTHON] Migrate `ValueError` from DataFrame into `PySparkValueError`. - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/03 11:52:24 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40525: [SPARK-42859][CONNECT][PS] Basic support for pandas API on Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/03 12:15:22 UTC, 24 replies.
- [GitHub] [spark] MaxGekk opened a new pull request, #40644: [MINOR][DOCS] Add Java 8 types to value types of Scala/Java APIs - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/03 12:29:57 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #40623: [SPARK-43009][SQL] Parameterized `sql()` with `Any` constants - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/03 12:31:39 UTC, 0 replies.
- [GitHub] [spark] beliefer commented on a diff in pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable` - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/03 12:44:36 UTC, 0 replies.
- [GitHub] [spark] clownxc commented on a diff in pull request #40638: [SPARK-42774][SQL]Expose VectorTypes API for DataSourceV2 Batch Scans - posted by "clownxc (via GitHub)" <gi...@apache.org> on 2023/04/03 13:01:11 UTC, 2 replies.
- [GitHub] [spark] zhouyifan279 opened a new pull request, #40645: [SPARK-43014] Do not override `spark.app.submitTime` in k8s cluster mode driver - posted by "zhouyifan279 (via GitHub)" <gi...@apache.org> on 2023/04/03 13:02:39 UTC, 0 replies.
- [GitHub] [spark] clownxc commented on pull request #40635: [SPARK-42860][SQL] Add analysed logical mode in org.apache.spark.sql.execution.ExplainMode - posted by "clownxc (via GitHub)" <gi...@apache.org> on 2023/04/03 13:02:55 UTC, 1 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40438: [SPARK-42806][SPARK-42811][CONNECT] Add `Catalog` support - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/03 13:14:20 UTC, 0 replies.
- [GitHub] [spark] srowen closed pull request #40620: [SPARK-43005][PYSPARK] Fix typo in pyspark/pandas/config.py - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/03 13:24:28 UTC, 0 replies.
- [GitHub] [spark] srowen commented on pull request #40620: [SPARK-43005][PYSPARK] Fix typo in pyspark/pandas/config.py - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/03 13:25:09 UTC, 0 replies.
- [GitHub] [spark] srowen closed pull request #40619: [SPARK-43006][PYSPARK] Fix typo in StorageLevel __eq__() - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/03 13:26:09 UTC, 0 replies.
- [GitHub] [spark] srowen commented on pull request #40619: [SPARK-43006][PYSPARK] Fix typo in StorageLevel __eq__() - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/03 13:26:42 UTC, 0 replies.
- [GitHub] [spark] srowen closed pull request #40613: [SPARK-42974][CORE] Restore `Utils.createTempDir` to use the `ShutdownHookManager` and clean up `JavaUtils.createTempDir` method. - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/03 13:28:34 UTC, 0 replies.
- [GitHub] [spark] srowen commented on pull request #40613: [SPARK-42974][CORE] Restore `Utils.createTempDir` to use the `ShutdownHookManager` and clean up `JavaUtils.createTempDir` method. - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/03 13:28:37 UTC, 2 replies.
- [GitHub] [spark] jiangjiguang opened a new pull request, #40646: [WIP][SPARK-42696]Speed up parquet reading with Java Vector API - posted by "jiangjiguang (via GitHub)" <gi...@apache.org> on 2023/04/03 13:29:43 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40647: [SPARK-42974][CORE][3.4] Restore `Utils.createTempDir` to use the `ShutdownHookManager` and clean up `JavaUtils.createTempDir` method. - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/03 13:57:03 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40647: [SPARK-42974][CORE][3.4] Restore `Utils.createTempDir` to use the `ShutdownHookManager` and clean up `JavaUtils.createTempDir` method. - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/03 13:59:33 UTC, 1 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40646: [WIP][SPARK-42696]Speed up parquet reading with Java Vector API - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/03 14:00:18 UTC, 2 replies.
- [GitHub] [spark] MaxGekk commented on pull request #40644: [MINOR][DOCS] Add Java 8 types to value types of Scala/Java APIs - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/03 14:09:17 UTC, 0 replies.
- [GitHub] [spark] MaxGekk closed pull request #40644: [MINOR][DOCS] Add Java 8 types to value types of Scala/Java APIs - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/03 14:09:48 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40648: [MINOR][CONNECT][TESTS] Merge two `SparkVersion` test to one - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/03 14:31:36 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40581: [SPARK-42953][Connect] Typed filter, map, flatMap, mapPartitions - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/03 15:13:19 UTC, 4 replies.
- [GitHub] [spark] Hisoka-X opened a new pull request, #40649: [SPARK-41628][CONNECT][SERVER] The Design for support async query execution - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/03 15:41:30 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on pull request #40649: [SPARK-41628][CONNECT][SERVER] The Design for support async query execution - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/03 15:46:20 UTC, 2 replies.
- [GitHub] [spark] pengzhon-db commented on a diff in pull request #40586: [SPARK-42939][SS][CONNECT] Core streaming Python API for Spark Connect - posted by "pengzhon-db (via GitHub)" <gi...@apache.org> on 2023/04/03 16:01:11 UTC, 2 replies.
- [GitHub] [spark] liangyu-1 closed pull request #40621: Fix ExecutorAllocationManager cannot allocate new instances when all … - posted by "liangyu-1 (via GitHub)" <gi...@apache.org> on 2023/04/03 16:18:41 UTC, 0 replies.
- [GitHub] [spark] liangyu-1 opened a new pull request, #40621: Fix ExecutorAllocationManager cannot allocate new instances when all … - posted by "liangyu-1 (via GitHub)" <gi...@apache.org> on 2023/04/03 16:22:39 UTC, 0 replies.
- [GitHub] [spark] liangyu-1 commented on pull request #40621: Fix ExecutorAllocationManager cannot allocate new instances when all … - posted by "liangyu-1 (via GitHub)" <gi...@apache.org> on 2023/04/03 16:23:32 UTC, 1 replies.
- [GitHub] [spark] zhenlineo commented on a diff in pull request #40581: [SPARK-42953][Connect] Typed filter, map, flatMap, mapPartitions - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/03 16:39:44 UTC, 5 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40586: [SPARK-42939][SS][CONNECT] Core streaming Python API for Spark Connect - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/03 16:46:40 UTC, 4 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #40586: [SPARK-42939][SS][CONNECT] Core streaming Python API for Spark Connect - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/03 16:49:35 UTC, 10 replies.
- [GitHub] [spark] grundprinzip commented on a diff in pull request #40586: [SPARK-42939][SS][CONNECT] Core streaming Python API for Spark Connect - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/03 17:30:28 UTC, 0 replies.
- [GitHub] [spark] liuzqt commented on pull request #40629: [SPARK-42980][CORE] Implement a lightweight SmallBroadcast - posted by "liuzqt (via GitHub)" <gi...@apache.org> on 2023/04/03 17:51:32 UTC, 3 replies.
- [GitHub] [spark] lucaspompeun commented on pull request #40614: [SPARK-42987][DOCS] Correction of protobuf sql documentation - posted by "lucaspompeun (via GitHub)" <gi...@apache.org> on 2023/04/03 17:53:47 UTC, 2 replies.
- [GitHub] [spark] ueshin commented on pull request #40619: [SPARK-43006][PYSPARK] Fix typo in StorageLevel __eq__() - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/03 17:57:42 UTC, 1 replies.
- [GitHub] [spark] HeartSaVioR commented on a diff in pull request #40561: [SPARK-42931][SS] Introduce dropDuplicatesWithinWatermark - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/03 18:15:55 UTC, 39 replies.
- [GitHub] [spark] mridulm commented on pull request #37206: [SPARK-39696][CORE] Ensure Concurrent r/w `TaskMetrics` not throw Exception - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/03 18:24:48 UTC, 1 replies.
- [GitHub] [spark] mridulm commented on pull request #40629: [SPARK-42980][CORE] Implement a lightweight SmallBroadcast - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/03 18:29:15 UTC, 5 replies.
- [GitHub] [spark] gengliangwang commented on a diff in pull request #40616: [SPARK-42991][SQL] Disable string type +/- interval in ANSI mode - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/03 18:30:51 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #40641: [SPARK-43011][SQL] `array_insert` should fail with 0 index - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/03 18:33:06 UTC, 0 replies.
- [GitHub] [spark] gengliangwang commented on pull request #40616: [SPARK-42991][SQL] Disable string type +/- interval in ANSI mode - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/03 18:33:29 UTC, 0 replies.
- [GitHub] [spark] ueshin opened a new pull request, #40650: [SPARK-43006][PYSPARK][TESTS] Fix DataFrameTests.test_cache_dataframe - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/03 18:40:07 UTC, 0 replies.
- [GitHub] [spark] gengliangwang commented on a diff in pull request #40633: [SPARK-43000][SQL] Do not cast to double type in `PromoteStrings` - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/03 18:40:27 UTC, 0 replies.
- [GitHub] [spark] srowen commented on pull request #40650: [SPARK-43006][PYSPARK][TESTS] Fix DataFrameTests.test_cache_dataframe - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/03 18:50:54 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40628: [SPARK-42999][Connect] Dataset#foreach, foreachPartition - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/03 19:18:14 UTC, 3 replies.
- [GitHub] [spark] jaceklaskowski commented on a diff in pull request #34558: [SPARK-37019][SQL] Add codegen support to array higher-order functions - posted by "jaceklaskowski (via GitHub)" <gi...@apache.org> on 2023/04/03 19:37:40 UTC, 0 replies.
- [GitHub] [spark] ueshin commented on pull request #40015: [SPARK-42437][PYSPARK][CONNECT] PySpark catalog.cacheTable will allow to specify storage level - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/03 20:15:02 UTC, 0 replies.
- [GitHub] [spark] amaliujia opened a new pull request, #40651: [WIP] Add ordering to PhysicalDataType - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/03 22:07:09 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun closed pull request #40589: [SPARK-38697][SQL] Extend SparkSessionExtensions to inject rules into AQE query stage optimizer - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/03 22:12:56 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40589: [SPARK-38697][SQL] Extend SparkSessionExtensions to inject rules into AQE query stage optimizer - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/03 22:14:50 UTC, 2 replies.
- [GitHub] [spark] dongjoon-hyun closed pull request #40630: [SPARK-42997][SQL] TableOutputResolver must use correct column paths in error messages for arrays and maps - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/03 22:21:16 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40630: [SPARK-42997][SQL] TableOutputResolver must use correct column paths in error messages for arrays and maps - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/03 22:26:33 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40650: [SPARK-43006][PYTHON][TESTS] Fix DataFrameTests.test_cache_dataframe - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/03 23:02:55 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40650: [SPARK-43006][PYTHON][TESTS] Fix DataFrameTests.test_cache_dataframe - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/03 23:03:19 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40648: [MINOR][CONNECT][TESTS] Merge two `SparkVersion` related tests to one - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/03 23:12:59 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40648: [MINOR][CONNECT][TESTS] Merge two `SparkVersion` related tests to one - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/03 23:13:17 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on pull request #40614: [SPARK-42987][DOCS] Correction of protobuf sql documentation - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/03 23:58:16 UTC, 1 replies.
- [GitHub] [spark] dtenedor opened a new pull request, #40652: [SPARK-43018][SQL] Fix bug for INSERT commands with timetstamp literals - posted by "dtenedor (via GitHub)" <gi...@apache.org> on 2023/04/04 00:12:16 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #39184: [SPARK-41635][SQL] GROUP BY ALL - ansi mode test case - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/04 00:18:33 UTC, 0 replies.
- [GitHub] [spark] RyanBerti commented on a diff in pull request #40615: [WIP][SPARK-16484][SQL] Add support for Datasketches HllSketch - posted by "RyanBerti (via GitHub)" <gi...@apache.org> on 2023/04/04 00:28:35 UTC, 14 replies.
- [GitHub] [spark] srielau commented on a diff in pull request #39937: [SPARK-42309][SQL] Introduce `INCOMPATIBLE_DATA_TO_TABLE` and sub classes. - posted by "srielau (via GitHub)" <gi...@apache.org> on 2023/04/04 00:31:00 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40624: [SPARK-42995][CONNECT][PYTHON] Migrate Spark Connect DataFrame errors into error class - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/04 00:34:26 UTC, 1 replies.
- [GitHub] [spark] itholic commented on a diff in pull request #40624: [SPARK-42995][CONNECT][PYTHON] Migrate Spark Connect DataFrame errors into error class - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/04 00:40:16 UTC, 5 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40633: [SPARK-43000][SQL] Do not cast to double type in `PromoteStrings` - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/04 00:42:01 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40629: [SPARK-42980][CORE] Implement a lightweight SmallBroadcast - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/04 00:49:40 UTC, 1 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40438: [SPARK-42806][SPARK-42811][CONNECT] Add `Catalog` support - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/04 01:07:52 UTC, 3 replies.
- [GitHub] [spark] jiangjiguang commented on pull request #40646: [WIP][SPARK-42696]Speed up parquet reading with Java Vector API - posted by "jiangjiguang (via GitHub)" <gi...@apache.org> on 2023/04/04 01:17:04 UTC, 1 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40651: [SPARK-43019][SQL] Move Ordering to PhysicalDataType - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/04 01:20:43 UTC, 21 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40651: [SPARK-43019][SQL] Move Ordering to PhysicalDataType - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/04 01:21:58 UTC, 18 replies.
- [GitHub] [spark] ulysses-you opened a new pull request, #40653: [SPARK-42963][SQL] Extend SparkSessionExtensions to inject rules into AQE query stage optimizer - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/04 01:26:02 UTC, 0 replies.
- [GitHub] [spark] ulysses-you commented on pull request #40589: [SPARK-38697][SQL] Extend SparkSessionExtensions to inject rules into AQE query stage optimizer - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/04 01:27:08 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40623: [SPARK-43009][SQL] Parameterized `sql()` with `Any` constants - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/04 02:06:15 UTC, 0 replies.
- [GitHub] [spark] hvanhovell closed pull request #40603: [MINOR][CONNECT] Adding Proto Debug String to Job Description. - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/04 02:09:37 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40649: [SPARK-41628][CONNECT][SERVER] The Design for support async query execution - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/04 02:14:26 UTC, 0 replies.
- [GitHub] [spark] grundprinzip commented on a diff in pull request #40623: [SPARK-43009][SQL] Parameterized `sql()` with `Any` constants - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/04 02:19:21 UTC, 1 replies.
- [GitHub] [spark] itholic commented on a diff in pull request #39937: [SPARK-42309][SQL] Introduce `INCOMPATIBLE_DATA_TO_TABLE` and sub classes. - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/04 02:20:36 UTC, 4 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40648: [MINOR][CONNECT][TESTS] Merge two `SparkVersion` related tests to one - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/04 02:21:42 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40623: [SPARK-43009][SQL] Parameterized `sql()` with `Any` constants - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/04 02:31:25 UTC, 2 replies.
- [GitHub] [spark] wangyum commented on a diff in pull request #40633: [SPARK-43000][SQL] Do not cast to double type in `PromoteStrings` - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/04 02:38:27 UTC, 0 replies.
- [GitHub] [spark] grundprinzip commented on pull request #40623: [SPARK-43009][SQL] Parameterized `sql()` with `Any` constants - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/04 02:43:20 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40654: [SPARK-43022][CONNECT] Support protobuf functions for Scala client - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/04 03:00:55 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40654: [SPARK-43022][CONNECT] Support protobuf functions for Scala client - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/04 03:09:00 UTC, 8 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40654: [SPARK-43022][CONNECT] Support protobuf functions for Scala client - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/04 03:11:50 UTC, 1 replies.
- [GitHub] [spark] aokolnychyi opened a new pull request, #40655: [SPARK-42855][SQL] Use runtime null checks in TableOutputResolver - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/04 03:23:53 UTC, 0 replies.
- [GitHub] [spark] aokolnychyi commented on a diff in pull request #40655: [SPARK-42855][SQL] Use runtime null checks in TableOutputResolver - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/04 03:24:56 UTC, 10 replies.
- [GitHub] [spark] aokolnychyi commented on pull request #40655: [SPARK-42855][SQL] Use runtime null checks in TableOutputResolver - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/04 03:33:38 UTC, 4 replies.
- [GitHub] [spark] itholic commented on pull request #40525: [SPARK-42859][CONNECT][PS] Basic support for pandas API on Spark Connect - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/04 03:34:22 UTC, 3 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40656: [SPARK-43023][CONNECT][TESTS] Add switch catalog testing scenario for `CatalogSuite` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/04 03:52:23 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40656: [SPARK-43023][CONNECT][TESTS] Add switch catalog testing scenario for `CatalogSuite` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/04 03:53:58 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR commented on pull request #40561: [SPARK-42931][SS] Introduce dropDuplicatesWithinWatermark - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/04 04:28:12 UTC, 6 replies.
- [GitHub] [spark] liang3zy22 opened a new pull request, #40657: [SPARK-42844][SQL]Update error_class "_LEGACY_ERROR_TEMP_2008" to "INVALID_URL_STRING" - posted by "liang3zy22 (via GitHub)" <gi...@apache.org> on 2023/04/04 04:51:38 UTC, 0 replies.
- [GitHub] [spark] zhouyifan279 commented on pull request #40645: [SPARK-43014] Do not overwrite `spark.app.submitTime` in k8s cluster mode driver - posted by "zhouyifan279 (via GitHub)" <gi...@apache.org> on 2023/04/04 04:52:40 UTC, 5 replies.
- [GitHub] [spark] aokolnychyi commented on a diff in pull request #40308: [SPARK-42151][SQL] Align UPDATE assignments with table attributes - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/04 05:08:08 UTC, 25 replies.
- [GitHub] [spark] yaooqinn commented on a diff in pull request #40645: [SPARK-43014] Do not overwrite `spark.app.submitTime` in k8s cluster mode driver - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/04 05:23:49 UTC, 2 replies.
- [GitHub] [spark] itholic opened a new pull request, #40658: [SPARK-43024][PS] Upgrade pandas to 2.0.0 - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/04 05:42:57 UTC, 0 replies.
- [GitHub] [spark] itholic commented on a diff in pull request #40657: [SPARK-42844][SQL]Update error_class "_LEGACY_ERROR_TEMP_2008" to "INVALID_URL_STRING" - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/04 05:48:45 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40525: [SPARK-42859][CONNECT][PS] Basic support for pandas API on Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/04 06:09:08 UTC, 5 replies.
- [GitHub] [spark] khalidmammadov commented on pull request #40015: [SPARK-42437][PYSPARK][CONNECT] PySpark catalog.cacheTable will allow to specify storage level - posted by "khalidmammadov (via GitHub)" <gi...@apache.org> on 2023/04/04 06:10:10 UTC, 0 replies.
- [GitHub] [spark] liang3zy22 commented on a diff in pull request #40657: [SPARK-42844][SQL]Update error_class "_LEGACY_ERROR_TEMP_2008" to "INVALID_URL_STRING" - posted by "liang3zy22 (via GitHub)" <gi...@apache.org> on 2023/04/04 06:36:09 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun closed pull request #40653: [SPARK-42963][SQL] Extend SparkSessionExtensions to inject rules into AQE query stage optimizer - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/04 07:09:08 UTC, 0 replies.
- [GitHub] [spark] ulysses-you opened a new pull request, #40659: [SPARK-43026][SQL] Apply AQE with non-exchange table cache - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/04 07:16:20 UTC, 0 replies.
- [GitHub] [spark] ulysses-you commented on pull request #40659: [SPARK-43026][SQL] Apply AQE with non-exchange table cache - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/04 07:16:37 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on pull request #40641: [SPARK-43011][SQL] `array_insert` should fail with 0 index - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/04 07:20:15 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40647: [SPARK-42974][CORE][3.4] Restore `Utils.createTempDir` to use the `ShutdownHookManager` and clean up `JavaUtils.createTempDir` method. - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/04 07:21:47 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun closed pull request #40647: [SPARK-42974][CORE][3.4] Restore `Utils.createTempDir` to use the `ShutdownHookManager` and clean up `JavaUtils.createTempDir` method. - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/04 07:21:47 UTC, 0 replies.
- [GitHub] [spark] MaxGekk closed pull request #40641: [SPARK-43011][SQL] `array_insert` should fail with 0 index - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/04 07:22:28 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40655: [SPARK-42855][SQL] Use runtime null checks in TableOutputResolver - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/04 07:35:59 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #40634: [SPARK-42840][SQL] Rename the error class _LEGACY_ERROR_TEMP_2004 to NO_DEFAULT_FOR_DATA_TYPE - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/04 07:56:00 UTC, 1 replies.
- [GitHub] [spark] yaooqinn closed pull request #40588: [SPARK-42964][SQL] PosgresDialect '42P07' also means table already exists - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/04 08:24:54 UTC, 0 replies.
- [GitHub] [spark] allisonwang-db opened a new pull request, #40660: [SPARK-43028][SQL] Add error class SQL_CONF_NOT_FOUND - posted by "allisonwang-db (via GitHub)" <gi...@apache.org> on 2023/04/04 08:25:46 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40659: [SPARK-43026][SQL] Apply AQE with non-exchange table cache - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/04 09:11:36 UTC, 1 replies.
- [GitHub] [spark] zhouyifan279 commented on a diff in pull request #40645: [SPARK-43014] Do not overwrite `spark.app.submitTime` in k8s cluster mode driver - posted by "zhouyifan279 (via GitHub)" <gi...@apache.org> on 2023/04/04 09:25:47 UTC, 1 replies.
- [GitHub] [spark] DHKold commented on a diff in pull request #40491: [SPARK-41006][K8S] Generate new ConfigMap names for each run - posted by "DHKold (via GitHub)" <gi...@apache.org> on 2023/04/04 09:28:46 UTC, 0 replies.
- [GitHub] [spark] DHKold commented on pull request #40491: [SPARK-41006][K8S] Generate new ConfigMap names for each run - posted by "DHKold (via GitHub)" <gi...@apache.org> on 2023/04/04 09:29:05 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #40657: [SPARK-42844][SQL]Update error_class "_LEGACY_ERROR_TEMP_2008" to "INVALID_URL_STRING" - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/04 09:31:12 UTC, 0 replies.
- [GitHub] [spark] WeichenXu123 commented on pull request #40607: [SPARK-42993][ML][CONNECT] Make PyTorch Distributor compatible with Spark Connect - posted by "WeichenXu123 (via GitHub)" <gi...@apache.org> on 2023/04/04 09:36:35 UTC, 1 replies.
- [GitHub] [spark] ulysses-you commented on a diff in pull request #40659: [SPARK-43026][SQL] Apply AQE with non-exchange table cache - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/04 09:46:37 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40639: [SPARK-43007][BUILD] Upgrade rocksdbjni to 8.0.0 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/04 09:51:02 UTC, 2 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40586: [SPARK-42939][SS][CONNECT] Core streaming Python API for Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/04 10:13:26 UTC, 13 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #40660: [SPARK-43028][SQL] Add error class SQL_CONF_NOT_FOUND - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/04 10:38:42 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #40632: [SPARK-42298][SQL] Assign name to _LEGACY_ERROR_TEMP_2132 - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/04 10:50:17 UTC, 4 replies.
- [GitHub] [spark] beliefer commented on pull request #40563: [SPARK-41232][SPARK-41233][FOLLOWUP] Refactor `array_append` and `array_prepend` with `RuntimeReplaceable` - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/04 11:41:22 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #40609: [SPARK-42316][SQL] Assign name to _LEGACY_ERROR_TEMP_2044 - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/04 11:56:35 UTC, 0 replies.
- [GitHub] [spark] beliefer opened a new pull request, #40661: [SPARK-43025][SQL] Eliminate Union if filters have the same child plan - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/04 12:00:14 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40659: [SPARK-43026][SQL] Apply AQE with non-exchange table cache - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/04 13:13:18 UTC, 0 replies.
- [GitHub] [spark] Leibnizhu commented on a diff in pull request #40634: [SPARK-42840][SQL] Rename the error class _LEGACY_ERROR_TEMP_2004 to NO_DEFAULT_FOR_DATA_TYPE - posted by "Leibnizhu (via GitHub)" <gi...@apache.org> on 2023/04/04 13:14:27 UTC, 2 replies.
- [GitHub] [spark] cloud-fan closed pull request #40659: [SPARK-43026][SQL] Apply AQE with non-exchange table cache - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/04 13:14:46 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40258: [SPARK-42655][SQL] Incorrect ambiguous column reference error - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/04 13:15:39 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40258: [SPARK-42655][SQL] Incorrect ambiguous column reference error - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/04 13:16:16 UTC, 0 replies.
- [GitHub] [spark] liang3zy22 commented on pull request #40657: [SPARK-42844][SQL]Update error_class "_LEGACY_ERROR_TEMP_2008" to "INVALID_URL_STRING" - posted by "liang3zy22 (via GitHub)" <gi...@apache.org> on 2023/04/04 14:11:55 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #39937: [SPARK-42309][SQL] Introduce `INCOMPATIBLE_DATA_TO_TABLE` and sub classes. - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/04 14:27:46 UTC, 2 replies.
- [GitHub] [spark] cloud-fan opened a new pull request, #40662: [SPARK-43030][SQL] Deduplicate relations with metadata columns - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/04 14:38:50 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40662: [SPARK-43030][SQL] Deduplicate relations with metadata columns - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/04 14:39:10 UTC, 2 replies.
- [GitHub] [spark] MaxGekk commented on pull request #40657: [SPARK-42844][SQL]Update error_class "_LEGACY_ERROR_TEMP_2008" to "INVALID_URL_STRING" - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/04 14:41:26 UTC, 0 replies.
- [GitHub] [spark] eejbyfeldt opened a new pull request, #40663: [SPARK-39696][CORE][WIP] Test case showing race in access to TaskMetrics.externalAccums - posted by "eejbyfeldt (via GitHub)" <gi...@apache.org> on 2023/04/04 14:49:55 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40623: [SPARK-43009][SQL] Parameterized `sql()` with `Any` constants - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/04 15:22:58 UTC, 0 replies.
- [GitHub] [spark] srielau commented on a diff in pull request #38867: [SPARK-41234][SQL][PYTHON] Add `array_insert` function - posted by "srielau (via GitHub)" <gi...@apache.org> on 2023/04/04 15:44:25 UTC, 2 replies.
- [GitHub] [spark] MaxGekk commented on pull request #40623: [SPARK-43009][SQL] Parameterized `sql()` with `Any` constants - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/04 15:53:47 UTC, 1 replies.
- [GitHub] [spark] Hisoka-X commented on a diff in pull request #40609: [SPARK-42316][SQL] Assign name to _LEGACY_ERROR_TEMP_2044 - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/04 16:13:27 UTC, 1 replies.
- [GitHub] [spark] tanvn commented on pull request #38053: [SPARK-40600] Support recursiveFileLookup for partitioned datasource - posted by "tanvn (via GitHub)" <gi...@apache.org> on 2023/04/04 16:15:48 UTC, 0 replies.
- [GitHub] [spark] dzhigimont opened a new pull request, #40664: [SPARK-43024][PS][INFRA] Upgrade pandas to 2.0.0 - posted by "dzhigimont (via GitHub)" <gi...@apache.org> on 2023/04/04 16:22:52 UTC, 0 replies.
- [GitHub] [spark] srielau commented on a diff in pull request #40641: [SPARK-43011][SQL] `array_insert` should fail with 0 index - posted by "srielau (via GitHub)" <gi...@apache.org> on 2023/04/04 16:23:28 UTC, 0 replies.
- [GitHub] [spark] dzhigimont opened a new pull request, #40665: [SPARK-42621][PS] Add inclusive parameter for pd.date_range - posted by "dzhigimont (via GitHub)" <gi...@apache.org> on 2023/04/04 16:46:39 UTC, 0 replies.
- [GitHub] [spark] dtenedor commented on pull request #40652: [SPARK-43018][SQL] Fix bug for INSERT commands with timetstamp literals - posted by "dtenedor (via GitHub)" <gi...@apache.org> on 2023/04/04 17:26:24 UTC, 0 replies.
- [GitHub] [spark] MaxGekk opened a new pull request, #40666: [SPARK-43009][SQL][3.4] Parameterized `sql()` with `Any` constants - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/04 17:30:54 UTC, 0 replies.
- [GitHub] [spark] yliou commented on pull request #35939: [SPARK-38617][SQL][WEBUI] Show Spark rule and phase timings in SQL UI and REST API - posted by "yliou (via GitHub)" <gi...@apache.org> on 2023/04/04 17:38:03 UTC, 0 replies.
- [GitHub] [spark] tgravescs commented on pull request #40622: [SPARK-43004][CORE] Fix typo in ResourceRequest.equals() - posted by "tgravescs (via GitHub)" <gi...@apache.org> on 2023/04/04 18:02:41 UTC, 0 replies.
- [GitHub] [spark] gengliangwang commented on a diff in pull request #40652: [SPARK-43018][SQL] Fix bug for INSERT commands with timetstamp literals - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/04 18:11:10 UTC, 11 replies.
- [GitHub] [spark] Kimahriman commented on pull request #32987: [SPARK-35564][SQL] Support subexpression elimination for conditionally evaluated expressions - posted by "Kimahriman (via GitHub)" <gi...@apache.org> on 2023/04/04 18:15:38 UTC, 0 replies.
- [GitHub] [spark] dtenedor commented on a diff in pull request #40652: [SPARK-43018][SQL] Fix bug for INSERT commands with timetstamp literals - posted by "dtenedor (via GitHub)" <gi...@apache.org> on 2023/04/04 18:19:01 UTC, 6 replies.
- [GitHub] [spark] amaliujia commented on pull request #40586: [SPARK-42939][SS][CONNECT] Core streaming Python API for Spark Connect - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/04 18:54:34 UTC, 0 replies.
- [GitHub] [spark] ksumit opened a new pull request, #40667: Improve IDE build experience against jdk11 - posted by "ksumit (via GitHub)" <gi...@apache.org> on 2023/04/04 19:12:08 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40645: [SPARK-43014] Do not overwrite `spark.app.submitTime` in k8s cluster mode driver - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/04 20:11:51 UTC, 1 replies.
- [GitHub] [spark] dongjoon-hyun commented on a diff in pull request #40655: [SPARK-42855][SQL] Use runtime null checks in TableOutputResolver - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/04 20:15:22 UTC, 0 replies.
- [GitHub] [spark] Kimahriman commented on a diff in pull request #34558: [SPARK-37019][SQL] Add codegen support to array higher-order functions - posted by "Kimahriman (via GitHub)" <gi...@apache.org> on 2023/04/04 20:44:15 UTC, 1 replies.
- [GitHub] [spark] Kimahriman commented on pull request #34558: [SPARK-37019][SQL] Add codegen support to array higher-order functions - posted by "Kimahriman (via GitHub)" <gi...@apache.org> on 2023/04/04 20:44:56 UTC, 0 replies.
- [GitHub] [spark] justaparth opened a new pull request, #40668: spark protobuf: add materializeDefaults option to spark-protobuf - posted by "justaparth (via GitHub)" <gi...@apache.org> on 2023/04/04 21:13:28 UTC, 0 replies.
- [GitHub] [spark] justaparth closed pull request #40668: spark protobuf: add materializeDefaults option to spark-protobuf - posted by "justaparth (via GitHub)" <gi...@apache.org> on 2023/04/04 21:13:36 UTC, 0 replies.
- [GitHub] [spark] shardulm94 commented on pull request #40637: [SPARK-43002][YARN] Modify yarn client application report logging frequency to reduce noise - posted by "shardulm94 (via GitHub)" <gi...@apache.org> on 2023/04/04 21:41:21 UTC, 1 replies.
- [GitHub] [spark] WweiL commented on a diff in pull request #40586: [SPARK-42939][SS][CONNECT] Core streaming Python API for Spark Connect - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/04 22:45:11 UTC, 0 replies.
- [GitHub] [spark] ueshin opened a new pull request, #40669: [SPARK-42983][CONNECT][PYTHON] Fix createDataFrame to handle 0-dim numpy array properly - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/04 23:10:52 UTC, 0 replies.
- [GitHub] [spark] gengliangwang commented on a diff in pull request #40655: [SPARK-42855][SQL] Use runtime null checks in TableOutputResolver - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/04 23:17:49 UTC, 0 replies.
- [GitHub] [spark] gengliangwang commented on pull request #40655: [SPARK-42855][SQL] Use runtime null checks in TableOutputResolver - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/04 23:35:53 UTC, 1 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40649: [SPARK-41628][CONNECT][SERVER] The Design for support async query execution - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/04 23:56:43 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40665: [SPARK-42621][PS] Add inclusive parameter for pd.date_range - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 00:08:46 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40664: [SPARK-43024][PS][INFRA] Upgrade pandas to 2.0.0 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 00:11:39 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40666: [SPARK-43009][SQL][3.4] Parameterized `sql()` with `Any` constants - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 00:12:00 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40666: [SPARK-43009][SQL][3.4] Parameterized `sql()` with `Any` constants - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 00:12:30 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon opened a new pull request, #40670: [MINOR][PYTHON][CONNECT][DOCS] Deduplicate versionchanged directive in Catalog - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 00:44:58 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40670: [MINOR][PYTHON][CONNECT][DOCS] Deduplicate versionchanged directive in Catalog - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 00:45:14 UTC, 1 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40611: [SPARK-42981][CONNECT] Add direct arrow serialization - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/05 00:57:29 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40651: [SPARK-43019][SQL] Move Ordering to PhysicalDataType - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/05 01:05:55 UTC, 13 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40670: [MINOR][PYTHON][CONNECT][DOCS] Deduplicate versionchanged directive in Catalog - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 01:48:22 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40124: [SPARK-37980][SQL] Access row_index via _metadata if possible in tests - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/05 01:55:35 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40124: [SPARK-37980][SQL] Access row_index via _metadata if possible in tests - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/05 01:55:52 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon opened a new pull request, #40671: [MINOR][CONNECT][DOCS] Clarify Spark Connect option in Spark scripts - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 01:56:25 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40671: [MINOR][CONNECT][DOCS] Clarify Spark Connect option in Spark scripts - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 01:56:45 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40662: [SPARK-43030][SQL] Deduplicate relations with metadata columns - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/05 02:58:49 UTC, 6 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40669: [SPARK-42983][CONNECT][PYTHON] Fix createDataFrame to handle 0-dim numpy array properly - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 03:04:03 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40669: [SPARK-42983][CONNECT][PYTHON] Fix createDataFrame to handle 0-dim numpy array properly - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 03:04:40 UTC, 0 replies.
- [GitHub] [spark] gengliangwang commented on a diff in pull request #40652: [SPARK-43018][SQL] Fix bug for INSERT commands with timestamp literals - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/05 03:43:38 UTC, 1 replies.
- [GitHub] [spark] wankunde commented on a diff in pull request #40523: [SPARK-42897][SQL] Avoid evaluate more than once for the variables from the left side in the FullOuter SMJ condition - posted by "wankunde (via GitHub)" <gi...@apache.org> on 2023/04/05 04:56:12 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40671: [MINOR][CONNECT][DOCS] Clarify Spark Connect option in Spark scripts - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 06:01:04 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #39294: [SPARK-41537][INFRA][TESTS] Github Workflow Check for Breaking Changes in Spark Connect Proto - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 06:25:09 UTC, 2 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #39294: [SPARK-41537][INFRA][TESTS] Github Workflow Check for Breaking Changes in Spark Connect Proto - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 06:29:03 UTC, 0 replies.
- [GitHub] [spark] dzhigimont closed pull request #40664: [SPARK-43024][PS][INFRA] Upgrade pandas to 2.0.0 - posted by "dzhigimont (via GitHub)" <gi...@apache.org> on 2023/04/05 07:05:03 UTC, 0 replies.
- [GitHub] [spark] dzhigimont commented on a diff in pull request #40664: [SPARK-43024][PS][INFRA] Upgrade pandas to 2.0.0 - posted by "dzhigimont (via GitHub)" <gi...@apache.org> on 2023/04/05 07:05:59 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #37079: [SPARK-39610][INFRA] Add GITHUB_WORKSPACE to git trust safe.directory for container based job - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 07:18:06 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #40657: [SPARK-42844][SQL]Update error_class "_LEGACY_ERROR_TEMP_2008" to "INVALID_URL" - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/05 07:22:11 UTC, 1 replies.
- [GitHub] [spark] allisonwang-db opened a new pull request, #40672: [SPARK-43035][CONNECT] Add error class in Spark Connect server's ErrorInfo - posted by "allisonwang-db (via GitHub)" <gi...@apache.org> on 2023/04/05 07:36:47 UTC, 0 replies.
- [GitHub] [spark] liang3zy22 commented on a diff in pull request #40657: [SPARK-42844][SQL] Update the error class `_LEGACY_ERROR_TEMP_2008` to `INVALID_URL` - posted by "liang3zy22 (via GitHub)" <gi...@apache.org> on 2023/04/05 07:38:37 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on pull request #40609: [SPARK-42316][SQL] Assign name to _LEGACY_ERROR_TEMP_2044 - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/05 07:40:55 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon opened a new pull request, #40673: [SPARK-41537][INFRA][CONNECT][FOLLOW-UP] Removes breaking changes within master branch - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 07:40:58 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40673: [SPARK-41537][INFRA][CONNECT][FOLLOW-UP] Removes breaking changes within master branch - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 07:43:13 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #40657: [SPARK-42844][SQL] Update the error class `_LEGACY_ERROR_TEMP_2008` to `INVALID_URL` - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/05 07:47:38 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on pull request #40609: [SPARK-42316][SQL] Assign name to _LEGACY_ERROR_TEMP_2044 - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/05 07:50:44 UTC, 1 replies.
- [GitHub] [spark] MaxGekk closed pull request #40609: [SPARK-42316][SQL] Assign name to _LEGACY_ERROR_TEMP_2044 - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/05 07:58:31 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40672: [SPARK-43035][CONNECT] Add error class in Spark Connect server's ErrorInfo - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 08:03:39 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40655: [SPARK-42855][SQL] Use runtime null checks in TableOutputResolver - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/05 08:17:14 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40655: [SPARK-42855][SQL] Use runtime null checks in TableOutputResolver - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/05 08:18:14 UTC, 0 replies.
- [GitHub] [spark] Yikun commented on pull request #37079: [SPARK-39610][INFRA] Add GITHUB_WORKSPACE to git trust safe.directory for container based job - posted by "Yikun (via GitHub)" <gi...@apache.org> on 2023/04/05 10:01:53 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon opened a new pull request, #40674: [MINOR][INFRA] Remove workaround for CVE-2022-24765 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 10:23:18 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40624: [SPARK-42995][CONNECT][PYTHON] Migrate Spark Connect DataFrame errors into error class - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 10:33:40 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40624: [SPARK-42995][CONNECT][PYTHON] Migrate Spark Connect DataFrame errors into error class - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 10:33:59 UTC, 0 replies.
- [GitHub] [spark] vicennial opened a new pull request, #40675: [SPARK-42657][CONNECT] Support to find and transfer client-side REPL classfiles to server as artifacts - posted by "vicennial (via GitHub)" <gi...@apache.org> on 2023/04/05 11:16:16 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40674: [MINOR][INFRA] Remove workaround for CVE-2022-24765 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 11:44:51 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40674: [MINOR][INFRA] Remove workaround for CVE-2022-24765 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 11:45:15 UTC, 0 replies.
- [GitHub] [spark] eejbyfeldt commented on pull request #40663: [SPARK-39696][CORE] Test case showing race in access to TaskMetrics.externalAccums - posted by "eejbyfeldt (via GitHub)" <gi...@apache.org> on 2023/04/05 12:08:41 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40663: [SPARK-39696][CORE] Fix data race in access to TaskMetrics.externalAccums - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/05 12:27:26 UTC, 5 replies.
- [GitHub] [spark] juliuszsompolski opened a new pull request, #40676: [SPARK-42656][FOLLOWUP] Add BUILD option to Spark Connect scripts - posted by "juliuszsompolski (via GitHub)" <gi...@apache.org> on 2023/04/05 14:06:16 UTC, 0 replies.
- [GitHub] [spark] juliuszsompolski commented on pull request #40676: [SPARK-42656][FOLLOWUP] Add BUILD option to Spark Connect scripts - posted by "juliuszsompolski (via GitHub)" <gi...@apache.org> on 2023/04/05 14:06:39 UTC, 3 replies.
- [GitHub] [spark] VindhyaG commented on pull request #40553: [SPARK-39722] [SQL] getString API for Dataset - posted by "VindhyaG (via GitHub)" <gi...@apache.org> on 2023/04/05 14:35:17 UTC, 0 replies.
- [GitHub] [spark] ryan-johnson-databricks opened a new pull request, #40677: [SPARK-43039] Support custom fields in the file source _metadata column. - posted by "ryan-johnson-databricks (via GitHub)" <gi...@apache.org> on 2023/04/05 15:45:41 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40675: [SPARK-42657][CONNECT] Support to find and transfer client-side REPL classfiles to server as artifacts - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/05 15:49:56 UTC, 7 replies.
- [GitHub] [spark] dtenedor commented on a diff in pull request #40652: [SPARK-43018][SQL] Fix bug for INSERT commands with timestamp literals - posted by "dtenedor (via GitHub)" <gi...@apache.org> on 2023/04/05 16:30:10 UTC, 1 replies.
- [GitHub] [spark] xinrong-meng commented on a diff in pull request #40525: [SPARK-42859][CONNECT][PS] Basic support for pandas API on Spark Connect - posted by "xinrong-meng (via GitHub)" <gi...@apache.org> on 2023/04/05 17:48:28 UTC, 0 replies.
- [GitHub] [spark] xinrong-meng commented on a diff in pull request #40665: [SPARK-42621][PS] Add inclusive parameter for pd.date_range - posted by "xinrong-meng (via GitHub)" <gi...@apache.org> on 2023/04/05 18:12:35 UTC, 0 replies.
- [GitHub] [spark] ueshin commented on a diff in pull request #40525: [SPARK-42859][CONNECT][PS] Basic support for pandas API on Spark Connect - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/05 18:34:07 UTC, 1 replies.
- [GitHub] [spark] viirya commented on a diff in pull request #40662: [SPARK-43030][SQL] Deduplicate relations with metadata columns - posted by "viirya (via GitHub)" <gi...@apache.org> on 2023/04/05 18:34:18 UTC, 2 replies.
- [GitHub] [spark] tianhanhu opened a new pull request, #40678: [SPARK-43040][SQL] Improve TimestampNTZ type support in JDBC data source - posted by "tianhanhu (via GitHub)" <gi...@apache.org> on 2023/04/05 18:50:06 UTC, 0 replies.
- [GitHub] [spark] eejbyfeldt commented on pull request #40663: [SPARK-39696][CORE] Fix data race in access to TaskMetrics.externalAccums - posted by "eejbyfeldt (via GitHub)" <gi...@apache.org> on 2023/04/05 19:20:27 UTC, 0 replies.
- [GitHub] [spark] tgravescs commented on pull request #40637: [SPARK-43002][YARN] Modify yarn client application report logging frequency to reduce noise - posted by "tgravescs (via GitHub)" <gi...@apache.org> on 2023/04/05 19:21:11 UTC, 1 replies.
- [GitHub] [spark] robreeves commented on a diff in pull request #40637: [SPARK-43002][YARN] Modify yarn client application report logging frequency to reduce noise - posted by "robreeves (via GitHub)" <gi...@apache.org> on 2023/04/05 20:06:46 UTC, 0 replies.
- [GitHub] [spark] robreeves commented on pull request #40637: [SPARK-43002][YARN] Modify yarn client application report logging frequency to reduce noise - posted by "robreeves (via GitHub)" <gi...@apache.org> on 2023/04/05 20:11:27 UTC, 0 replies.
- [GitHub] [spark] ShreyeshArangath commented on pull request #40637: [SPARK-43002][YARN] Modify yarn client application report logging frequency to reduce noise - posted by "ShreyeshArangath (via GitHub)" <gi...@apache.org> on 2023/04/05 20:13:53 UTC, 1 replies.
- [GitHub] [spark] aokolnychyi opened a new pull request, #40679: [SPARK-43041][SQL] Restore constructors of exceptions for compatibility in connector API - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/05 20:17:25 UTC, 0 replies.
- [GitHub] [spark] aokolnychyi commented on pull request #40679: [SPARK-43041][SQL] Restore constructors of exceptions for compatibility in connector API - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/05 20:18:52 UTC, 4 replies.
- [GitHub] [spark] aokolnychyi commented on a diff in pull request #40679: [SPARK-43041][SQL] Restore constructors of exceptions for compatibility in connector API - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/05 20:22:27 UTC, 13 replies.
- [GitHub] [spark] dtenedor commented on a diff in pull request #40662: [SPARK-43030][SQL] Deduplicate relations with metadata columns - posted by "dtenedor (via GitHub)" <gi...@apache.org> on 2023/04/05 20:58:41 UTC, 0 replies.
- [GitHub] [spark] tirumaleshn2458 opened a new pull request, #40680: Master clone - posted by "tirumaleshn2458 (via GitHub)" <gi...@apache.org> on 2023/04/05 21:05:46 UTC, 0 replies.
- [GitHub] [spark] gengliangwang commented on a diff in pull request #40679: [SPARK-43041][SQL] Restore constructors of exceptions for compatibility in connector API - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/05 21:13:02 UTC, 3 replies.
- [GitHub] [spark] sadikovi commented on a diff in pull request #40678: [SPARK-43040][SQL] Improve TimestampNTZ type support in JDBC data source - posted by "sadikovi (via GitHub)" <gi...@apache.org> on 2023/04/05 22:21:49 UTC, 1 replies.
- [GitHub] [spark] tianhanhu-db commented on a diff in pull request #40678: [SPARK-43040][SQL] Improve TimestampNTZ type support in JDBC data source - posted by "tianhanhu-db (via GitHub)" <gi...@apache.org> on 2023/04/05 22:36:00 UTC, 17 replies.
- [GitHub] [spark] HyukjinKwon opened a new pull request, #40681: [MINOR][INFRA] Partially brings workaround for CVE-2022-24765 back. - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 22:41:10 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #40561: [SPARK-42931][SS] Introduce dropDuplicatesWithinWatermark - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/05 22:44:16 UTC, 8 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40681: [MINOR][INFRA] Partially brings workaround for CVE-2022-24765 back. - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 22:45:36 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40681: [MINOR][INFRA] Partially brings workaround for CVE-2022-24765 back. - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/05 22:45:37 UTC, 0 replies.
- [GitHub] [spark] clownxc commented on pull request #40638: [SPARK-42774][SQL]Expose VectorTypes API for DataSourceV2 Batch Scans - posted by "clownxc (via GitHub)" <gi...@apache.org> on 2023/04/05 23:40:56 UTC, 2 replies.
- [GitHub] [spark] amaliujia commented on pull request #40651: [SPARK-43019][SQL] Move Ordering to PhysicalDataType - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/05 23:45:23 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on pull request #40663: [SPARK-39696][CORE] Fix data race in access to TaskMetrics.externalAccums - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/06 00:04:29 UTC, 1 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40672: [SPARK-43035][CONNECT] Add error class in Spark Connect server's ErrorInfo - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/06 00:04:38 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on a diff in pull request #40663: [SPARK-39696][CORE] Fix data race in access to TaskMetrics.externalAccums - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/06 00:07:48 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40679: [SPARK-43041][SQL] Restore constructors of exceptions for compatibility in connector API - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/06 00:46:48 UTC, 0 replies.
- [GitHub] [spark] allisonwang-db commented on a diff in pull request #40672: [SPARK-43035][CONNECT] Add error class in Spark Connect server's ErrorInfo - posted by "allisonwang-db (via GitHub)" <gi...@apache.org> on 2023/04/06 00:52:15 UTC, 1 replies.
- [GitHub] [spark] panbingkun opened a new pull request, #40682: [SPARK-43044][CONNECT][BUILD] Upgrade buf to v1.17.0 - posted by "panbingkun (via GitHub)" <gi...@apache.org> on 2023/04/06 01:37:21 UTC, 0 replies.
- [GitHub] [spark] frankliee commented on pull request #40646: [WIP][SPARK-42696]Speed up parquet reading with Java Vector API - posted by "frankliee (via GitHub)" <gi...@apache.org> on 2023/04/06 02:09:59 UTC, 0 replies.
- [GitHub] [spark] itholic commented on a diff in pull request #40672: [SPARK-43035][CONNECT] Add error class in Spark Connect server's ErrorInfo - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/06 02:10:12 UTC, 5 replies.
- [GitHub] [spark] sadikovi commented on pull request #40678: [SPARK-43040][SQL] Improve TimestampNTZ type support in JDBC data source - posted by "sadikovi (via GitHub)" <gi...@apache.org> on 2023/04/06 02:21:45 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40282: [SPARK-42672][PYTHON][DOCS] Document error class list - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/06 02:59:17 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40642: [SPARK-43010][PYTHON] Migrate Column errors into error class - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/06 02:59:33 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40282: [SPARK-42672][PYTHON][DOCS] Document error class list - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/06 02:59:58 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40642: [SPARK-43010][PYTHON] Migrate Column errors into error class - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/06 03:00:00 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40282: [SPARK-42672][PYTHON][DOCS] Document error class list - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/06 03:00:26 UTC, 0 replies.
- [GitHub] [spark] itholic commented on a diff in pull request #40282: [SPARK-42672][PYTHON][DOCS] Document error class list - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/06 03:02:13 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40586: [SPARK-42939][SS][CONNECT] Core streaming Python API for Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/06 03:25:40 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40586: [SPARK-42939][SS][CONNECT] Core streaming Python API for Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/06 03:26:00 UTC, 0 replies.
- [GitHub] [spark] allisonwang-db commented on pull request #40660: [SPARK-43028][SQL] Add error class SQL_CONF_NOT_FOUND - posted by "allisonwang-db (via GitHub)" <gi...@apache.org> on 2023/04/06 03:27:42 UTC, 0 replies.
- [GitHub] [spark] gatorsmile commented on a diff in pull request #40672: [SPARK-43035][CONNECT] Add error class in Spark Connect server's ErrorInfo - posted by "gatorsmile (via GitHub)" <gi...@apache.org> on 2023/04/06 03:59:28 UTC, 0 replies.
- [GitHub] [spark] ueshin commented on a diff in pull request #40672: [SPARK-43035][CONNECT] Add error class in Spark Connect server's ErrorInfo - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/06 04:12:39 UTC, 0 replies.
- [GitHub] [spark] gengliangwang commented on pull request #40652: [SPARK-43018][SQL] Fix bug for INSERT commands with timestamp literals - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/06 04:16:41 UTC, 0 replies.
- [GitHub] [spark] gengliangwang closed pull request #40652: [SPARK-43018][SQL] Fix bug for INSERT commands with timestamp literals - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/06 04:17:04 UTC, 0 replies.
- [GitHub] [spark] zsxwing commented on a diff in pull request #40561: [SPARK-42931][SS] Introduce dropDuplicatesWithinWatermark - posted by "zsxwing (via GitHub)" <gi...@apache.org> on 2023/04/06 05:00:59 UTC, 2 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40678: [SPARK-43040][SQL] Improve TimestampNTZ type support in JDBC data source - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/06 05:08:29 UTC, 14 replies.
- [GitHub] [spark] gengliangwang commented on pull request #40679: [SPARK-43041][SQL] Restore constructors of exceptions for compatibility in connector API - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/06 05:10:31 UTC, 0 replies.
- [GitHub] [spark] gengliangwang closed pull request #40679: [SPARK-43041][SQL] Restore constructors of exceptions for compatibility in connector API - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/06 05:10:59 UTC, 0 replies.
- [GitHub] [spark] shrprasa commented on pull request #30057: [SPARK-32838][SQL]Check DataSource insert command path with actual path - posted by "shrprasa (via GitHub)" <gi...@apache.org> on 2023/04/06 05:27:31 UTC, 0 replies.
- [GitHub] [spark] shrprasa commented on pull request #35608: [SPARK-32838][SQL] Static partition overwrite could use staging dir insert - posted by "shrprasa (via GitHub)" <gi...@apache.org> on 2023/04/06 05:31:52 UTC, 0 replies.
- [GitHub] [spark] beliefer commented on a diff in pull request #40678: [SPARK-43040][SQL] Improve TimestampNTZ type support in JDBC data source - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/06 05:41:33 UTC, 9 replies.
- [GitHub] [spark] yaooqinn opened a new pull request, #40683: [SPARK-43049][SQL] Use CLOB instead of VARCHAR(255) for StringType for Oracle JDBC - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/06 05:54:28 UTC, 0 replies.
- [GitHub] [spark] eejbyfeldt commented on a diff in pull request #40663: [SPARK-39696][CORE] Fix data race in access to TaskMetrics.externalAccums - posted by "eejbyfeldt (via GitHub)" <gi...@apache.org> on 2023/04/06 06:10:15 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on pull request #40657: [SPARK-42844][SQL] Update the error class `_LEGACY_ERROR_TEMP_2008` to `INVALID_URL` - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/06 06:53:54 UTC, 2 replies.
- [GitHub] [spark] MaxGekk commented on pull request #40634: [SPARK-42840][SQL] Rename the error class _LEGACY_ERROR_TEMP_2004 to NO_DEFAULT_FOR_DATA_TYPE - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/06 06:56:45 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on pull request #40660: [SPARK-43028][SQL] Add error class SQL_CONF_NOT_FOUND - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/06 07:00:07 UTC, 1 replies.
- [GitHub] [spark] liang3zy22 commented on pull request #40657: [SPARK-42844][SQL] Update the error class `_LEGACY_ERROR_TEMP_2008` to `INVALID_URL` - posted by "liang3zy22 (via GitHub)" <gi...@apache.org> on 2023/04/06 07:04:48 UTC, 1 replies.
- [GitHub] [spark] vicennial commented on a diff in pull request #40675: [SPARK-42657][CONNECT] Support to find and transfer client-side REPL classfiles to server as artifacts - posted by "vicennial (via GitHub)" <gi...@apache.org> on 2023/04/06 07:08:52 UTC, 12 replies.
- [GitHub] [spark] MaxGekk closed pull request #40657: [SPARK-42844][SQL] Update the error class `_LEGACY_ERROR_TEMP_2008` to `INVALID_URL` - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/06 07:10:47 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40663: [SPARK-39696][CORE] Fix data race in access to TaskMetrics.externalAccums - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/06 07:42:04 UTC, 7 replies.
- [GitHub] [spark] Yikun commented on a diff in pull request #40607: [SPARK-42993][ML][CONNECT] Make PyTorch Distributor compatible with Spark Connect - posted by "Yikun (via GitHub)" <gi...@apache.org> on 2023/04/06 08:25:18 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40603: [MINOR][CONNECT] Adding Proto Debug String to Job Description. - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/06 08:46:30 UTC, 1 replies.
- [GitHub] [spark] grundprinzip commented on a diff in pull request #40603: [MINOR][CONNECT] Adding Proto Debug String to Job Description. - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/06 09:06:17 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40643: [SPARK-43013][PYTHON] Migrate `ValueError` from DataFrame into `PySparkValueError`. - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/06 09:10:26 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40643: [SPARK-43013][PYTHON] Migrate `ValueError` from DataFrame into `PySparkValueError`. - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/06 09:10:46 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40603: [MINOR][CONNECT] Adding Proto Debug String to Job Description. - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/06 09:37:22 UTC, 0 replies.
- [GitHub] [spark] frankliee commented on a diff in pull request #40646: [WIP][SPARK-42696]Speed up parquet reading with Java Vector API - posted by "frankliee (via GitHub)" <gi...@apache.org> on 2023/04/06 10:00:42 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40678: [SPARK-43040][SQL] Improve TimestampNTZ type support in JDBC data source - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/06 12:26:28 UTC, 0 replies.
- [GitHub] [spark] Leibnizhu commented on pull request #40634: [SPARK-42840][SQL] Change `_LEGACY_ERROR_TEMP_2004` error to internal error - posted by "Leibnizhu (via GitHub)" <gi...@apache.org> on 2023/04/06 12:28:29 UTC, 1 replies.
- [GitHub] [spark] allisonwang-db commented on a diff in pull request #40662: [SPARK-43030][SQL] Deduplicate relations with metadata columns - posted by "allisonwang-db (via GitHub)" <gi...@apache.org> on 2023/04/06 12:28:32 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on pull request #40634: [SPARK-42840][SQL] Change `_LEGACY_ERROR_TEMP_2004` error to internal error - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/06 12:52:46 UTC, 1 replies.
- [GitHub] [spark] MaxGekk closed pull request #40634: [SPARK-42840][SQL] Change `_LEGACY_ERROR_TEMP_2004` error to internal error - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/06 12:53:20 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40437: [SPARK-41259][SQL] SparkSQLDriver Output schema and result string should be consistent - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/06 13:17:47 UTC, 6 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40563: [SPARK-41233][FOLLOWUP] Refactor `array_prepend` with `RuntimeReplaceable` - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/06 13:21:52 UTC, 2 replies.
- [GitHub] [spark] RossKen commented on pull request #34442: [SPARK-37165][SQL] Add REPEATABLE in TABLESAMPLE to specify seed - posted by "RossKen (via GitHub)" <gi...@apache.org> on 2023/04/06 13:26:47 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40523: [SPARK-42897][SQL] Avoid evaluate variables multiple times for SMJ and SHJ fullOuter join - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/06 13:34:15 UTC, 8 replies.
- [GitHub] [spark] Hisoka-X opened a new pull request, #40684: [SPARK-41532][CONNECT][CLIENT] Add check for operations that involve multiple data frames - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/06 13:59:58 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on a diff in pull request #40684: [SPARK-41532][CONNECT][CLIENT] Add check for operations that involve multiple data frames - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/06 14:03:46 UTC, 6 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40684: [SPARK-41532][CONNECT][CLIENT] Add check for operations that involve multiple data frames - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/06 14:26:37 UTC, 6 replies.
- [GitHub] [spark] wankunde commented on a diff in pull request #40523: [SPARK-42897][SQL] Avoid evaluate variables multiple times for SMJ and SHJ fullOuter join - posted by "wankunde (via GitHub)" <gi...@apache.org> on 2023/04/06 14:38:14 UTC, 3 replies.
- [GitHub] [spark] wangyum opened a new pull request, #40685: [SPARK-43050][SQL] Fix construct aggregate expressions by replacing grouping functions - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/06 14:48:45 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on pull request #40621: Fix ExecutorAllocationManager cannot allocate new instances when all … - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/06 14:53:27 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on a diff in pull request #40685: [SPARK-43050][SQL] Fix construct aggregate expressions by replacing grouping functions - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/06 14:56:40 UTC, 5 replies.
- [GitHub] [spark] justaparth opened a new pull request, #40686: [SPARK-43051] Add option to materialize zero values when deserializing protobufs - posted by "justaparth (via GitHub)" <gi...@apache.org> on 2023/04/06 15:03:32 UTC, 0 replies.
- [GitHub] [spark] tgravescs commented on a diff in pull request #40637: [SPARK-43002][YARN] Modify yarn client application report logging frequency to reduce noise - posted by "tgravescs (via GitHub)" <gi...@apache.org> on 2023/04/06 15:08:31 UTC, 0 replies.
- [GitHub] [spark] warrenzhu25 opened a new pull request, #40687: [SPARK-43052][CORE] Handle stacktrace with null file name in event log - posted by "warrenzhu25 (via GitHub)" <gi...@apache.org> on 2023/04/06 15:19:02 UTC, 0 replies.
- [GitHub] [spark] warrenzhu25 commented on pull request #39280: [SPARK-41766][CORE] Handle decommission request sent before executor registration - posted by "warrenzhu25 (via GitHub)" <gi...@apache.org> on 2023/04/06 15:19:51 UTC, 1 replies.
- [GitHub] [spark] huaxingao commented on pull request #34442: [SPARK-37165][SQL] Add REPEATABLE in TABLESAMPLE to specify seed - posted by "huaxingao (via GitHub)" <gi...@apache.org> on 2023/04/06 15:33:28 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on pull request #40687: [SPARK-43052][CORE] Handle stacktrace with null file name in event log - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/06 16:02:09 UTC, 4 replies.
- [GitHub] [spark] warrenzhu25 commented on pull request #40687: [SPARK-43052][CORE] Handle stacktrace with null file name in event log - posted by "warrenzhu25 (via GitHub)" <gi...@apache.org> on 2023/04/06 16:06:32 UTC, 5 replies.
- [GitHub] [spark] mridulm commented on a diff in pull request #40687: [SPARK-43052][CORE] Handle stacktrace with null file name in event log - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/06 16:06:37 UTC, 2 replies.
- [GitHub] [spark] mridulm commented on pull request #39280: [SPARK-41766][CORE] Handle decommission request sent before executor registration - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/06 16:09:50 UTC, 0 replies.
- [GitHub] [spark] zzzzming95 opened a new pull request, #40688: [SPARK-43021] `CoalesceBucketsInJoin` not work when using AQE - posted by "zzzzming95 (via GitHub)" <gi...@apache.org> on 2023/04/06 16:12:14 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40677: [SPARK-43039] Support custom fields in the file source _metadata column. - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/06 16:30:50 UTC, 1 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #39280: [SPARK-41766][CORE] Handle decommission request sent before executor registration - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/06 16:43:13 UTC, 1 replies.
- [GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39280: [SPARK-41766][CORE] Handle decommission request sent before executor registration - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/06 16:45:51 UTC, 2 replies.
- [GitHub] [spark] ryan-johnson-databricks commented on a diff in pull request #40677: [SPARK-43039] Support custom fields in the file source _metadata column. - posted by "ryan-johnson-databricks (via GitHub)" <gi...@apache.org> on 2023/04/06 17:03:53 UTC, 2 replies.
- [GitHub] [spark] WweiL opened a new pull request, #40689: [SPARK-42951][SS][Connect] DataStreamReader APIs - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/06 17:56:40 UTC, 0 replies.
- [GitHub] [spark] dbtsai commented on pull request #40686: [SPARK-43051] Add option to materialize zero values when deserializing protobufs - posted by "dbtsai (via GitHub)" <gi...@apache.org> on 2023/04/06 18:00:14 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40581: [SPARK-42953][Connect] Typed filter, map, flatMap, mapPartitions - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/06 18:56:40 UTC, 0 replies.
- [GitHub] [spark] hvanhovell closed pull request #40581: [SPARK-42953][Connect] Typed filter, map, flatMap, mapPartitions - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/06 19:26:03 UTC, 0 replies.
- [GitHub] [spark] viirya commented on a diff in pull request #40686: [SPARK-43051] Add option to materialize zero values when deserializing protobufs - posted by "viirya (via GitHub)" <gi...@apache.org> on 2023/04/06 19:44:43 UTC, 1 replies.
- [GitHub] [spark] pang-wu commented on a diff in pull request #40686: [SPARK-43051] Add option to materialize zero values when deserializing protobufs - posted by "pang-wu (via GitHub)" <gi...@apache.org> on 2023/04/06 22:01:26 UTC, 4 replies.
- [GitHub] [spark] jiangxb1987 opened a new pull request, #40690: [SPARK-43043][CORE] Improve the performance of MapOutputTracker.updateMapOutput - posted by "jiangxb1987 (via GitHub)" <gi...@apache.org> on 2023/04/06 22:10:32 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on pull request #40685: [SPARK-43050][SQL] Fix construct aggregate expressions by replacing grouping functions - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/06 22:18:04 UTC, 2 replies.
- [GitHub] [spark] WweiL opened a new pull request, #40691: [SPARK-43031] [SS] [Connect] Enable unit test and doctest for streaming - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/06 22:24:42 UTC, 0 replies.
- [GitHub] [spark] WweiL commented on a diff in pull request #40691: [SPARK-43031] [SS] [Connect] Enable unit test and doctest for streaming - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/06 22:26:18 UTC, 22 replies.
- [GitHub] [spark] warrenzhu25 commented on a diff in pull request #40687: [SPARK-43052][CORE] Handle stacktrace with null file name in event log - posted by "warrenzhu25 (via GitHub)" <gi...@apache.org> on 2023/04/06 22:40:40 UTC, 2 replies.
- [GitHub] [spark] justaparth commented on a diff in pull request #40686: [SPARK-43051] Add option to materialize zero values when deserializing protobufs - posted by "justaparth (via GitHub)" <gi...@apache.org> on 2023/04/06 22:47:45 UTC, 5 replies.
- [GitHub] [spark] xinrong-meng commented on pull request #40628: [SPARK-42999][Connect] Dataset#foreach, foreachPartition - posted by "xinrong-meng (via GitHub)" <gi...@apache.org> on 2023/04/06 23:08:07 UTC, 0 replies.
- [GitHub] [spark] ueshin opened a new pull request, #40692: [SPARK-43055][CONNECT][PYTHON] Support duplicated nested field names - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/06 23:42:54 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40673: [SPARK-41537][INFRA][CONNECT][FOLLOW-UP] Removes breaking changes within master branch - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/07 00:09:34 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40692: [SPARK-43055][CONNECT][PYTHON] Support duplicated nested field names - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/07 00:15:03 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40607: [SPARK-42993][ML][CONNECT] Make PyTorch Distributor compatible with Spark Connect - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/07 00:16:13 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40691: [SPARK-43031] [SS] [Connect] Enable unit test and doctest for streaming - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/07 00:16:48 UTC, 2 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #39219: [WIP][SPARK-41277] Auto infer bucketing info for shuffled actions - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/07 00:17:11 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #38896: [WIP][SQL] Replace `require()` by an internal error in catalyst - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/07 00:17:14 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #38893: [Spark-40099][SQL] Merge adjacent CaseWhen branches if their values are the same - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/07 00:17:16 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40663: [SPARK-39696][CORE] Fix data race in access to TaskMetrics.externalAccums - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/07 01:13:49 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40663: [SPARK-39696][CORE] Fix data race in access to TaskMetrics.externalAccums - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/07 01:14:16 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40692: [SPARK-43055][CONNECT][PYTHON] Support duplicated nested field names - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/07 01:15:31 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40682: [SPARK-43044][CONNECT][BUILD] Upgrade buf to v1.17.0 - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/07 01:20:24 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40682: [SPARK-43044][CONNECT][BUILD] Upgrade buf to v1.17.0 - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/07 01:20:56 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40628: [SPARK-42999][Connect] Dataset#foreach, foreachPartition - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/07 01:28:21 UTC, 0 replies.
- [GitHub] [spark] hvanhovell closed pull request #40628: [SPARK-42999][Connect] Dataset#foreach, foreachPartition - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/07 01:28:56 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40684: [SPARK-41532][CONNECT][CLIENT] Add check for operations that involve multiple data frames - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/07 01:28:58 UTC, 2 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #39541: [SPARK-42043][CONNECT] Scala Client Result with E2E Tests - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/07 01:31:18 UTC, 1 replies.
- [GitHub] [spark] zhenlineo commented on a diff in pull request #39541: [SPARK-42043][CONNECT] Scala Client Result with E2E Tests - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/07 01:39:10 UTC, 0 replies.
- [GitHub] [spark] hvanhovell closed pull request #40676: [SPARK-42656][FOLLOWUP] Add BUILD and SCCLASSPATH options to Spark Connect scripts - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/07 01:40:48 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40651: [SPARK-43019][SQL] Move Ordering to PhysicalDataType - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/07 01:42:49 UTC, 0 replies.
- [GitHub] [spark] hvanhovell closed pull request #40651: [SPARK-43019][SQL] Move Ordering to PhysicalDataType - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/07 01:46:12 UTC, 0 replies.
- [GitHub] [spark] yaooqinn commented on pull request #40683: [SPARK-43049][SQL] Use CLOB instead of VARCHAR(255) for StringType for Oracle JDBC - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/07 01:46:30 UTC, 1 replies.
- [GitHub] [spark] hvanhovell closed pull request #40656: [SPARK-43023][CONNECT][TESTS] Add switch catalog testing scenario for `CatalogSuite` - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/07 01:51:57 UTC, 0 replies.
- [GitHub] [spark] ueshin commented on pull request #40692: [SPARK-43055][CONNECT][PYTHON] Support duplicated nested field names - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/07 01:55:49 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40605: [SPARK-42958][CONNECT] Refactor `connect-jvm-client-mima-check` to support mima check with avro module - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/07 02:00:52 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40676: [SPARK-42656][FOLLOWUP] Add BUILD and SCCLASSPATH options to Spark Connect scripts - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/07 02:01:37 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40654: [SPARK-43022][CONNECT] Support protobuf functions for Scala client - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/07 02:08:17 UTC, 1 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40692: [SPARK-43055][CONNECT][PYTHON] Support duplicated nested field names - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/07 02:14:18 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40415: [Do not merge] Add JDBC to DataFrameWriter - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/07 02:16:40 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40352: [SPARK-42664][CONNECT] Support `bloomFilter` function for `DataFrameStatFunctions` - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/07 02:25:54 UTC, 4 replies.
- [GitHub] [spark] amaliujia commented on pull request #40684: [SPARK-41532][CONNECT][CLIENT] Add check for operations that involve multiple data frames - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/07 02:39:17 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40684: [SPARK-41532][CONNECT][CLIENT] Add check for operations that involve multiple data frames - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/07 02:41:41 UTC, 2 replies.
- [GitHub] [spark] amaliujia opened a new pull request, #40693: [SPARK-43058] Move Numeric to PhysicalDataType - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/07 02:48:31 UTC, 0 replies.
- [GitHub] [spark] itholic opened a new pull request, #40694: [SPARK-42057][CONNECT][PYTHON] Migrate Spark Connect Column errors into error class - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/07 02:52:43 UTC, 0 replies.
- [GitHub] [spark] yaooqinn commented on a diff in pull request #40678: [SPARK-43040][SQL] Improve TimestampNTZ type support in JDBC data source - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/07 02:53:07 UTC, 0 replies.
- [GitHub] [spark] yaooqinn closed pull request #40683: [SPARK-43049][SQL] Use CLOB instead of VARCHAR(255) for StringType for Oracle JDBC - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/07 02:55:53 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40693: [SPARK-43058] Move Numeric to PhysicalDataType - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/07 02:57:07 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40695: [SPARK-42994][ML][CONNECT] PyTorch Distributor support Local Mode with GPU - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/07 02:58:03 UTC, 0 replies.
- [GitHub] [spark] beliefer commented on pull request #40415: [Do not merge] Add JDBC to DataFrameWriter - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/07 03:13:54 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on a diff in pull request #40690: [SPARK-43043][CORE] Improve the performance of MapOutputTracker.updateMapOutput - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/07 03:24:10 UTC, 0 replies.
- [GitHub] [spark] beliefer commented on pull request #40563: [SPARK-41233][FOLLOWUP] Refactor `array_prepend` with `RuntimeReplaceable` - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/07 03:31:38 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40685: [SPARK-43050][SQL] Fix construct aggregate expressions by replacing grouping functions - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/07 03:35:16 UTC, 0 replies.
- [GitHub] [spark] beliefer commented on pull request #39170: [SPARK-41674][SQL] Runtime filter should supports multi level shuffle join side as filter creation side - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/07 03:39:38 UTC, 2 replies.
- [GitHub] [spark] dongjoon-hyun commented on a diff in pull request #40683: [SPARK-43049][SQL] Use CLOB instead of VARCHAR(255) for StringType for Oracle JDBC - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/07 03:45:07 UTC, 1 replies.
- [GitHub] [spark] cloud-fan closed pull request #40662: [SPARK-43030][SQL] Deduplicate relations with metadata columns - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/07 03:47:00 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40555: [SPARK-42926][BUILD][SQL] Upgrade Parquet to 1.13.0 - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/07 03:47:05 UTC, 4 replies.
- [GitHub] [spark] yaooqinn commented on a diff in pull request #40683: [SPARK-43049][SQL] Use CLOB instead of VARCHAR(255) for StringType for Oracle JDBC - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/07 03:52:57 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40663: [SPARK-39696][CORE] Fix data race in access to TaskMetrics.externalAccums - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/07 04:01:59 UTC, 3 replies.
- [GitHub] [spark] anishshri-db opened a new pull request, #40696: [SPARK-43056] RocksDB state store commit should continue b/ground work only if its paused - posted by "anishshri-db (via GitHub)" <gi...@apache.org> on 2023/04/07 04:04:09 UTC, 0 replies.
- [GitHub] [spark] anishshri-db commented on pull request #40696: [SPARK-43056][SS] RocksDB state store commit should continue b/ground work only if its paused - posted by "anishshri-db (via GitHub)" <gi...@apache.org> on 2023/04/07 04:05:14 UTC, 2 replies.
- [GitHub] [spark] ueshin commented on a diff in pull request #40692: [SPARK-43055][CONNECT][PYTHON] Support duplicated nested field names - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/07 04:12:44 UTC, 3 replies.
- [GitHub] [spark] dongjoon-hyun closed pull request #40639: [SPARK-43007][BUILD] Upgrade rocksdbjni to 8.0.0 - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/07 05:14:11 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40639: [SPARK-43007][BUILD] Upgrade rocksdbjni to 8.0.0 - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/07 05:14:18 UTC, 1 replies.
- [GitHub] [spark] HeartSaVioR commented on pull request #40639: [SPARK-43007][BUILD] Upgrade rocksdbjni to 8.0.0 - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/07 05:16:51 UTC, 0 replies.
- [GitHub] [spark] cloud-fan opened a new pull request, #40697: [SPARK-43061][SQL] Introduce TaskEvaluator for SQL operator execution - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/07 05:41:22 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40697: [SPARK-43061][SQL] Introduce TaskEvaluator for SQL operator execution - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/07 05:42:57 UTC, 2 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40697: [SPARK-43061][SQL] Introduce TaskEvaluator for SQL operator execution - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/07 05:43:42 UTC, 2 replies.
- [GitHub] [spark] dongjoon-hyun commented on a diff in pull request #40697: [SPARK-43061][SQL] Introduce TaskEvaluator for SQL operator execution - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/07 06:02:05 UTC, 1 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40688: [SPARK-43021][SQL] `CoalesceBucketsInJoin` not work when using AQE - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/07 06:10:30 UTC, 3 replies.
- [GitHub] [spark] amaliujia commented on pull request #40693: [SPARK-43058] Move Numeric and Fractional to PhysicalDataType - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/07 06:22:36 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR commented on pull request #40696: [SPARK-43056][SS] RocksDB state store commit should continue b/ground work only if its paused - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/07 06:24:17 UTC, 0 replies.
- [GitHub] [spark] ueshin opened a new pull request, #40698: [SPARK-43062][INFRA][PYTHON][TESTS] Add options to lint-python to run each test separately - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/07 06:43:13 UTC, 0 replies.
- [GitHub] [spark] yaooqinn commented on a diff in pull request #40697: [SPARK-43061][SQL] Introduce TaskEvaluator for SQL operator execution - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/07 06:48:14 UTC, 0 replies.
- [GitHub] [spark] viirya commented on a diff in pull request #40688: [SPARK-43021][SQL] `CoalesceBucketsInJoin` not work when using AQE - posted by "viirya (via GitHub)" <gi...@apache.org> on 2023/04/07 06:48:34 UTC, 0 replies.
- [GitHub] [spark] beliefer commented on pull request #40697: [SPARK-43061][SQL] Introduce TaskEvaluator for SQL operator execution - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/07 06:55:05 UTC, 1 replies.
- [GitHub] [spark] zzzzming95 commented on pull request #40688: [SPARK-43021][SQL] `CoalesceBucketsInJoin` not work when using AQE - posted by "zzzzming95 (via GitHub)" <gi...@apache.org> on 2023/04/07 07:00:15 UTC, 6 replies.
- [GitHub] [spark] viirya commented on a diff in pull request #40697: [SPARK-43061][SQL] Introduce TaskEvaluator for SQL operator execution - posted by "viirya (via GitHub)" <gi...@apache.org> on 2023/04/07 07:01:50 UTC, 0 replies.
- [GitHub] [spark] yaooqinn commented on pull request #40697: [SPARK-43061][SQL] Introduce TaskEvaluator for SQL operator execution - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/07 07:10:53 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40697: [SPARK-43061][SQL] Introduce TaskEvaluator for SQL operator execution - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/07 07:31:08 UTC, 2 replies.
- [GitHub] [spark] Yikf opened a new pull request, #40699: [SPARK-43063][SQL] `df.show` handle null should print NULL instead of null - posted by "Yikf (via GitHub)" <gi...@apache.org> on 2023/04/07 07:47:02 UTC, 0 replies.
- [GitHub] [spark] caican00 opened a new pull request, #40700: [SPARK][SQL]Set job description for tpcds queries - posted by "caican00 (via GitHub)" <gi...@apache.org> on 2023/04/07 07:57:33 UTC, 0 replies.
- [GitHub] [spark] caican00 commented on pull request #40700: [SPARK][SQL]Set job description for tpcds queries - posted by "caican00 (via GitHub)" <gi...@apache.org> on 2023/04/07 08:03:20 UTC, 0 replies.
- [GitHub] [spark] AngersZhuuuu opened a new pull request, #40701: [SPARK-43064][SQL] Spark SQL CLI SQL tab should only show once statement once - posted by "AngersZhuuuu (via GitHub)" <gi...@apache.org> on 2023/04/07 08:32:41 UTC, 0 replies.
- [GitHub] [spark] AngersZhuuuu commented on pull request #40701: [SPARK-43064][SQL] Spark SQL CLI SQL tab should only show once statement once - posted by "AngersZhuuuu (via GitHub)" <gi...@apache.org> on 2023/04/07 08:33:13 UTC, 1 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40605: [SPARK-42958][CONNECT] Refactor `connect-jvm-client-mima-check` to support mima check with avro module - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/07 08:34:40 UTC, 1 replies.
- [GitHub] [spark] AngersZhuuuu commented on pull request #40314: [SPARK-42698][CORE] SparkSubmit should also stop SparkContext when exit program in yarn mode and pass exitCode to AM side - posted by "AngersZhuuuu (via GitHub)" <gi...@apache.org> on 2023/04/07 08:36:52 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40701: [SPARK-43064][SQL] Spark SQL CLI SQL tab should only show once statement once - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/07 08:48:30 UTC, 1 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40352: [SPARK-42664][CONNECT] Support `bloomFilter` function for `DataFrameStatFunctions` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/07 08:58:06 UTC, 5 replies.
- [GitHub] [spark] AngersZhuuuu commented on a diff in pull request #40701: [SPARK-43064][SQL] Spark SQL CLI SQL tab should only show once statement once - posted by "AngersZhuuuu (via GitHub)" <gi...@apache.org> on 2023/04/07 09:04:01 UTC, 1 replies.
- [GitHub] [spark] AngersZhuuuu commented on pull request #40315: [SPARK-42699][CONNECT] SparkConnectServer should make client and AM same exit code - posted by "AngersZhuuuu (via GitHub)" <gi...@apache.org> on 2023/04/07 09:14:26 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on pull request #40114: [SPARK-42513][SQL] Push down topK through join - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/07 09:22:48 UTC, 0 replies.
- [GitHub] [spark] yaooqinn commented on pull request #40437: [SPARK-41259][SQL] SparkSQLDriver Output schema and result string should be consistent - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/07 09:25:51 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR opened a new pull request, #40702: [SPARK-43066][SQL] Add test for dropDuplicates in JavaDatasetSuite - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/07 10:01:43 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40352: [SPARK-42664][CONNECT] Support `bloomFilter` function for `DataFrameStatFunctions` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/07 10:09:16 UTC, 1 replies.
- [GitHub] [spark] clownxc opened a new pull request, #40703: [SPARK-43033][SQL] Avoid task retries due to AssertNotNull checks - posted by "clownxc (via GitHub)" <gi...@apache.org> on 2023/04/07 12:38:55 UTC, 0 replies.
- [GitHub] [spark] MaxGekk opened a new pull request, #40704: [WIP][SPARK-43038][SQL] Support the CBC mode by `aes_encrypt()`/`aes_decrypt()` - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/07 12:43:25 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR opened a new pull request, #40705: [SPARK-43067][SS] Correct the location of error class resource file in Kafka connector - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/07 12:50:45 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR commented on pull request #40705: [SPARK-43067][SS] Correct the location of error class resource file in Kafka connector - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/07 12:51:50 UTC, 2 replies.
- [GitHub] [spark] itholic opened a new pull request, #40706: [SPARK-43059][CONNECT][PYTHON] Migrate TypeError from DataFrame(Reader|Writer) into error class - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/07 13:13:02 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40701: [SPARK-43064][SQL] Spark SQL CLI SQL tab should only show once statement once - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/07 14:18:29 UTC, 0 replies.
- [GitHub] [spark] warrenzhu25 commented on a diff in pull request #39280: [SPARK-41766][CORE] Handle decommission request sent before executor registration - posted by "warrenzhu25 (via GitHub)" <gi...@apache.org> on 2023/04/07 15:07:16 UTC, 1 replies.
- [GitHub] [spark] warrenzhu25 commented on pull request #38852: [SPARK-41341][CORE] Wait shuffle fetch to finish when decommission executor - posted by "warrenzhu25 (via GitHub)" <gi...@apache.org> on 2023/04/07 15:10:48 UTC, 2 replies.
- [GitHub] [spark] rangadi closed pull request #40373: [Draft] Streaming Spark Connect POC - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/07 16:09:53 UTC, 0 replies.
- [GitHub] [spark] aokolnychyi commented on pull request #40308: [SPARK-42151][SQL] Align UPDATE assignments with table attributes - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/07 16:29:18 UTC, 2 replies.
- [GitHub] [spark] zzzzming95 commented on a diff in pull request #40688: [SPARK-43021][SQL] `CoalesceBucketsInJoin` not work when using AQE - posted by "zzzzming95 (via GitHub)" <gi...@apache.org> on 2023/04/07 17:17:27 UTC, 7 replies.
- [GitHub] [spark] clownxc opened a new pull request, #40707: [SPARK-43033][SQL] Avoid task retries due to AssertNotNull checks - posted by "clownxc (via GitHub)" <gi...@apache.org> on 2023/04/07 17:33:12 UTC, 0 replies.
- [GitHub] [spark] clownxc closed pull request #40703: [SPARK-43033][SQL] Avoid task retries due to AssertNotNull checks - posted by "clownxc (via GitHub)" <gi...@apache.org> on 2023/04/07 17:33:30 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on pull request #40656: [SPARK-43023][CONNECT][TESTS] Add switch catalog testing scenario for `CatalogSuite` - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/07 18:09:04 UTC, 0 replies.
- [GitHub] [spark] RyanBerti commented on pull request #40615: [WIP][SPARK-16484][SQL] Add support for Datasketches HllSketch - posted by "RyanBerti (via GitHub)" <gi...@apache.org> on 2023/04/07 18:09:08 UTC, 2 replies.
- [GitHub] [spark] amaliujia commented on pull request #40315: [SPARK-42699][CONNECT] SparkConnectServer should make client and AM same exit code - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/07 18:29:39 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun opened a new pull request, #40708: [SPARK-43069][BUILD] Use `sbt-eclipse` instead of `sbteclipse-plugin` - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/07 18:59:55 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40708: [SPARK-43069][BUILD] Use `sbt-eclipse` instead of `sbteclipse-plugin` - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/07 19:11:23 UTC, 2 replies.
- [GitHub] [spark] ueshin commented on a diff in pull request #40015: [SPARK-42437][PYTHON][CONNECT] PySpark catalog.cacheTable will allow to specify storage level - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/07 19:13:55 UTC, 1 replies.
- [GitHub] [spark] viirya commented on pull request #40708: [SPARK-43069][BUILD] Use `sbt-eclipse` instead of `sbteclipse-plugin` - posted by "viirya (via GitHub)" <gi...@apache.org> on 2023/04/07 19:39:11 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun closed pull request #40708: [SPARK-43069][BUILD] Use `sbt-eclipse` instead of `sbteclipse-plugin` - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/07 19:54:17 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun opened a new pull request, #40709: [SPARK-43070][BUILD] Upgrade sbt-unidoc to 0.5.0 - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/07 20:13:43 UTC, 0 replies.
- [GitHub] [spark] WweiL commented on pull request #40691: [SPARK-43031] [SS] [Connect] Enable unit test and doctest for streaming - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/07 20:40:15 UTC, 2 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40709: [SPARK-43070][BUILD] Upgrade `sbt-unidoc` to 0.5.0 - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/07 20:42:17 UTC, 4 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40692: [SPARK-43055][CONNECT][PYTHON] Support duplicated nested field names - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/07 20:46:52 UTC, 3 replies.
- [GitHub] [spark] jiangxb1987 commented on pull request #40690: [SPARK-43043][CORE] Improve the performance of MapOutputTracker.updateMapOutput - posted by "jiangxb1987 (via GitHub)" <gi...@apache.org> on 2023/04/07 21:11:50 UTC, 0 replies.
- [GitHub] [spark] jiangxb1987 commented on a diff in pull request #40690: [SPARK-43043][CORE] Improve the performance of MapOutputTracker.updateMapOutput - posted by "jiangxb1987 (via GitHub)" <gi...@apache.org> on 2023/04/07 21:19:30 UTC, 0 replies.
- [GitHub] [spark] zhenlineo closed pull request #40274: [SPARK-42215][CONNECT] Simplify Scala Client IT tests - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/07 21:42:05 UTC, 0 replies.
- [GitHub] [spark] dtenedor opened a new pull request, #40710: [SPARK-43071][SQL] Support SELECT DEFAULT with ORDER BY, LIMIT, OFFSET for INSERT source relation - posted by "dtenedor (via GitHub)" <gi...@apache.org> on 2023/04/07 22:41:50 UTC, 0 replies.
- [GitHub] [spark] gengliangwang commented on a diff in pull request #40710: [SPARK-43071][SQL] Support SELECT DEFAULT with ORDER BY, LIMIT, OFFSET for INSERT source relation - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/07 22:54:30 UTC, 1 replies.
- [GitHub] [spark] dtenedor commented on a diff in pull request #40710: [SPARK-43071][SQL] Support SELECT DEFAULT with ORDER BY, LIMIT, OFFSET for INSERT source relation - posted by "dtenedor (via GitHub)" <gi...@apache.org> on 2023/04/07 22:58:20 UTC, 1 replies.
- [GitHub] [spark] dongjoon-hyun closed pull request #40709: [SPARK-43070][BUILD] Upgrade `sbt-unidoc` to 0.5.0 - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/07 23:00:35 UTC, 0 replies.
- [GitHub] [spark] gengliangwang opened a new pull request, #40711: [SPARK-43072][DOC] Include TIMESTAMP_NTZ type in ANSI Compliance doc - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/07 23:44:28 UTC, 0 replies.
- [GitHub] [spark] gengliangwang commented on pull request #40711: [SPARK-43072][DOC] Include TIMESTAMP_NTZ type in ANSI Compliance doc - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/07 23:46:52 UTC, 1 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #39259: [SPARK-41739][SQL] CheckRule should not be executed when analyze view child - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/08 00:16:53 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #39219: [WIP][SPARK-41277] Auto infer bucketing info for shuffled actions - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/08 00:16:56 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #39021: [SPARK-41483][CORE] Last metrics system report should have a timeout, avoid to lead shutdown hook timeout - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/08 00:16:57 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #38896: [WIP][SQL] Replace `require()` by an internal error in catalyst - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/08 00:16:59 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #38893: [Spark-40099][SQL] Merge adjacent CaseWhen branches if their values are the same - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/08 00:17:00 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on pull request #40690: [SPARK-43043][CORE] Improve the performance of MapOutputTracker.updateMapOutput - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/08 07:03:04 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on a diff in pull request #40707: [SPARK-43033][SQL] Avoid task retries due to AssertNotNull checks - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/08 07:07:24 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40525: [SPARK-42859][CONNECT][PS] Basic support for pandas API on Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/08 07:27:30 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40698: [SPARK-43062][INFRA][PYTHON][TESTS] Add options to lint-python to run each test separately - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/08 07:28:05 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40698: [SPARK-43062][INFRA][PYTHON][TESTS] Add options to lint-python to run each test separately - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/08 07:28:23 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40702: [SPARK-43066][SQL] Add test for dropDuplicates in JavaDatasetSuite - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/08 07:28:56 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40702: [SPARK-43066][SQL] Add test for dropDuplicates in JavaDatasetSuite - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/08 07:29:11 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40712: [SPARK-43073][CONNECT] Add proto data types constants - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/08 07:56:09 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40695: [SPARK-42994][ML][CONNECT] PyTorch Distributor support Local Mode with GPU - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/08 07:57:03 UTC, 2 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40263: [SPARK-42659][ML] Reimplement `FPGrowthModel.transform` with dataframe operations - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/08 08:02:59 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40699: [SPARK-43063][SQL] `df.show` handle null should print NULL instead of null - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/08 08:20:01 UTC, 1 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40693: [SPARK-43058] Move Numeric and Fractional to PhysicalDataType - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/08 10:55:15 UTC, 1 replies.
- [GitHub] [spark] HeartSaVioR commented on pull request #40702: [SPARK-43066][SQL] Add test for dropDuplicates in JavaDatasetSuite - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/08 11:13:09 UTC, 0 replies.
- [GitHub] [spark] wankunde opened a new pull request, #40713: [WIP][SPARK-42551][SQL] Support more subexpression elimination cases - posted by "wankunde (via GitHub)" <gi...@apache.org> on 2023/04/08 12:12:30 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR closed pull request #40561: [SPARK-42931][SS] Introduce dropDuplicatesWithinWatermark - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/08 13:45:05 UTC, 0 replies.
- [GitHub] [spark] zzzzming95 opened a new pull request, #40714: [SPARK-43074] Add the function without constant parameters of `SessionState#executePlan` - posted by "zzzzming95 (via GitHub)" <gi...@apache.org> on 2023/04/08 13:50:51 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40715: [SPARK-43007][TESTS][FOLLOWUP] Regenerate benchmark results of `StateStoreBasicOperationsBenchmark` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/08 14:14:05 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40715: [SPARK-43007][TESTS][FOLLOWUP] Regenerate benchmark results of `StateStoreBasicOperationsBenchmark` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/08 14:17:22 UTC, 1 replies.
- [GitHub] [spark] bjornjorgensen commented on pull request #40525: [SPARK-42859][CONNECT][PS] Basic support for pandas API on Spark Connect - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/08 19:27:13 UTC, 3 replies.
- [GitHub] [spark] bjornjorgensen opened a new pull request, #40716: Change `gRPC` to `grpcio` when it is not installed. - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/08 20:07:14 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR closed pull request #40705: [SPARK-43067][SS] Correct the location of error class resource file in Kafka connector - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/08 21:39:18 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR commented on a diff in pull request #40715: [SPARK-43007][TESTS][FOLLOWUP] Regenerate benchmark results of `StateStoreBasicOperationsBenchmark` - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/08 22:13:04 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40693: [SPARK-43058] Move Numeric and Fractional to PhysicalDataType - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/08 23:05:29 UTC, 10 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #39259: [SPARK-41739][SQL] CheckRule should not be executed when analyze view child - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/09 00:19:45 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #39021: [SPARK-41483][CORE] Last metrics system report should have a timeout, avoid to lead shutdown hook timeout - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/09 00:19:47 UTC, 0 replies.
- [GitHub] [spark] zzzzming95 commented on a diff in pull request #40714: [SPARK-43074] Add the function without constant parameters of `SessionState#executePlan` - posted by "zzzzming95 (via GitHub)" <gi...@apache.org> on 2023/04/09 03:34:36 UTC, 2 replies.
- [GitHub] [spark] grundprinzip commented on a diff in pull request #40695: [SPARK-42994][ML][CONNECT] PyTorch Distributor support Local Mode with GPU - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/09 04:39:47 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40716: [SPARK-43075][CONNECT] Change `gRPC` to `grpcio` when it is not installed. - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/09 05:50:49 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40716: [SPARK-43075][CONNECT] Change `gRPC` to `grpcio` when it is not installed. - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/09 05:51:05 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40673: [SPARK-41537][INFRA][CONNECT][FOLLOW-UP] Removes breaking changes within master branch - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/09 05:55:51 UTC, 0 replies.
- [GitHub] [spark] khalidmammadov commented on pull request #40015: [SPARK-42437][PYTHON][CONNECT] PySpark catalog.cacheTable will allow to specify storage level - posted by "khalidmammadov (via GitHub)" <gi...@apache.org> on 2023/04/09 11:10:30 UTC, 1 replies.
- [GitHub] [spark] khalidmammadov commented on a diff in pull request #40015: [SPARK-42437][PYTHON][CONNECT] PySpark catalog.cacheTable will allow to specify storage level - posted by "khalidmammadov (via GitHub)" <gi...@apache.org> on 2023/04/09 11:18:36 UTC, 0 replies.
- [GitHub] [spark] grundprinzip commented on pull request #40525: [SPARK-42859][CONNECT][PS] Basic support for pandas API on Spark Connect - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/09 12:30:45 UTC, 3 replies.
- [GitHub] [spark] bersprockets opened a new pull request, #40717: [MINOR][SQL][TESTS] Tests in `SubquerySuite` should not drop view created in `beforeAll` - posted by "bersprockets (via GitHub)" <gi...@apache.org> on 2023/04/09 19:43:10 UTC, 0 replies.
- [GitHub] [spark] bersprockets commented on a diff in pull request #40717: [MINOR][SQL][TESTS] Tests in `SubquerySuite` should not drop view created in `beforeAll` - posted by "bersprockets (via GitHub)" <gi...@apache.org> on 2023/04/09 19:51:07 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR commented on pull request #40715: [SPARK-43007][TESTS][FOLLOWUP] Regenerate benchmark results of `StateStoreBasicOperationsBenchmark` - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/09 23:52:51 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR closed pull request #40715: [SPARK-43007][TESTS][FOLLOWUP] Regenerate benchmark results of `StateStoreBasicOperationsBenchmark` - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/09 23:53:24 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #37348: [SPARK-39854][SQL] replaceWithAliases should keep the original children for Generate - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/10 00:18:22 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40717: [MINOR][SQL][TESTS] Tests in `SubquerySuite` should not drop view created in `beforeAll` - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 00:41:56 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40717: [MINOR][SQL][TESTS] Tests in `SubquerySuite` should not drop view created in `beforeAll` - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 00:42:16 UTC, 0 replies.
- [GitHub] [spark] WeichenXu123 commented on pull request #40695: [SPARK-42994][ML][CONNECT] PyTorch Distributor support Local Mode with GPU - posted by "WeichenXu123 (via GitHub)" <gi...@apache.org> on 2023/04/10 00:53:08 UTC, 6 replies.
- [GitHub] [spark] clownxc commented on a diff in pull request #40707: [SPARK-43033][SQL] Avoid task retries due to AssertNotNull checks - posted by "clownxc (via GitHub)" <gi...@apache.org> on 2023/04/10 01:08:35 UTC, 16 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40712: [SPARK-43073][CONNECT] Add proto data types constants - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/10 01:36:51 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40712: [SPARK-43073][CONNECT] Add proto data types constants - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/10 01:37:16 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40714: [SPARK-43074] Add the function without constant parameters of `SessionState#executePlan` - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 02:09:32 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40711: [SPARK-43072][DOC] Include TIMESTAMP_NTZ type in ANSI Compliance doc - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 02:09:57 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40711: [SPARK-43072][DOC] Include TIMESTAMP_NTZ type in ANSI Compliance doc - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 02:10:13 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40700: [SPARK-43065][SQL]Set job description for tpcds queries - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 02:12:18 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40707: [SPARK-43033][SQL] Avoid task retries due to AssertNotNull checks - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 02:32:01 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40697: [SPARK-43061][CORE][SQL] Introduce TaskEvaluator for SQL operator execution - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 02:48:35 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on pull request #40326: [SPARK-42708][DOCS] Improve doc about protobuf java file can't be indexed. - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/10 02:55:14 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X closed pull request #40326: [SPARK-42708][DOCS] Improve doc about protobuf java file can't be indexed. - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/10 02:55:15 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40677: [SPARK-43039][SQL] Support custom fields in the file source _metadata column. - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 03:01:37 UTC, 2 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40628: [SPARK-42999][Connect] Dataset#foreach, foreachPartition - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/10 03:06:53 UTC, 2 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40695: [SPARK-42994][ML][CONNECT] PyTorch Distributor support Local Mode with GPU - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/10 03:13:03 UTC, 3 replies.
- [GitHub] [spark] yaooqinn opened a new pull request, #40718: [SPARK-43077][SQL] Improve the error message of UNRECOGNIZED_SQL_TYPE - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/10 03:18:36 UTC, 0 replies.
- [GitHub] [spark] jiangjiguang closed pull request #40646: [WIP][SPARK-42696]Speed up parquet reading with Java Vector API - posted by "jiangjiguang (via GitHub)" <gi...@apache.org> on 2023/04/10 03:21:03 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40614: [SPARK-42987][DOCS] Correction of protobuf sql documentation - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 03:23:20 UTC, 0 replies.
- [GitHub] [spark] jiangjiguang opened a new pull request, #40719: [WIP]Speed up parquet reading with Java Vector API - posted by "jiangjiguang (via GitHub)" <gi...@apache.org> on 2023/04/10 03:26:13 UTC, 0 replies.
- [GitHub] [spark] jiangjiguang commented on pull request #40719: [WIP]Speed up parquet reading with Java Vector API - posted by "jiangjiguang (via GitHub)" <gi...@apache.org> on 2023/04/10 03:29:42 UTC, 2 replies.
- [GitHub] [spark] grundprinzip commented on pull request #40695: [SPARK-42994][ML][CONNECT] PyTorch Distributor support Local Mode with GPU - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/10 04:30:59 UTC, 1 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40491: [SPARK-41006][K8S] Generate new ConfigMap names for each run - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/10 05:15:34 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40190: [SPARK-42597][SQL] Support unwrap date type to timestamp type - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/10 05:17:45 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on a diff in pull request #40637: [SPARK-43002][YARN] Modify yarn client application report logging frequency to reduce noise - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/10 05:22:30 UTC, 3 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40688: [SPARK-43021][SQL] `CoalesceBucketsInJoin` not work when using AQE - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/10 05:26:20 UTC, 5 replies.
- [GitHub] [spark] tirumaleshn2458 opened a new pull request, #40720: Master clone2 - posted by "tirumaleshn2458 (via GitHub)" <gi...@apache.org> on 2023/04/10 05:34:32 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40677: [SPARK-43039][SQL] Support custom fields in the file source _metadata column. - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/10 05:36:16 UTC, 7 replies.
- [GitHub] [spark] dongjoon-hyun closed pull request #40680: Master clone - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/10 05:38:33 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun closed pull request #40720: Master clone2 - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/10 05:38:49 UTC, 0 replies.
- [GitHub] [spark] zzzzming95 closed pull request #40714: [SPARK-43074] Add the function without constant parameters of `SessionState#executePlan` - posted by "zzzzming95 (via GitHub)" <gi...@apache.org> on 2023/04/10 05:44:12 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR commented on pull request #40696: [SPARK-43056][SS] RocksDB state store commit should continue background work only if its paused - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/10 05:46:43 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR closed pull request #40696: [SPARK-43056][SS] RocksDB state store commit should continue background work only if its paused - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/10 05:47:28 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40721: [WIP][BUILD] Upgrade zstd-jni to 1.5.5-1 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/10 06:00:03 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40706: [SPARK-43059][CONNECT][PYTHON] Migrate TypeError from DataFrame(Reader|Writer) into error class - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 06:14:36 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40694: [SPARK-42057][CONNECT][PYTHON] Migrate Spark Connect Column errors into error class - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 06:14:41 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40706: [SPARK-43059][CONNECT][PYTHON] Migrate TypeError from DataFrame(Reader|Writer) into error class - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 06:15:04 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40694: [SPARK-42057][CONNECT][PYTHON] Migrate Spark Connect Column errors into error class - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 06:15:18 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40700: [SPARK-43065][SQL][TESTS] Set job description in `TPCDSQueryBenchmark` - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 06:19:13 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40700: [SPARK-43065][SQL][TESTS] Set job description in `TPCDSQueryBenchmark` - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 06:19:30 UTC, 0 replies.
- [GitHub] [spark] itholic opened a new pull request, #40722: [SPARK-43076][PS][CONNECT] Removing the dependency on `grpcio` when remote session is not used. - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/10 06:21:59 UTC, 0 replies.
- [GitHub] [spark] itholic commented on pull request #40722: [SPARK-43076][PS][CONNECT] Removing the dependency on `grpcio` when remote session is not used. - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/10 06:23:05 UTC, 2 replies.
- [GitHub] [spark] viirya commented on a diff in pull request #40697: [SPARK-43061][CORE][SQL] Introduce PartitionEvaluator for SQL operator execution - posted by "viirya (via GitHub)" <gi...@apache.org> on 2023/04/10 06:45:53 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #39170: [SPARK-41674][SQL] Runtime filter should supports multi level shuffle join side as filter creation side - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/10 07:02:05 UTC, 16 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40700: [SPARK-43065][SQL][TESTS] Set job description in `TPCDSQueryBenchmark` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/10 07:38:09 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40693: [SPARK-43058] Move Numeric and Fractional to PhysicalDataType - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/10 07:57:06 UTC, 10 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40697: [SPARK-43061][CORE][SQL] Introduce PartitionEvaluator for SQL operator execution - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/10 07:59:41 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40722: [SPARK-43076][PS][CONNECT] Removing the dependency on `grpcio` when remote session is not used. - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 08:00:11 UTC, 31 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40723: [MINOR][CONNECT][TESTS] Move `withTable` from `RemoteSparkSession` to `SQLHelper` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/10 08:53:35 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40705: [SPARK-43067][SS] Correct the location of error class resource file in Kafka connector - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/10 08:56:35 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40721: [SPARK-43080][BUILD] Upgrade `zstd-jni` to 1.5.5-1 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/10 09:03:50 UTC, 3 replies.
- [GitHub] [spark] yaooqinn commented on pull request #40718: [SPARK-43077][SQL] Improve the error message of UNRECOGNIZED_SQL_TYPE - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/10 09:20:55 UTC, 1 replies.
- [GitHub] [spark] Hisoka-X commented on pull request #40684: [SPARK-41532][CONNECT][CLIENT] Add check for operations that involve multiple data frames - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/10 09:25:39 UTC, 3 replies.
- [GitHub] [spark] wangyum commented on pull request #40555: [SPARK-42926][BUILD][SQL] Upgrade Parquet to 1.13.0 - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/10 09:27:34 UTC, 3 replies.
- [GitHub] [spark] wangyum commented on a diff in pull request #40190: [SPARK-42597][SQL] Support unwrap date type to timestamp type - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/10 09:28:19 UTC, 1 replies.
- [GitHub] [spark] WeichenXu123 opened a new pull request, #40724: [SPARK-43081] [ML] [CONNECT] Add torch distributor data loader that loads data from spark partition data - posted by "WeichenXu123 (via GitHub)" <gi...@apache.org> on 2023/04/10 09:59:36 UTC, 0 replies.
- [GitHub] [spark] beliefer commented on a diff in pull request #39170: [SPARK-41674][SQL] Runtime filter should supports multi level shuffle join side as filter creation side - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/10 10:25:49 UTC, 5 replies.
- [GitHub] [spark] AngersZhuuuu commented on pull request #40437: [SPARK-41259][SQL] SparkSQLDriver Output schema and result string should be consistent - posted by "AngersZhuuuu (via GitHub)" <gi...@apache.org> on 2023/04/10 10:34:34 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40724: [SPARK-43081] [ML] [CONNECT] Add torch distributor data loader that loads data from spark partition data - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/10 11:17:59 UTC, 0 replies.
- [GitHub] [spark] WeichenXu123 commented on a diff in pull request #40724: [SPARK-43081] [ML] [CONNECT] Add torch distributor data loader that loads data from spark partition data - posted by "WeichenXu123 (via GitHub)" <gi...@apache.org> on 2023/04/10 11:40:37 UTC, 8 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40699: [SPARK-43063][SQL] `df.show` handle null should print NULL instead of null - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/10 12:24:56 UTC, 2 replies.
- [GitHub] [spark] Yikf commented on a diff in pull request #40699: [SPARK-43063][SQL] `df.show` handle null should print NULL instead of null - posted by "Yikf (via GitHub)" <gi...@apache.org> on 2023/04/10 12:51:07 UTC, 1 replies.
- [GitHub] [spark] ryan-johnson-databricks commented on a diff in pull request #40677: [SPARK-43039][SQL] Support custom fields in the file source _metadata column. - posted by "ryan-johnson-databricks (via GitHub)" <gi...@apache.org> on 2023/04/10 13:03:08 UTC, 21 replies.
- [GitHub] [spark] justaparth commented on a diff in pull request #40686: [SPARK-43051][PROTOBUF] Add option to materialize zero values when deserializing protobufs - posted by "justaparth (via GitHub)" <gi...@apache.org> on 2023/04/10 14:13:14 UTC, 2 replies.
- [GitHub] [spark] zhenlineo commented on a diff in pull request #40628: [SPARK-42999][Connect] Dataset#foreach, foreachPartition - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/10 15:20:19 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40308: [SPARK-42151][SQL] Align UPDATE assignments with table attributes - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/10 15:53:57 UTC, 4 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40685: [SPARK-43050][SQL] Fix construct aggregate expressions by replacing grouping functions - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/10 16:01:26 UTC, 2 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40718: [SPARK-43077][SQL] Improve the error message of UNRECOGNIZED_SQL_TYPE - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/10 16:07:23 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40697: [SPARK-43061][CORE][SQL] Introduce PartitionEvaluator for SQL operator execution - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/10 16:08:00 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40697: [SPARK-43061][CORE][SQL] Introduce PartitionEvaluator for SQL operator execution - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/10 16:08:43 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40685: [SPARK-43050][SQL] Fix construct aggregate expressions by replacing grouping functions - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/10 16:15:48 UTC, 3 replies.
- [GitHub] [spark] mkaravel commented on a diff in pull request #40615: [WIP][SPARK-16484][SQL] Add support for Datasketches HllSketch - posted by "mkaravel (via GitHub)" <gi...@apache.org> on 2023/04/10 16:46:31 UTC, 3 replies.
- [GitHub] [spark] xinrong-meng opened a new pull request, #40725: [SPARK-43082][Connect][PYTHON] Arrow-optimized Python UDFs in Spark Connect - posted by "xinrong-meng (via GitHub)" <gi...@apache.org> on 2023/04/10 17:54:36 UTC, 0 replies.
- [GitHub] [spark] xinrong-meng commented on a diff in pull request #40725: [SPARK-43082][Connect][PYTHON] Arrow-optimized Python UDFs in Spark Connect - posted by "xinrong-meng (via GitHub)" <gi...@apache.org> on 2023/04/10 18:30:05 UTC, 2 replies.
- [GitHub] [spark] aokolnychyi commented on a diff in pull request #40707: [SPARK-43033][SQL] Avoid task retries due to AssertNotNull checks - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/10 18:39:56 UTC, 2 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40697: [SPARK-43061][CORE][SQL] Introduce PartitionEvaluator for SQL operator execution - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/10 18:50:41 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #40691: [SPARK-43031] [SS] [Connect] Enable unit test and doctest for streaming - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/10 18:52:28 UTC, 13 replies.
- [GitHub] [spark] rangadi commented on pull request #40689: [SPARK-42951][SS][Connect] DataStreamReader APIs - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/10 18:53:26 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on a diff in pull request #40721: [SPARK-43080][BUILD] Upgrade `zstd-jni` to 1.5.5-1 - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/10 18:56:44 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun opened a new pull request, #40726: [SPARK-42382][BUILD] Upgrade `cyclonedx-maven-plugin` to 2.7.6 - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/10 19:18:39 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40065: [SPARK-42382][BUILD] Upgrade `cyclonedx-maven-plugin` to 2.7.5 - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/10 19:28:06 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun opened a new pull request, #40727: [SPARK-43083][SQL][TESTS] Mark `*StateStoreSuite` as `ExtendedSQLTest` - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/10 21:21:27 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40726: [SPARK-42382][BUILD] Upgrade `cyclonedx-maven-plugin` to 2.7.6 - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/10 21:53:04 UTC, 3 replies.
- [GitHub] [spark] vkorukanti opened a new pull request, #40728: [WIP][SPARK-39634][SQL] Allow file splitting in combination with row index generation - posted by "vkorukanti (via GitHub)" <gi...@apache.org> on 2023/04/10 22:17:58 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40687: [SPARK-43052][CORE] Handle stacktrace with null file name in event log - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/10 22:30:34 UTC, 1 replies.
- [GitHub] [spark] zhenlineo opened a new pull request, #40729: [WIP][CONNECT] Adding groupByKey + mapGroup functions - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/10 22:40:28 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40727: [SPARK-43083][SQL][TESTS] Mark `*StateStoreSuite` as `ExtendedSQLTest` - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/10 22:46:00 UTC, 2 replies.
- [GitHub] [spark] gengliangwang commented on pull request #40710: [SPARK-43071][SQL] Support SELECT DEFAULT with ORDER BY, LIMIT, OFFSET for INSERT source relation - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/10 23:08:40 UTC, 0 replies.
- [GitHub] [spark] gengliangwang closed pull request #40710: [SPARK-43071][SQL] Support SELECT DEFAULT with ORDER BY, LIMIT, OFFSET for INSERT source relation - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/10 23:09:34 UTC, 0 replies.
- [GitHub] [spark] warrenzhu25 opened a new pull request, #40730: [SPARK-43086][CORE] Support bin pack task scheduling on executors - posted by "warrenzhu25 (via GitHub)" <gi...@apache.org> on 2023/04/10 23:16:54 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun closed pull request #40727: [SPARK-43083][SQL][TESTS] Mark `*StateStoreSuite` as `ExtendedSQLTest` - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/10 23:24:04 UTC, 0 replies.
- [GitHub] [spark] warrenzhu25 commented on pull request #40730: [SPARK-43086][CORE] Support bin pack task scheduling on executors - posted by "warrenzhu25 (via GitHub)" <gi...@apache.org> on 2023/04/10 23:24:05 UTC, 2 replies.
- [GitHub] [spark] wangyum opened a new pull request, #40731: [SPARK-43087][SQL] Support coalesce buckets in join in AQE - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/10 23:25:05 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40726: [SPARK-42382][BUILD] Upgrade `cyclonedx-maven-plugin` to 2.7.6 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 23:54:11 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40726: [SPARK-42382][BUILD] Upgrade `cyclonedx-maven-plugin` to 2.7.6 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 23:55:02 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40689: [SPARK-42951][SS][Connect] DataStreamReader APIs - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 23:56:10 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40689: [SPARK-42951][SS][Connect] DataStreamReader APIs - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/10 23:56:28 UTC, 0 replies.
- [GitHub] [spark] WeichenXu123 commented on a diff in pull request #40695: [SPARK-42994][ML][CONNECT] PyTorch Distributor support Local Mode with GPU - posted by "WeichenXu123 (via GitHub)" <gi...@apache.org> on 2023/04/11 00:10:39 UTC, 5 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #37348: [SPARK-39854][SQL] replaceWithAliases should keep the original children for Generate - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/11 00:19:00 UTC, 0 replies.
- [GitHub] [spark] dtenedor opened a new pull request, #40732: [WIP][SPARK-43085][SQL] Support column DEFAULT assignment for multi-part table names - posted by "dtenedor (via GitHub)" <gi...@apache.org> on 2023/04/11 00:22:42 UTC, 0 replies.
- [GitHub] [spark] xinrong-meng commented on pull request #40725: [SPARK-43082][Connect][PYTHON] Arrow-optimized Python UDFs in Spark Connect - posted by "xinrong-meng (via GitHub)" <gi...@apache.org> on 2023/04/11 00:53:13 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40603: [MINOR][CONNECT] Adding Proto Debug String to Job Description. - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/11 01:36:51 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon opened a new pull request, #40733: [SPARK-43089][CONNECT] Redact debug string in UI - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/11 01:44:26 UTC, 0 replies.
- [GitHub] [spark] yaooqinn commented on a diff in pull request #40718: [SPARK-43077][SQL] Improve the error message of UNRECOGNIZED_SQL_TYPE - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/11 01:48:14 UTC, 0 replies.
- [GitHub] [spark] rithwik-db commented on a diff in pull request #40724: [SPARK-43081] [ML] [CONNECT] Add torch distributor data loader that loads data from spark partition data - posted by "rithwik-db (via GitHub)" <gi...@apache.org> on 2023/04/11 02:04:00 UTC, 3 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40723: [SPARK-43090][CONNECT][TESTS] Move `withTable` from `RemoteSparkSession` to `SQLHelper` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/11 02:07:20 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40726: [SPARK-42382][BUILD] Upgrade `cyclonedx-maven-plugin` to 2.7.6 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/11 02:09:28 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun closed pull request #40723: [SPARK-43090][CONNECT][TESTS] Move `withTable` from `RemoteSparkSession` to `SQLHelper` - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/11 02:13:44 UTC, 0 replies.
- [GitHub] [spark] yaooqinn closed pull request #40718: [SPARK-43077][SQL] Improve the error message of UNRECOGNIZED_SQL_TYPE - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/11 02:47:43 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on pull request #40731: [SPARK-43087][SQL] Support coalesce buckets in join in AQE - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/11 03:12:17 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40733: [SPARK-43089][CONNECT] Redact debug string in UI - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/11 03:21:53 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40733: [SPARK-43089][CONNECT] Redact debug string in UI - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/11 03:22:24 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40707: [SPARK-43033][SQL] Avoid task retries due to AssertNotNull checks - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/11 03:30:16 UTC, 4 replies.
- [GitHub] [spark] aokolnychyi opened a new pull request, #40734: [SPARK-43088][SQL] Respect RequiresDistributionAndOrdering in CTAS/RTAS - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/11 03:31:15 UTC, 0 replies.
- [GitHub] [spark] aokolnychyi commented on a diff in pull request #40734: [SPARK-43088][SQL] Respect RequiresDistributionAndOrdering in CTAS/RTAS - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/11 03:32:16 UTC, 12 replies.
- [GitHub] [spark] aokolnychyi commented on pull request #40734: [SPARK-43088][SQL] Respect RequiresDistributionAndOrdering in CTAS/RTAS - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/11 03:40:12 UTC, 6 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40735: [SPARK-43092][CONNECT] Clean up unimplemented `dropDuplicatesWithinWatermark` series functions from `Dataset` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/11 05:36:17 UTC, 0 replies.
- [GitHub] [spark] pengzhon-db opened a new pull request, #40736: [SPARK-43084] [SS] Add applyInPandasWithState support for spark connect - posted by "pengzhon-db (via GitHub)" <gi...@apache.org> on 2023/04/11 05:39:59 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40737: [SPARK-43093][SQL][TESTS] Refactor `Add a directory when spark.sql.legacy.addSingleFileInAddFile set to false` to use random directories for testing - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/11 05:54:59 UTC, 0 replies.
- [GitHub] [spark] pengzhon-db commented on a diff in pull request #40691: [SPARK-43031] [SS] [Connect] Enable unit test and doctest for streaming - posted by "pengzhon-db (via GitHub)" <gi...@apache.org> on 2023/04/11 06:11:34 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40737: [SPARK-43093][SQL][TESTS] Refactor `Add a directory when spark.sql.legacy.addSingleFileInAddFile set to false` to test using tempDir with non-fixed root dir - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/11 06:38:26 UTC, 2 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40738: [BUILD] Test maven 3.9.1 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/11 08:48:58 UTC, 0 replies.
- [GitHub] [spark] cloud-fan opened a new pull request, #40739: [WIP] Make Python UDAF an AggregateFunction - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/11 09:41:56 UTC, 0 replies.
- [GitHub] [spark] juliuszsompolski commented on a diff in pull request #40676: [SPARK-42656][FOLLOWUP] Add BUILD and SCCLASSPATH options to Spark Connect scripts - posted by "juliuszsompolski (via GitHub)" <gi...@apache.org> on 2023/04/11 10:12:57 UTC, 0 replies.
- [GitHub] [spark] bowenliang123 commented on pull request #40738: [BUILD] Test maven 3.9.1 - posted by "bowenliang123 (via GitHub)" <gi...@apache.org> on 2023/04/11 10:16:18 UTC, 1 replies.
- [GitHub] [spark] juliuszsompolski opened a new pull request, #40740: [SPARK-42656][FOLLOWUP] Rename BUILD parameter to SCBUILD to avoid clashes - posted by "juliuszsompolski (via GitHub)" <gi...@apache.org> on 2023/04/11 10:18:48 UTC, 0 replies.
- [GitHub] [spark] juliuszsompolski commented on pull request #40740: [SPARK-42656][FOLLOWUP] Rename BUILD parameter to SCBUILD to avoid clashes - posted by "juliuszsompolski (via GitHub)" <gi...@apache.org> on 2023/04/11 10:19:02 UTC, 0 replies.
- [GitHub] [spark] steveloughran commented on pull request #40726: [SPARK-42382][BUILD] Upgrade `cyclonedx-maven-plugin` to 2.7.6 - posted by "steveloughran (via GitHub)" <gi...@apache.org> on 2023/04/11 10:39:57 UTC, 0 replies.
- [GitHub] [spark] MaxGekk closed pull request #40660: [SPARK-43028][SQL] Add error class SQL_CONF_NOT_FOUND - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/11 10:43:29 UTC, 0 replies.
- [GitHub] [spark] jaceklaskowski commented on a diff in pull request #40685: [SPARK-43050][SQL] Fix construct aggregate expressions by replacing grouping functions - posted by "jaceklaskowski (via GitHub)" <gi...@apache.org> on 2023/04/11 12:00:32 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40693: [SPARK-43058][SQL] Move Numeric and Fractional to PhysicalDataType - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/11 12:36:23 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40693: [SPARK-43058][SQL] Move Numeric and Fractional to PhysicalDataType - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/11 12:55:07 UTC, 0 replies.
- [GitHub] [spark] mattoh91 commented on pull request #39267: [SPARK-41592][PYTHON][ML] Pytorch file Distributed Training - posted by "mattoh91 (via GitHub)" <gi...@apache.org> on 2023/04/11 13:13:36 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40738: [BUILD] Test maven 3.9.1 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/11 13:43:06 UTC, 1 replies.
- [GitHub] [spark] Hisoka-X opened a new pull request, #40741: [SPARK-41811][CONNECT][CLIENT] Support sql with dataframes and columns - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/11 13:48:56 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40699: [SPARK-43063][SQL] `df.show` handle null should print NULL instead of null - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/11 13:54:16 UTC, 4 replies.
- [GitHub] [spark] Hisoka-X commented on a diff in pull request #40741: [SPARK-41811][CONNECT][CLIENT] Support sql with dataframes and columns - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/11 13:56:23 UTC, 11 replies.
- [GitHub] [spark] wangyum opened a new pull request, #40742: [SPARK-43095][SQL] Avoid Once strategy's idempotence is broken for batch: `Infer Filters` - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/11 14:04:06 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40719: [WIP]Speed up parquet reading with Java Vector API - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/11 14:06:19 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40731: [SPARK-43087][SQL] Support coalesce buckets in join in AQE - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/11 14:11:49 UTC, 0 replies.
- [GitHub] [spark] jaceklaskowski commented on a diff in pull request #40724: [SPARK-43081] [ML] [CONNECT] Add torch distributor data loader that loads data from spark partition data - posted by "jaceklaskowski (via GitHub)" <gi...@apache.org> on 2023/04/11 14:14:14 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40734: [SPARK-43088][SQL] Respect RequiresDistributionAndOrdering in CTAS/RTAS - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/11 14:14:59 UTC, 6 replies.
- [GitHub] [spark] wangyum closed pull request #40731: [SPARK-43087][SQL] Support coalesce buckets in join in AQE - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/11 14:27:25 UTC, 0 replies.
- [GitHub] [spark] wangyum opened a new pull request, #40743: [SPARK-42597][SQL][FOLLOW-UP] Exclude `EqualNullSafe` when unwrapping date type to timestamp type - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/11 14:50:21 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on pull request #40704: [SPARK-43038][SQL] Support the CBC mode by `aes_encrypt()`/`aes_decrypt()` - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/11 14:51:59 UTC, 1 replies.
- [GitHub] [spark] pengzhon-db commented on a diff in pull request #40736: [SPARK-43084] [SS] Add applyInPandasWithState support for spark connect - posted by "pengzhon-db (via GitHub)" <gi...@apache.org> on 2023/04/11 16:40:01 UTC, 4 replies.
- [GitHub] [spark] dongjoon-hyun closed pull request #40721: [SPARK-43080][BUILD] Upgrade `zstd-jni` to 1.5.5-1 - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/11 16:51:40 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40738: [BUILD] Test maven 3.9.1 - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/11 16:53:36 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #40736: [SPARK-43084] [SS] Add applyInPandasWithState support for spark connect - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/11 17:10:59 UTC, 2 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40637: [SPARK-43002][YARN] Modify yarn client application report logging frequency to reduce noise - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/11 17:14:13 UTC, 1 replies.
- [GitHub] [spark] dongjoon-hyun closed pull request #40637: [SPARK-43002][YARN] Modify yarn client application report logging frequency to reduce noise - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/11 17:14:39 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40730: [SPARK-43086][CORE] Support bin pack task scheduling on executors - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/11 17:32:07 UTC, 0 replies.
- [GitHub] [spark] peter-toth opened a new pull request, #40744: [WIP][SPARK-24497][SQL] Support recursive SQL - posted by "peter-toth (via GitHub)" <gi...@apache.org> on 2023/04/11 17:39:09 UTC, 0 replies.
- [GitHub] [spark] rithwik-db commented on pull request #39267: [SPARK-41592][PYTHON][ML] Pytorch file Distributed Training - posted by "rithwik-db (via GitHub)" <gi...@apache.org> on 2023/04/11 17:44:48 UTC, 0 replies.
- [GitHub] [spark] WweiL commented on a diff in pull request #40736: [SPARK-43084] [SS] Add applyInPandasWithState support for spark connect - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/11 18:57:08 UTC, 6 replies.
- [GitHub] [spark] WweiL commented on pull request #40736: [SPARK-43084] [SS] Add applyInPandasWithState support for spark connect - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/11 19:10:37 UTC, 1 replies.
- [GitHub] [spark] zhenlineo opened a new pull request, #40745: [Do not merge] Testing repl build on CI - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/11 19:29:24 UTC, 0 replies.
- [GitHub] [spark] zhenlineo commented on a diff in pull request #40675: [SPARK-42657][CONNECT] Support to find and transfer client-side REPL classfiles to server as artifacts - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/11 19:38:47 UTC, 0 replies.
- [GitHub] [spark] ueshin opened a new pull request, #40746: [SPARK-42985][CONNECT][PYTHON] Fix createDataFrame to respect the SQL configs - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/11 21:15:37 UTC, 0 replies.
- [GitHub] [spark] alexjinghn opened a new pull request, #40747: [WIP][SPARK-43099] Use `getName` instead of `getCanonicalName` to get builder class name when registering udf to FunctionRegistry - posted by "alexjinghn (via GitHub)" <gi...@apache.org> on 2023/04/11 22:25:22 UTC, 0 replies.
- [GitHub] [spark] gengliangwang commented on pull request #40747: [WIP][SPARK-43099][SQL] Use `getName` instead of `getCanonicalName` to get builder class name when registering udf to FunctionRegistry - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/11 22:44:17 UTC, 0 replies.
- [GitHub] [spark] zhenlineo commented on pull request #40745: [Do not merge] Testing repl build on CI - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/11 23:26:12 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on pull request #40742: [SPARK-43095][SQL] Avoid Once strategy's idempotence is broken for batch: `Infer Filters` - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/11 23:45:26 UTC, 1 replies.
- [GitHub] [spark] WeichenXu123 opened a new pull request, #40748: [WIP][SPARK-43097] New pyspark ML logistic regression estimator implemented on top of distributor - posted by "WeichenXu123 (via GitHub)" <gi...@apache.org> on 2023/04/12 00:02:17 UTC, 0 replies.
- [GitHub] [spark] zhouyejoe opened a new pull request, #40749: SPARK-43100 Mismatch of field name in log event writer and parser for push shuffle metrics - posted by "zhouyejoe (via GitHub)" <gi...@apache.org> on 2023/04/12 00:04:07 UTC, 0 replies.
- [GitHub] [spark] zhouyejoe commented on pull request #40749: SPARK-43100 Mismatch of field name in log event writer and parser for push shuffle metrics - posted by "zhouyejoe (via GitHub)" <gi...@apache.org> on 2023/04/12 00:18:34 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on a diff in pull request #40688: [SPARK-43021][SQL] `CoalesceBucketsInJoin` not work when using AQE - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/12 00:18:38 UTC, 2 replies.
- [GitHub] [spark] gengliangwang commented on a diff in pull request #40749: SPARK-43100 Mismatch of field name in log event writer and parser for push shuffle metrics - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/12 00:22:33 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40695: [SPARK-42994][ML][CONNECT] PyTorch Distributor support Local Mode - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/12 00:51:38 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40736: [SPARK-43084] [SS] Add applyInPandasWithState support for spark connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/12 01:32:46 UTC, 1 replies.
- [GitHub] [spark] yaooqinn commented on pull request #40543: [SPARK-42916][SQL] JDBCTableCatalog Keeps Char/Varchar meta on the read-side - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/12 01:48:58 UTC, 1 replies.
- [GitHub] [spark] yaooqinn commented on a diff in pull request #40543: [SPARK-42916][SQL] JDBCTableCatalog Keeps Char/Varchar meta on the read-side - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/12 01:53:43 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40740: [SPARK-42656][FOLLOWUP] Rename BUILD parameter to SCBUILD to avoid clashes - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/12 01:54:08 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40748: [WIP][SPARK-43097] New pyspark ML logistic regression estimator implemented on top of distributor - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/12 01:54:28 UTC, 2 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40740: [SPARK-42656][FOLLOWUP] Rename BUILD parameter to SCBUILD to avoid clashes - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/12 01:54:29 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40676: [SPARK-42656][FOLLOWUP] Add BUILD and SCCLASSPATH options to Spark Connect scripts - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/12 01:56:16 UTC, 1 replies.
- [GitHub] [spark] viirya commented on a diff in pull request #40308: [SPARK-42151][SQL] Align UPDATE assignments with table attributes - posted by "viirya (via GitHub)" <gi...@apache.org> on 2023/04/12 01:56:28 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40724: [SPARK-43081] [ML] [CONNECT] Add torch distributor data loader that loads data from spark partition data - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/12 01:57:16 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40743: [SPARK-42597][SQL][FOLLOW-UP] Exclude `EqualNullSafe` when unwrapping date type to timestamp type - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/12 02:20:50 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40741: [SPARK-41811][CONNECT][CLIENT] Support sql with dataframes and columns - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/12 02:22:22 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40737: [SPARK-43093][SQL][TESTS] Refactor `Add a directory when spark.sql.legacy.addSingleFileInAddFile set to false` to test using tempDir with non-fixed root dir - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/12 02:23:00 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40737: [SPARK-43093][SQL][TESTS] Refactor `Add a directory when spark.sql.legacy.addSingleFileInAddFile set to false` to test using tempDir with non-fixed root dir - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/12 02:23:17 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40737: [SPARK-43093][SQL][TESTS] Refactor `Add a directory when spark.sql.legacy.addSingleFileInAddFile set to false` to test using tempDir with non-fixed root dir - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/12 02:24:37 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on pull request #40743: [SPARK-42597][SQL][FOLLOW-UP] Exclude `EqualNullSafe` when unwrapping date type to timestamp type - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/12 02:25:33 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40748: [WIP][SPARK-43097] New pyspark ML logistic regression estimator implemented on top of distributor - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/12 02:26:25 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40743: [SPARK-42597][SQL][FOLLOW-UP] Exclude `EqualNullSafe` when unwrapping date type to timestamp type - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/12 02:27:24 UTC, 1 replies.
- [GitHub] [spark] Yikf commented on pull request #40699: [SPARK-43063][SQL] `df.show` handle null should print NULL instead of null - posted by "Yikf (via GitHub)" <gi...@apache.org> on 2023/04/12 02:31:54 UTC, 4 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40742: [SPARK-43095][SQL] Avoid Once strategy's idempotence is broken for batch: `Infer Filters` - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/12 02:32:40 UTC, 6 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40750: [WIP][CONNECT] Redact the proto message - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/12 02:35:50 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40750: [WIP][CONNECT] Redact the proto message - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/12 02:40:25 UTC, 5 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40677: [SPARK-43039][SQL] Support custom fields in the file source _metadata column. - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/12 03:03:24 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40677: [SPARK-43039][SQL] Support custom fields in the file source _metadata column. - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/12 03:03:46 UTC, 0 replies.
- [GitHub] [spark] gengliangwang commented on a diff in pull request #40732: [SPARK-43085][SQL] Support column DEFAULT assignment for multi-part table names - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/12 03:52:30 UTC, 4 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #38384: [SPARK-40657][PROTOBUF] Require shading for Java class jar, improve error handling - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/12 03:54:05 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40704: [SPARK-43038][SQL] Support the CBC mode by `aes_encrypt()`/`aes_decrypt()` - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/12 03:58:25 UTC, 2 replies.
- [GitHub] [spark] WeichenXu123 commented on a diff in pull request #40748: [WIP][SPARK-43097] New pyspark ML logistic regression estimator implemented on top of distributor - posted by "WeichenXu123 (via GitHub)" <gi...@apache.org> on 2023/04/12 04:09:21 UTC, 4 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40738: [BUILD] Test maven 3.9.1 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/12 04:39:08 UTC, 3 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40751: [SPARK-43102][BUILD] Upgrade commons-compress to 1.23.0 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/12 04:55:29 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #38384: [SPARK-40657][PROTOBUF] Require shading for Java class jar, improve error handling - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/12 06:01:20 UTC, 0 replies.
- [GitHub] [spark] amaliujia opened a new pull request, #40752: [WIP][SQL] Moving Integral to PhysicalDataTye - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/12 06:03:27 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40741: [SPARK-41811][CONNECT][CLIENT] Support sql with dataframes and columns - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/12 06:07:40 UTC, 3 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40692: [SPARK-43055][CONNECT][PYTHON] Support duplicated nested field names - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/12 06:11:32 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on pull request #40752: [SPARK-43103][SQL] Moving Integral to PhysicalDataType - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/12 06:12:33 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40746: [SPARK-42985][CONNECT][PYTHON] Fix createDataFrame to respect the SQL configs - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/12 06:13:50 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40746: [SPARK-42985][CONNECT][PYTHON] Fix createDataFrame to respect the SQL configs - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/12 06:13:55 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #40704: [SPARK-43038][SQL] Support the CBC mode by `aes_encrypt()`/`aes_decrypt()` - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/12 06:17:46 UTC, 3 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40750: [WIP][CONNECT] Redact the proto message - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/12 06:38:48 UTC, 6 replies.
- [GitHub] [spark] gengliangwang commented on a diff in pull request #40750: [WIP][CONNECT] Redact the proto message - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/12 07:07:15 UTC, 0 replies.
- [GitHub] [spark] yaooqinn commented on pull request #40747: [WIP][SPARK-43099][SQL] Use `getName` instead of `getCanonicalName` to get builder class name when registering udf to FunctionRegistry - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/12 07:24:31 UTC, 1 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40753: [SPARK-43104][BUILD] Set `shadeTestJar` of `protobuf` module to `false` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/12 07:35:09 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40753: [SPARK-43104][BUILD] Set `shadeTestJar` of `protobuf` module to `false` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/12 07:38:15 UTC, 2 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40695: [SPARK-42994][ML][CONNECT] PyTorch Distributor support Local Mode - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/12 07:52:16 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40747: [WIP][SPARK-43099][SQL] Use `getName` instead of `getCanonicalName` to get builder class name when registering udf to FunctionRegistry - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/12 08:10:20 UTC, 0 replies.
- [GitHub] [spark] ulysses-you opened a new pull request, #40754: [SPARK-37099][SQL][FOLLOWUP] Add numOutputRows metric for WindowGroupLimitExec - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/12 08:31:29 UTC, 0 replies.
- [GitHub] [spark] ulysses-you commented on pull request #40754: [SPARK-37099][SQL][FOLLOWUP] Add numOutputRows metric for WindowGroupLimitExec - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/12 08:32:25 UTC, 2 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40750: [SPARK-43105][CONNECT] Abbreviate Bytes in proto message's debug string - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/12 09:02:35 UTC, 6 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40750: [SPARK-43105][CONNECT] Abbreviate Bytes in proto message's debug string - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/12 09:06:01 UTC, 1 replies.
- [GitHub] [spark] beliefer commented on pull request #40754: [SPARK-37099][SQL][FOLLOWUP] Add numOutputRows metric for WindowGroupLimitExec - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/12 09:19:51 UTC, 1 replies.
- [GitHub] [spark] grundprinzip commented on a diff in pull request #40750: [SPARK-43105][CONNECT] Abbreviate Bytes in proto message's debug string - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/12 09:20:02 UTC, 2 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40734: [SPARK-43088][SQL] Respect RequiresDistributionAndOrdering in CTAS/RTAS - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/12 09:20:23 UTC, 2 replies.
- [GitHub] [spark] juliuszsompolski commented on pull request #40676: [SPARK-42656][FOLLOWUP] Add BUILD and SCCLASSPATH options to Spark Connect scripts - posted by "juliuszsompolski (via GitHub)" <gi...@apache.org> on 2023/04/12 11:10:31 UTC, 0 replies.
- [GitHub] [spark] beliefer commented on a diff in pull request #40754: [SPARK-37099][SQL][FOLLOWUP] Add numOutputRows metric for WindowGroupLimitExec - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/12 11:22:45 UTC, 4 replies.
- [GitHub] [spark] WeichenXu123 commented on pull request #40724: [SPARK-43081] [ML] [CONNECT] Add torch distributor data loader that loads data from spark partition data - posted by "WeichenXu123 (via GitHub)" <gi...@apache.org> on 2023/04/12 11:58:42 UTC, 2 replies.
- [GitHub] [spark] kings129 opened a new pull request, #40755: [SPARK-37829][SQL] Dataframe.joinWith outer-join should return a null value for unmatched row - posted by "kings129 (via GitHub)" <gi...@apache.org> on 2023/04/12 12:05:54 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40747: [WIP][SPARK-43099][SQL] Use `getName` instead of `getCanonicalName` to get builder class name when registering udf to FunctionRegistry - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/12 12:06:09 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40747: [WIP][SPARK-43099][SQL] Use `getName` instead of `getCanonicalName` to get builder class name when registering udf to FunctionRegistry - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/12 12:07:25 UTC, 0 replies.
- [GitHub] [spark] ulysses-you commented on a diff in pull request #40754: [SPARK-37099][SQL][FOLLOWUP] Add numOutputRows metric for WindowGroupLimitExec - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/12 12:10:05 UTC, 2 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40752: [SPARK-43103][SQL] Moving Integral to PhysicalDataType - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/12 12:25:56 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40752: [SPARK-43103][SQL] Moving Integral to PhysicalDataType - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/12 12:26:30 UTC, 0 replies.
- [GitHub] [spark] grundprinzip commented on a diff in pull request #40750: [SPARK-43105][CONNECT] Abbreviate Bytes and Strings in proto message - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/12 12:49:04 UTC, 1 replies.
- [GitHub] [spark] MaxGekk closed pull request #40704: [SPARK-43038][SQL] Support the CBC mode by `aes_encrypt()`/`aes_decrypt()` - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/12 13:02:40 UTC, 0 replies.
- [GitHub] [spark] steveloughran commented on pull request #40738: [BUILD] Test maven 3.9.1 - posted by "steveloughran (via GitHub)" <gi...@apache.org> on 2023/04/12 13:11:06 UTC, 0 replies.
- [GitHub] [spark] wangyum opened a new pull request, #40756: [SPARK-43107][SQL] Coalesce buckets in join applied on broadcast join stream side - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/12 13:34:18 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40743: [SPARK-42597][SQL][FOLLOW-UP] Exclude `EqualNullSafe` when unwrapping date type to timestamp type - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/12 14:23:37 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on a diff in pull request #40743: [SPARK-42597][SQL][FOLLOW-UP] Exclude `EqualNullSafe` when unwrapping date type to timestamp type - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/12 14:33:38 UTC, 0 replies.
- [GitHub] [spark] juliuszsompolski opened a new pull request, #40757: [SPARK-42656][FOLLOWUP] `chmod+x` for `connector/connect/bin/spark-connect-scala-client-classpath` script - posted by "juliuszsompolski (via GitHub)" <gi...@apache.org> on 2023/04/12 15:45:52 UTC, 0 replies.
- [GitHub] [spark] juliuszsompolski commented on pull request #40757: [SPARK-42656][FOLLOWUP] `chmod+x` for `connector/connect/bin/spark-connect-scala-client-classpath` script - posted by "juliuszsompolski (via GitHub)" <gi...@apache.org> on 2023/04/12 15:46:22 UTC, 0 replies.
- [GitHub] [spark] amaliujia opened a new pull request, #40758: [SPARK-43110][SQL] Move asIntegral to PhysicalDataType - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/12 16:53:09 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40741: [SPARK-41811][CONNECT][CLIENT] Support sql with dataframes and columns - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/12 17:06:51 UTC, 2 replies.
- [GitHub] [spark] mkaravel commented on pull request #40615: [WIP][SPARK-16484][SQL] Add support for Datasketches HllSketch - posted by "mkaravel (via GitHub)" <gi...@apache.org> on 2023/04/12 17:08:30 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on a diff in pull request #40645: [SPARK-43014] Do not overwrite `spark.app.submitTime` in k8s cluster mode driver - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/12 17:09:41 UTC, 0 replies.
- [GitHub] [spark] zhenlineo commented on a diff in pull request #40745: [Do not merge] Testing repl build on CI - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/12 17:45:29 UTC, 0 replies.
- [GitHub] [spark] pengzhon-db commented on pull request #40736: [SPARK-43084] [SS] Add applyInPandasWithState support for spark connect - posted by "pengzhon-db (via GitHub)" <gi...@apache.org> on 2023/04/12 17:56:58 UTC, 1 replies.
- [GitHub] [spark] dtenedor commented on a diff in pull request #40732: [SPARK-43085][SQL] Support column DEFAULT assignment for multi-part table names - posted by "dtenedor (via GitHub)" <gi...@apache.org> on 2023/04/12 18:39:40 UTC, 7 replies.
- [GitHub] [spark] ueshin commented on pull request #40015: [SPARK-42437][PYTHON][CONNECT] PySpark catalog.cacheTable will allow to specify storage level - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/12 20:11:28 UTC, 0 replies.
- [GitHub] [spark] ueshin closed pull request #40015: [SPARK-42437][PYTHON][CONNECT] PySpark catalog.cacheTable will allow to specify storage level - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/12 20:12:08 UTC, 0 replies.
- [GitHub] [spark] bjornjorgensen opened a new pull request, #40759: [SPARK-43111] Merge nested if statements into single if statements - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/12 20:35:45 UTC, 0 replies.
- [GitHub] [spark] ueshin opened a new pull request, #40760: [SPARK-42982][CONNECT][PYTHON] Fix createDataFrame to respect the given schema ddl - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/12 21:54:37 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on pull request #40758: [SPARK-43110][SQL] Move asIntegral to PhysicalDataType - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/12 22:04:52 UTC, 0 replies.
- [GitHub] [spark] zhouyejoe commented on a diff in pull request #40749: [SPARK-43100][CORE] Mismatch of field name in log event writer and parser for push shuffle metrics - posted by "zhouyejoe (via GitHub)" <gi...@apache.org> on 2023/04/12 22:11:39 UTC, 0 replies.
- [GitHub] [spark] gengliangwang opened a new pull request, #40761: [Minor][SQL] Simplify the method resolveExprsAndAddMissingAttrs - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/12 22:44:59 UTC, 0 replies.
- [GitHub] [spark] zhenlineo opened a new pull request, #40762: [WIP] Fix maven test build for udf tests - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/12 22:50:45 UTC, 0 replies.
- [GitHub] [spark] wangyum opened a new pull request, #40763: [SPARK-43114][SQL][TESTS] Add interval types to TypeCoercionSuite - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/12 23:36:37 UTC, 0 replies.
- [GitHub] [spark] ueshin opened a new pull request, #40764: [SPARK-43115][CONNECT][PS][TESTS] Split pyspark-pandas-connect from pyspark-connect module - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/12 23:42:54 UTC, 0 replies.
- [GitHub] [spark] ahshahid opened a new pull request, #40765: [WIP][SPARK-43112]. Spark may use a column other than the actual specified partitioning column for partitioning, for Hive format tables - posted by "ahshahid (via GitHub)" <gi...@apache.org> on 2023/04/12 23:44:37 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40691: [SPARK-43031] [SS] [Connect] Enable unit test and doctest for streaming - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/13 00:05:45 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40691: [SPARK-43031] [SS] [Connect] Enable unit test and doctest for streaming - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/13 00:06:07 UTC, 0 replies.
- [GitHub] [spark] bersprockets opened a new pull request, #40766: [SPARK-43113][SQL] Evaluate stream-side variables when generating code for a bound condition - posted by "bersprockets (via GitHub)" <gi...@apache.org> on 2023/04/13 00:10:55 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #39327: [SPARK-41801][CORE][PYTHON][PS] Remove `*args: Any, **kwargs: Any` for `def transpose` - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/13 00:17:11 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #37967: Scalable SkipGram-Word2Vec implementation - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/13 00:17:13 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on pull request #40764: [SPARK-43115][CONNECT][PS][TESTS] Split pyspark-pandas-connect from pyspark-connect module - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/13 00:20:49 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40763: [SPARK-43114][SQL][TESTS] Add interval types to TypeCoercionSuite - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/13 00:24:07 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40750: [SPARK-43105][CONNECT] Abbreviate Bytes and Strings in proto message - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/13 00:25:22 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40730: [SPARK-43086][CORE] Support bin pack task scheduling on executors - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/13 00:33:38 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40757: [SPARK-42656][FOLLOWUP] `chmod+x` for `connector/connect/bin/spark-connect-scala-client-classpath` script - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/13 00:34:45 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40757: [SPARK-42656][FOLLOWUP] `chmod+x` for `connector/connect/bin/spark-connect-scala-client-classpath` script - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/13 00:35:10 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40750: [SPARK-43105][CONNECT] Abbreviate Bytes and Strings in proto message - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/13 00:36:08 UTC, 2 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40766: [SPARK-43113][SQL] Evaluate stream-side variables when generating code for a bound condition - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/13 00:36:57 UTC, 2 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40762: [WIP] Fix maven test build for udf tests - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/13 00:37:56 UTC, 0 replies.
- [GitHub] [spark] q-aaronzolnailucas commented on pull request #39572: [SPARK-39979][SQL] Add option to use large variable width vectors for arrow UDF operations - posted by "q-aaronzolnailucas (via GitHub)" <gi...@apache.org> on 2023/04/13 00:56:56 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40761: [MINOR][SQL] Simplify the method resolveExprsAndAddMissingAttrs - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/13 01:05:51 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40764: [SPARK-43115][CONNECT][PS][TESTS] Split pyspark-pandas-connect from pyspark-connect module - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/13 01:14:07 UTC, 1 replies.
- [GitHub] [spark] ueshin commented on a diff in pull request #40764: [SPARK-43115][CONNECT][PS][TESTS] Split pyspark-pandas-connect from pyspark-connect module - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/13 01:24:03 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40758: [SPARK-43110][SQL] Move asIntegral to PhysicalDataType - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/13 01:52:20 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40758: [SPARK-43110][SQL] Move asIntegral to PhysicalDataType - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/13 01:53:12 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40754: [SPARK-37099][SQL][FOLLOWUP] Add numOutputRows metric for WindowGroupLimitExec - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/13 02:00:52 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40699: [SPARK-43063][SQL] `df.show` handle null should print NULL instead of null - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/13 02:15:28 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40762: [WIP] Fix maven test build for udf tests - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/13 02:16:29 UTC, 0 replies.
- [GitHub] [spark] bowenliang123 commented on a diff in pull request #40738: [BUILD] Test maven 3.9.1 - posted by "bowenliang123 (via GitHub)" <gi...@apache.org> on 2023/04/13 02:30:04 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40764: [SPARK-43115][CONNECT][PS][TESTS] Split pyspark-pandas-connect from pyspark-connect module - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/13 03:06:09 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40764: [SPARK-43115][CONNECT][PS][TESTS] Split pyspark-pandas-connect from pyspark-connect module - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/13 03:06:27 UTC, 0 replies.
- [GitHub] [spark] jerrypeng opened a new pull request, #40767: [SPARK-43118] Remove unnecessary assert for UninterruptibleThread in KafkaMicroBatchStream - posted by "jerrypeng (via GitHub)" <gi...@apache.org> on 2023/04/13 03:45:49 UTC, 0 replies.
- [GitHub] [spark] yaooqinn opened a new pull request, #40768: [SPARK-43119][SQL] Support Get SQL Keywords Dynamically Thru JDBC API and An Auxiliary Function - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/13 04:15:20 UTC, 0 replies.
- [GitHub] [spark] gengliangwang commented on pull request #40761: [MINOR][SQL] Simplify the method resolveExprsAndAddMissingAttrs - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/13 05:04:09 UTC, 0 replies.
- [GitHub] [spark] gengliangwang closed pull request #40761: [MINOR][SQL] Simplify the method resolveExprsAndAddMissingAttrs - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/13 05:04:41 UTC, 0 replies.
- [GitHub] [spark] bersprockets commented on a diff in pull request #40766: [SPARK-43113][SQL] Evaluate stream-side variables when generating code for a bound condition - posted by "bersprockets (via GitHub)" <gi...@apache.org> on 2023/04/13 05:20:35 UTC, 1 replies.
- [GitHub] [spark] pan3793 commented on a diff in pull request #40730: [SPARK-43086][CORE] Support bin pack task scheduling on executors - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/13 05:21:55 UTC, 0 replies.
- [GitHub] [spark] yaooqinn closed pull request #40543: [SPARK-42916][SQL] JDBCTableCatalog Keeps Char/Varchar meta on the read-side - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/13 05:30:47 UTC, 0 replies.
- [GitHub] [spark] yaooqinn commented on a diff in pull request #40768: [SPARK-43119][SQL] Support Get SQL Keywords Dynamically Thru JDBC API and An Auxiliary Function - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/13 05:39:22 UTC, 6 replies.
- [GitHub] [spark] wangyum closed pull request #40601: [SPARK-42975][SQL] Cast result type to timestamp type for string +/- interval - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/13 06:16:04 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40769: [WIP][PYTHON][TESTS] Reduce the required resources in PyTorch related tests - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/13 06:26:07 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40769: [WIP][PYTHON][TESTS] Reduce the required resources in PyTorch related tests - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/13 06:30:10 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40769: [WIP][PYTHON][TESTS] Reduce the required resources in PyTorch related tests - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/13 06:35:15 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on a diff in pull request #40749: [SPARK-43100][CORE] Mismatch of field name in log event writer and parser for push shuffle metrics - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/13 06:46:13 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on a diff in pull request #40742: [SPARK-43095][SQL] Avoid Once strategy's idempotence is broken for batch: `Infer Filters` - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/13 07:13:16 UTC, 5 replies.
- [GitHub] [spark] anishshri-db opened a new pull request, #40770: [SPARK-43120][SS] Add support for tracking pinned blocks memory usage for RocksDB state store - posted by "anishshri-db (via GitHub)" <gi...@apache.org> on 2023/04/13 07:26:49 UTC, 0 replies.
- [GitHub] [spark] anishshri-db commented on pull request #40770: [SPARK-43120][SS] Add support for tracking pinned blocks memory usage for RocksDB state store - posted by "anishshri-db (via GitHub)" <gi...@apache.org> on 2023/04/13 07:27:15 UTC, 0 replies.
- [GitHub] [spark] dfercode opened a new pull request, #40771: [SPARK-35723] set k8s pod container request, limit memory separately. - posted by "dfercode (via GitHub)" <gi...@apache.org> on 2023/04/13 07:28:25 UTC, 0 replies.
- [GitHub] [spark] WeichenXu123 opened a new pull request, #40772: [MINOR] Fix DistributedDataParallel model code in TorchDistributor suite - posted by "WeichenXu123 (via GitHub)" <gi...@apache.org> on 2023/04/13 07:36:25 UTC, 0 replies.
- [GitHub] [spark] dfercode commented on a diff in pull request #40771: [SPARK-35723] set k8s pod container request, limit memory separately. - posted by "dfercode (via GitHub)" <gi...@apache.org> on 2023/04/13 07:44:54 UTC, 1 replies.
- [GitHub] [spark] yaooqinn commented on pull request #40768: [SPARK-43119][SQL] Support Get SQL Keywords Dynamically Thru JDBC API and An Auxiliary Function - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/13 07:45:13 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR commented on pull request #40767: [SPARK-43118][SS] Remove unnecessary assert for UninterruptibleThread in KafkaMicroBatchStream - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/13 08:09:24 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR closed pull request #40767: [SPARK-43118][SS] Remove unnecessary assert for UninterruptibleThread in KafkaMicroBatchStream - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/13 08:10:06 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40773: [SPARK-43121][SQL] Use `BytesWritable.copyBytes` instead of manual copy in `HiveInspectors - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/13 08:17:33 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #38732: [SPARK-41210][K8S] Window based executor failure tracking mechanism - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/13 08:28:09 UTC, 1 replies.
- [GitHub] [spark] yaooqinn commented on pull request #38732: [SPARK-41210][K8S] Window based executor failure tracking mechanism - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/13 08:32:10 UTC, 0 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40774: [SPARK-41210][K8S] Window based executor failure tracking mechanism - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/13 08:34:10 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40775: [MINOR][TESTS][PYTHON] Skip `TorchDistributorLocalUnitTestsIIOnConnect` for now - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/13 08:56:06 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40763: [SPARK-43114][SQL][TESTS] Add interval types to TypeCoercionSuite - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/13 09:30:06 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40763: [SPARK-43114][SQL][TESTS] Add interval types to TypeCoercionSuite - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/13 09:30:27 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40760: [SPARK-42982][CONNECT][PYTHON] Fix createDataFrame to respect the given schema ddl - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/13 09:31:01 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40760: [SPARK-42982][CONNECT][PYTHON] Fix createDataFrame to respect the given schema ddl - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/13 09:31:24 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40775: [SPARK-42994][TESTS][FOLLOWUP] Skip `TorchDistributorLocalUnitTestsIIOnConnect` for now - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/13 09:49:06 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40775: [SPARK-42994][TESTS][FOLLOWUP] Skip `TorchDistributorLocalUnitTestsIIOnConnect` for now - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/13 09:52:25 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40775: [SPARK-42994][TESTS][FOLLOWUP] Skip `TorchDistributorLocalUnitTestsIIOnConnect` for now - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/13 09:52:38 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on a diff in pull request #40756: [SPARK-43107][SQL] Coalesce buckets in join applied on broadcast join stream side - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/13 09:54:50 UTC, 2 replies.
- [GitHub] [spark] yaooqinn commented on a diff in pull request #40774: [SPARK-41210][K8S] Window based executor failure tracking mechanism - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/13 09:56:15 UTC, 8 replies.
- [GitHub] [spark] cloud-fan closed pull request #40734: [SPARK-43088][SQL] Respect RequiresDistributionAndOrdering in CTAS/RTAS - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/13 09:56:58 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on a diff in pull request #40774: [SPARK-41210][K8S] Window based executor failure tracking mechanism - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/13 10:03:06 UTC, 5 replies.
- [GitHub] [spark] yaooqinn commented on pull request #40774: [SPARK-41210][K8S] Window based executor failure tracking mechanism - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/13 10:07:46 UTC, 2 replies.
- [GitHub] [spark] pan3793 commented on pull request #40774: [SPARK-41210][K8S] Window based executor failure tracking mechanism - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/13 10:19:11 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni) - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/13 11:06:04 UTC, 1 replies.
- [GitHub] [spark] jaceklaskowski commented on a diff in pull request #40713: [WIP][SPARK-42551][SQL] Support more subexpression elimination cases - posted by "jaceklaskowski (via GitHub)" <gi...@apache.org> on 2023/04/13 11:13:09 UTC, 0 replies.
- [GitHub] [spark] cloud-fan opened a new pull request, #40776: [SPARK-43123][SQL] Internal field metadata should not be leaked to catalogs - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/13 11:55:48 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40776: [SPARK-43123][SQL] Internal field metadata should not be leaked to catalogs - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/13 11:56:20 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40768: [SPARK-43119][SQL] Support Get SQL Keywords Dynamically Thru JDBC API and An Auxiliary Function - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/13 12:00:28 UTC, 3 replies.
- [GitHub] [spark] juliuszsompolski opened a new pull request, #40777: [CONNECT] Dump of query cancellation hacking - posted by "juliuszsompolski (via GitHub)" <gi...@apache.org> on 2023/04/13 12:50:08 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40778: [WIP][PYTHON][TESTS] Test test_parity_torch_distributor with timeout - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/13 12:59:50 UTC, 0 replies.
- [GitHub] [spark] peter-toth opened a new pull request, #40779: [SPARK-43124][SQL] Dataset.show projects CommandResults locally - posted by "peter-toth (via GitHub)" <gi...@apache.org> on 2023/04/13 12:59:54 UTC, 0 replies.
- [GitHub] [spark] peter-toth commented on pull request #40779: [SPARK-43124][SQL] Dataset.show projects CommandResults locally - posted by "peter-toth (via GitHub)" <gi...@apache.org> on 2023/04/13 13:05:56 UTC, 3 replies.
- [GitHub] [spark] HeartSaVioR commented on pull request #40770: [SPARK-43120][SS] Add support for tracking pinned blocks memory usage for RocksDB state store - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/13 13:16:11 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR closed pull request #40770: [SPARK-43120][SS] Add support for tracking pinned blocks memory usage for RocksDB state store - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/13 13:16:41 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40779: [SPARK-43124][SQL] Dataset.show projects CommandResults locally - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/13 13:17:01 UTC, 1 replies.
- [GitHub] [spark] Hisoka-X opened a new pull request, #40780: [SPARK-43125][CONNECT][SERVER] Fix Connect Server Can't Handle Exception With Null Message - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/13 13:22:52 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on pull request #40780: [SPARK-43125][CONNECT][SERVER] Fix Connect Server Can't Handle Exception With Null Message - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/13 13:24:24 UTC, 0 replies.
- [GitHub] [spark] cloud-fan opened a new pull request, #40781: [SPARK-43126][SQL] Mark two Hive UDF expressions as stateful - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/13 13:31:51 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40781: [SPARK-43126][SQL] Mark two Hive UDF expressions as stateful - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/13 13:32:16 UTC, 0 replies.
- [GitHub] [spark] beliefer opened a new pull request, #40782: [SPARK-42669][CONNECT] Short circuit local relation RPCs - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/13 13:37:40 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40779: [SPARK-43124][SQL] Dataset.show projects CommandResults locally - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/13 13:52:42 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on pull request #39170: [SPARK-41674][SQL] Runtime filter should supports multi level shuffle join side as filter creation side - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/13 13:54:05 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #39170: [SPARK-41674][SQL] Runtime filter should supports multi level shuffle join side as filter creation side - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/13 13:54:54 UTC, 0 replies.
- [GitHub] [spark] peter-toth commented on a diff in pull request #40779: [SPARK-43124][SQL] Dataset.show projects CommandResults locally - posted by "peter-toth (via GitHub)" <gi...@apache.org> on 2023/04/13 14:07:48 UTC, 1 replies.
- [GitHub] [spark] zhenlineo closed pull request #40745: [Do not merge] Testing repl build on CI - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/13 14:28:35 UTC, 0 replies.
- [GitHub] [spark] wankunde commented on a diff in pull request #40713: [WIP][SPARK-42551][SQL] Support more subexpression elimination cases - posted by "wankunde (via GitHub)" <gi...@apache.org> on 2023/04/13 14:39:26 UTC, 0 replies.
- [GitHub] [spark] bjornjorgensen commented on pull request #40759: [SPARK-43111][PS][CONNECT][PYTHON] Merge nested `if` statements into single `if` statements - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/13 15:45:20 UTC, 1 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40755: [SPARK-37829][SQL] Dataframe.joinWith outer-join should return a null value for unmatched row - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/13 15:50:17 UTC, 0 replies.
- [GitHub] [spark] tgravescs commented on a diff in pull request #40730: [SPARK-43086][CORE] Support bin pack task scheduling on executors - posted by "tgravescs (via GitHub)" <gi...@apache.org> on 2023/04/13 15:56:43 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40729: [WIP][CONNECT] Adding groupByKey + mapGroup functions - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/13 16:00:53 UTC, 2 replies.
- [GitHub] [spark] tgravescs commented on pull request #40730: [SPARK-43086][CORE] Support bin pack task scheduling on executors - posted by "tgravescs (via GitHub)" <gi...@apache.org> on 2023/04/13 16:01:37 UTC, 1 replies.
- [GitHub] [spark] ryan-johnson-databricks commented on a diff in pull request #40732: [SPARK-43085][SQL] Support column DEFAULT assignment for multi-part table names - posted by "ryan-johnson-databricks (via GitHub)" <gi...@apache.org> on 2023/04/13 16:05:24 UTC, 2 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40780: [SPARK-43125][CONNECT][SERVER] Fix Connect Server Can't Handle Exception With Null Message - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/13 16:23:49 UTC, 1 replies.
- [GitHub] [spark] michelsciortino commented on pull request #17293: [SPARK-19950][SQL] Fix to ignore nullable when df.load() is executed for file-based data source - posted by "michelsciortino (via GitHub)" <gi...@apache.org> on 2023/04/13 16:28:34 UTC, 0 replies.
- [GitHub] [spark] zhenlineo commented on a diff in pull request #40729: [WIP][CONNECT] Adding groupByKey + mapGroup functions - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/13 17:21:21 UTC, 0 replies.
- [GitHub] [spark] kings129 commented on pull request #40755: [SPARK-37829][SQL] Dataframe.joinWith outer-join should return a null value for unmatched row - posted by "kings129 (via GitHub)" <gi...@apache.org> on 2023/04/13 17:52:23 UTC, 1 replies.
- [GitHub] [spark] rangadi opened a new pull request, #40783: [SPARK-43129] Scala core API for streaming Spark Connect - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/13 18:47:40 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on pull request #40783: [SPARK-43129] Scala core API for streaming Spark Connect - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/13 18:49:55 UTC, 2 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #40783: [SPARK-43129] Scala core API for streaming Spark Connect - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/13 19:00:35 UTC, 11 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40783: [SPARK-43129] Scala core API for streaming Spark Connect - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/13 19:06:11 UTC, 8 replies.
- [GitHub] [spark] amaliujia opened a new pull request, #40784: [SPARK-43130][SQL] Move InternalType to PhysicalDataType - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/13 19:10:00 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on pull request #40784: [SPARK-43130][SQL] Move InternalType to PhysicalDataType - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/13 19:10:15 UTC, 2 replies.
- [GitHub] [spark] WweiL opened a new pull request, #40785: [SPARK-42960] Add await_termination() and exception() API for Streaming Query - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/13 21:36:49 UTC, 0 replies.
- [GitHub] [spark] WweiL commented on pull request #40785: [SPARK-42960] Add await_termination() and exception() API for Streaming Query - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/13 21:37:37 UTC, 2 replies.
- [GitHub] [spark] gengliangwang commented on pull request #40732: [SPARK-43085][SQL] Support column DEFAULT assignment for multi-part table names - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/13 21:57:41 UTC, 0 replies.
- [GitHub] [spark] gengliangwang closed pull request #40732: [SPARK-43085][SQL] Support column DEFAULT assignment for multi-part table names - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/13 21:58:18 UTC, 0 replies.
- [GitHub] [spark] warrenzhu25 commented on a diff in pull request #40730: [SPARK-43086][CORE] Support bin pack task scheduling on executors - posted by "warrenzhu25 (via GitHub)" <gi...@apache.org> on 2023/04/13 22:00:22 UTC, 1 replies.
- [GitHub] [spark] alexjinghn commented on a diff in pull request #40747: [SPARK-43099][SQL] Use `getName` instead of `getCanonicalName` to get builder class name when registering udf to FunctionRegistry - posted by "alexjinghn (via GitHub)" <gi...@apache.org> on 2023/04/13 22:00:32 UTC, 3 replies.
- [GitHub] [spark] dongjoon-hyun opened a new pull request, #40786: [SPARK-43135][INFRA] Remove `branch-3.2` from `publish_snapshot` GitHub Action job - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/13 22:01:44 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #40785: [SPARK-42960] Add await_termination() and exception() API for Streaming Query - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/13 22:02:28 UTC, 4 replies.
- [GitHub] [spark] alexjinghn commented on pull request #40747: [SPARK-43099][SQL] Use `getName` instead of `getCanonicalName` to get builder class name when registering udf to FunctionRegistry - posted by "alexjinghn (via GitHub)" <gi...@apache.org> on 2023/04/13 22:04:01 UTC, 0 replies.
- [GitHub] [spark] sunchao closed pull request #40786: [SPARK-43135][INFRA] Remove `branch-3.2` from `publish_snapshot` GitHub Action job - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/13 22:34:08 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40786: [SPARK-43135][INFRA] Remove `branch-3.2` from `publish_snapshot` GitHub Action job - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/13 22:34:31 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40786: [SPARK-43135][INFRA] Remove `branch-3.2` from `publish_snapshot` GitHub Action job - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/13 22:38:55 UTC, 1 replies.
- [GitHub] [spark] WweiL commented on a diff in pull request #40785: [SPARK-42960] Add await_termination() and exception() API for Streaming Query - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/13 22:46:57 UTC, 13 replies.
- [GitHub] [spark] sunchao commented on a diff in pull request #40701: [SPARK-43064][SQL] Spark SQL CLI SQL tab should only show once statement once - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/13 23:37:37 UTC, 0 replies.
- [GitHub] [spark] zhenlineo commented on pull request #40783: [SPARK-43129] Scala core API for streaming Spark Connect - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/13 23:57:46 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #39327: [SPARK-41801][CORE][PYTHON][PS] Remove `*args: Any, **kwargs: Any` for `def transpose` - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/14 00:17:32 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #37967: Scalable SkipGram-Word2Vec implementation - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/14 00:17:33 UTC, 0 replies.
- [GitHub] [spark] sunchao closed pull request #40781: [SPARK-43126][SQL] Mark two Hive UDF expressions as stateful - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/14 00:47:01 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40781: [SPARK-43126][SQL] Mark two Hive UDF expressions as stateful - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/14 00:47:22 UTC, 0 replies.
- [GitHub] [spark] sunchao closed pull request #40773: [SPARK-43121][SQL] Use `BytesWritable.copyBytes` instead of manual copy in `HiveInspectors - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/14 00:48:51 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40773: [SPARK-43121][SQL] Use `BytesWritable.copyBytes` instead of manual copy in `HiveInspectors - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/14 00:49:58 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40781: [SPARK-43126][SQL] Mark two Hive UDF expressions as stateful - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/14 00:52:50 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40773: [SPARK-43121][SQL] Use `BytesWritable.copyBytes` instead of manual copy in `HiveInspectors - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/14 00:53:14 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40750: [SPARK-43105][CONNECT] Abbreviate Bytes and Strings in proto message - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/14 00:59:17 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40750: [SPARK-43105][CONNECT] Abbreviate Bytes and Strings in proto message - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/14 00:59:36 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on a diff in pull request #40730: [SPARK-43086][CORE] Support bin pack task scheduling on executors - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/14 01:11:32 UTC, 2 replies.
- [GitHub] [spark] beliefer commented on pull request #40782: [SPARK-42669][CONNECT] Short circuit local relation RPCs - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/14 01:18:20 UTC, 2 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40787: [SPARK-42994][TESTS][FOLLOWUP] Skip `TorchDistributorLocalUnitTestsOnConnect` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/14 01:21:18 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40787: [SPARK-42994][TESTS][FOLLOWUP] Skip `TorchDistributorLocalUnitTestsOnConnect` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/14 01:22:07 UTC, 3 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40782: [SPARK-42669][CONNECT] Short circuit local relation RPCs - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/14 01:26:07 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40772: [MINOR][TESTS] Fix DistributedDataParallel model code in TorchDistributor suite - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/14 01:29:37 UTC, 0 replies.
- [GitHub] [spark] yaooqinn commented on pull request #40781: [SPARK-43126][SQL] Mark two Hive UDF expressions as stateful - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/14 01:40:44 UTC, 0 replies.
- [GitHub] [spark] zhenlineo commented on pull request #40729: [SPARK-43136][CONNECT] Adding groupByKey + mapGroup + coGroup functions - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/14 01:40:51 UTC, 1 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40773: [SPARK-43121][SQL] Use `BytesWritable.copyBytes` instead of manual copy in `HiveInspectors - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/14 02:02:01 UTC, 0 replies.
- [GitHub] [spark] lyy-pineapple commented on pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni) - posted by "lyy-pineapple (via GitHub)" <gi...@apache.org> on 2023/04/14 02:12:22 UTC, 3 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40783: [SPARK-43129] Scala core API for streaming Spark Connect - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/14 02:21:33 UTC, 3 replies.
- [GitHub] [spark] yaooqinn commented on a diff in pull request #40774: [SPARK-41210][K8S] Port executor failure tracker from Spark on YARN to K8s - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/14 03:10:13 UTC, 2 replies.
- [GitHub] [spark] pan3793 commented on a diff in pull request #40774: [SPARK-41210][K8S] Port executor failure tracker from Spark on YARN to K8s - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/14 03:10:58 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40754: [SPARK-37099][SQL][FOLLOWUP] Add numOutputRows metric for WindowGroupLimitExec - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/14 03:12:14 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40754: [SPARK-37099][SQL][FOLLOWUP] Add numOutputRows metric for WindowGroupLimitExec - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/14 03:13:22 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40772: [MINOR][TESTS] Fix DistributedDataParallel model code in TorchDistributor suite - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/14 03:15:36 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40772: [MINOR][TESTS] Fix DistributedDataParallel model code in TorchDistributor suite - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/14 03:15:56 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40688: [SPARK-43021][SQL] `CoalesceBucketsInJoin` not work when using AQE - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/14 03:19:12 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40688: [SPARK-43021][SQL] `CoalesceBucketsInJoin` not work when using AQE - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/14 03:19:27 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40788: [SPARK-42452][BUILD] Cleanup `hadoop-2` profile - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/14 03:21:44 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40787: [SPARK-42994][TESTS][FOLLOWUP] Skip `TorchDistributorLocalUnitTestsOnConnect` - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/14 03:29:54 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40787: [SPARK-42994][TESTS][FOLLOWUP] Skip `TorchDistributorLocalUnitTestsOnConnect` - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/14 03:30:13 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on pull request #40782: [SPARK-42669][CONNECT] Short circuit local relation RPCs - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/14 03:31:45 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40788: [SPARK-42452][BUILD] Remove `hadoop-2` profile from Apache Spark 3.5.0 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/14 03:33:53 UTC, 3 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40782: [SPARK-42669][CONNECT] Short circuit local relation RPCs - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/14 03:34:19 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40780: [SPARK-43125][CONNECT] Fix Connect Server Can't Handle Exception With Null Message - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/14 03:35:59 UTC, 0 replies.
- [GitHub] [spark] dfercode commented on pull request #40771: [SPARK-35723] set k8s pod container request, limit memory separately. - posted by "dfercode (via GitHub)" <gi...@apache.org> on 2023/04/14 03:37:07 UTC, 2 replies.
- [GitHub] [spark] wangyum commented on pull request #40756: [SPARK-43107][SQL] Coalesce buckets in join applied on broadcast join stream side - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/14 03:39:50 UTC, 1 replies.
- [GitHub] [spark] Hisoka-X commented on a diff in pull request #40780: [SPARK-43125][CONNECT] Fix Connect Server Can't Handle Exception With Null Message - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/14 03:40:44 UTC, 1 replies.
- [GitHub] [spark] gengliangwang commented on a diff in pull request #40776: [SPARK-43123][SQL] Internal field metadata should not be leaked to catalogs - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/14 03:41:49 UTC, 1 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40783: [SPARK-43129] Scala core API for streaming Spark Connect - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/14 03:43:14 UTC, 14 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40747: [SPARK-43099][SQL] Use `getName` instead of `getCanonicalName` to get builder class name when registering udf to FunctionRegistry - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/14 03:44:30 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40780: [SPARK-43125][CONNECT] Fix Connect Server Can't Handle Exception With Null Message - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/14 03:45:54 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40780: [SPARK-43125][CONNECT] Fix Connect Server Can't Handle Exception With Null Message - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/14 03:46:12 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40759: [SPARK-43111][PS][CONNECT][PYTHON] Merge nested `if` statements into single `if` statements - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/14 03:46:35 UTC, 1 replies.
- [GitHub] [spark] beliefer opened a new pull request, #40789: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and equals to zero. - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/14 03:53:17 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40736: [SPARK-43084] [SS] Add applyInPandasWithState support for spark connect - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/14 04:16:07 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40788: [SPARK-42452][BUILD] Remove `hadoop-2` profile from Apache Spark 3.5.0 - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/14 04:59:32 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40787: [SPARK-42994][TESTS][FOLLOWUP] Skip `TorchDistributorLocalUnitTestsOnConnect` - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/14 05:20:43 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40610: [SPARK-42626][CONNECT] Add Destructive Iterator for SparkResult - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/14 05:43:04 UTC, 1 replies.
- [GitHub] [spark] yaooqinn commented on a diff in pull request #40747: [SPARK-43099][SQL] Use `getName` instead of `getCanonicalName` to get builder class name when registering udf to FunctionRegistry - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/14 05:47:44 UTC, 1 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40747: [SPARK-43099][SQL] Use `getName` instead of `getCanonicalName` to get builder class name when registering udf to FunctionRegistry - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/14 05:52:54 UTC, 5 replies.
- [GitHub] [spark] yaooqinn commented on pull request #40771: [SPARK-35723] set k8s pod container request, limit memory separately. - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/14 06:03:04 UTC, 0 replies.
- [GitHub] [spark] beliefer commented on a diff in pull request #40782: [SPARK-42669][CONNECT] Short circuit local relation RPCs - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/14 06:35:48 UTC, 5 replies.
- [GitHub] [spark] bjornjorgensen commented on pull request #40788: [SPARK-42452][BUILD] Remove `hadoop-2` profile from Apache Spark 3.5.0 - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/14 06:47:02 UTC, 1 replies.
- [GitHub] [spark] grundprinzip commented on a diff in pull request #40736: [SPARK-43084] [SS] Add applyInPandasWithState support for spark connect - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/14 06:54:00 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40605: [SPARK-42958][CONNECT] Refactor `connect-jvm-client-mima-check` to support mima check with avro module - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/14 07:05:06 UTC, 0 replies.
- [GitHub] [spark] daureg commented on pull request #31073: [SPARK-33995][SQL] Expose make_interval as a Scala function - posted by "daureg (via GitHub)" <gi...@apache.org> on 2023/04/14 07:34:44 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40776: [SPARK-43123][SQL] Internal field metadata should not be leaked to catalogs - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/14 09:08:42 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on pull request #31073: [SPARK-33995][SQL] Expose make_interval as a Scala function - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/14 09:14:56 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni) - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/14 09:34:34 UTC, 4 replies.
- [GitHub] [spark] jackylee-ch opened a new pull request, #40790: [SPARK-43116][SQL] Fix Cast.forceNullable - posted by "jackylee-ch (via GitHub)" <gi...@apache.org> on 2023/04/14 10:01:15 UTC, 0 replies.
- [GitHub] [spark] wangyum opened a new pull request, #40791: [SPARK-43140][SQL][TESTS] Override computeStats in `DummyLeafNode` - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/14 10:03:04 UTC, 0 replies.
- [GitHub] [spark] beliefer commented on pull request #40789: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and equals to zero. - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/14 10:27:38 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon opened a new pull request, #40792: [SPARK-43141][BUILD] Ignore generated Java files in checkstyle - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/14 11:38:35 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40792: [SPARK-43141][BUILD] Ignore generated Java files in checkstyle - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/14 11:39:18 UTC, 5 replies.
- [GitHub] [spark] grundprinzip commented on a diff in pull request #40675: [SPARK-42657][CONNECT] Support to find and transfer client-side REPL classfiles to server as artifacts - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/14 11:58:24 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40769: [WIP][PYTHON][TESTS] Reduce the required resources in PyTorch related tests - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/14 12:24:09 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40793: [WIP][SPARK-43122][TESTS] Reenable TorchDistributorLocalUnitTestsOnConnect and TorchDistributorLocalUnitTestsIIOnConnect - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/14 12:35:49 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40793: [WIP][SPARK-43122][TESTS] Reenable TorchDistributorLocalUnitTestsOnConnect and TorchDistributorLocalUnitTestsIIOnConnect - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/14 12:42:30 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40789: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and equals to zero. - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/14 12:58:45 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40755: [SPARK-37829][SQL] Dataframe.joinWith outer-join should return a null value for unmatched row - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/14 13:07:31 UTC, 1 replies.
- [GitHub] [spark] rshkv opened a new pull request, #40794: [SPARK-43142] Fix DSL expressions on attributes with special characters - posted by "rshkv (via GitHub)" <gi...@apache.org> on 2023/04/14 13:43:47 UTC, 0 replies.
- [GitHub] [spark] jaceklaskowski commented on a diff in pull request #40748: [WIP][SPARK-43097] New pyspark ML logistic regression estimator implemented on top of distributor - posted by "jaceklaskowski (via GitHub)" <gi...@apache.org> on 2023/04/14 14:19:23 UTC, 0 replies.
- [GitHub] [spark] zhenlineo commented on pull request #40762: [WIP] Fix maven test build for udf tests - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/14 15:15:11 UTC, 0 replies.
- [GitHub] [spark] wankunde commented on pull request #40157: [SPARK-42551][SQL] Support more subexpression elimination cases - posted by "wankunde (via GitHub)" <gi...@apache.org> on 2023/04/14 15:25:03 UTC, 0 replies.
- [GitHub] [spark] wankunde closed pull request #40157: [SPARK-42551][SQL] Support more subexpression elimination cases - posted by "wankunde (via GitHub)" <gi...@apache.org> on 2023/04/14 15:25:04 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40792: [SPARK-43141][BUILD] Ignore generated Java files in checkstyle - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/14 15:31:59 UTC, 2 replies.
- [GitHub] [spark] sunchao closed pull request #40753: [SPARK-43104][BUILD] Set `shadeTestJar` of `protobuf` module to `false` - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/14 15:41:47 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40753: [SPARK-43104][BUILD] Set `shadeTestJar` of `protobuf` module to `false` - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/14 15:42:02 UTC, 0 replies.
- [GitHub] [spark] sunchao closed pull request #40701: [SPARK-43064][SQL] Spark SQL CLI SQL tab should only show once statement once - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/14 15:43:52 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40701: [SPARK-43064][SQL] Spark SQL CLI SQL tab should only show once statement once - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/14 15:44:28 UTC, 0 replies.
- [GitHub] [spark] Yikf commented on pull request #40437: [SPARK-41259][SQL] SparkSQLDriver use the spark result string that is consistent with that of `df.show` - posted by "Yikf (via GitHub)" <gi...@apache.org> on 2023/04/14 16:15:05 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40701: [SPARK-43064][SQL] Spark SQL CLI SQL tab should only show once statement once - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/14 16:15:46 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40753: [SPARK-43104][BUILD] Set `shadeTestJar` of `protobuf` module to `false` - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/14 16:16:13 UTC, 0 replies.
- [GitHub] [spark] rangadi opened a new pull request, #40795: [TEMP] Scala m1 temp - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/14 17:35:20 UTC, 0 replies.
- [GitHub] [spark] WweiL commented on pull request #40780: [SPARK-43125][CONNECT] Fix Connect Server Can't Handle Exception With Null Message - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/14 19:11:45 UTC, 4 replies.
- [GitHub] [spark] sunchao commented on pull request #40788: [SPARK-42452][BUILD] Remove `hadoop-2` profile from Apache Spark 3.5.0 - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/14 19:25:54 UTC, 3 replies.
- [GitHub] [spark] WweiL commented on a diff in pull request #40783: [SPARK-43129] Scala core API for streaming Spark Connect - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/14 19:28:02 UTC, 1 replies.
- [GitHub] [spark] zhenlineo opened a new pull request, #40796: [WIP]Typed agg functions - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/14 19:39:43 UTC, 0 replies.
- [GitHub] [spark] kings129 commented on a diff in pull request #40755: [SPARK-37829][SQL] Dataframe.joinWith outer-join should return a null value for unmatched row - posted by "kings129 (via GitHub)" <gi...@apache.org> on 2023/04/14 20:28:26 UTC, 2 replies.
- [GitHub] [spark] WweiL opened a new pull request, #40797: [SPARK-43042] [SS] [Connect] Add table() API support for DataStreamReader - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/14 20:32:52 UTC, 0 replies.
- [GitHub] [spark] WweiL commented on pull request #40797: [SPARK-43042] [SS] [Connect] Add table() API support for DataStreamReader - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/14 20:36:06 UTC, 1 replies.
- [GitHub] [spark] zhenlineo commented on a diff in pull request #40783: [SPARK-43129] Scala core API for streaming Spark Connect - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/14 21:26:45 UTC, 0 replies.
- [GitHub] [spark] zhenlineo commented on a diff in pull request #40747: [SPARK-43099][SQL] Use `getName` instead of `getCanonicalName` to get builder class name when registering udf to FunctionRegistry - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/14 21:40:31 UTC, 0 replies.
- [GitHub] [spark] DerekTBrown opened a new pull request, #40798: fix: name docker users - posted by "DerekTBrown (via GitHub)" <gi...@apache.org> on 2023/04/14 21:55:08 UTC, 0 replies.
- [GitHub] [spark] yigress opened a new pull request, #40799: [SPARK-43145][SQL] Reduce ClassNotFound of hive storage handler table - posted by "yigress (via GitHub)" <gi...@apache.org> on 2023/04/14 22:17:40 UTC, 0 replies.
- [GitHub] [spark] ueshin opened a new pull request, #40800: [SPARK-43146][CONNECT][PYTHON] Implement eager evaluation for __repr__ and _repr_html_ - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/14 23:19:01 UTC, 0 replies.
- [GitHub] [spark] ueshin commented on a diff in pull request #40747: [SPARK-43099][SQL] Use `getName` instead of `getCanonicalName` to get builder class name when registering udf to FunctionRegistry - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/14 23:25:06 UTC, 0 replies.
- [GitHub] [spark] WweiL opened a new pull request, #40801: [SPARK-43147] fix flake8 lint for local check - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/15 00:10:47 UTC, 0 replies.
- [GitHub] [spark] WweiL commented on pull request #40801: [SPARK-43147] fix flake8 lint for local check - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/15 00:11:18 UTC, 0 replies.
- [GitHub] [spark] rangadi closed pull request #40795: [TEMP] Scala m1 temp - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/15 00:21:56 UTC, 0 replies.
- [GitHub] [spark-docker] Yikun opened a new pull request, #33: Add Apache Spark 3.4.0 Dockerfiles - posted by "Yikun (via GitHub)" <gi...@apache.org> on 2023/04/15 01:06:09 UTC, 0 replies.
- [GitHub] [spark] wangyum closed pull request #40756: [SPARK-43107][SQL] Coalesce buckets in join applied on broadcast join stream side - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/15 01:07:15 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40797: [SPARK-43042] [SS] [Connect] Add table() API support for DataStreamReader - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/15 01:08:16 UTC, 1 replies.
- [GitHub] [spark] wangyum closed pull request #40743: [SPARK-42597][SQL][FOLLOW-UP] Only rewrite `EqualNullSafe` when the left side is non-nullable - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/15 01:10:21 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on pull request #40743: [SPARK-42597][SQL][FOLLOW-UP] Only rewrite `EqualNullSafe` when the left side is non-nullable - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/15 01:10:47 UTC, 0 replies.
- [GitHub] [spark-docker] Yikun closed pull request #32: Test on 3.4.0-rc7 - posted by "Yikun (via GitHub)" <gi...@apache.org> on 2023/04/15 01:12:12 UTC, 0 replies.
- [GitHub] [spark] wangyum closed pull request #40555: [SPARK-42926][BUILD][SQL] Upgrade Parquet to 1.13.0 - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/15 01:15:46 UTC, 0 replies.
- [GitHub] [spark] wangyum closed pull request #40742: [SPARK-43095][SQL] Avoid Once strategy's idempotence is broken for batch: `Infer Filters` - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/15 01:24:54 UTC, 0 replies.
- [GitHub] [spark] wangyum closed pull request #40685: [SPARK-43050][SQL] Fix construct aggregate expressions by replacing grouping functions - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/15 01:27:34 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on pull request #40780: [SPARK-43125][CONNECT] Fix Connect Server Can't Handle Exception With Null Message - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/15 02:18:29 UTC, 1 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40802: [SPARK-43150][SQL] Remove workaround for PARQUET-2160 - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/15 02:20:48 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40802: [SPARK-43150][SQL] Remove workaround for PARQUET-2160 - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/15 02:21:09 UTC, 0 replies.
- [GitHub] [spark] panbingkun opened a new pull request, #40803: [MINOR][CONNECT][PYTHON] Typo fixes - posted by "panbingkun (via GitHub)" <gi...@apache.org> on 2023/04/15 02:52:01 UTC, 0 replies.
- [GitHub] [spark-docker] Yikun commented on pull request #33: [SPARK-43148] Add Apache Spark 3.4.0 Dockerfiles - posted by "Yikun (via GitHub)" <gi...@apache.org> on 2023/04/15 03:40:45 UTC, 2 replies.
- [GitHub] [spark] sunchao closed pull request #40802: [SPARK-43150][SQL] Remove workaround for PARQUET-2160 - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/15 04:50:47 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40802: [SPARK-43150][SQL] Remove workaround for PARQUET-2160 - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/15 04:51:00 UTC, 0 replies.
- [GitHub] [spark] yigress commented on pull request #40799: [SPARK-43145][SQL] Reduce ClassNotFound of hive storage handler table - posted by "yigress (via GitHub)" <gi...@apache.org> on 2023/04/15 05:43:31 UTC, 0 replies.
- [GitHub] [spark] gengliangwang opened a new pull request, #40804: [SPARK-43151][DOC] Update the prerequisites for generating Python API docs - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/15 07:14:12 UTC, 0 replies.
- [GitHub] [spark] ivoson commented on pull request #40610: [SPARK-42626][CONNECT] Add Destructive Iterator for SparkResult - posted by "ivoson (via GitHub)" <gi...@apache.org> on 2023/04/15 10:58:34 UTC, 2 replies.
- [GitHub] [spark] ivoson commented on a diff in pull request #40610: [SPARK-42626][CONNECT] Add Destructive Iterator for SparkResult - posted by "ivoson (via GitHub)" <gi...@apache.org> on 2023/04/15 10:58:58 UTC, 4 replies.
- [GitHub] [spark] wangyum opened a new pull request, #40805: [SPARK-40609][SQL] Unwrap cast in the join condition to unlock bucketed read - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/15 14:25:24 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on a diff in pull request #38047: [SPARK-40609][SQL] Casts types according to bucket info for Equality expressions - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/15 14:27:38 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on a diff in pull request #39950: [SPARK-42388][SQL] Avoid parquet footer reads twice in vectorized reader - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/15 17:56:01 UTC, 0 replies.
- [GitHub] [spark] WweiL commented on a diff in pull request #40797: [SPARK-43042] [SS] [Connect] Add table() API support for DataStreamReader - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/15 19:09:38 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on pull request #40797: [SPARK-43042] [SS] [Connect] Add table() API support for DataStreamReader - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/15 19:30:22 UTC, 0 replies.
- [GitHub] [spark] ueshin opened a new pull request, #40806: [SPARK-43153][CONNECT] Skip Spark execution when the dataframe is local - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/15 21:37:42 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #39187: [SPARK-41670] WIP builtin schema - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/16 00:19:34 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #38660: [SPARK-40199][SQL][WIP] Provide useful error when encountering null values in non-null fields - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/16 00:19:36 UTC, 0 replies.
- [GitHub] [spark] wangyum opened a new pull request, #40807: [SPARK-43139][SQL][DOCS] Fix incorrect column names in sql-ref-syntax-dml-insert-table.md - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/16 00:27:57 UTC, 0 replies.
- [GitHub] [spark] wangyum closed pull request #40803: [MINOR][CONNECT][PYTHON] Typo fixes - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/16 00:29:50 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on pull request #40803: [MINOR][CONNECT][PYTHON] Typo fixes - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/16 00:30:00 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on a diff in pull request #40790: [SPARK-43116][SQL] Fix Cast.forceNullable - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/16 00:37:50 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40804: [SPARK-43151][DOC] Update the prerequisites for generating Python API docs - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/16 02:13:29 UTC, 0 replies.
- [GitHub] [spark] gengliangwang commented on a diff in pull request #40804: [SPARK-43151][DOC] Update the prerequisites for generating Python API docs - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/16 03:58:19 UTC, 0 replies.
- [GitHub] [spark] sunchao closed pull request #39950: [SPARK-42388][SQL] Avoid parquet footer reads twice in vectorized reader - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/16 04:10:07 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #39950: [SPARK-42388][SQL] Avoid parquet footer reads twice in vectorized reader - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/16 04:10:30 UTC, 1 replies.
- [GitHub] [spark] amaliujia commented on pull request #40804: [SPARK-43151][DOC] Update the prerequisites for generating Python API docs - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/16 06:05:09 UTC, 0 replies.
- [GitHub] [spark] yabola commented on pull request #39950: [SPARK-42388][SQL] Avoid parquet footer reads twice in vectorized reader - posted by "yabola (via GitHub)" <gi...@apache.org> on 2023/04/16 06:49:11 UTC, 1 replies.
- [GitHub] [spark] wangyum closed pull request #40807: [SPARK-43139][SQL][DOCS] Fix incorrect column names in sql-ref-syntax-dml-insert-table.md - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/16 06:58:27 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on pull request #40807: [SPARK-43139][SQL][DOCS] Fix incorrect column names in sql-ref-syntax-dml-insert-table.md - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/16 07:00:12 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40792: [SPARK-43141][BUILD] Ignore generated Java files in checkstyle - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/16 10:46:44 UTC, 0 replies.
- [GitHub] [spark] eejbyfeldt opened a new pull request, #40808: [SPARK-43138[: Fix ClaassNotFoundException during migration - posted by "eejbyfeldt (via GitHub)" <gi...@apache.org> on 2023/04/16 12:50:31 UTC, 0 replies.
- [GitHub] [spark] ericsun95 commented on a diff in pull request #39550: [SPARK-42056][SQL][PROTOBUF] Add missing options for Protobuf functions - posted by "ericsun95 (via GitHub)" <gi...@apache.org> on 2023/04/16 15:06:53 UTC, 5 replies.
- [GitHub] [spark] SandishKumarHN commented on a diff in pull request #39550: [SPARK-42056][SQL][PROTOBUF] Add missing options for Protobuf functions - posted by "SandishKumarHN (via GitHub)" <gi...@apache.org> on 2023/04/16 16:33:21 UTC, 5 replies.
- [GitHub] [spark] tirumaleshn2458 opened a new pull request, #40809: Master clone3 - posted by "tirumaleshn2458 (via GitHub)" <gi...@apache.org> on 2023/04/16 16:51:14 UTC, 0 replies.
- [GitHub] [spark] SandishKumarHN commented on pull request #39550: [SPARK-42056][SQL][PROTOBUF] Add missing options for Protobuf functions - posted by "SandishKumarHN (via GitHub)" <gi...@apache.org> on 2023/04/16 17:56:17 UTC, 0 replies.
- [GitHub] [spark] jaceklaskowski commented on a diff in pull request #40800: [SPARK-43146][CONNECT][PYTHON] Implement eager evaluation for __repr__ and _repr_html_ - posted by "jaceklaskowski (via GitHub)" <gi...@apache.org> on 2023/04/16 18:44:54 UTC, 0 replies.
- [GitHub] [spark] jaceklaskowski commented on a diff in pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni) - posted by "jaceklaskowski (via GitHub)" <gi...@apache.org> on 2023/04/16 19:01:11 UTC, 0 replies.
- [GitHub] [spark] kori73 opened a new pull request, #40810: [SPARK-42317][SQL] Assign name to _LEGACY_ERROR_TEMP_2247: CANNOT_MERGE_SCHEMAS - posted by "kori73 (via GitHub)" <gi...@apache.org> on 2023/04/16 19:14:26 UTC, 0 replies.
- [GitHub] [spark] ueshin commented on a diff in pull request #40800: [SPARK-43146][CONNECT][PYTHON] Implement eager evaluation for __repr__ and _repr_html_ - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/16 19:20:57 UTC, 1 replies.
- [GitHub] [spark] jchen5 opened a new pull request, #40811: [SPARK-43098][SQL] Fix correctness COUNT bug when scalar subquery has group by clause - posted by "jchen5 (via GitHub)" <gi...@apache.org> on 2023/04/16 21:18:32 UTC, 0 replies.
- [GitHub] [spark] jchen5 commented on pull request #40811: [SPARK-43098][SQL] Fix correctness COUNT bug when scalar subquery has group by clause - posted by "jchen5 (via GitHub)" <gi...@apache.org> on 2023/04/16 21:20:33 UTC, 2 replies.
- [GitHub] [spark] pengzhon-db commented on a diff in pull request #40783: [SPARK-43129] Scala core API for streaming Spark Connect - posted by "pengzhon-db (via GitHub)" <gi...@apache.org> on 2023/04/16 21:21:26 UTC, 0 replies.
- [GitHub] [spark] robreeves opened a new pull request, #40812: [WIP][SPARK-43157][SQL] Clone InMemoryRelation cached plan to prevent cloned plan from referencing same objects - posted by "robreeves (via GitHub)" <gi...@apache.org> on 2023/04/16 22:45:02 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #39377: [SPARK-41867][SQL] Selective predicate should respect InMemoryRelation - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/17 00:18:42 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #39187: [SPARK-41670] WIP builtin schema - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/17 00:18:44 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #38660: [SPARK-40199][SQL][WIP] Provide useful error when encountering null values in non-null fields - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/17 00:18:45 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon opened a new pull request, #40813: [SPARK-42475][DOCS][FOLLOW-UP] Fix connect quickstart - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 00:28:04 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40813: [SPARK-42475][DOCS][FOLLOW-UP] Fix PySpark connect Quickstart binder link - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 00:49:26 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40813: [SPARK-42475][DOCS][FOLLOW-UP] Fix PySpark connect Quickstart binder link - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 00:49:42 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon opened a new pull request, #40814: [SPARK-43158][DOCS] Set upperbound of pandas version for Binder integration - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 01:08:30 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40814: [SPARK-43158][DOCS] Set upperbound of pandas version for Binder integration - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 01:11:51 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40814: [SPARK-43158][DOCS] Set upperbound of pandas version for Binder integration - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 01:12:04 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40804: [SPARK-43151][DOC] Update the prerequisites for generating Python API docs - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/17 01:17:56 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40800: [SPARK-43146][CONNECT][PYTHON] Implement eager evaluation for __repr__ and _repr_html_ - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/17 01:25:45 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on a diff in pull request #40805: [SPARK-40609][SQL] Unwrap cast in the join condition to unlock bucketed read - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/17 02:06:54 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on pull request #40805: [SPARK-40609][SQL] Unwrap cast in the join condition to unlock bucketed read - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/17 02:07:40 UTC, 2 replies.
- [GitHub] [spark] wangyum commented on pull request #40791: [SPARK-43140][SQL][TESTS] Override computeStats in `DummyLeafNode` - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/17 02:13:28 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on pull request #40741: [SPARK-41811][CONNECT][CLIENT] Support sql with dataframes and columns - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/17 02:17:28 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon opened a new pull request, #40815: [SPARK-42475][DOCS][FOLLOW-UP] Fix the version string with dev0 to work in Binder integration - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 02:36:54 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40815: [SPARK-42475][DOCS][FOLLOW-UP] Fix the version string with dev0 to work in Binder integration - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 02:39:47 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40815: [SPARK-42475][DOCS][FOLLOW-UP] Fix the version string with dev0 to work in Binder integration - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 02:40:10 UTC, 1 replies.
- [GitHub] [spark] itholic opened a new pull request, #40816: [SPARK-42078][PYTHON][FOLLOWUP] Add `CapturedException` to utils - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/17 02:55:54 UTC, 0 replies.
- [GitHub] [spark] itholic commented on pull request #40816: [SPARK-42078][PYTHON][FOLLOWUP] Add `CapturedException` to utils - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/17 02:56:28 UTC, 0 replies.
- [GitHub] [spark] lyy-pineapple commented on a diff in pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni) - posted by "lyy-pineapple (via GitHub)" <gi...@apache.org> on 2023/04/17 02:59:19 UTC, 6 replies.
- [GitHub] [spark] mingyangge-db commented on a diff in pull request #40816: [SPARK-42078][PYTHON][FOLLOWUP] Add `CapturedException` to utils - posted by "mingyangge-db (via GitHub)" <gi...@apache.org> on 2023/04/17 03:00:04 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40816: [SPARK-42078][PYTHON][FOLLOWUP] Add `CapturedException` to utils - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 03:02:25 UTC, 2 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40804: [SPARK-43151][DOC] Update the prerequisites for generating Python API docs - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 03:04:39 UTC, 1 replies.
- [GitHub] [spark] itholic commented on a diff in pull request #40816: [SPARK-42078][PYTHON][FOLLOWUP] Add `CapturedException` to utils - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/17 03:04:40 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40804: [SPARK-43151][DOC] Update the prerequisites for generating Python API docs - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 03:05:05 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40801: [SPARK-43147] fix flake8 lint for local check - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 03:09:00 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40801: [SPARK-43147] fix flake8 lint for local check - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 03:09:21 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40798: fix: name docker users - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 03:11:37 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40791: [SPARK-43140][SQL][TESTS] Override computeStats in `DummyLeafNode` - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 03:12:59 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40791: [SPARK-43140][SQL][TESTS] Override computeStats in `DummyLeafNode` - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 03:13:41 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40747: [SPARK-43099][SQL] Use `getName` instead of `getCanonicalName` to get builder class name when registering udf to FunctionRegistry - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/17 03:14:19 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40747: [SPARK-43099][SQL] Use `getName` instead of `getCanonicalName` to get builder class name when registering udf to FunctionRegistry - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/17 03:15:06 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40809: Master clone3 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 03:16:02 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40610: [SPARK-42626][CONNECT] Add Destructive Iterator for SparkResult - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/17 04:04:55 UTC, 6 replies.
- [GitHub] [spark] zsxwing commented on a diff in pull request #40776: [SPARK-43123][SQL] Internal field metadata should not be leaked to catalogs - posted by "zsxwing (via GitHub)" <gi...@apache.org> on 2023/04/17 04:05:45 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40805: [SPARK-40609][SQL] Unwrap cast in the join condition to unlock bucketed read - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/17 04:07:34 UTC, 2 replies.
- [GitHub] [spark] liang3zy22 opened a new pull request, #40817: [SPARK-42845][SQL] Update the error class _LEGACY_ERROR_TEMP_2010 to MERGE_UNSUPPORTED_BY_WINDOW_FUNCTION - posted by "liang3zy22 (via GitHub)" <gi...@apache.org> on 2023/04/17 04:33:34 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40784: [SPARK-43130][SQL] Move InternalType to PhysicalDataType - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/17 04:35:52 UTC, 0 replies.
- [GitHub] [spark] liang3zy22 commented on pull request #40817: [SPARK-42845][SQL] Update the error class _LEGACY_ERROR_TEMP_2010 to MERGE_UNSUPPORTED_BY_WINDOW_FUNCTION - posted by "liang3zy22 (via GitHub)" <gi...@apache.org> on 2023/04/17 04:39:40 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40784: [SPARK-43130][SQL] Move InternalType to PhysicalDataType - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/17 04:45:59 UTC, 0 replies.
- [GitHub] [spark] gengliangwang commented on pull request #40804: [SPARK-43151][DOC] Update the prerequisites for generating Python API docs - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/17 05:25:26 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40776: [SPARK-43123][SQL] Internal field metadata should not be leaked to catalogs - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/17 05:34:43 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40804: [SPARK-43151][DOC] Update the prerequisites for generating Python API docs - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 06:00:21 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40797: [SPARK-43042] [SS] [Connect] Add table() API support for DataStreamReader - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 06:04:30 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40797: [SPARK-43042] [SS] [Connect] Add table() API support for DataStreamReader - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 06:05:37 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40816: [SPARK-42078][PYTHON][FOLLOWUP] Add `CapturedException` to utils - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 06:09:33 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40816: [SPARK-42078][PYTHON][FOLLOWUP] Add `CapturedException` to utils - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 06:09:49 UTC, 0 replies.
- [GitHub] [spark] itholic commented on a diff in pull request #40722: [SPARK-43076][PS][CONNECT] Removing the dependency on `grpcio` when remote session is not used. - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/17 06:22:18 UTC, 17 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40818: [MINOR][CONNECT][PYTHON] Add missing `super().__init__()` in expressions - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/17 06:31:20 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40800: [SPARK-43146][CONNECT][PYTHON] Implement eager evaluation for __repr__ and _repr_html_ - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 06:38:04 UTC, 1 replies.
- [GitHub] [spark] aimtsou opened a new pull request, #40819: [SPARK-43160][PYTHON]: Removed typing.io deprecated namespace - posted by "aimtsou (via GitHub)" <gi...@apache.org> on 2023/04/17 07:31:35 UTC, 0 replies.
- [GitHub] [spark] Yikf opened a new pull request, #40820: [MINOR] Improve spark.sql.files.minPartitionNum's doc - posted by "Yikf (via GitHub)" <gi...@apache.org> on 2023/04/17 07:36:05 UTC, 0 replies.
- [GitHub] [spark] Yikf commented on pull request #40820: [MINOR] Improve spark.sql.files.minPartitionNum's doc - posted by "Yikf (via GitHub)" <gi...@apache.org> on 2023/04/17 07:36:24 UTC, 1 replies.
- [GitHub] [spark] aimtsou commented on pull request #40819: [WIP][SPARK-43160][PYTHON]: Removed typing.io deprecated namespace - posted by "aimtsou (via GitHub)" <gi...@apache.org> on 2023/04/17 07:53:02 UTC, 0 replies.
- [GitHub] [spark] woj-i opened a new pull request, #40821: [SPARK-43152][spark-structured-streaming] Parametrisable output metadata path (_spark_metadata) - posted by "woj-i (via GitHub)" <gi...@apache.org> on 2023/04/17 08:03:03 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40818: [MINOR][CONNECT][PYTHON] Add missing `super().__init__()` in expressions - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/17 08:13:44 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40818: [MINOR][CONNECT][PYTHON] Add missing `super().__init__()` in expressions - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/17 08:18:05 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40793: [SPARK-43122][TESTS] Reenable TorchDistributorLocalUnitTestsOnConnect and TorchDistributorLocalUnitTestsIIOnConnect - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/17 09:16:48 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40820: [MINOR] Improve spark.sql.files.minPartitionNum's doc - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 09:38:03 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40793: [SPARK-43122][TESTS] Reenable TorchDistributorLocalUnitTestsOnConnect and TorchDistributorLocalUnitTestsIIOnConnect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/17 09:39:48 UTC, 9 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40793: [SPARK-43122][CONNECT][PYTHON][ML][TESTS] Reenable TorchDistributorLocalUnitTestsOnConnect and TorchDistributorLocalUnitTestsIIOnConnect - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/17 09:57:30 UTC, 3 replies.
- [GitHub] [spark] yaooqinn opened a new pull request, #40822: [WIP] cancel-in-progress - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/17 10:44:33 UTC, 0 replies.
- [GitHub] [spark] ulysses-you commented on a diff in pull request #40820: [MINOR] Improve spark.sql.files.minPartitionNum's doc - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/17 11:04:16 UTC, 0 replies.
- [GitHub] [spark] Yikf commented on a diff in pull request #40820: [MINOR] Improve spark.sql.files.minPartitionNum's doc - posted by "Yikf (via GitHub)" <gi...@apache.org> on 2023/04/17 11:12:02 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40675: [SPARK-42657][CONNECT] Support to find and transfer client-side REPL classfiles to server as artifacts - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/17 13:17:30 UTC, 0 replies.
- [GitHub] [spark] hvanhovell closed pull request #40675: [SPARK-42657][CONNECT] Support to find and transfer client-side REPL classfiles to server as artifacts - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/17 13:18:14 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X opened a new pull request, #40823: [SPARK-42552][SQL] Fix select without parentheses can't be parsed - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/17 13:24:36 UTC, 0 replies.
- [GitHub] [spark] clownxc commented on pull request #40707: [SPARK-43033][SQL] Avoid task retries due to AssertNotNull checks - posted by "clownxc (via GitHub)" <gi...@apache.org> on 2023/04/17 13:30:37 UTC, 5 replies.
- [GitHub] [spark] pan3793 commented on a diff in pull request #40675: [SPARK-42657][CONNECT] Support to find and transfer client-side REPL classfiles to server as artifacts - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/17 13:41:14 UTC, 0 replies.
- [GitHub] [spark] wankunde opened a new pull request, #40824: [SPARK_32064][SQL] Support temporary table - posted by "wankunde (via GitHub)" <gi...@apache.org> on 2023/04/17 14:00:13 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40806: [SPARK-43153][CONNECT] Skip Spark execution when the dataframe is local - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/17 14:00:56 UTC, 1 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40823: [SPARK-42552][SQL] Fix select without parentheses can't be parsed - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/17 14:22:58 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40823: [SPARK-42552][SQL] Fix select without parentheses can't be parsed - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/17 14:24:21 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on a diff in pull request #40823: [SPARK-42552][SQL] Fix select without parentheses can't be parsed - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/17 14:24:28 UTC, 5 replies.
- [GitHub] [spark] wangyum commented on a diff in pull request #40823: [SPARK-42552][SQL] Fix select without parentheses can't be parsed - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/17 15:20:19 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on a diff in pull request #40823: [SPARK-42552][SQL] Fix select without parentheses can't be parsed - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/17 15:21:59 UTC, 3 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40738: [SPARK-42380][BUILD] Upgrade Apache Maven to 3.9.1 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/17 15:38:06 UTC, 3 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40783: [SPARK-43129] Scala core API for streaming Spark Connect - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/17 16:28:55 UTC, 14 replies.
- [GitHub] [spark-docker] xinrong-meng commented on pull request #33: [SPARK-43148] Add Apache Spark 3.4.0 Dockerfiles - posted by "xinrong-meng (via GitHub)" <gi...@apache.org> on 2023/04/17 17:08:19 UTC, 0 replies.
- [GitHub] [spark] ueshin commented on a diff in pull request #40806: [SPARK-43153][CONNECT] Skip Spark execution when the dataframe is local - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/17 17:25:21 UTC, 0 replies.
- [GitHub] [spark] amaliujia opened a new pull request, #40825: [SPARK-43165][SQL] Move canWrite to DataTypeUtils - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/17 17:39:05 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on pull request #40825: [SPARK-43165][SQL] Move canWrite to DataTypeUtils - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/17 17:40:17 UTC, 3 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40675: [SPARK-42657][CONNECT] Support to find and transfer client-side REPL classfiles to server as artifacts - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/17 17:49:18 UTC, 0 replies.
- [GitHub] [spark] pengzhon-db commented on a diff in pull request #40785: [SPARK-42960] Add await_termination() and exception() API for Streaming Query - posted by "pengzhon-db (via GitHub)" <gi...@apache.org> on 2023/04/17 18:33:27 UTC, 0 replies.
- [GitHub] [spark] amaliujia opened a new pull request, #40826: [SPARK-43168][SQL] Remove get PhysicalDataType method from Datatype class - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/17 18:45:10 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on pull request #40826: [SPARK-43168][SQL] Remove get PhysicalDataType method from Datatype class - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/17 18:45:32 UTC, 0 replies.
- [GitHub] [spark] zhenlineo commented on a diff in pull request #40796: [WIP]Typed agg functions - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/17 19:11:53 UTC, 0 replies.
- [GitHub] [spark] kori73 commented on pull request #40810: [SPARK-42317][SQL] Assign name to _LEGACY_ERROR_TEMP_2247: CANNOT_MERGE_SCHEMAS - posted by "kori73 (via GitHub)" <gi...@apache.org> on 2023/04/17 19:42:05 UTC, 1 replies.
- [GitHub] [spark] MaxGekk opened a new pull request, #40827: [WIP][SPARK-42585][CONNECT] Streaming the `createDataFrame` implementation - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/17 20:03:58 UTC, 0 replies.
- [GitHub] [spark] ueshin opened a new pull request, #40828: [SPARK-42984][CONNECT][PYTHON][TESTS] Enable test_createDataFrame_with_single_data_type - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/17 20:15:37 UTC, 0 replies.
- [GitHub] [spark] robreeves commented on pull request #40812: [SPARK-43157][SQL] Clone InMemoryRelation cached plan to prevent cloned plan from referencing same objects - posted by "robreeves (via GitHub)" <gi...@apache.org> on 2023/04/17 20:16:43 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on a diff in pull request #40799: [SPARK-43145][SQL] Reduce ClassNotFound of hive storage handler table - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/17 20:42:36 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #39550: [SPARK-42056][SQL][PROTOBUF] Add missing options for Protobuf functions - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/17 20:51:17 UTC, 2 replies.
- [GitHub] [spark] yigress commented on a diff in pull request #40799: [SPARK-43145][SQL] Reduce ClassNotFound of hive storage handler table - posted by "yigress (via GitHub)" <gi...@apache.org> on 2023/04/17 21:24:50 UTC, 0 replies.
- [GitHub] [spark] ueshin opened a new pull request, #40829: [SPARK-41971][SQL] Use deduplicated field names when creating Arrow RecordBatch - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/17 23:33:41 UTC, 0 replies.
- [GitHub] [spark] JoshRosen commented on a diff in pull request #40690: [SPARK-43043][CORE] Improve the performance of MapOutputTracker.updateMapOutput - posted by "JoshRosen (via GitHub)" <gi...@apache.org> on 2023/04/17 23:43:48 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40729: [SPARK-43136][CONNECT] Adding groupByKey + mapGroup + coGroup functions - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/17 23:54:14 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on pull request #40820: [MINOR] Improve spark.sql.files.minPartitionNum's doc - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/18 00:13:09 UTC, 1 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #39377: [SPARK-41867][SQL] Selective predicate should respect InMemoryRelation - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/18 00:17:48 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on pull request #40735: [SPARK-43092][CONNECT] Clean up unimplemented `dropDuplicatesWithinWatermark` series functions from `Dataset` - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/18 00:18:34 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40736: [SPARK-43084] [SS] Add applyInPandasWithState support for spark connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/18 00:21:25 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40736: [SPARK-43084] [SS] Add applyInPandasWithState support for spark connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/18 00:21:51 UTC, 0 replies.
- [GitHub] [spark] zhenlineo commented on a diff in pull request #40729: [SPARK-43136][CONNECT] Adding groupByKey + mapGroup + coGroup functions - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/18 00:23:02 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40825: [SPARK-43165][SQL] Move canWrite to DataTypeUtils - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/18 00:27:13 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40793: [SPARK-43122][CONNECT][PYTHON][ML][TESTS] Reenable TorchDistributorLocalUnitTestsOnConnect and TorchDistributorLocalUnitTestsIIOnConnect - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/18 00:36:44 UTC, 2 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40725: [SPARK-43082][Connect][PYTHON] Arrow-optimized Python UDFs in Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/18 00:51:01 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40782: [SPARK-42669][CONNECT] Short circuit local relation RPCs - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/18 01:00:46 UTC, 3 replies.
- [GitHub] [spark] jchen5 commented on a diff in pull request #40811: [SPARK-43098][SQL] Fix correctness COUNT bug when scalar subquery has group by clause - posted by "jchen5 (via GitHub)" <gi...@apache.org> on 2023/04/18 01:29:43 UTC, 4 replies.
- [GitHub] [spark] yaooqinn closed pull request #40774: [SPARK-41210][K8S] Port executor failure tracker from Spark on YARN to K8s - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/18 01:51:40 UTC, 0 replies.
- [GitHub] [spark] yaooqinn commented on pull request #40774: [SPARK-41210][K8S] Port executor failure tracker from Spark on YARN to K8s - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/18 01:52:11 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40774: [SPARK-41210][K8S] Port executor failure tracker from Spark on YARN to K8s - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/18 01:53:38 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40811: [SPARK-43098][SQL] Fix correctness COUNT bug when scalar subquery has group by clause - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/18 02:01:01 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40811: [SPARK-43098][SQL] Fix correctness COUNT bug when scalar subquery has group by clause - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/18 02:01:33 UTC, 1 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40729: [SPARK-43136][CONNECT] Adding groupByKey + mapGroup + coGroup functions - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/18 02:12:42 UTC, 9 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40828: [SPARK-42984][CONNECT][PYTHON][TESTS] Enable test_createDataFrame_with_single_data_type - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/18 02:14:15 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40828: [SPARK-42984][CONNECT][PYTHON][TESTS] Enable test_createDataFrame_with_single_data_type - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/18 02:15:05 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40826: [SPARK-43168][SQL] Remove get PhysicalDataType method from Datatype class - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/18 02:15:44 UTC, 0 replies.
- [GitHub] [spark] hvanhovell closed pull request #40826: [SPARK-43168][SQL] Remove get PhysicalDataType method from Datatype class - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/18 02:16:36 UTC, 0 replies.
- [GitHub] [spark] yaooqinn commented on pull request #40768: [SPARK-43119][SQL] Support Get SQL Keywords Dynamically Thru JDBC API and TVF - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/18 02:22:43 UTC, 4 replies.
- [GitHub] [spark-docker] Yikun closed pull request #33: [SPARK-43148] Add Apache Spark 3.4.0 Dockerfiles - posted by "Yikun (via GitHub)" <gi...@apache.org> on 2023/04/18 02:59:12 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40724: [SPARK-43081] [ML] [CONNECT] Add torch distributor data loader that loads data from spark partition data - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/18 03:16:58 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40793: [SPARK-43122][CONNECT][PYTHON][ML][TESTS] Reenable TorchDistributorLocalUnitTestsOnConnect and TorchDistributorLocalUnitTestsIIOnConnect - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/18 03:23:16 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40830: [SPARK-43169][INFRA] Bump `previousSparkVersion` to 3.4.0 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/18 03:24:17 UTC, 0 replies.
- [GitHub] [spark] allisonwang-db commented on a diff in pull request #40780: [SPARK-43125][CONNECT] Fix Connect Server Can't Handle Exception With Null Message - posted by "allisonwang-db (via GitHub)" <gi...@apache.org> on 2023/04/18 03:37:55 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40823: [SPARK-42552][SQL] Fix select without parentheses can't be parsed - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/18 03:52:30 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40808: [SPARK-43138][CORE][WIP]: Fix ClassNotFoundException during migration - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/18 04:06:34 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40766: [SPARK-43113][SQL] Evaluate stream-side variables when generating code for a bound condition - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/18 04:09:52 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40759: [SPARK-43111][PS][CONNECT][PYTHON] Merge nested `if` statements into single `if` statements - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/18 04:13:20 UTC, 0 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40831: [SPARK-43171][K8S] Support custom Unix username in Pod - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/18 04:35:50 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40831: [SPARK-43171][K8S] Support custom Unix username in Pod - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/18 04:38:37 UTC, 4 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40824: [SPARK-32064][SQL] Support temporary table - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/18 05:03:28 UTC, 0 replies.
- [GitHub] [spark] allisonwang-db commented on pull request #40575: [SPARK-42945][CONNECT] Support PYSPARK_JVM_STACKTRACE_ENABLED in Spark Connect - posted by "allisonwang-db (via GitHub)" <gi...@apache.org> on 2023/04/18 05:04:04 UTC, 2 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40830: [SPARK-43169][INFRA] Bump `previousSparkVersion` to 3.4.0 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/18 05:10:24 UTC, 2 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40832: [SPARK-42657][CONNECT][FOLLOWUP] Correct the API version in scaladoc - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/18 05:36:05 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40832: [SPARK-42657][CONNECT][FOLLOWUP] Correct the API version in scaladoc - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/18 05:36:21 UTC, 0 replies.
- [GitHub] [spark] beliefer opened a new pull request, #40833: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and positive. - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/18 05:44:00 UTC, 0 replies.
- [GitHub] [spark] beliefer commented on a diff in pull request #40789: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and equals to zero. - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/18 05:46:42 UTC, 0 replies.
- [GitHub] [spark] grundprinzip commented on a diff in pull request #40575: [SPARK-42945][CONNECT] Support PYSPARK_JVM_STACKTRACE_ENABLED in Spark Connect - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/18 06:00:09 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40437: [SPARK-41259][SQL] SparkSQLDriver use the spark result string that is consistent with that of `df.show` - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/18 06:11:15 UTC, 2 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40823: [SPARK-42552][SQL] Fix select without parentheses can't be parsed - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/18 06:17:26 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40768: [SPARK-43119][SQL] Support Get SQL Keywords Dynamically Thru JDBC API and TVF - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/18 06:20:50 UTC, 3 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40768: [SPARK-43119][SQL] Support Get SQL Keywords Dynamically Thru JDBC API and TVF - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/18 06:21:40 UTC, 0 replies.
- [GitHub] [spark] yaooqinn commented on a diff in pull request #40768: [SPARK-43119][SQL] Support Get SQL Keywords Dynamically Thru JDBC API and TVF - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/18 06:24:51 UTC, 5 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40832: [SPARK-42657][CONNECT][FOLLOWUP] Correct the API version in scaladoc - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/18 06:31:55 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40832: [SPARK-42657][CONNECT][FOLLOWUP] Correct the API version in scaladoc - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/18 06:34:25 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40575: [SPARK-42945][CONNECT] Support PYSPARK_JVM_STACKTRACE_ENABLED in Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/18 06:37:52 UTC, 1 replies.
- [GitHub] [spark] Hisoka-X commented on pull request #40823: [SPARK-42552][SQL] Fix select without parentheses can't be parsed - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/18 07:05:25 UTC, 1 replies.
- [GitHub] [spark] zwangsheng commented on pull request #40771: [SPARK-35723] set k8s pod container request, limit memory separately. - posted by "zwangsheng (via GitHub)" <gi...@apache.org> on 2023/04/18 07:15:45 UTC, 1 replies.
- [GitHub] [spark] Yikf commented on a diff in pull request #40437: [SPARK-41259][SQL] SparkSQLDriver use the spark result string that is consistent with that of `df.show` - posted by "Yikf (via GitHub)" <gi...@apache.org> on 2023/04/18 07:20:26 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40806: [SPARK-43153][CONNECT] Skip Spark execution when the dataframe is local - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/18 07:32:11 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40806: [SPARK-43153][CONNECT] Skip Spark execution when the dataframe is local - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/18 07:32:21 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40778: [WIP][PYTHON][TESTS] Test test_parity_torch_distributor with timeout - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/18 07:34:34 UTC, 0 replies.
- [GitHub] [spark] viirya commented on a diff in pull request #40811: [SPARK-43098][SQL] Fix correctness COUNT bug when scalar subquery has group by clause - posted by "viirya (via GitHub)" <gi...@apache.org> on 2023/04/18 07:48:39 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40308: [SPARK-42151][SQL] Align UPDATE assignments with table attributes - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/18 07:59:07 UTC, 1 replies.
- [GitHub] [spark] cloud-fan closed pull request #40308: [SPARK-42151][SQL] Align UPDATE assignments with table attributes - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/18 08:01:04 UTC, 0 replies.
- [GitHub] [spark] bogao007 opened a new pull request, #40834: [SPARK-43046] [SS] [Connect] Implemented Python API dropDuplicatesWithinWatermark for Spark Connect - posted by "bogao007 (via GitHub)" <gi...@apache.org> on 2023/04/18 08:15:23 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40823: [SPARK-42552][SQL] Fix select without parentheses can't be parsed - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/18 08:43:09 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40707: [SPARK-43033][SQL] Avoid task retries due to AssertNotNull checks - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/18 08:53:33 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40833: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and positive. - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/18 08:57:25 UTC, 1 replies.
- [GitHub] [spark] mridulm commented on pull request #40707: [SPARK-43033][SQL] Avoid task retries due to AssertNotNull checks - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/18 09:07:38 UTC, 0 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40835: [SPARK-42552][SQL] Correct the two-stage parsing strategy of Antlr pa… - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/18 09:29:40 UTC, 0 replies.
- [GitHub] [spark] nija-at opened a new pull request, #40836: [SPARK-43172expose host and token from spark connect client - posted by "nija-at (via GitHub)" <gi...@apache.org> on 2023/04/18 09:37:26 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40837: [SPARK-43173][CONNECT][TESTS] Ignore `write jdbc` when test `ClientE2ETestSuite` without `-Phive` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/18 09:41:53 UTC, 0 replies.
- [GitHub] [spark] wangyum opened a new pull request, #40838: [SPARK-43174][SQL] Fix SparkSQLCLIDriver completer - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/18 10:07:03 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on a diff in pull request #40838: [SPARK-43174][SQL] Fix SparkSQLCLIDriver completer - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/18 10:08:06 UTC, 6 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40837: [SPARK-43173][CONNECT][TESTS] Ignore `write jdbc` when test `ClientE2ETestSuite` without `-Phive` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/18 10:08:26 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40820: [MINOR][SQL][DOCS] Improve spark.sql.files.minPartitionNum's doc - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/18 10:17:28 UTC, 0 replies.
- [GitHub] [spark] jackylee-ch commented on a diff in pull request #40790: [SPARK-43116][SQL] Fix Cast.forceNullable - posted by "jackylee-ch (via GitHub)" <gi...@apache.org> on 2023/04/18 10:27:17 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40675: [SPARK-42657][CONNECT] Support to find and transfer client-side REPL classfiles to server as artifacts - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/18 11:44:37 UTC, 5 replies.
- [GitHub] [spark] Hisoka-X closed pull request #40823: [SPARK-42552][SQL] Fix select without parentheses can't be parsed - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/18 12:48:19 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40836: [SPARK-43172] [CONNECT] Expose host and token from spark connect client - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/18 12:52:12 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40836: [SPARK-43172] [CONNECT] Expose host and token from spark connect client - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/18 12:52:34 UTC, 0 replies.
- [GitHub] [spark] vicennial commented on pull request #40675: [SPARK-42657][CONNECT] Support to find and transfer client-side REPL classfiles to server as artifacts - posted by "vicennial (via GitHub)" <gi...@apache.org> on 2023/04/18 13:07:41 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40839: [SPARK-43176][CONNECT][PYTHON][TESTS] Deduplicate imports in Connect Tests - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/18 13:20:02 UTC, 0 replies.
- [GitHub] [spark] bersprockets commented on pull request #40766: [SPARK-43113][SQL] Evaluate stream-side variables when generating code for a bound condition - posted by "bersprockets (via GitHub)" <gi...@apache.org> on 2023/04/18 13:29:37 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #37630: [SPARK-40193][SQL] Merge subquery plans with different filters - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/18 13:32:19 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40830: [SPARK-43169][INFRA] Bump `previousSparkVersion` to 3.4.0 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/18 13:58:21 UTC, 1 replies.
- [GitHub] [spark] sunchao closed pull request #40788: [SPARK-42452][BUILD] Remove `hadoop-2` profile from Apache Spark 3.5.0 - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/18 15:58:17 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40839: [SPARK-43176][CONNECT][PYTHON][TESTS] Deduplicate imports in Connect Tests - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/18 16:39:57 UTC, 0 replies.
- [GitHub] [spark] srielau commented on pull request #40768: [SPARK-43119][SQL] Support Get SQL Keywords Dynamically Thru JDBC API and TVF - posted by "srielau (via GitHub)" <gi...@apache.org> on 2023/04/18 17:28:18 UTC, 0 replies.
- [GitHub] [spark] ueshin commented on a diff in pull request #39384: [SPARK-40307][PYTHON] Introduce Arrow-optimized Python UDFs - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/18 18:44:28 UTC, 0 replies.
- [GitHub] [spark] aokolnychyi commented on pull request #40707: [SPARK-43033][SQL] Avoid task retries due to AssertNotNull checks - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/18 18:45:18 UTC, 0 replies.
- [GitHub] [spark] kori73 commented on a diff in pull request #40632: [SPARK-42298][SQL] Assign name to _LEGACY_ERROR_TEMP_2132 - posted by "kori73 (via GitHub)" <gi...@apache.org> on 2023/04/18 19:34:23 UTC, 0 replies.
- [GitHub] [spark] samkenxstream opened a new pull request, #40840: SamKenX sync - posted by "samkenxstream (via GitHub)" <gi...@apache.org> on 2023/04/18 19:46:32 UTC, 0 replies.
- [GitHub] [spark] aokolnychyi commented on a diff in pull request #37879: [SPARK-40425][SQL] DROP TABLE does not need to do table lookup - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/18 19:49:14 UTC, 12 replies.
- [GitHub] [spark] viirya commented on a diff in pull request #37879: [SPARK-40425][SQL] DROP TABLE does not need to do table lookup - posted by "viirya (via GitHub)" <gi...@apache.org> on 2023/04/18 20:19:06 UTC, 2 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40785: [SPARK-42960] Add await_termination() and exception() API for Streaming Query - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/18 21:50:02 UTC, 3 replies.
- [GitHub] [spark] dependabot[bot] opened a new pull request, #40841: Bump jetty-server from 9.4.51.v20230217 to 10.0.14 - posted by "dependabot[bot] (via GitHub)" <gi...@apache.org> on 2023/04/18 22:52:01 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40841: Bump jetty-server from 9.4.51.v20230217 to 10.0.14 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/18 23:58:10 UTC, 0 replies.
- [GitHub] [spark] dependabot[bot] commented on pull request #40841: Bump jetty-server from 9.4.51.v20230217 to 10.0.14 - posted by "dependabot[bot] (via GitHub)" <gi...@apache.org> on 2023/04/18 23:58:13 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40840: SamKenX sync - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/18 23:58:17 UTC, 0 replies.
- [GitHub] [spark] xkrogen commented on pull request #35969: [SPARK-38651][SQL] Add `spark.sql.legacy.allowEmptySchemaWrite` - posted by "xkrogen (via GitHub)" <gi...@apache.org> on 2023/04/19 00:02:09 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40830: [SPARK-43169][INFRA] Bump `previousSparkVersion` to 3.4.0 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/19 00:28:43 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40830: [SPARK-43169][INFRA] Bump `previousSparkVersion` to 3.4.0 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/19 00:29:04 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40837: [SPARK-43173][CONNECT][TESTS] Ignore `write jdbc` when test `ClientE2ETestSuite` without `-Phive` - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/19 00:29:43 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40837: [SPARK-43173][CONNECT][TESTS] Ignore `write jdbc` when test `ClientE2ETestSuite` without `-Phive` - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/19 00:30:02 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #37879: [SPARK-40425][SQL] DROP TABLE does not need to do table lookup - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/19 00:32:56 UTC, 2 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #40834: [SPARK-43046] [SS] [Connect] Implemented Python API dropDuplicatesWithinWatermark for Spark Connect - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/19 00:39:42 UTC, 5 replies.
- [GitHub] [spark] bogao007 commented on a diff in pull request #40834: [SPARK-43046] [SS] [Connect] Implemented Python API dropDuplicatesWithinWatermark for Spark Connect - posted by "bogao007 (via GitHub)" <gi...@apache.org> on 2023/04/19 00:51:03 UTC, 4 replies.
- [GitHub] [spark] bogao007 commented on pull request #40834: [SPARK-43046] [SS] [Connect] Implemented Python API dropDuplicatesWithinWatermark for Spark Connect - posted by "bogao007 (via GitHub)" <gi...@apache.org> on 2023/04/19 00:53:15 UTC, 1 replies.
- [GitHub] [spark] rangadi commented on pull request #40834: [SPARK-43046] [SS] [Connect] Implemented Python API dropDuplicatesWithinWatermark for Spark Connect - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/19 01:13:34 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40800: [SPARK-43146][CONNECT][PYTHON] Implement eager evaluation for __repr__ and _repr_html_ - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/19 01:23:27 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40800: [SPARK-43146][CONNECT][PYTHON] Implement eager evaluation for __repr__ and _repr_html_ - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/19 01:23:56 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40811: [SPARK-43098][SQL] Fix correctness COUNT bug when scalar subquery has group by clause - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/19 01:42:25 UTC, 0 replies.
- [GitHub] [spark] tirumaleshn2458 opened a new pull request, #40842: Master clone8 - posted by "tirumaleshn2458 (via GitHub)" <gi...@apache.org> on 2023/04/19 02:00:06 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40837: [SPARK-43173][CONNECT][TESTS] Ignore `write jdbc` when test `ClientE2ETestSuite` without `-Phive` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/19 02:06:11 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40766: [SPARK-43113][SQL] Evaluate stream-side variables when generating code for a bound condition - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/19 02:44:08 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40842: Master clone8 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/19 03:25:34 UTC, 0 replies.
- [GitHub] [spark] beliefer commented on pull request #40833: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and positive. - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/19 04:40:01 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40839: [SPARK-43176][CONNECT][PYTHON][TESTS] Deduplicate imports in Connect Tests - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/19 05:12:20 UTC, 0 replies.
- [GitHub] [spark] allisonwang-db commented on a diff in pull request #40811: [SPARK-43098][SQL] Fix correctness COUNT bug when scalar subquery has group by clause - posted by "allisonwang-db (via GitHub)" <gi...@apache.org> on 2023/04/19 05:30:32 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40825: [SPARK-43165][SQL] Move canWrite to DataTypeUtils - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/19 05:35:49 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40825: [SPARK-43165][SQL] Move canWrite to DataTypeUtils - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/19 05:36:28 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on pull request #40768: [SPARK-43119][SQL] Support Get SQL Keywords Dynamically Thru JDBC API and TVF - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/19 06:04:51 UTC, 0 replies.
- [GitHub] [spark] otterc opened a new pull request, #40843: [SPARK-43179][SHUFFLE] Allowing apps to control whether their metadata gets saved in the db by the External Shuffle Service - posted by "otterc (via GitHub)" <gi...@apache.org> on 2023/04/19 06:05:31 UTC, 0 replies.
- [GitHub] [spark] otterc commented on pull request #40843: [SPARK-43179][SHUFFLE] Allowing apps to control whether their metadata gets saved in the db by the External Shuffle Service - posted by "otterc (via GitHub)" <gi...@apache.org> on 2023/04/19 06:10:07 UTC, 2 replies.
- [GitHub] [spark] yaooqinn commented on a diff in pull request #40824: [SPARK-32064][SQL] Support temporary table - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/19 06:46:19 UTC, 0 replies.
- [GitHub] [spark] yaooqinn commented on pull request #40824: [SPARK-32064][SQL] Support temporary table - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/19 06:54:12 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40835: [SPARK-42552][SQL] Correct the two-stage parsing strategy of antlr parser - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/19 08:22:03 UTC, 0 replies.
- [GitHub] [spark] panbingkun opened a new pull request, #40844: [SPARK-43181][SQL] spark-sql console should display the Spark WEB UI address - posted by "panbingkun (via GitHub)" <gi...@apache.org> on 2023/04/19 08:29:48 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40835: [SPARK-42552][SQL] Correct the two-stage parsing strategy of antlr parser - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/19 08:36:36 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40835: [SPARK-42552][SQL] Correct the two-stage parsing strategy of antlr parser - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/19 08:37:57 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40844: [SPARK-43181][SQL] spark-sql console should display the Spark WEB UI address - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/19 09:07:36 UTC, 0 replies.
- [GitHub] [spark] vicennial commented on pull request #40783: [SPARK-43129] Scala core API for streaming Spark Connect - posted by "vicennial (via GitHub)" <gi...@apache.org> on 2023/04/19 10:02:10 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR opened a new pull request, #40845: [SPARK-43183][SS] Introduce a new callback "onQueryIdle" to StreamingQueryListener - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/19 10:02:43 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR commented on pull request #40845: [SPARK-43183][SS] Introduce a new callback "onQueryIdle" to StreamingQueryListener - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/19 10:09:39 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40839: [SPARK-43176][CONNECT][PYTHON][TESTS] Deduplicate imports in Connect Tests - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/19 10:33:27 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40839: [SPARK-43176][CONNECT][PYTHON][TESTS] Deduplicate imports in Connect Tests - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/19 10:34:00 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40846: [SPARK-43184][YARN] Resume using enumeration to compare `NodeState.DECOMMISSIONING` state - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/19 11:04:26 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40798: SPARK-43166: name docker users - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/19 11:07:17 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40847: [SPARK-43185][BUILD] Inline `hadoop-client` related properties in `pom.xml` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/19 11:13:35 UTC, 0 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40848: [SPARK-43186][SQL][HIVE] Remove workaround for FileSinkDesc - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/19 11:20:44 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40848: [SPARK-43186][SQL][HIVE] Remove workaround for FileSinkDesc - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/19 11:33:23 UTC, 0 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40849: [SPARK-43187][TEST] Remove workaround for MiniKdc's BindException - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/19 11:58:56 UTC, 0 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40850: [SPARK-43191][CORE] Replace reflection w/ direct calling for Hadoop CallerContext - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/19 12:48:29 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40850: [SPARK-43191][CORE] Replace reflection w/ direct calling for Hadoop CallerContext - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/19 12:51:00 UTC, 0 replies.
- [GitHub] [spark] cloud-fan opened a new pull request, #40851: [SPARK-43190][SQL] ListQuery.childOutput should be consistent with child output - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/19 12:52:33 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40851: [SPARK-43190][SQL] ListQuery.childOutput should be consistent with child output - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/19 12:55:11 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40851: [SPARK-43190][SQL] ListQuery.childOutput should be consistent with child output - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/19 12:55:12 UTC, 0 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40852: [SPARK-43193][SS] Remove workaround for HADOOP-12074 - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/19 13:11:13 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40852: [SPARK-43193][SS] Remove workaround for HADOOP-12074 - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/19 13:12:43 UTC, 3 replies.
- [GitHub] [spark] nija-at opened a new pull request, #40853: [SPARK-43192] [CONNECT] Remove user agent validations - posted by "nija-at (via GitHub)" <gi...@apache.org> on 2023/04/19 13:16:11 UTC, 0 replies.
- [GitHub] [spark] grundprinzip commented on a diff in pull request #40853: [SPARK-43192] [CONNECT] Remove user agent validations - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/19 13:19:00 UTC, 0 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40854: [SPARK-43195][CORE] Remove unnecessary serializable wrapper in HadoopFSUtils - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/19 13:31:58 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40854: [SPARK-43195][CORE] Remove unnecessary serializable wrapper in HadoopFSUtils - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/19 13:32:18 UTC, 1 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40855: [SPARK-43196][YARN] Replace reflection w/ direct calling for `ContainerLaunchContext#setTokensConf` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/19 13:52:45 UTC, 0 replies.
- [GitHub] [spark] peter-toth commented on pull request #29210: [WIP][SPARK-24497][SQL] Support recursive SQL query - posted by "peter-toth (via GitHub)" <gi...@apache.org> on 2023/04/19 13:54:39 UTC, 0 replies.
- [GitHub] [spark] allisonwang-db commented on a diff in pull request #40575: [SPARK-42945][CONNECT] Support PYSPARK_JVM_STACKTRACE_ENABLED in Spark Connect - posted by "allisonwang-db (via GitHub)" <gi...@apache.org> on 2023/04/19 13:57:38 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40755: [SPARK-37829][SQL] Dataframe.joinWith outer-join should return a null value for unmatched row - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/19 14:05:19 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40755: [SPARK-37829][SQL] Dataframe.joinWith outer-join should return a null value for unmatched row - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/19 14:06:03 UTC, 1 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40855: [SPARK-43196][YARN] Replace reflection w/ direct calling for `ContainerLaunchContext#setTokensConf` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/19 14:08:18 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40833: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and positive. - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/19 14:11:58 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40854: [SPARK-43195][CORE] Remove unnecessary serializable wrapper in HadoopFSUtils - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/19 14:12:09 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40833: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and positive. - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/19 14:12:37 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40849: [SPARK-43187][TEST] Remove workaround for MiniKdc's BindException - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/19 14:58:03 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on pull request #40843: [SPARK-43179][SHUFFLE] Allowing apps to control whether their metadata gets saved in the db by the External Shuffle Service - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/19 15:04:54 UTC, 2 replies.
- [GitHub] [spark] tgravescs commented on pull request #40843: [SPARK-43179][SHUFFLE] Allowing apps to control whether their metadata gets saved in the db by the External Shuffle Service - posted by "tgravescs (via GitHub)" <gi...@apache.org> on 2023/04/19 15:59:10 UTC, 1 replies.
- [GitHub] [spark] pan3793 commented on a diff in pull request #40854: [SPARK-43195][CORE] Remove unnecessary serializable wrapper in HadoopFSUtils - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/19 16:06:25 UTC, 0 replies.
- [GitHub] [spark] peter-toth opened a new pull request, #40856: [SPARK-43199][SQL] Make InlineCTE idempotent - posted by "peter-toth (via GitHub)" <gi...@apache.org> on 2023/04/19 16:55:47 UTC, 0 replies.
- [GitHub] [spark] sunchao closed pull request #40848: [SPARK-43186][SQL][HIVE] Remove workaround for FileSinkDesc - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/19 17:25:54 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40848: [SPARK-43186][SQL][HIVE] Remove workaround for FileSinkDesc - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/19 17:26:11 UTC, 0 replies.
- [GitHub] [spark] sunchao closed pull request #40849: [SPARK-43187][TEST] Remove workaround for MiniKdc's BindException - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/19 17:26:50 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40849: [SPARK-43187][TEST] Remove workaround for MiniKdc's BindException - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/19 17:27:04 UTC, 0 replies.
- [GitHub] [spark] xinrong-meng commented on a diff in pull request #40729: [SPARK-43136][CONNECT] Adding groupByKey + mapGroup + coGroup functions - posted by "xinrong-meng (via GitHub)" <gi...@apache.org> on 2023/04/19 17:38:04 UTC, 0 replies.
- [GitHub] [spark] xinrong-meng commented on pull request #40729: [SPARK-43136][CONNECT] Adding groupByKey + mapGroup + coGroup functions - posted by "xinrong-meng (via GitHub)" <gi...@apache.org> on 2023/04/19 17:39:44 UTC, 1 replies.
- [GitHub] [spark] RyanBerti commented on pull request #40615: [SPARK-16484][SQL] Add support for Datasketches HllSketch - posted by "RyanBerti (via GitHub)" <gi...@apache.org> on 2023/04/19 17:49:11 UTC, 4 replies.
- [GitHub] [spark] ryan-johnson-databricks commented on a diff in pull request #37879: [SPARK-40425][SQL] DROP TABLE does not need to do table lookup - posted by "ryan-johnson-databricks (via GitHub)" <gi...@apache.org> on 2023/04/19 18:06:51 UTC, 0 replies.
- [GitHub] [spark] zhenlineo commented on a diff in pull request #40762: [SPARK-42953][Connect][Followup] Fix maven test build for Scala client UDF tests - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/19 18:13:17 UTC, 2 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40857: [SPARK-43200][DOCS] Remove Hadoop 2 reference in docs - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/19 18:15:40 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40857: [SPARK-43200][DOCS] Remove Hadoop 2 reference in docs - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/19 18:16:15 UTC, 0 replies.
- [GitHub] [spark] viirya commented on a diff in pull request #40845: [SPARK-43183][SS] Introduce a new callback "onQueryIdle" to StreamingQueryListener - posted by "viirya (via GitHub)" <gi...@apache.org> on 2023/04/19 18:28:33 UTC, 1 replies.
- [GitHub] [spark] kings129 opened a new pull request, #40858: [SPARK-37829][SQL] Dataframe.joinWith outer-join should return a null… - posted by "kings129 (via GitHub)" <gi...@apache.org> on 2023/04/19 19:14:00 UTC, 0 replies.
- [GitHub] [spark] zhouyejoe commented on a diff in pull request #40843: [SPARK-43179][SHUFFLE] Allowing apps to control whether their metadata gets saved in the db by the External Shuffle Service - posted by "zhouyejoe (via GitHub)" <gi...@apache.org> on 2023/04/19 19:17:32 UTC, 0 replies.
- [GitHub] [spark] otterc commented on a diff in pull request #40843: [SPARK-43179][SHUFFLE] Allowing apps to control whether their metadata gets saved in the db by the External Shuffle Service - posted by "otterc (via GitHub)" <gi...@apache.org> on 2023/04/19 19:22:03 UTC, 4 replies.
- [GitHub] [spark] viirya commented on pull request #40858: [SPARK-37829][SQL] Dataframe.joinWith outer-join should return a null… - posted by "viirya (via GitHub)" <gi...@apache.org> on 2023/04/19 19:33:38 UTC, 0 replies.
- [GitHub] [spark] khalidmammadov opened a new pull request, #40859: [FOLLOW-UP][SPARK-42437][PYTHON][CONNECT] Storage level proto converters - posted by "khalidmammadov (via GitHub)" <gi...@apache.org> on 2023/04/19 20:05:58 UTC, 0 replies.
- [GitHub] [spark] khalidmammadov commented on pull request #40859: [FOLLOW-UP][SPARK-42437][PYTHON][CONNECT] Storage level proto converters - posted by "khalidmammadov (via GitHub)" <gi...@apache.org> on 2023/04/19 20:06:48 UTC, 0 replies.
- [GitHub] [spark] ueshin commented on pull request #40859: [SPARK-42437][CONNECT][PYTHON][FOLLOW-UP] Storage level proto converters - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/19 20:30:25 UTC, 0 replies.
- [GitHub] [spark] ueshin closed pull request #40859: [SPARK-42437][CONNECT][PYTHON][FOLLOW-UP] Storage level proto converters - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/19 20:31:05 UTC, 0 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40860: [SPARK-43197][YARN] Replace reflection w/ direct calling for YARN Resource API - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/19 20:35:56 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR commented on a diff in pull request #40845: [SPARK-43183][SS] Introduce a new callback "onQueryIdle" to StreamingQueryListener - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/19 20:43:36 UTC, 5 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #40845: [SPARK-43183][SS] Introduce a new callback "onQueryIdle" to StreamingQueryListener - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/19 21:01:12 UTC, 1 replies.
- [GitHub] [spark] xinrong-meng commented on a diff in pull request #39384: [SPARK-40307][PYTHON] Introduce Arrow-optimized Python UDFs - posted by "xinrong-meng (via GitHub)" <gi...@apache.org> on 2023/04/19 21:47:24 UTC, 0 replies.
- [GitHub] [spark] WweiL opened a new pull request, #40861: [SPARK-43032] streaming query manager - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/19 22:14:37 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40861: [SPARK-43032] streaming query manager - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/19 22:20:07 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on pull request #40853: [SPARK-43192] [CONNECT] Remove user agent validations - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/19 22:25:32 UTC, 0 replies.
- [GitHub] [spark] WweiL commented on a diff in pull request #40861: [SPARK-43032] streaming query manager - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/19 23:14:26 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40575: [SPARK-42945][CONNECT] Support PYSPARK_JVM_STACKTRACE_ENABLED in Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/19 23:29:10 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40575: [SPARK-42945][CONNECT] Support PYSPARK_JVM_STACKTRACE_ENABLED in Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/19 23:29:33 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40675: [SPARK-42657][CONNECT] Support to find and transfer client-side REPL classfiles to server as artifacts - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/19 23:31:59 UTC, 3 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40783: [SPARK-43129] Scala core API for streaming Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/19 23:33:28 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40783: [SPARK-43129] Scala core API for streaming Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/19 23:33:45 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40785: [SPARK-42960] Add await_termination() and exception() API for Streaming Query - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/19 23:33:54 UTC, 2 replies.
- [GitHub] [spark] zhenlineo commented on pull request #40675: [SPARK-42657][CONNECT] Support to find and transfer client-side REPL classfiles to server as artifacts - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/19 23:37:07 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40830: [SPARK-43169][INFRA] Bump `previousSparkVersion` to 3.4.0 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/19 23:37:39 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40862: [SPARK-43169][INFRA][FOLLOWUP] Add more memory for mima check - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/20 00:04:19 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40862: [SPARK-43169][INFRA][FOLLOWUP] Add more memory for mima check - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/20 00:19:24 UTC, 4 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40862: [SPARK-43169][INFRA][FOLLOWUP] Add more memory for mima check - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/20 00:20:41 UTC, 16 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40628: [SPARK-42999][Connect] Dataset#foreach, foreachPartition - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/20 00:21:27 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40785: [SPARK-42960] Add await_termination() and exception() API for Streaming Query - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/20 00:24:13 UTC, 3 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40847: [SPARK-43185][BUILD] Inline `hadoop-client` related properties in `pom.xml` - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/20 00:35:27 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40863: [SPARK-43207][CONNECT] Add helper functions for extract value from literal expression - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/20 00:35:41 UTC, 0 replies.
- [GitHub] [spark] xinrong-meng opened a new pull request, #40864: Nested DataType compatibility in Arrow-optimized Python UDF - posted by "xinrong-meng (via GitHub)" <gi...@apache.org> on 2023/04/20 00:38:58 UTC, 0 replies.
- [GitHub] [spark] ahshahid commented on pull request #40765: [WIP][SPARK-43112]. Spark may use a column other than the actual specified partitioning column for partitioning, for Hive format tables - posted by "ahshahid (via GitHub)" <gi...@apache.org> on 2023/04/20 01:15:13 UTC, 1 replies.
- [GitHub] [spark] ahshahid closed pull request #40765: [WIP][SPARK-43112]. Spark may use a column other than the actual specified partitioning column for partitioning, for Hive format tables - posted by "ahshahid (via GitHub)" <gi...@apache.org> on 2023/04/20 01:15:18 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40675: [SPARK-42657][CONNECT] Support to find and transfer client-side REPL classfiles to server as artifacts - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/20 02:22:10 UTC, 0 replies.
- [GitHub] [spark] beliefer closed pull request #40789: [SPARK-43137][SQL] Improve ArrayInsert if the position is foldable and equals to zero. - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/20 02:53:13 UTC, 0 replies.
- [GitHub] [spark] Yikun commented on pull request #40831: [SPARK-43171][K8S] Support custom Unix username in Pod - posted by "Yikun (via GitHub)" <gi...@apache.org> on 2023/04/20 04:01:17 UTC, 0 replies.
- [GitHub] [spark] sunchao closed pull request #40854: [SPARK-43195][CORE] Remove unnecessary serializable wrapper in HadoopFSUtils - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/20 04:02:58 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40854: [SPARK-43195][CORE] Remove unnecessary serializable wrapper in HadoopFSUtils - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/20 04:03:13 UTC, 0 replies.
- [GitHub] [spark] sunchao closed pull request #40857: [SPARK-43200][DOCS] Remove Hadoop 2 reference in docs - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/20 04:07:53 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40857: [SPARK-43200][DOCS] Remove Hadoop 2 reference in docs - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/20 04:08:06 UTC, 0 replies.
- [GitHub] [spark] sunchao closed pull request #40850: [SPARK-43191][CORE] Replace reflection w/ direct calling for Hadoop CallerContext - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/20 04:08:42 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40850: [SPARK-43191][CORE] Replace reflection w/ direct calling for Hadoop CallerContext - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/20 04:08:56 UTC, 0 replies.
- [GitHub] [spark] sunchao closed pull request #40855: [SPARK-43196][YARN] Replace reflection w/ direct calling for `ContainerLaunchContext#setTokensConf` - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/20 04:09:28 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40855: [SPARK-43196][YARN] Replace reflection w/ direct calling for `ContainerLaunchContext#setTokensConf` - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/20 04:09:39 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40860: [SPARK-43197][YARN] Replace reflection w/ direct calling for YARN Resource API - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/20 04:10:54 UTC, 1 replies.
- [GitHub] [spark] Hisoka-X opened a new pull request, #40865: [SPARK-43156][SQL] Fix `COUNT(*) is null` bug in correlated scalar subquery - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/20 04:54:12 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40863: [SPARK-43207][CONNECT] Add helper functions to extract value from literal expression - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/20 05:22:25 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40575: [SPARK-42945][CONNECT] Support PYSPARK_JVM_STACKTRACE_ENABLED in Spark Connect - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/20 05:24:46 UTC, 1 replies.
- [GitHub] [spark] itholic opened a new pull request, #40866: [SPARK-43178][CONNECT][PYTHON] Migrate UDF errors into PySpark error framework - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/20 05:30:49 UTC, 0 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40867: [SPARK-43208][HIVE] IsolatedClassLoader should close barrier class InputStream after reading - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/20 05:34:37 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40867: [SPARK-43208][HIVE] IsolatedClassLoader should close barrier class InputStream after reading - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/20 05:35:44 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on a diff in pull request #40867: [SPARK-43208][HIVE] IsolatedClassLoader should close barrier class InputStream after reading - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/20 05:36:03 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40867: [SPARK-43208][HIVE] IsolatedClassLoader should close barrier class InputStream after reading - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/20 05:37:07 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #39950: [SPARK-42388][SQL] Avoid parquet footer reads twice in vectorized reader - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/20 05:43:54 UTC, 1 replies.
- [GitHub] [spark] grundprinzip commented on pull request #40853: [SPARK-43192] [CONNECT] Remove user agent validations - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/20 05:56:56 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40862: [SPARK-43169][INFRA][FOLLOWUP] Add more memory for mima check - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/20 06:05:07 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40862: [SPARK-43169][INFRA][FOLLOWUP] Add more memory for mima check - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/20 06:24:15 UTC, 0 replies.
- [GitHub] [spark] itholic opened a new pull request, #40868: [SPARK-43210][CONNECT][PYTHON] Introduce `PySparkAssertionError` - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/20 06:26:01 UTC, 0 replies.
- [GitHub] [spark] itholic opened a new pull request, #40869: [SPARK-43209][CONNECT][PYTHON] Migrate Expression errors into error class - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/20 06:42:51 UTC, 0 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40870: [SPARK-43211][HIVE] Remove Hadoop2 support in IsolatedClientLoader - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/20 07:02:39 UTC, 0 replies.
- [GitHub] [spark] viirya commented on pull request #40851: [SPARK-43190][SQL] ListQuery.childOutput should be consistent with child output - posted by "viirya (via GitHub)" <gi...@apache.org> on 2023/04/20 07:02:51 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40847: [SPARK-43185][BUILD] Inline `hadoop-client` related properties in `pom.xml` - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/20 07:09:10 UTC, 2 replies.
- [GitHub] [spark] pan3793 commented on pull request #40870: [SPARK-43211][HIVE] Remove Hadoop2 support in IsolatedClientLoader - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/20 07:10:28 UTC, 1 replies.
- [GitHub] [spark] cloud-fan closed pull request #40851: [SPARK-43190][SQL] ListQuery.childOutput should be consistent with child output - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/20 07:13:18 UTC, 0 replies.
- [GitHub] [spark] peter-toth commented on pull request #40856: [SPARK-43199][SQL] Make InlineCTE idempotent - posted by "peter-toth (via GitHub)" <gi...@apache.org> on 2023/04/20 07:14:25 UTC, 2 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40847: [SPARK-43185][BUILD] Inline `hadoop-client` related properties in `pom.xml` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/20 07:23:00 UTC, 6 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40827: [WIP][SPARK-42585][CONNECT] Streaming of local relations - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/20 07:40:16 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40725: [SPARK-43082][Connect][PYTHON] Arrow-optimized Python UDFs in Spark Connect - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/20 08:04:34 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40863: [SPARK-43207][CONNECT] Add helper functions to extract value from literal expression - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/20 08:06:50 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on a diff in pull request #40843: [SPARK-43179][SHUFFLE] Allowing apps to control whether their metadata gets saved in the db by the External Shuffle Service - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/20 08:15:55 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR closed pull request #40845: [SPARK-43183][SS] Introduce a new callback "onQueryIdle" to StreamingQueryListener - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/20 08:21:11 UTC, 0 replies.
- [GitHub] [spark] nija-at commented on a diff in pull request #40853: [SPARK-43192] [CONNECT] Remove user agent validations - posted by "nija-at (via GitHub)" <gi...@apache.org> on 2023/04/20 08:26:06 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on pull request #40862: [SPARK-43169][INFRA][FOLLOWUP] Add more memory for mima check - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/20 08:33:54 UTC, 1 replies.
- [GitHub] [spark] cloud-fan opened a new pull request, #40871: Revert [SPARK-39203][SQL] Rewrite table location to absolute URI based on database URI - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/20 08:44:52 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40871: Revert [SPARK-39203][SQL] Rewrite table location to absolute URI based on database URI - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/20 08:45:01 UTC, 0 replies.
- [GitHub] [spark] peter-toth commented on pull request #40744: [WIP][SPARK-24497][SQL] Support recursive SQL - posted by "peter-toth (via GitHub)" <gi...@apache.org> on 2023/04/20 08:57:34 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40872: [MINOR][DOCS] Fix the doc of parameter `num` in `DataFrame.offset` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/20 09:04:30 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40872: [MINOR][DOCS] Fix the doc of parameter `num` in `DataFrame.offset` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/20 09:09:56 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40873: [SPARK-43213][PYTHON] Add `DataFrame.offset` to PySpark - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/20 09:13:00 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40874: [WIP][ML][PYTHON][CONNECT][TESTS] Reduce the memory requirement in torch-related tests - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/20 09:18:43 UTC, 0 replies.
- [GitHub] [spark] rshkv commented on pull request #40794: [SPARK-43142] Fix DSL expressions on attributes with special characters - posted by "rshkv (via GitHub)" <gi...@apache.org> on 2023/04/20 09:23:53 UTC, 2 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40846: [SPARK-43184][YARN] Resume using enumeration to compare `NodeState.DECOMMISSIONING` state - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/20 09:37:32 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40846: [SPARK-43184][YARN] Resume using enumeration to compare `NodeState.DECOMMISSIONING` state - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/20 09:37:49 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40794: [SPARK-43142] Fix DSL expressions on attributes with special characters - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/20 09:41:38 UTC, 2 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40829: [SPARK-41971][SQL] Use deduplicated field names when creating Arrow RecordBatch - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/20 09:42:56 UTC, 0 replies.
- [GitHub] [spark] cfmcgrady opened a new pull request, #40875: [SPARK-43214][SQL] Post driver-side metrics for LocalTableScanExec/CommandResultExec - posted by "cfmcgrady (via GitHub)" <gi...@apache.org> on 2023/04/20 10:42:27 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on pull request #40865: [SPARK-43156][SQL] Fix `COUNT(*) is null` bug in correlated scalar subquery - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/20 11:02:04 UTC, 5 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40846: [SPARK-43184][YARN] Resume using enumeration to compare `NodeState.DECOMMISSIONING` state - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/20 11:41:20 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40876: [SPARK-43215][YARN] Remove `isYarnResourceTypesAvailable` from `ResourceRequestHelper` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/20 11:44:42 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40876: [SPARK-43215][YARN] Remove `isYarnResourceTypesAvailable` from `ResourceRequestHelper` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/20 11:46:50 UTC, 2 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40865: [SPARK-43156][SQL] Fix `COUNT(*) is null` bug in correlated scalar subquery - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/20 12:33:25 UTC, 6 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40865: [SPARK-43156][SQL] Fix `COUNT(*) is null` bug in correlated scalar subquery - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/20 12:34:24 UTC, 16 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40877: [SPARK-31733][YARN][TESTS] Make `specify a more specific type for the application` in `ClientSuite` pass in Hadoop 3 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/20 12:43:37 UTC, 0 replies.
- [GitHub] [spark] bjornjorgensen opened a new pull request, #40878: [SPARK-42780][BUILD] Upgrade `Tink` to 1.9.0 - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/20 12:44:54 UTC, 0 replies.
- [GitHub] [spark] bjornjorgensen commented on pull request #40878: [SPARK-42780][BUILD] Upgrade `Tink` to 1.9.0 - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/20 12:46:44 UTC, 1 replies.
- [GitHub] [spark] bjornjorgensen commented on pull request #40408: [SPARK-42780][BUILD] Upgrade `Tink` to 1.8.0 - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/20 12:47:56 UTC, 0 replies.
- [GitHub] [spark] bjornjorgensen closed pull request #40408: [SPARK-42780][BUILD] Upgrade `Tink` to 1.8.0 - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/20 12:48:31 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40878: [SPARK-42780][BUILD] Upgrade `Tink` to 1.9.0 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/20 12:51:07 UTC, 2 replies.
- [GitHub] [spark] johanl-db opened a new pull request, #40879: [SPARK-43217] Correctly recurse in nested maps/arrays in findNestedField - posted by "johanl-db (via GitHub)" <gi...@apache.org> on 2023/04/20 12:52:15 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on a diff in pull request #40865: [SPARK-43156][SQL] Fix `COUNT(*) is null` bug in correlated scalar subquery - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/20 12:52:42 UTC, 19 replies.
- [GitHub] [spark] jaceklaskowski commented on a diff in pull request #40821: [SPARK-43152][spark-structured-streaming] Parametrisable output metadata path (_spark_metadata) - posted by "jaceklaskowski (via GitHub)" <gi...@apache.org> on 2023/04/20 13:01:32 UTC, 0 replies.
- [GitHub] [spark] tgravescs commented on a diff in pull request #40843: [SPARK-43179][SHUFFLE] Allowing apps to control whether their metadata gets saved in the db by the External Shuffle Service - posted by "tgravescs (via GitHub)" <gi...@apache.org> on 2023/04/20 13:39:45 UTC, 1 replies.
- [GitHub] [spark] juliuszsompolski commented on a diff in pull request #40160: [SPARK-41725][CONNECT] Eager Execution of DF.sql() - posted by "juliuszsompolski (via GitHub)" <gi...@apache.org> on 2023/04/20 13:40:30 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40856: [SPARK-43199][SQL] Make InlineCTE idempotent - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/20 14:01:37 UTC, 1 replies.
- [GitHub] [spark] fryz commented on pull request #40381: [SPARK-42761][BUILD][K8S] Upgrade `kubernetes-client` to 6.5.0 - posted by "fryz (via GitHub)" <gi...@apache.org> on 2023/04/20 14:15:23 UTC, 0 replies.
- [GitHub] [spark] bjornjorgensen commented on pull request #40381: [SPARK-42761][BUILD][K8S] Upgrade `kubernetes-client` to 6.5.0 - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/20 15:06:20 UTC, 1 replies.
- [GitHub] [spark] itholic opened a new pull request, #40880: [SPARK-43212][SS][PYTHON] Migrate Structured Streaming errors into error class - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/20 15:09:44 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40876: [SPARK-43215][YARN] Remove `isYarnResourceTypesAvailable` from `ResourceRequestHelper` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/20 15:47:20 UTC, 0 replies.
- [GitHub] [spark] LuciferYang closed pull request #40876: [SPARK-43215][YARN] Remove `isYarnResourceTypesAvailable` from `ResourceRequestHelper` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/20 15:47:21 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40860: [SPARK-43197][YARN] Replace reflection w/ direct calling for YARN Resource API - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/20 15:48:27 UTC, 0 replies.
- [GitHub] [spark] bersprockets opened a new pull request, #40881: [SPARK-43113][SQL][FOLLOWUP] Add comment about copying steam-side variables - posted by "bersprockets (via GitHub)" <gi...@apache.org> on 2023/04/20 15:48:58 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40860: [SPARK-43202][YARN] Replace reflection w/ direct calling for YARN Resource API - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/20 15:50:00 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40860: [SPARK-43202][YARN] Replace reflection w/ direct calling for YARN Resource API - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/20 15:51:04 UTC, 0 replies.
- [GitHub] [spark] mkaravel commented on pull request #40615: [SPARK-16484][SQL] Add support for Datasketches HllSketch - posted by "mkaravel (via GitHub)" <gi...@apache.org> on 2023/04/20 16:01:16 UTC, 3 replies.
- [GitHub] [spark] gatorsmile commented on pull request #40615: [SPARK-16484][SQL] Add support for Datasketches HllSketch - posted by "gatorsmile (via GitHub)" <gi...@apache.org> on 2023/04/20 16:22:56 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40847: [SPARK-43185][BUILD] Inline `hadoop-client` related properties in `pom.xml` - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/20 16:25:10 UTC, 3 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40882: [SPARK-43222][CORE][SQL][K8S][TESTS] Remove `isHadoop3` check from tests - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/20 16:31:30 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40882: [SPARK-43222][CORE][SQL][K8S][TESTS] Remove `isHadoop3` check from tests - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/20 16:32:55 UTC, 3 replies.
- [GitHub] [spark] mkaravel commented on a diff in pull request #40615: [SPARK-16484][SQL] Add support for Datasketches HllSketch - posted by "mkaravel (via GitHub)" <gi...@apache.org> on 2023/04/20 16:51:03 UTC, 7 replies.
- [GitHub] [spark] sunchao closed pull request #40867: [SPARK-43208][SQL][HIVE] IsolatedClassLoader should close barrier class InputStream after reading - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/20 16:53:55 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40867: [SPARK-43208][SQL][HIVE] IsolatedClassLoader should close barrier class InputStream after reading - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/20 16:54:33 UTC, 1 replies.
- [GitHub] [spark] mridulm commented on a diff in pull request #40850: [SPARK-43191][CORE] Replace reflection w/ direct calling for Hadoop CallerContext - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/20 16:56:33 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40867: [SPARK-43208][SQL][HIVE] IsolatedClassLoader should close barrier class InputStream after reading - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/20 17:17:01 UTC, 0 replies.
- [GitHub] [spark] yorksity opened a new pull request, #40883: [SPARK-43221][CORE] the BlockManager with the persisted block is pref… - posted by "yorksity (via GitHub)" <gi...@apache.org> on 2023/04/20 17:20:32 UTC, 1 replies.
- [GitHub] [spark] RyanBerti commented on a diff in pull request #40615: [SPARK-16484][SQL] Add support for Datasketches HllSketch - posted by "RyanBerti (via GitHub)" <gi...@apache.org> on 2023/04/20 17:24:46 UTC, 14 replies.
- [GitHub] [spark] xkrogen commented on pull request #40847: [SPARK-43185][BUILD] Inline `hadoop-client` related properties in `pom.xml` - posted by "xkrogen (via GitHub)" <gi...@apache.org> on 2023/04/20 17:27:06 UTC, 0 replies.
- [GitHub] [spark] ueshin commented on pull request #40829: [SPARK-41971][SQL] Use deduplicated field names when creating Arrow RecordBatch - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/20 17:31:35 UTC, 0 replies.
- [GitHub] [spark] jaceklaskowski commented on a diff in pull request #40730: [SPARK-43086][CORE] Support bin pack task scheduling on executors - posted by "jaceklaskowski (via GitHub)" <gi...@apache.org> on 2023/04/20 17:48:58 UTC, 0 replies.
- [GitHub] [spark] xinrong-meng commented on a diff in pull request #40725: [SPARK-43082][CONNECT][PYTHON] Arrow-optimized Python UDFs in Spark Connect - posted by "xinrong-meng (via GitHub)" <gi...@apache.org> on 2023/04/20 17:50:39 UTC, 5 replies.
- [GitHub] [spark] zsxwing commented on pull request #40852: [SPARK-43193][SS] Remove workaround for HADOOP-12074 - posted by "zsxwing (via GitHub)" <gi...@apache.org> on 2023/04/20 17:53:19 UTC, 1 replies.
- [GitHub] [spark] jaceklaskowski commented on a diff in pull request #39719: [SPARK-42169] [SQL] Implement code generation for to_csv function (StructsToCsv) - posted by "jaceklaskowski (via GitHub)" <gi...@apache.org> on 2023/04/20 17:54:44 UTC, 0 replies.
- [GitHub] [spark] srielau opened a new pull request, #40884: [SPARK-43205] IDENTIFIER() Steel thread - posted by "srielau (via GitHub)" <gi...@apache.org> on 2023/04/20 17:55:50 UTC, 0 replies.
- [GitHub] [spark] ryan-johnson-databricks opened a new pull request, #40885: [WIP][SPARK-XXXXX] Define extractors for file-constant metadata - posted by "ryan-johnson-databricks (via GitHub)" <gi...@apache.org> on 2023/04/20 17:58:31 UTC, 0 replies.
- [GitHub] [spark] ueshin commented on a diff in pull request #40725: [SPARK-43082][CONNECT][PYTHON] Arrow-optimized Python UDFs in Spark Connect - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/20 17:59:35 UTC, 1 replies.
- [GitHub] [spark] bjornjorgensen commented on a diff in pull request #40878: [SPARK-42780][BUILD] Upgrade `Tink` to 1.9.0 - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/20 18:23:52 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #40827: [WIP][SPARK-42585][CONNECT] Streaming of local relations - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/20 18:29:07 UTC, 0 replies.
- [GitHub] [spark] jaceklaskowski commented on a diff in pull request #40864: [WIP] Nested DataType compatibility in Arrow-optimized Python UDF and Pandas UDF - posted by "jaceklaskowski (via GitHub)" <gi...@apache.org> on 2023/04/20 19:12:57 UTC, 0 replies.
- [GitHub] [spark] holdenk commented on pull request #38852: [SPARK-41341][CORE] Wait shuffle fetch to finish when decommission executor - posted by "holdenk (via GitHub)" <gi...@apache.org> on 2023/04/20 19:32:54 UTC, 0 replies.
- [GitHub] [spark] sunchao closed pull request #40860: [SPARK-43202][YARN] Replace reflection w/ direct calling for YARN Resource API - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/20 20:01:44 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40860: [SPARK-43202][YARN] Replace reflection w/ direct calling for YARN Resource API - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/20 20:02:00 UTC, 0 replies.
- [GitHub] [spark] rshkv commented on a diff in pull request #40794: [SPARK-43142] Fix DSL expressions on attributes with special characters - posted by "rshkv (via GitHub)" <gi...@apache.org> on 2023/04/20 20:03:02 UTC, 5 replies.
- [GitHub] [spark] sunchao closed pull request #40870: [SPARK-43211][HIVE] Remove Hadoop2 support in IsolatedClientLoader - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/20 20:10:26 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40870: [SPARK-43211][HIVE] Remove Hadoop2 support in IsolatedClientLoader - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/20 20:10:42 UTC, 0 replies.
- [GitHub] [spark] WweiL opened a new pull request, #40887: Spark 43144 scala table api - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/20 21:08:53 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #40887: [SPARK-43144] Scala Client DataStreamReader table() API - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/20 21:22:49 UTC, 1 replies.
- [GitHub] [spark] WweiL commented on a diff in pull request #40887: [SPARK-43144] Scala Client DataStreamReader table() API - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/20 21:27:50 UTC, 2 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40887: [SPARK-43144] Scala Client DataStreamReader table() API - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/20 22:09:09 UTC, 1 replies.
- [GitHub] [spark] xinrong-meng commented on a diff in pull request #40864: [WIP] Nested DataType compatibility in Arrow-optimized Python UDF and Pandas UDF - posted by "xinrong-meng (via GitHub)" <gi...@apache.org> on 2023/04/20 22:18:15 UTC, 1 replies.
- [GitHub] [spark] mridulm commented on pull request #40883: [SPARK-43221][CORE] the BlockManager with the persisted block is pref… - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/20 22:23:34 UTC, 0 replies.
- [GitHub] [spark] zhenlineo commented on pull request #40796: [SPARK-43223][Connect] Typed agg, reduce functions - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/20 22:37:14 UTC, 0 replies.
- [GitHub] [spark] jchen5 commented on a diff in pull request #40865: [SPARK-43156][SQL] Fix `COUNT(*) is null` bug in correlated scalar subquery - posted by "jchen5 (via GitHub)" <gi...@apache.org> on 2023/04/20 22:48:32 UTC, 4 replies.
- [GitHub] [spark] WweiL commented on pull request #40785: [SPARK-42960] [CONNECT] [SS] Add await_termination() and exception() API for Streaming Query in Python - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/20 22:54:27 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40853: [SPARK-43192] [CONNECT] Remove user agent charset validation - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/20 23:43:30 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40785: [SPARK-42960] [CONNECT] [SS] Add await_termination() and exception() API for Streaming Query in Python - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/20 23:57:50 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40785: [SPARK-42960] [CONNECT] [SS] Add await_termination() and exception() API for Streaming Query in Python - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/20 23:58:17 UTC, 0 replies.
- [GitHub] [spark] allisonwang-db opened a new pull request, #40575: [SPARK-42945][CONNECT] Support PYSPARK_JVM_STACKTRACE_ENABLED in Spark Connect - posted by "allisonwang-db (via GitHub)" <gi...@apache.org> on 2023/04/20 23:58:53 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40864: [WIP] Nested DataType compatibility in Arrow-optimized Python UDF and Pandas UDF - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 00:04:58 UTC, 0 replies.
- [GitHub] [spark] ueshin opened a new pull request, #40888: [SPARK-43055][CONNECT][PYTHON][FOLLOWUP] Fix deduplicate field names and refactor - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/21 00:08:06 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40829: [SPARK-41971][SQL] Use deduplicated field names when creating Arrow RecordBatch - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 00:12:47 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #39312: [SPARK-41788][SQL] Move InsertIntoStatement to basicLogicalOperators - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/21 00:17:42 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40779: [SPARK-43124][SQL] Dataset.show projects CommandResults locally - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 00:33:38 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40779: [SPARK-43124][SQL] Dataset.show projects CommandResults locally - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 00:33:54 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40873: [SPARK-43213][PYTHON] Add `DataFrame.offset` to vanilla PySpark - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/21 00:36:56 UTC, 1 replies.
- [GitHub] [spark] amaliujia commented on pull request #40887: [SPARK-43144] Scala Client DataStreamReader table() API - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/21 01:14:16 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40871: Revert [SPARK-39203][SQL] Rewrite table location to absolute URI based on database URI - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 01:28:07 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40871: Revert [SPARK-39203][SQL] Rewrite table location to absolute URI based on database URI - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 01:28:35 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40872: [MINOR][CONNECT][PYTHON][DOCS] Fix the doc of parameter `num` in `DataFrame.offset` - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 01:30:37 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40872: [MINOR][CONNECT][PYTHON][DOCS] Fix the doc of parameter `num` in `DataFrame.offset` - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 01:31:02 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40873: [SPARK-43213][PYTHON] Add `DataFrame.offset` to vanilla PySpark - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 01:31:50 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40881: [SPARK-43113][SQL][FOLLOWUP] Add comment about copying steam-side variables - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 01:33:01 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40684: [SPARK-41532][CONNECT][CLIENT] Add check for operations that involve multiple data frames - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 01:34:29 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40873: [SPARK-43213][PYTHON] Add `DataFrame.offset` to vanilla PySpark - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/21 01:36:30 UTC, 0 replies.
- [GitHub] [spark] huaxingao opened a new pull request, #40889: [SPARK-41660][SQL][3.3] Only propagate metadata columns if they are used - posted by "huaxingao (via GitHub)" <gi...@apache.org> on 2023/04/21 01:43:11 UTC, 0 replies.
- [GitHub] [spark] huaxingao commented on pull request #40889: [SPARK-41660][SQL][3.3] Only propagate metadata columns if they are used - posted by "huaxingao (via GitHub)" <gi...@apache.org> on 2023/04/21 01:44:07 UTC, 1 replies.
- [GitHub] [spark] viirya commented on a diff in pull request #40889: [SPARK-41660][SQL][3.3] Only propagate metadata columns if they are used - posted by "viirya (via GitHub)" <gi...@apache.org> on 2023/04/21 01:48:06 UTC, 1 replies.
- [GitHub] [spark] Hisoka-X opened a new pull request, #40890: [SPARK-43219] Add `INSERT INTO REPLACE WHERE` statement into website - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/21 02:04:04 UTC, 0 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40891: [SPARK-43191][CORE][FOLLOWUP] Use renamed import statement for Hadoop classes - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/21 02:04:59 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on a diff in pull request #40850: [SPARK-43191][CORE] Replace reflection w/ direct calling for Hadoop CallerContext - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/21 02:05:48 UTC, 0 replies.
- [GitHub] [spark] huaxingao commented on a diff in pull request #40889: [SPARK-41660][SQL][3.3] Only propagate metadata columns if they are used - posted by "huaxingao (via GitHub)" <gi...@apache.org> on 2023/04/21 02:22:39 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40889: [SPARK-41660][SQL][3.3] Only propagate metadata columns if they are used - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/21 02:23:29 UTC, 0 replies.
- [GitHub] [spark] yaooqinn closed pull request #40768: [SPARK-43119][SQL] Support Get SQL Keywords Dynamically Thru JDBC API and TVF - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/21 02:24:30 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40881: [SPARK-43113][SQL][FOLLOWUP] Add comment about copying steam-side variables - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/21 02:24:33 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40881: [SPARK-43113][SQL][FOLLOWUP] Add comment about copying steam-side variables - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/21 02:25:08 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40877: [SPARK-31733][YARN][TESTS] Make `specify a more specific type for the application` in `ClientSuite` pass in Hadoop 3 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/21 02:38:41 UTC, 2 replies.
- [GitHub] [spark] wangyum commented on pull request #40838: [SPARK-43174][SQL] Fix SparkSQLCLIDriver completer - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/21 02:49:39 UTC, 1 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40892: [SPARK-43128][CONNECT][SS] Make `recentProgress` and `lastProgress` return `StreamingQueryProgress` consistent with the native Scala Api - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/21 02:59:18 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40892: [SPARK-43128][CONNECT][SS] Make `recentProgress` and `lastProgress` return `StreamingQueryProgress` consistent with the native Scala Api - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/21 03:06:29 UTC, 1 replies.
- [GitHub] [spark] yaooqinn commented on a diff in pull request #40838: [SPARK-43174][SQL] Fix SparkSQLCLIDriver completer - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/21 03:36:22 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40892: [SPARK-43128][CONNECT] Make `recentProgress` and `lastProgress` return `StreamingQueryProgress` consistent with the native Scala Api - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/21 03:36:32 UTC, 7 replies.
- [GitHub] [spark] wangyum opened a new pull request, #40893: [SPARK-43225][BUILD][SQL] Remove jackson-core-asl jackson-mapper-asl from pre-built distribution - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/21 03:37:43 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40892: [SPARK-43128][CONNECT] Make `recentProgress` and `lastProgress` return `StreamingQueryProgress` consistent with the native Scala Api - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/21 03:38:43 UTC, 0 replies.
- [GitHub] [spark] cfmcgrady commented on pull request #40875: [SPARK-43214][SQL] Post driver-side metrics for LocalTableScanExec/CommandResultExec - posted by "cfmcgrady (via GitHub)" <gi...@apache.org> on 2023/04/21 03:43:11 UTC, 2 replies.
- [GitHub] [spark] pan3793 commented on pull request #40893: [SPARK-43225][BUILD][SQL] Remove jackson-core-asl and jackson-mapper-asl from pre-built distribution - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/21 03:48:16 UTC, 3 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40852: [SPARK-43193][SS] Remove workaround for HADOOP-12074 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 04:01:21 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40852: [SPARK-43193][SS] Remove workaround for HADOOP-12074 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 04:01:45 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40892: [SPARK-43128][CONNECT] Make `recentProgress` and `lastProgress` return `StreamingQueryProgress` consistent with the native Scala Api - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 04:02:37 UTC, 0 replies.
- [GitHub] [spark] attilapiros commented on pull request #40843: [SPARK-43179][SHUFFLE] Allowing apps to control whether their metadata gets saved in the db by the External Shuffle Service - posted by "attilapiros (via GitHub)" <gi...@apache.org> on 2023/04/21 04:25:19 UTC, 0 replies.
- [GitHub] [spark] yorksity commented on pull request #40883: [SPARK-43221][CORE] the BlockManager with the persisted block is pref… - posted by "yorksity (via GitHub)" <gi...@apache.org> on 2023/04/21 04:29:08 UTC, 0 replies.
- [GitHub] [spark] yorksity closed pull request #40883: [SPARK-43221][CORE] the BlockManager with the persisted block is pref… - posted by "yorksity (via GitHub)" <gi...@apache.org> on 2023/04/21 04:29:09 UTC, 0 replies.
- [GitHub] [spark] attilapiros commented on pull request #40883: [SPARK-43221][CORE] the BlockManager with the persisted block is pref… - posted by "attilapiros (via GitHub)" <gi...@apache.org> on 2023/04/21 04:30:20 UTC, 1 replies.
- [GitHub] [spark] yorksity commented on pull request #40883: [WIP][SPARK-43221][CORE] he BlockManager with the persisted block is preferred - posted by "yorksity (via GitHub)" <gi...@apache.org> on 2023/04/21 04:36:50 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40873: [SPARK-43213][PYTHON] Add `DataFrame.offset` to vanilla PySpark - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/21 04:55:42 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40868: [SPARK-43210][CONNECT][PYTHON] Introduce `PySparkAssertionError` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/21 05:09:12 UTC, 1 replies.
- [GitHub] [spark] sadikovi commented on pull request #40699: [SPARK-43063][SQL] `df.show` handle null should print NULL instead of null - posted by "sadikovi (via GitHub)" <gi...@apache.org> on 2023/04/21 05:12:53 UTC, 3 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40751: [SPARK-43102][BUILD] Upgrade commons-compress to 1.23.0 - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/21 05:22:53 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40751: [SPARK-43102][BUILD] Upgrade commons-compress to 1.23.0 - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/21 05:23:07 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40751: [SPARK-43102][BUILD] Upgrade commons-compress to 1.23.0 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/21 05:33:21 UTC, 0 replies.
- [GitHub] [spark] holdenk commented on pull request #40128: [SPARK-42466][K8S]: Cleanup k8s upload directory when job terminates - posted by "holdenk (via GitHub)" <gi...@apache.org> on 2023/04/21 05:34:24 UTC, 0 replies.
- [GitHub] [spark] vicennial opened a new pull request, #40894: [SPARK-43198][CONNECT] Fix "Could not initialise class ammonite..." error when using filter - posted by "vicennial (via GitHub)" <gi...@apache.org> on 2023/04/21 05:37:43 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40888: [SPARK-43055][CONNECT][PYTHON][FOLLOWUP] Fix deduplicate field names and refactor - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 05:57:35 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40888: [SPARK-43055][CONNECT][PYTHON][FOLLOWUP] Fix deduplicate field names and refactor - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 05:58:07 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40575: [SPARK-42945][CONNECT] Support PYSPARK_JVM_STACKTRACE_ENABLED in Spark Connect - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/21 05:58:50 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40882: [SPARK-43222][CORE][SQL][K8S][TESTS] Remove `isHadoop3` check from tests - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 05:59:04 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40882: [SPARK-43222][CORE][SQL][K8S][TESTS] Remove `isHadoop3` check from tests - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 05:59:24 UTC, 0 replies.
- [GitHub] [spark] bogao007 opened a new pull request, #40895: [SPARK-43128] [SS] [Connect] Implemented StreamingQueryProgress for Spark Connect - posted by "bogao007 (via GitHub)" <gi...@apache.org> on 2023/04/21 06:04:06 UTC, 0 replies.
- [GitHub] [spark] bogao007 commented on a diff in pull request #40895: [SPARK-43128] [SS] [Connect] Implemented StreamingQueryProgress for Spark Connect - posted by "bogao007 (via GitHub)" <gi...@apache.org> on 2023/04/21 06:06:05 UTC, 7 replies.
- [GitHub] [spark] woj-i commented on a diff in pull request #40821: [SPARK-43152][spark-structured-streaming] Parametrisable output metadata path (_spark_metadata) - posted by "woj-i (via GitHub)" <gi...@apache.org> on 2023/04/21 06:17:27 UTC, 9 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40853: [SPARK-43192] [CONNECT] Remove user agent charset validation - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 06:23:30 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40853: [SPARK-43192] [CONNECT] Remove user agent charset validation - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 06:23:52 UTC, 0 replies.
- [GitHub] [spark] mridulm closed pull request #40891: [SPARK-43191][CORE][FOLLOWUP] Use renamed import statement for Hadoop classes - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/21 06:28:05 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on pull request #40891: [SPARK-43191][CORE][FOLLOWUP] Use renamed import statement for Hadoop classes - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/21 06:28:43 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40790: [SPARK-43116][SQL] Fix Cast.forceNullable - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/21 06:42:39 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40834: [SPARK-43046] [SS] [Connect] Implemented Python API dropDuplicatesWithinWatermark for Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/21 07:04:11 UTC, 2 replies.
- [GitHub] [spark] bogao007 commented on pull request #40895: [SPARK-43128] [SS] [Connect] Implemented StreamingQueryProgress for Spark Connect - posted by "bogao007 (via GitHub)" <gi...@apache.org> on 2023/04/21 07:08:52 UTC, 0 replies.
- [GitHub] [spark] bogao007 closed pull request #40895: [SPARK-43128] [SS] [Connect] Implemented StreamingQueryProgress for Spark Connect - posted by "bogao007 (via GitHub)" <gi...@apache.org> on 2023/04/21 07:08:56 UTC, 0 replies.
- [GitHub] [spark] bogao007 commented on a diff in pull request #40892: [SPARK-43128][CONNECT] Make `recentProgress` and `lastProgress` return `StreamingQueryProgress` consistent with the native Scala Api - posted by "bogao007 (via GitHub)" <gi...@apache.org> on 2023/04/21 07:15:17 UTC, 2 replies.
- [GitHub] [spark] bogao007 commented on pull request #40892: [SPARK-43128][CONNECT] Make `recentProgress` and `lastProgress` return `StreamingQueryProgress` consistent with the native Scala Api - posted by "bogao007 (via GitHub)" <gi...@apache.org> on 2023/04/21 07:15:52 UTC, 0 replies.
- [GitHub] [spark] beliefer commented on pull request #40528: [SPARK-42584][CONNECT] Improve output of Column.explain - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/21 07:20:36 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40895: [SPARK-43128] [SS] [Connect] Implemented StreamingQueryProgress for Spark Connect - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/21 07:29:04 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40896: [WIP][ML][PYTHON][CONNECT] Support Barrier Python UDF - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/21 07:37:32 UTC, 0 replies.
- [GitHub] [spark] johanl-db commented on a diff in pull request #40885: [SPARK-43226] Define extractors for file-constant metadata - posted by "johanl-db (via GitHub)" <gi...@apache.org> on 2023/04/21 07:55:51 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40838: [SPARK-43174][SQL] Fix SparkSQLCLIDriver completer - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/21 08:46:46 UTC, 1 replies.
- [GitHub] [spark] wangyum opened a new pull request, #40897: [SPARK-43228][SQL] Join keys also match PartitioningCollection in CoalesceBucketsInJoin - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/21 08:47:49 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40856: [SPARK-43199][SQL] Make InlineCTE idempotent - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/21 08:52:21 UTC, 6 replies.
- [GitHub] [spark] bjornjorgensen commented on pull request #40893: [SPARK-43225][BUILD][SQL] Remove jackson-core-asl and jackson-mapper-asl from pre-built distribution - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/21 08:54:20 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40898: [SPARK-43230][CONNECT] Simplify `DataFrameNaFunctions.fillna` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/21 09:16:49 UTC, 0 replies.
- [GitHub] [spark] peter-toth commented on a diff in pull request #40856: [SPARK-43199][SQL] Make InlineCTE idempotent - posted by "peter-toth (via GitHub)" <gi...@apache.org> on 2023/04/21 09:27:40 UTC, 4 replies.
- [GitHub] [spark] Yikf commented on a diff in pull request #40824: [SPARK-32064][SQL] Support temporary table - posted by "Yikf (via GitHub)" <gi...@apache.org> on 2023/04/21 09:44:07 UTC, 0 replies.
- [GitHub] [spark] allisonwang-db commented on a diff in pull request #40865: [SPARK-43156][SQL] Fix `COUNT(*) is null` bug in correlated scalar subquery - posted by "allisonwang-db (via GitHub)" <gi...@apache.org> on 2023/04/21 09:48:32 UTC, 1 replies.
- [GitHub] [spark] NarekDW commented on a diff in pull request #39719: [SPARK-42169] [SQL] Implement code generation for to_csv function (StructsToCsv) - posted by "NarekDW (via GitHub)" <gi...@apache.org> on 2023/04/21 10:40:27 UTC, 2 replies.
- [GitHub] [spark] NarekDW commented on pull request #39719: [SPARK-42169] [SQL] Implement code generation for to_csv function (StructsToCsv) - posted by "NarekDW (via GitHub)" <gi...@apache.org> on 2023/04/21 10:43:04 UTC, 0 replies.
- [GitHub] [spark] jchen5 commented on pull request #40865: [SPARK-43156][SQL] Fix `COUNT(*) is null` bug in correlated scalar subquery - posted by "jchen5 (via GitHub)" <gi...@apache.org> on 2023/04/21 11:41:07 UTC, 1 replies.
- [GitHub] [spark] wangyum commented on pull request #40897: [SPARK-43228][SQL] Join keys also match PartitioningCollection in CoalesceBucketsInJoin - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/21 11:52:13 UTC, 0 replies.
- [GitHub] [spark] grundprinzip commented on a diff in pull request #40160: [SPARK-41725][CONNECT] Eager Execution of DF.sql() - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/21 11:57:14 UTC, 0 replies.
- [GitHub] [spark] grundprinzip opened a new pull request, #40899: [MINOR][CONNECT] Fix missing stats for SQL Command - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/21 12:36:26 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40794: [SPARK-43142] Fix DSL expressions on attributes with special characters - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/21 12:54:12 UTC, 1 replies.
- [GitHub] [spark] cloud-fan closed pull request #40794: [SPARK-43142] Fix DSL expressions on attributes with special characters - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/21 12:55:02 UTC, 0 replies.
- [GitHub] [spark] peter-toth commented on pull request #40266: [SPARK-42660][SQL] Infer filters for Join produced by IN and EXISTS clause (RewritePredicateSubquery rule) - posted by "peter-toth (via GitHub)" <gi...@apache.org> on 2023/04/21 12:56:26 UTC, 0 replies.
- [GitHub] [spark] advancedxy commented on pull request #37417: [SPARK-33782][K8S][CORE]Place spark.files, spark.jars and spark.files under the current working directory on the driver in K8S cluster mode - posted by "advancedxy (via GitHub)" <gi...@apache.org> on 2023/04/21 12:57:36 UTC, 0 replies.
- [GitHub] [spark] juliuszsompolski commented on pull request #40899: [MINOR][CONNECT] Fix missing stats for SQL Command - posted by "juliuszsompolski (via GitHub)" <gi...@apache.org> on 2023/04/21 12:58:05 UTC, 0 replies.
- [GitHub] [spark] ted-jenks commented on pull request #39907: [SPARK-42359][SQL] Support row skipping when reading CSV files - posted by "ted-jenks (via GitHub)" <gi...@apache.org> on 2023/04/21 13:03:16 UTC, 0 replies.
- [GitHub] [spark] ryan-johnson-databricks commented on a diff in pull request #40885: [SPARK-43226] Define extractors for file-constant metadata - posted by "ryan-johnson-databricks (via GitHub)" <gi...@apache.org> on 2023/04/21 13:16:26 UTC, 24 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40900: [SPARK-43196][YARN][FOLLOWUP] Remove unnecessary Hadoop version check - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/21 13:35:23 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40900: [SPARK-43196][YARN][FOLLOWUP] Remove unnecessary Hadoop version check - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/21 13:39:03 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on pull request #40616: [SPARK-42991][SQL] Disable string type +/- interval in ANSI mode - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/21 13:53:23 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40794: [SPARK-43142] Fix DSL expressions on attributes with special characters - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/21 14:32:56 UTC, 1 replies.
- [GitHub] [spark] huaxingao closed pull request #40889: [SPARK-41660][SQL][3.3] Only propagate metadata columns if they are used - posted by "huaxingao (via GitHub)" <gi...@apache.org> on 2023/04/21 14:47:12 UTC, 0 replies.
- [GitHub] [spark] wangyum commented on pull request #40794: [SPARK-43142] Fix DSL expressions on attributes with special characters - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/21 14:57:05 UTC, 0 replies.
- [GitHub] [spark] srowen commented on pull request #40893: [SPARK-43225][BUILD][SQL] Remove jackson-core-asl and jackson-mapper-asl from pre-built distribution - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/21 15:25:16 UTC, 2 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40901: [SPARK-43195][FOLLOWUP] Fix mima check for Scala 2.13 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/21 15:37:35 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40901: [SPARK-43195][BUILD][FOLLOWUP] Fix mima check for Scala 2.13 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/21 15:42:29 UTC, 1 replies.
- [GitHub] [spark] WweiL commented on pull request #40887: [SPARK-43144] Scala Client DataStreamReader table() API - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/21 16:53:41 UTC, 2 replies.
- [GitHub] [spark] mridulm closed pull request #40843: [SPARK-43179][SHUFFLE] Allowing apps to control whether their metadata gets saved in the db by the External Shuffle Service - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/21 17:21:46 UTC, 0 replies.
- [GitHub] [spark] sunchao closed pull request #40900: [SPARK-43196][YARN][FOLLOWUP] Remove unnecessary Hadoop version check - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/21 17:26:01 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40900: [SPARK-43196][YARN][FOLLOWUP] Remove unnecessary Hadoop version check - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/21 17:26:26 UTC, 0 replies.
- [GitHub] [spark] ryan-johnson-databricks commented on pull request #40885: [SPARK-43226] Define extractors for file-constant metadata - posted by "ryan-johnson-databricks (via GitHub)" <gi...@apache.org> on 2023/04/21 17:47:24 UTC, 0 replies.
- [GitHub] [spark] woj-i commented on pull request #40821: [SPARK-43152][spark-structured-streaming] Parametrisable output metadata path (_spark_metadata) - posted by "woj-i (via GitHub)" <gi...@apache.org> on 2023/04/21 18:42:35 UTC, 0 replies.
- [GitHub] [spark] rshkv opened a new pull request, #40902: [SPARK-43142] Fix DSL expressions on attributes with special characters - posted by "rshkv (via GitHub)" <gi...@apache.org> on 2023/04/21 20:03:13 UTC, 0 replies.
- [GitHub] [spark] xinrong-meng commented on pull request #40864: [WIP] Nested DataType compatibility in Arrow-optimized Python UDF and Pandas UDF - posted by "xinrong-meng (via GitHub)" <gi...@apache.org> on 2023/04/21 20:10:53 UTC, 0 replies.
- [GitHub] [spark] xinrong-meng closed pull request #40864: [WIP] Nested DataType compatibility in Arrow-optimized Python UDF and Pandas UDF - posted by "xinrong-meng (via GitHub)" <gi...@apache.org> on 2023/04/21 20:10:54 UTC, 0 replies.
- [GitHub] [spark] sweisdb opened a new pull request, #40903: [WIP][SPARK-NNNNN] Updating AES-CBC support to not use OpenSSL's KDF - posted by "sweisdb (via GitHub)" <gi...@apache.org> on 2023/04/21 20:25:34 UTC, 0 replies.
- [GitHub] [spark] ueshin commented on a diff in pull request #40782: [SPARK-42669][CONNECT] Short circuit local relation RPCs - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/21 20:59:23 UTC, 0 replies.
- [GitHub] [spark] pengzhon-db opened a new pull request, #40904: [WIP][POC] foreachbatch spark connect - posted by "pengzhon-db (via GitHub)" <gi...@apache.org> on 2023/04/21 21:00:05 UTC, 0 replies.
- [GitHub] [spark] siying opened a new pull request, #40905: [SPARK-43233] [SS] Add logging for Kafka Batch Reading for topic partition, offset range and task ID - posted by "siying (via GitHub)" <gi...@apache.org> on 2023/04/21 21:38:34 UTC, 0 replies.
- [GitHub] [spark] anishshri-db commented on pull request #40905: [SPARK-43233] [SS] Add logging for Kafka Batch Reading for topic partition, offset range and task ID - posted by "anishshri-db (via GitHub)" <gi...@apache.org> on 2023/04/21 21:39:52 UTC, 1 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40834: [SPARK-43046] [SS] [Connect] Implemented Python API dropDuplicatesWithinWatermark for Spark Connect - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/21 22:16:16 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40796: [SPARK-43223][Connect] Typed agg, reduce functions - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/21 22:26:22 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on pull request #40796: [SPARK-43223][Connect] Typed agg, reduce functions - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/21 22:27:12 UTC, 0 replies.
- [GitHub] [spark] WweiL opened a new pull request, #40906: [SPARK-43134] [CONNECT] [SS] JVM client StreamingQuery exception() API - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/21 22:47:36 UTC, 0 replies.
- [GitHub] [spark] WweiL commented on pull request #40906: [SPARK-43134] [CONNECT] [SS] JVM client StreamingQuery exception() API - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/21 22:47:55 UTC, 4 replies.
- [GitHub] [spark] alexanderwu-db opened a new pull request, #40907: [PYTHON] Implement `__dir__()` in `pyspark.sql.dataframe.DataFrame` to include columns - posted by "alexanderwu-db (via GitHub)" <gi...@apache.org> on 2023/04/21 23:08:55 UTC, 0 replies.
- [GitHub] [spark] wangyum closed pull request #40838: [SPARK-43174][SQL] Fix SparkSQLCLIDriver completer - posted by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/22 00:16:43 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #39481: [MINOR][SQL] Update the import order of scala package in class `SpecificParquetRecordReaderBase` - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/22 00:18:39 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #39312: [SPARK-41788][SQL] Move InsertIntoStatement to basicLogicalOperators - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/22 00:18:41 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40834: [SPARK-43046] [SS] [Connect] Implemented Python API dropDuplicatesWithinWatermark for Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/22 00:21:03 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40834: [SPARK-43046] [SS] [Connect] Implemented Python API dropDuplicatesWithinWatermark for Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/22 00:21:27 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40907: [PYTHON] Implement `__dir__()` in `pyspark.sql.dataframe.DataFrame` to include columns - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/22 00:24:12 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40907: [PYTHON] Implement `__dir__()` in `pyspark.sql.dataframe.DataFrame` to include columns - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/22 00:24:17 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40901: [SPARK-43195][BUILD][FOLLOWUP] Fix mima check for Scala 2.13 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/22 00:25:15 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40901: [SPARK-43195][BUILD][FOLLOWUP] Fix mima check for Scala 2.13 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/22 00:25:36 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40725: [SPARK-43082][CONNECT][PYTHON] Arrow-optimized Python UDFs in Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/22 00:30:45 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40725: [SPARK-43082][CONNECT][PYTHON] Arrow-optimized Python UDFs in Spark Connect - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/22 00:31:04 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40885: [SPARK-43226] Define extractors for file-constant metadata - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/22 03:54:04 UTC, 5 replies.
- [GitHub] [spark] Hisoka-X opened a new pull request, #40908: [SPARK-42750] Support Insert By Name statement - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/22 04:15:06 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on a diff in pull request #40908: [SPARK-42750] Support Insert By Name statement - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/22 04:17:31 UTC, 0 replies.
- [GitHub] [spark] puneetguptanitj opened a new pull request, #40909: [SPARK-42411] [Kubernetes] Add support for istio with strict mtls - posted by "puneetguptanitj (via GitHub)" <gi...@apache.org> on 2023/04/22 04:33:03 UTC, 0 replies.
- [GitHub] [spark] itholic opened a new pull request, #40910: [SPARK-43234][CONNECT][PYTHON] Migrate `ValueError` from Conect DataFrame into error class - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/22 05:34:26 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40902: [SPARK-43142] Fix DSL expressions on attributes with special characters - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/22 08:45:59 UTC, 3 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #39719: [SPARK-42169] [SQL] Implement code generation for to_csv function (StructsToCsv) - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/22 09:46:54 UTC, 0 replies.
- [GitHub] [spark] rshkv commented on a diff in pull request #40902: [SPARK-43142] Fix DSL expressions on attributes with special characters - posted by "rshkv (via GitHub)" <gi...@apache.org> on 2023/04/22 11:57:47 UTC, 4 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #40810: [SPARK-42317][SQL] Assign name to _LEGACY_ERROR_TEMP_2247: CANNOT_MERGE_SCHEMAS - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/22 14:19:48 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #40817: [SPARK-42845][SQL] Update the error class _LEGACY_ERROR_TEMP_2010 to MERGE_UNSUPPORTED_BY_WINDOW_FUNCTION - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/22 14:27:03 UTC, 1 replies.
- [GitHub] [spark] warrenzhu25 opened a new pull request, #40911: [SPARK-43237][CORE] Handle null exception message in event log - posted by "warrenzhu25 (via GitHub)" <gi...@apache.org> on 2023/04/22 18:08:36 UTC, 0 replies.
- [GitHub] [spark] warrenzhu25 commented on pull request #40911: [SPARK-43237][CORE] Handle null exception message in event log - posted by "warrenzhu25 (via GitHub)" <gi...@apache.org> on 2023/04/22 18:09:33 UTC, 1 replies.
- [GitHub] [spark] warrenzhu25 opened a new pull request, #40912: [SPARK-43238][CORE] Support only decommission idle workers in standalone - posted by "warrenzhu25 (via GitHub)" <gi...@apache.org> on 2023/04/22 18:27:52 UTC, 0 replies.
- [GitHub] [spark] bjornjorgensen commented on pull request #40216: [SPARK-42593][PS] Deprecate & remove the APIs that will be removed in pandas 2.0. - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/22 19:27:15 UTC, 1 replies.
- [GitHub] [spark] bjornjorgensen opened a new pull request, #40913: [SPARK-43239][PS] Remove `null_counts` from info() - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/22 20:25:39 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #39481: [MINOR][SQL] Update the import order of scala package in class `SpecificParquetRecordReaderBase` - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/23 00:20:19 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #38877: [SPARK-41361] [SQL] Invalid call toAttribute on unresolved object exception caused by WidenSetOperationTypes - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/23 00:20:19 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #38649: [SPARK-41132][SQL] Convert LikeAny and NotLikeAny to InSet if no pattern contains wildcards - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/23 00:20:21 UTC, 0 replies.
- [GitHub] [spark] JkSelf opened a new pull request, #40914: [SPARK-43240] [SQL] Fix the wrong result issue when calling df.describe() method. - posted by "JkSelf (via GitHub)" <gi...@apache.org> on 2023/04/23 03:26:12 UTC, 0 replies.
- [GitHub] [spark] JkSelf commented on pull request #40914: [SPARK-43240] [SQL] Fix the wrong result issue when calling df.describe() method. - posted by "JkSelf (via GitHub)" <gi...@apache.org> on 2023/04/23 03:26:37 UTC, 1 replies.
- [GitHub] [spark] beliefer commented on pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit - posted by "beliefer (via GitHub)" <gi...@apache.org> on 2023/04/23 08:38:10 UTC, 0 replies.
- [GitHub] [spark] ulysses-you opened a new pull request, #40915: [SPARK-43232][SQL] Improve ObjectHashAggregateExec performance for high cardinality - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/23 08:41:21 UTC, 0 replies.
- [GitHub] [spark] SparksFyz commented on a diff in pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni) - posted by "SparksFyz (via GitHub)" <gi...@apache.org> on 2023/04/23 09:01:49 UTC, 1 replies.
- [GitHub] [spark] rshkv commented on pull request #40902: [SPARK-43142] Fix DSL expressions on attributes with special characters - posted by "rshkv (via GitHub)" <gi...@apache.org> on 2023/04/23 10:26:31 UTC, 2 replies.
- [GitHub] [spark] bjornjorgensen commented on pull request #40913: [SPARK-43239][PS] Remove `null_counts` from info() - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/23 12:01:57 UTC, 1 replies.
- [GitHub] [spark] zuston commented on pull request #40312: [SPARK-42695][SQL] Skew join handling in stream side of broadcast hash join - posted by "zuston (via GitHub)" <gi...@apache.org> on 2023/04/23 13:09:06 UTC, 0 replies.
- [GitHub] [spark] srowen commented on a diff in pull request #40687: [SPARK-43052][CORE] Handle stacktrace with null file name in event log - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/23 14:31:00 UTC, 0 replies.
- [GitHub] [spark] srowen commented on pull request #40738: [SPARK-42380][BUILD] Upgrade Apache Maven to 3.9.1 - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/23 14:31:34 UTC, 0 replies.
- [GitHub] [spark] srowen commented on pull request #40430: [SPARK-42798][BUILD] Upgrade protobuf-java to 3.22.3 - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/23 14:32:10 UTC, 1 replies.
- [GitHub] [spark] srowen closed pull request #40640: [SPARK-43008][BUILD] Upgrade joda-time from 2.12.2 to 2.12.5 - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/23 14:35:07 UTC, 0 replies.
- [GitHub] [spark] srowen commented on pull request #40640: [SPARK-43008][BUILD] Upgrade joda-time from 2.12.2 to 2.12.5 - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/23 14:35:31 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40640: [SPARK-43008][BUILD] Upgrade joda-time from 2.12.2 to 2.12.5 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/23 15:50:37 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40430: [SPARK-42798][BUILD] Upgrade protobuf-java to 3.22.3 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/23 15:51:49 UTC, 2 replies.
- [GitHub] [spark] khalidmammadov opened a new pull request, #40916: [SPARK-43243][PySpark][Connect] Add level param to printSchema for Python - posted by "khalidmammadov (via GitHub)" <gi...@apache.org> on 2023/04/23 16:20:30 UTC, 0 replies.
- [GitHub] [spark] kori73 commented on a diff in pull request #40810: [SPARK-42317][SQL] Assign name to _LEGACY_ERROR_TEMP_2247: CANNOT_MERGE_SCHEMAS - posted by "kori73 (via GitHub)" <gi...@apache.org> on 2023/04/23 17:48:33 UTC, 1 replies.
- [GitHub] [spark] bersprockets opened a new pull request, #40917: [SPARK-43113][SQL][3.3] Evaluate stream-side variables when generating code for a bound condition - posted by "bersprockets (via GitHub)" <gi...@apache.org> on 2023/04/23 18:54:19 UTC, 0 replies.
- [GitHub] [spark] Knorreman opened a new pull request, #40918: [WIP][CORE] Add shuffle sort merge joins to RDD API - posted by "Knorreman (via GitHub)" <gi...@apache.org> on 2023/04/23 19:02:14 UTC, 0 replies.
- [GitHub] [spark] aokolnychyi opened a new pull request, #40919: [SPARK-43204][SQL] Align MERGE assignments with table attributes - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/23 19:55:16 UTC, 0 replies.
- [GitHub] [spark] aokolnychyi commented on a diff in pull request #40919: [SPARK-43204][SQL] Align MERGE assignments with table attributes - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/23 19:58:45 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40820: [MINOR][SQL][DOCS] Improve spark.sql.files.minPartitionNum's doc - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/24 00:13:10 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40820: [MINOR][SQL][DOCS] Improve spark.sql.files.minPartitionNum's doc - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/24 00:13:28 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40913: [SPARK-43239][PS] Remove `null_counts` from info() - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/24 00:14:16 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40913: [SPARK-43239][PS] Remove `null_counts` from info() - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/24 00:14:36 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40880: [SPARK-43212][SS][PYTHON] Migrate Structured Streaming errors into error class - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/24 00:17:14 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40869: [SPARK-43209][CONNECT][PYTHON] Migrate Expression errors into error class - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/24 00:17:33 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #38877: [SPARK-41361] [SQL] Invalid call toAttribute on unresolved object exception caused by WidenSetOperationTypes - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/24 00:19:07 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #38649: [SPARK-41132][SQL] Convert LikeAny and NotLikeAny to InSet if no pattern contains wildcards - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/24 00:19:09 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40575: [SPARK-42945][CONNECT] Support PYSPARK_JVM_STACKTRACE_ENABLED in Spark Connect - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 00:54:01 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40917: [SPARK-43113][SQL][3.3] Evaluate stream-side variables when generating code for a bound condition - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/24 00:58:03 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40917: [SPARK-43113][SQL][3.3] Evaluate stream-side variables when generating code for a bound condition - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/24 00:58:59 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40914: [SPARK-43240] [SQL] Fix the wrong result issue when calling df.describe() method. - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/24 01:02:23 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40911: [SPARK-43237][CORE] Handle null exception message in event log - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/24 01:03:11 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40898: [SPARK-43230][CONNECT] Simplify `DataFrameNaFunctions.fillna` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 01:08:07 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40675: [SPARK-42657][CONNECT] Support to find and transfer client-side REPL classfiles to server as artifacts - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/24 01:09:04 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40899: [MINOR][CONNECT] Fix missing stats for SQL Command - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 01:11:16 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40868: [SPARK-43210][CONNECT][PYTHON] Introduce `PySparkAssertionError` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 01:14:13 UTC, 0 replies.
- [GitHub] [spark] ulysses-you commented on pull request #40915: [SPARK-43232][SQL] Improve ObjectHashAggregateExec performance for high cardinality - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/24 01:34:26 UTC, 0 replies.
- [GitHub] [spark] ulysses-you commented on a diff in pull request #40915: [SPARK-43232][SQL] Improve ObjectHashAggregateExec performance for high cardinality - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/24 01:50:22 UTC, 6 replies.
- [GitHub] [spark] BeishaoCao-db commented on pull request #40907: [PYTHON] Implement `__dir__()` in `pyspark.sql.dataframe.DataFrame` to include columns - posted by "BeishaoCao-db (via GitHub)" <gi...@apache.org> on 2023/04/24 01:56:06 UTC, 1 replies.
- [GitHub] [spark] Hisoka-X commented on pull request #40890: [SPARK-43219][SQL][DOCS] Add `INSERT INTO REPLACE WHERE` statement into website - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/24 02:14:10 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40914: [SPARK-43240] [SQL] Fix the wrong result issue when calling df.describe() method. - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 02:36:02 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40914: [SPARK-43240] [SQL] Fix the wrong result issue when calling df.describe() method. - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 02:37:00 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR commented on a diff in pull request #40905: [SPARK-43233] [SS] Add logging for Kafka Batch Reading for topic partition, offset range and task ID - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/24 02:43:00 UTC, 1 replies.
- [GitHub] [spark] JkSelf commented on a diff in pull request #40914: [SPARK-43240] [SQL] Fix the wrong result issue when calling df.describe() method. - posted by "JkSelf (via GitHub)" <gi...@apache.org> on 2023/04/24 03:12:53 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on pull request #40908: [SPARK-42750][SQL] Support Insert By Name statement - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/24 03:20:05 UTC, 0 replies.
- [GitHub] [spark] itholic commented on a diff in pull request #40370: [SPARK-42620][PS] Add `inclusive` parameter for (DataFrame|Series).between_time - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/24 03:37:09 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40914: [SPARK-43240] [SQL] Fix the wrong result issue when calling df.describe() method. - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/24 03:58:29 UTC, 1 replies.
- [GitHub] [spark] sadikovi commented on pull request #39950: [SPARK-42388][SQL] Avoid parquet footer reads twice in vectorized reader - posted by "sadikovi (via GitHub)" <gi...@apache.org> on 2023/04/24 04:00:40 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40916: [SPARK-43243][PYTHON][CONNECT] Add level param to printSchema for Python - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/24 04:03:02 UTC, 0 replies.
- [GitHub] [spark] JkSelf commented on pull request #40914: [SPARK-43240][SQL][3.3] Fix the wrong result issue when calling df.describe() method. - posted by "JkSelf (via GitHub)" <gi...@apache.org> on 2023/04/24 05:04:09 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40898: [SPARK-43230][CONNECT] Simplify `DataFrameNaFunctions.fillna` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 05:13:23 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40914: [SPARK-43240][SQL][3.3] Fix the wrong result issue when calling df.describe() method. - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/24 05:14:03 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40869: [SPARK-43209][CONNECT][PYTHON] Migrate Expression errors into error class - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 05:27:53 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40869: [SPARK-43209][CONNECT][PYTHON] Migrate Expression errors into error class - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 05:28:10 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40898: [SPARK-43230][CONNECT] Simplify `DataFrameNaFunctions.fillna` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/24 05:36:50 UTC, 5 replies.
- [GitHub] [spark] JkSelf commented on a diff in pull request #40914: [SPARK-43240][SQL][3.3] Fix the wrong result issue when calling df.describe() method. - posted by "JkSelf (via GitHub)" <gi...@apache.org> on 2023/04/24 05:49:06 UTC, 1 replies.
- [GitHub] [spark] MaxGekk commented on pull request #40810: [SPARK-42317][SQL] Assign name to _LEGACY_ERROR_TEMP_2247: CANNOT_MERGE_SCHEMAS - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/24 06:20:41 UTC, 2 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40898: [SPARK-43230][CONNECT] Simplify `DataFrameNaFunctions.fillna` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 06:37:27 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40914: [SPARK-43240][SQL][3.3] Fix the wrong result issue when calling df.describe() method. - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 06:40:20 UTC, 0 replies.
- [GitHub] [spark] liang3zy22 commented on a diff in pull request #40817: [SPARK-42845][SQL] Update the error class _LEGACY_ERROR_TEMP_2010 to MERGE_UNSUPPORTED_BY_WINDOW_FUNCTION - posted by "liang3zy22 (via GitHub)" <gi...@apache.org> on 2023/04/24 06:41:31 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40910: [SPARK-43234][CONNECT][PYTHON] Migrate `ValueError` from Conect DataFrame into error class - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 06:45:58 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40910: [SPARK-43234][CONNECT][PYTHON] Migrate `ValueError` from Conect DataFrame into error class - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 06:46:15 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40866: [SPARK-43178][CONNECT][PYTHON] Migrate UDF errors into PySpark error framework - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 06:49:47 UTC, 1 replies.
- [GitHub] [spark-docker] Yikun opened a new pull request, #34: [SPARK-40513][DOCS] Add apache/spark docker image overview - posted by "Yikun (via GitHub)" <gi...@apache.org> on 2023/04/24 06:52:52 UTC, 0 replies.
- [GitHub] [spark-docker] Yikun commented on pull request #34: [SPARK-40513][DOCS] Add apache/spark docker image overview - posted by "Yikun (via GitHub)" <gi...@apache.org> on 2023/04/24 06:55:43 UTC, 0 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40920: [SPARK-43248][SQL] Unnecessary serialize/deserialize of Path on parallel gather partition stats - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/24 07:01:56 UTC, 0 replies.
- [GitHub] [spark] CavemanIV opened a new pull request, #40921: [SPARK-43242] fix throw 'Unexpected type of BlockId' in diagnose when… - posted by "CavemanIV (via GitHub)" <gi...@apache.org> on 2023/04/24 08:11:26 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40862: [SPARK-43169][INFRA][FOLLOWUP] Add more memory for mima check - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 08:11:39 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40862: [SPARK-43169][INFRA][FOLLOWUP] Add more memory for mima check - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 08:12:20 UTC, 0 replies.
- [GitHub] [spark] cloud-fan opened a new pull request, #40922: [SPARK-43063][SQL][FOLLOWUP] Add ToPrettyString expression for Dataset.show - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/24 08:26:15 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40922: [SPARK-43063][SQL][FOLLOWUP] Add ToPrettyString expression for Dataset.show - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/24 08:26:42 UTC, 4 replies.
- [GitHub] [spark] MaxGekk closed pull request #40810: [SPARK-42317][SQL] Assign name to _LEGACY_ERROR_TEMP_2247: CANNOT_MERGE_SCHEMAS - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/24 08:30:26 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40899: [SPARK-43249][CONNECT] Fix missing stats for SQL Command - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 08:33:49 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40899: [SPARK-43249][CONNECT] Fix missing stats for SQL Command - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 08:35:11 UTC, 0 replies.
- [GitHub] [spark] bjornjorgensen commented on pull request #40658: [WIP][SPARK-43024][PS] Upgrade pandas to 2.0.0 - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/24 09:24:37 UTC, 0 replies.
- [GitHub] [spark] bogao007 opened a new pull request, #40923: [Draft] State API (FlatMapGroupsWithState) in Scala for Spark Connect - posted by "bogao007 (via GitHub)" <gi...@apache.org> on 2023/04/24 09:32:47 UTC, 0 replies.
- [GitHub] [spark] bogao007 commented on a diff in pull request #40923: [Draft] State API (FlatMapGroupsWithState) in Scala for Spark Connect - posted by "bogao007 (via GitHub)" <gi...@apache.org> on 2023/04/24 09:42:21 UTC, 2 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40866: [SPARK-43178][CONNECT][PYTHON] Migrate UDF errors into PySpark error framework - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/24 10:18:13 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40563: [SPARK-41233][FOLLOWUP] Refactor `array_prepend` with `RuntimeReplaceable` - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/24 10:30:18 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40915: [SPARK-43232][SQL] Improve ObjectHashAggregateExec performance for high cardinality - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/24 10:39:31 UTC, 4 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40875: [SPARK-43214][SQL] Post driver-side metrics for LocalTableScanExec/CommandResultExec - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/24 10:47:30 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40875: [SPARK-43214][SQL] Post driver-side metrics for LocalTableScanExec/CommandResultExec - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/24 10:48:40 UTC, 0 replies.
- [GitHub] [spark] itholic commented on pull request #40658: [WIP][SPARK-43024][PS] Upgrade pandas to 2.0.0 - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/24 10:54:08 UTC, 1 replies.
- [GitHub] [spark] itholic opened a new pull request, #40924: [SPARK-43260][PYTHON] Migrate the Spark SQL pandas arrow type errors into error class. - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/24 11:29:27 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40925: [SPARK-43246][BUILD] Ignore `privateClasses` and `privateMembers` from connect mima check as default - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/24 12:02:23 UTC, 0 replies.
- [GitHub] [spark] itholic opened a new pull request, #40926: [SPARK-43261][PYTHON] Migrate `TypeError` from Spark SQL types into error class. - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/24 12:27:24 UTC, 0 replies.
- [GitHub] [spark] itholic opened a new pull request, #40927: [SPARK-42419][FOLLOWUP][CONNECT][PYTHON] Remove unused exception - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/24 12:41:39 UTC, 0 replies.
- [GitHub] [spark] itholic opened a new pull request, #40928: [SPARK-43262][CONNECT][SS][PYTHON] Migrate Spark Connect Structured Streaming errors into error class - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/24 13:16:19 UTC, 0 replies.
- [GitHub] [spark] majdyz opened a new pull request, #40929: Avoid allocation of unwritten ColumnVector in VectorizedReader - posted by "majdyz (via GitHub)" <gi...@apache.org> on 2023/04/24 14:24:03 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40929: [SPARK-43264][CORE] Avoid allocation of unwritten ColumnVector in Spark Vectorized Reader - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/24 14:45:44 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40920: [SPARK-43248][SQL] Unnecessary serialize/deserialize of Path on parallel gather partition stats - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/24 15:00:24 UTC, 2 replies.
- [GitHub] [spark] ryan-johnson-databricks opened a new pull request, #40930: [DO NOT MERGE] File constant metadata extractors split - posted by "ryan-johnson-databricks (via GitHub)" <gi...@apache.org> on 2023/04/24 15:07:35 UTC, 0 replies.
- [GitHub] [spark] majdyz closed pull request #40929: [SPARK-43264][SQL] Avoid allocation of unwritten ColumnVector in Spark Vectorized Reader - posted by "majdyz (via GitHub)" <gi...@apache.org> on 2023/04/24 15:13:48 UTC, 0 replies.
- [GitHub] [spark] majdyz opened a new pull request, #40929: [SPARK-43264][SQL] Avoid allocation of unwritten ColumnVector in Spark Vectorized Reader - posted by "majdyz (via GitHub)" <gi...@apache.org> on 2023/04/24 15:13:50 UTC, 0 replies.
- [GitHub] [spark] majdyz commented on pull request #40929: [SPARK-43264][SQL] Avoid allocation of unwritten ColumnVector in Spark Vectorized Reader - posted by "majdyz (via GitHub)" <gi...@apache.org> on 2023/04/24 15:15:08 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on a diff in pull request #40920: [SPARK-43248][SQL] Unnecessary serialize/deserialize of Path on parallel gather partition stats - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/24 15:16:26 UTC, 2 replies.
- [GitHub] [spark] amaliujia opened a new pull request, #40931: [SPARK-43265] Move Error framework to a common utils module - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/24 15:35:30 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on pull request #40931: [SPARK-43265] Move Error framework to a common utils module - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/24 15:35:39 UTC, 5 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40879: [SPARK-43217] Correctly recurse in nested maps/arrays in findNestedField - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/24 15:39:02 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40879: [SPARK-43217] Correctly recurse in nested maps/arrays in findNestedField - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/24 15:40:59 UTC, 0 replies.
- [GitHub] [spark] peter-toth opened a new pull request, #40932: [SPARK-43266][SQL] Move MergeScalarSubqueries to spark-sql - posted by "peter-toth (via GitHub)" <gi...@apache.org> on 2023/04/24 15:44:47 UTC, 0 replies.
- [GitHub] [spark] peter-toth commented on pull request #37630: [SPARK-40193][SQL] Merge subquery plans with different filters - posted by "peter-toth (via GitHub)" <gi...@apache.org> on 2023/04/24 15:56:34 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40931: [SPARK-43265] Move Error framework to a common utils module - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/24 15:58:49 UTC, 4 replies.
- [GitHub] [spark] pan3793 commented on pull request #40920: [SPARK-43248][SQL] Unnecessary serialize/deserialize of Path on parallel gather partition stats - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/24 16:02:56 UTC, 1 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40931: [SPARK-43265] Move Error framework to a common utils module - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/24 16:18:35 UTC, 6 replies.
- [GitHub] [spark] aokolnychyi commented on pull request #40919: [SPARK-43204][SQL] Align MERGE assignments with table attributes - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/24 16:39:45 UTC, 1 replies.
- [GitHub] [spark] dzhigimont commented on a diff in pull request #40370: [SPARK-42620][PS] Add `inclusive` parameter for (DataFrame|Series).between_time - posted by "dzhigimont (via GitHub)" <gi...@apache.org> on 2023/04/24 16:48:08 UTC, 1 replies.
- [GitHub] [spark] DerekTBrown closed pull request #40798: SPARK-43166: name docker users - posted by "DerekTBrown (via GitHub)" <gi...@apache.org> on 2023/04/24 16:50:11 UTC, 0 replies.
- [GitHub] [spark] DerekTBrown commented on pull request #40798: SPARK-43166: name docker users - posted by "DerekTBrown (via GitHub)" <gi...@apache.org> on 2023/04/24 16:50:11 UTC, 0 replies.
- [GitHub] [spark] dzhigimont commented on a diff in pull request #40665: [SPARK-42621][PS] Add inclusive parameter for pd.date_range - posted by "dzhigimont (via GitHub)" <gi...@apache.org> on 2023/04/24 16:55:59 UTC, 1 replies.
- [GitHub] [spark] sunchao commented on pull request #40893: [SPARK-43225][BUILD][SQL] Remove jackson-core-asl and jackson-mapper-asl from pre-built distribution - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/24 17:37:39 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on pull request #40899: [SPARK-43249][CONNECT] Fix missing stats for SQL Command - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/24 17:58:00 UTC, 0 replies.
- [GitHub] [spark] bjornjorgensen opened a new pull request, #40933: [SPARK-43263][BUILD] Upgrade `FasterXML jackson` to 2.15.0 - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/24 18:59:19 UTC, 0 replies.
- [GitHub] [spark] bjornjorgensen commented on pull request #40933: [SPARK-43263][BUILD] Upgrade `FasterXML jackson` to 2.15.0 - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/24 19:00:56 UTC, 5 replies.
- [GitHub] [spark] pjfanning commented on pull request #40933: [SPARK-43263][BUILD] Upgrade `FasterXML jackson` to 2.15.0 - posted by "pjfanning (via GitHub)" <gi...@apache.org> on 2023/04/24 19:08:37 UTC, 6 replies.
- [GitHub] [spark] aokolnychyi opened a new pull request, #40934: [SPARK-43268][SQL] Use proper error classes when exceptions are constructed with a message - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/24 20:17:43 UTC, 0 replies.
- [GitHub] [spark] aokolnychyi commented on pull request #40934: [SPARK-43268][SQL] Use proper error classes when exceptions are constructed with a message - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/24 20:18:31 UTC, 2 replies.
- [GitHub] [spark] aokolnychyi commented on a diff in pull request #40934: [SPARK-43268][SQL] Use proper error classes when exceptions are constructed with a message - posted by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/24 20:20:42 UTC, 0 replies.
- [GitHub] [spark] WweiL opened a new pull request, #40935: [SPARK-43206] [SS] [CONNECT] [DRAFT] [DO-NOT-REVIEW] StreamingQuery exception() include stack trace - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/24 20:22:26 UTC, 0 replies.
- [GitHub] [spark] BeishaoCao-db commented on a diff in pull request #40907: [PYTHON] Implement `__dir__()` in `pyspark.sql.dataframe.DataFrame` to include columns - posted by "BeishaoCao-db (via GitHub)" <gi...@apache.org> on 2023/04/24 21:51:40 UTC, 0 replies.
- [GitHub] [spark] sadikovi commented on pull request #40922: [SPARK-43063][SQL][FOLLOWUP] Add ToPrettyString expression for Dataset.show - posted by "sadikovi (via GitHub)" <gi...@apache.org> on 2023/04/24 23:39:55 UTC, 2 replies.
- [GitHub] [spark] HeartSaVioR commented on pull request #40905: [SPARK-43233] [SS] Add logging for Kafka Batch Reading for topic partition, offset range and task ID - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/24 23:49:15 UTC, 1 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #38496: [SPARK-40708][SQL] Auto update table statistics based on write metrics - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/25 00:18:26 UTC, 0 replies.
- [GitHub] [spark] sunchao closed pull request #40934: [SPARK-43268][SQL] Use proper error classes when exceptions are constructed with a message - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/25 00:25:36 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40934: [SPARK-43268][SQL] Use proper error classes when exceptions are constructed with a message - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/25 00:25:53 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR closed pull request #40905: [SPARK-43233] [SS] Add logging for Kafka Batch Reading for topic partition, offset range and task ID - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/25 00:33:14 UTC, 0 replies.
- [GitHub] [spark] bogao007 commented on a diff in pull request #40906: [SPARK-43134] [CONNECT] [SS] JVM client StreamingQuery exception() API - posted by "bogao007 (via GitHub)" <gi...@apache.org> on 2023/04/25 00:38:46 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40916: [SPARK-43243][PYTHON][CONNECT] Add level param to printSchema for Python - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 00:41:46 UTC, 0 replies.
- [GitHub] [spark] WweiL commented on a diff in pull request #40906: [SPARK-43134] [CONNECT] [SS] JVM client StreamingQuery exception() API - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/25 00:47:27 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40927: [SPARK-42419][FOLLOWUP][CONNECT][PYTHON] Remove unused exception - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 01:13:00 UTC, 2 replies.
- [GitHub] [spark] itholic commented on a diff in pull request #40665: [SPARK-42621][PS] Add inclusive parameter for pd.date_range - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/25 01:20:30 UTC, 0 replies.
- [GitHub] [spark] melin commented on pull request #38496: [SPARK-40708][SQL] Auto update table statistics based on write metrics - posted by "melin (via GitHub)" <gi...@apache.org> on 2023/04/25 01:31:25 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40617: [SPARK-42992][PYTHON] Introduce PySparkRuntimeError - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 01:35:16 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40928: [SPARK-43262][CONNECT][SS][PYTHON] Migrate Spark Connect Structured Streaming errors into error class - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 01:37:39 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40928: [SPARK-43262][CONNECT][SS][PYTHON] Migrate Spark Connect Structured Streaming errors into error class - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 01:37:56 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40936: [MINOR][CONNECT] Remove unnecessary creation of `planner` in `handleWriteOperation` and `handleWriteOperationV2` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 01:46:15 UTC, 0 replies.
- [GitHub] [spark] jackylee-ch commented on pull request #38496: [SPARK-40708][SQL] Auto update table statistics based on write metrics - posted by "jackylee-ch (via GitHub)" <gi...@apache.org> on 2023/04/25 01:49:18 UTC, 0 replies.
- [GitHub] [spark] itholic commented on pull request #40927: [SPARK-42419][FOLLOWUP][CONNECT][PYTHON] Remove unused exception - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/25 02:08:23 UTC, 1 replies.
- [GitHub] [spark] itholic commented on a diff in pull request #40617: [SPARK-42992][PYTHON] Introduce PySparkRuntimeError - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/25 02:35:13 UTC, 1 replies.
- [GitHub] [spark] itholic commented on pull request #40617: [SPARK-42992][PYTHON] Introduce PySparkRuntimeError - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/25 02:49:46 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40936: [MINOR][CONNECT] Remove unnecessary creation of `planner` in `handleWriteOperation` and `handleWriteOperationV2` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 03:26:23 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40936: [MINOR][CONNECT] Remove unnecessary creation of `planner` in `handleWriteOperation` and `handleWriteOperationV2` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 03:26:47 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40931: [SPARK-43265] Move Error framework to a common utils module - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/25 05:17:23 UTC, 3 replies.
- [GitHub] [spark] khalidmammadov commented on a diff in pull request #40916: [SPARK-43243][PYTHON][CONNECT] Add level param to printSchema for Python - posted by "khalidmammadov (via GitHub)" <gi...@apache.org> on 2023/04/25 05:39:51 UTC, 0 replies.
- [GitHub] [spark] rangadi opened a new pull request, #40937: [SPARK-42940] Improve session management for streaming queries - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/25 06:12:06 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on pull request #40937: [SPARK-42940] Improve session management for streaming queries - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/25 06:14:37 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40887: [SPARK-43144] Scala Client DataStreamReader table() API - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 06:21:07 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40887: [SPARK-43144] Scala Client DataStreamReader table() API - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 06:21:31 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40924: [SPARK-43260][PYTHON] Migrate the Spark SQL pandas arrow type errors into error class. - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 06:22:26 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40924: [SPARK-43260][PYTHON] Migrate the Spark SQL pandas arrow type errors into error class. - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 06:22:40 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #40937: [SPARK-42940] Improve session management for streaming queries - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/25 06:23:55 UTC, 5 replies.
- [GitHub] [spark] mridulm commented on pull request #40921: [SPARK-43242] fix throw 'Unexpected type of BlockId' in diagnose when… - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/25 06:38:14 UTC, 0 replies.
- [GitHub] [spark] bogao007 closed pull request #40923: [Draft] State API (FlatMapGroupsWithState) in Scala for Spark Connect - posted by "bogao007 (via GitHub)" <gi...@apache.org> on 2023/04/25 06:38:49 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on pull request #40911: [SPARK-43237][CORE] Handle null exception message in event log - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/25 06:46:13 UTC, 2 replies.
- [GitHub] [spark] mridulm commented on a diff in pull request #40911: [SPARK-43237][CORE] Handle null exception message in event log - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/25 06:47:12 UTC, 4 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40874: [SPARK-43231][ML][PYTHON][CONNECT][TESTS] Reduce the memory requirement in torch-related tests - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 07:27:26 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40874: [SPARK-43231][ML][PYTHON][CONNECT][TESTS] Reduce the memory requirement in torch-related tests - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 07:28:24 UTC, 0 replies.
- [GitHub] [spark] itholic opened a new pull request, #40938: [SPARK-43274][SPARK-43275][PYTHON][CONNECT] Introduce `PySparkNotImplementedError` - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/25 07:34:29 UTC, 0 replies.
- [GitHub] [spark] itholic opened a new pull request, #40939: [SPARK-43276][CONNECT][PYTHON] Migrate Spark Connect Window errors into error class - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/25 07:54:24 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40916: [SPARK-43243][PYTHON][CONNECT] Add level param to printSchema for Python - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 08:04:08 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40916: [SPARK-43243][PYTHON][CONNECT] Add level param to printSchema for Python - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 08:04:43 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40885: [SPARK-43226] Define extractors for file-constant metadata - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/25 08:10:51 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40885: [SPARK-43226] Define extractors for file-constant metadata - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/25 08:11:37 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40940: [SPARK-43277][YARN] Clean up deprecation hadoop api usage in `yarn` module - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/25 08:27:56 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40940: [SPARK-43277][YARN] Clean up deprecation hadoop api usage in `yarn` module - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/25 08:28:24 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40919: [SPARK-43204][SQL] Align MERGE assignments with table attributes - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/25 08:35:57 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40941: [MINOR][BUILD] Correct the error message in `dev/connect-check-protos.py` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 08:36:24 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40919: [SPARK-43204][SQL] Align MERGE assignments with table attributes - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/25 08:36:41 UTC, 0 replies.
- [GitHub] [spark] juliuszsompolski commented on a diff in pull request #40937: [SPARK-42940] Improve session management for streaming queries - posted by "juliuszsompolski (via GitHub)" <gi...@apache.org> on 2023/04/25 09:04:14 UTC, 1 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40942: [SPARK-43279][CORE] Cleanup unused members from `SparkHadoopUtil` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/25 09:29:40 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng opened a new pull request, #40943: [SPARK-43280][BUILD] Improve the protobuf breaking change checker script - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 09:29:58 UTC, 0 replies.
- [GitHub] [spark] steveloughran commented on pull request #40738: [SPARK-42380][BUILD] Upgrade Apache Maven to 3.9.1 - posted by "steveloughran (via GitHub)" <gi...@apache.org> on 2023/04/25 10:05:45 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40902: [SPARK-43142] Fix DSL expressions on attributes with special characters - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/25 10:23:05 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40902: [SPARK-43142] Fix DSL expressions on attributes with special characters - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/25 10:23:50 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40927: [SPARK-42419][FOLLOWUP][CONNECT][PYTHON] Remove unused exception - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 11:09:26 UTC, 0 replies.
- [GitHub] [spark] jackylee-ch opened a new pull request, #40944: [SPARK-40708][SQL][WIP] Auto update partition statistics based on write metrics - posted by "jackylee-ch (via GitHub)" <gi...@apache.org> on 2023/04/25 11:13:25 UTC, 0 replies.
- [GitHub] [spark] LuciferYang closed pull request #40738: [SPARK-42380][BUILD] Upgrade Apache Maven to 3.9.1 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/25 11:16:58 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40735: [SPARK-43092][CONNECT] Clean up unimplemented `dropDuplicatesWithinWatermark` series functions from `Dataset` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 11:21:58 UTC, 0 replies.
- [GitHub] [spark] bowenliang123 commented on pull request #40738: [SPARK-42380][BUILD] Upgrade Apache Maven to 3.9.1 - posted by "bowenliang123 (via GitHub)" <gi...@apache.org> on 2023/04/25 11:23:49 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40939: [SPARK-43276][CONNECT][PYTHON] Migrate Spark Connect Window errors into error class - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/25 11:26:52 UTC, 0 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40945: [SPARK-43272][CORE] Directly call `createFile` instead of reflection - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/25 11:44:27 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40945: [SPARK-43272][CORE] Directly call `createFile` instead of reflection - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/25 11:47:58 UTC, 1 replies.
- [GitHub] [spark] pan3793 commented on a diff in pull request #40945: [SPARK-43272][CORE] Directly call `createFile` instead of reflection - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/25 11:56:44 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40940: [SPARK-43277][YARN] Clean up deprecation hadoop api usage in `yarn` module - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/25 12:48:04 UTC, 3 replies.
- [GitHub] [spark] LuciferYang closed pull request #40735: [SPARK-43092][CONNECT] Clean up unimplemented `dropDuplicatesWithinWatermark` series functions from `Dataset` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/25 13:04:02 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40735: [SPARK-43092][CONNECT] Clean up unimplemented `dropDuplicatesWithinWatermark` series functions from `Dataset` - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/25 13:04:02 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40937: [SPARK-42940] Improve session management for streaming queries - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/25 13:37:23 UTC, 0 replies.
- [GitHub] [spark] itholic commented on a diff in pull request #40939: [SPARK-43276][CONNECT][PYTHON] Migrate Spark Connect Window errors into error class - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/25 13:40:29 UTC, 1 replies.
- [GitHub] [spark] srowen closed pull request #40430: [SPARK-42798][BUILD] Upgrade protobuf-java to 3.22.3 - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/25 13:53:05 UTC, 0 replies.
- [GitHub] [spark] srowen closed pull request #40893: [SPARK-43225][BUILD][SQL] Remove jackson-core-asl and jackson-mapper-asl from pre-built distribution - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/25 13:54:10 UTC, 0 replies.
- [GitHub] [spark] srowen commented on pull request #40933: [SPARK-43263][BUILD] Upgrade `FasterXML jackson` to 2.15.0 - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/25 13:56:47 UTC, 3 replies.
- [GitHub] [spark] melin commented on pull request #40944: [SPARK-40708][SQL][WIP] Auto update partition statistics based on write metrics - posted by "melin (via GitHub)" <gi...@apache.org> on 2023/04/25 14:27:14 UTC, 0 replies.
- [GitHub] [spark] jchen5 opened a new pull request, #40946: [SPARK-43156][SPARK-43098][SQL] Extend scalar subquery count bug test with decorrelateInnerQuery disabled - posted by "jchen5 (via GitHub)" <gi...@apache.org> on 2023/04/25 14:46:15 UTC, 0 replies.
- [GitHub] [spark] jchen5 commented on pull request #40946: [SPARK-43156][SPARK-43098][SQL] Extend scalar subquery count bug test with decorrelateInnerQuery disabled - posted by "jchen5 (via GitHub)" <gi...@apache.org> on 2023/04/25 14:47:13 UTC, 1 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40933: [SPARK-43263][BUILD] Upgrade `FasterXML jackson` to 2.15.0 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/25 14:53:44 UTC, 3 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40946: [SPARK-43156][SPARK-43098][SQL] Extend scalar subquery count bug test with decorrelateInnerQuery disabled - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/25 14:57:36 UTC, 0 replies.
- [GitHub] [spark] jackylee-ch commented on pull request #40944: [SPARK-40708][SQL][WIP] Auto update partition statistics based on write metrics - posted by "jackylee-ch (via GitHub)" <gi...@apache.org> on 2023/04/25 15:14:13 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40918: [WIP][CORE] Add shuffle sort merge joins to RDD API - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/25 15:22:25 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40896: [SPARK-43229][ML][PYTHON][CONNECT] Introduce Barrier Python UDF - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/25 15:23:38 UTC, 1 replies.
- [GitHub] [spark] databricks-david-lewis opened a new pull request, #40947: [Spark-43284] Switch back to url-encoded strings - posted by "databricks-david-lewis (via GitHub)" <gi...@apache.org> on 2023/04/25 16:15:26 UTC, 0 replies.
- [GitHub] [spark] vicennial opened a new pull request, #40948: [SPARK-43285] Fix ReplE2ESuite consistently failing with JDK 17 - posted by "vicennial (via GitHub)" <gi...@apache.org> on 2023/04/25 16:19:52 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40738: [SPARK-42380][BUILD] Upgrade Apache Maven to 3.9.1 - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/25 16:57:35 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40948: [SPARK-43285] Fix ReplE2ESuite consistently failing with JDK 17 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/25 17:26:15 UTC, 1 replies.
- [GitHub] [spark] thejdeep opened a new pull request, #40949: [DRAFT][SPARK-23607][CORE] Use HDFS extended attributes to store application summary information in SHS - posted by "thejdeep (via GitHub)" <gi...@apache.org> on 2023/04/25 17:34:27 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on pull request #40918: [WIP][CORE] Add shuffle sort merge joins to RDD API - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/25 17:40:50 UTC, 0 replies.
- [GitHub] [spark] vicennial commented on pull request #40948: [SPARK-43285] Fix ReplE2ESuite consistently failing with JDK 17 - posted by "vicennial (via GitHub)" <gi...@apache.org> on 2023/04/25 17:43:24 UTC, 0 replies.
- [GitHub] [spark] Knorreman commented on pull request #40918: [WIP][CORE] Add shuffle sort merge joins to RDD API - posted by "Knorreman (via GitHub)" <gi...@apache.org> on 2023/04/25 18:52:14 UTC, 1 replies.
- [GitHub] [spark] pengzhon-db commented on a diff in pull request #40937: [SPARK-42940] Improve session management for streaming queries - posted by "pengzhon-db (via GitHub)" <gi...@apache.org> on 2023/04/25 18:54:58 UTC, 0 replies.
- [GitHub] [spark] pjfanning commented on a diff in pull request #40933: [SPARK-43263][BUILD] Upgrade `FasterXML jackson` to 2.15.0 - posted by "pjfanning (via GitHub)" <gi...@apache.org> on 2023/04/25 18:57:08 UTC, 5 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40948: [SPARK-43285] Fix ReplE2ESuite consistently failing with JDK 17 - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/25 19:10:26 UTC, 0 replies.
- [GitHub] [spark] otterc commented on a diff in pull request #40921: [SPARK-43242] fix throw 'Unexpected type of BlockId' in diagnose when… - posted by "otterc (via GitHub)" <gi...@apache.org> on 2023/04/25 19:10:49 UTC, 0 replies.
- [GitHub] [spark] hvanhovell closed pull request #40948: [SPARK-43285] Fix ReplE2ESuite consistently failing with JDK 17 - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/25 19:11:11 UTC, 0 replies.
- [GitHub] [spark] WweiL opened a new pull request, #40950: [SPARK-43206] [SS] [CONNECT] [DRAFT] [DO-NOT-REVIEW] StreamingQuery exception() include stack trace - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/25 20:29:47 UTC, 0 replies.
- [GitHub] [spark] WweiL closed pull request #40935: [SPARK-43206] [SS] [CONNECT] [DRAFT] [DO-NOT-REVIEW] StreamingQuery exception() include stack trace - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/25 20:33:34 UTC, 0 replies.
- [GitHub] [spark] BeishaoCao-db commented on pull request #40907: [SPARK-43270][PYTHON] Implement `__dir__()` in `pyspark.sql.dataframe.DataFrame` to include columns - posted by "BeishaoCao-db (via GitHub)" <gi...@apache.org> on 2023/04/25 20:40:18 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40729: [SPARK-43136][CONNECT] Adding groupByKey + mapGroup + coGroup functions - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/25 20:57:25 UTC, 0 replies.
- [GitHub] [spark] hvanhovell closed pull request #40729: [SPARK-43136][CONNECT] Adding groupByKey + mapGroup + coGroup functions - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/25 20:58:03 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40906: [SPARK-43134] [CONNECT] [SS] JVM client StreamingQuery exception() API - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/25 21:09:56 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40938: [SPARK-43274][SPARK-43275][PYTHON][CONNECT] Introduce `PySparkNotImplementedError` - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/25 21:12:21 UTC, 1 replies.
- [GitHub] [spark] zhenlineo commented on pull request #40762: [SPARK-42953][Connect][Followup] Fix maven test build for Scala client UDF tests - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/25 21:57:35 UTC, 0 replies.
- [GitHub] [spark] amousavigourabi opened a new pull request, #40951: [SPARK-43250] Replace the error class `_LEGACY_ERROR_TEMP_2014` with an internal error - posted by "amousavigourabi (via GitHub)" <gi...@apache.org> on 2023/04/25 21:58:10 UTC, 0 replies.
- [GitHub] [spark] WweiL commented on a diff in pull request #40937: [SPARK-42940] Improve session management for streaming queries - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/25 22:26:28 UTC, 11 replies.
- [GitHub] [spark] srielau commented on pull request #40884: [SPARK-43205] IDENTIFIER() clause - posted by "srielau (via GitHub)" <gi...@apache.org> on 2023/04/25 23:16:14 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #38496: [SPARK-40708][SQL] Auto update table statistics based on write metrics - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/26 00:18:32 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40947: [Spark-43284] Switch back to url-encoded strings - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/26 01:19:21 UTC, 3 replies.
- [GitHub] [spark-docker] Yikun opened a new pull request, #35: [WIP] Switch 3.4.0 default Java to Java17 - posted by "Yikun (via GitHub)" <gi...@apache.org> on 2023/04/26 01:21:35 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40941: [MINOR][BUILD] Correct the error message in `dev/connect-check-protos.py` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/26 01:25:00 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40941: [MINOR][BUILD] Correct the error message in `dev/connect-check-protos.py` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/26 01:25:23 UTC, 0 replies.
- [GitHub] [spark] itholic commented on a diff in pull request #40938: [SPARK-43274][SPARK-43275][PYTHON][CONNECT] Introduce `PySparkNotImplementedError` - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/26 01:33:12 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #40892: [SPARK-43128][CONNECT][SS] Make `recentProgress` and `lastProgress` return `StreamingQueryProgress` consistent with the native Scala Api - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/26 01:35:07 UTC, 0 replies.
- [GitHub] [spark] ulysses-you opened a new pull request, #40952: [SPARK-43281][SQL] Fix concurrent writer does not update file metrics - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/26 01:51:00 UTC, 0 replies.
- [GitHub] [spark] ulysses-you commented on pull request #40952: [SPARK-43281][SQL] Fix concurrent writer does not update file metrics - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/26 01:53:16 UTC, 2 replies.
- [GitHub] [spark] JoshRosen commented on a diff in pull request #39011: [SPARK-41469][CORE] Avoid unnecessary task rerun on decommissioned executor lost if shuffle data migrated - posted by "JoshRosen (via GitHub)" <gi...@apache.org> on 2023/04/26 02:19:01 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X opened a new pull request, #40953: [SPARK-43267][JDBC] Handle postgres unknown user-defined column as string in array - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/26 03:08:26 UTC, 0 replies.
- [GitHub] [spark] srowen closed pull request #40940: [SPARK-43277][YARN] Clean up deprecation hadoop api usage in `yarn` module - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/26 03:12:44 UTC, 0 replies.
- [GitHub] [spark] srowen commented on pull request #40940: [SPARK-43277][YARN] Clean up deprecation hadoop api usage in `yarn` module - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/26 03:12:45 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40943: [SPARK-43280][BUILD] Reimplement the protobuf breaking change checker - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/26 03:45:41 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40939: [SPARK-43276][CONNECT][PYTHON] Migrate Spark Connect Window errors into error class - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/26 03:53:56 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40939: [SPARK-43276][CONNECT][PYTHON] Migrate Spark Connect Window errors into error class - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/26 03:54:02 UTC, 0 replies.
- [GitHub] [spark] WeichenXu123 opened a new pull request, #40954: [PYSPARK] [CONNECT] [ML] PySpark UDF supports python package dependencies - posted by "WeichenXu123 (via GitHub)" <gi...@apache.org> on 2023/04/26 03:56:52 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40933: [SPARK-43263][BUILD] Upgrade `FasterXML jackson` to 2.15.0 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/26 04:04:31 UTC, 4 replies.
- [GitHub] [spark] ulysses-you commented on a diff in pull request #40952: [SPARK-43281][SQL] Fix concurrent writer does not update file metrics - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/26 04:10:37 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40946: [SPARK-43156][SPARK-43098][SQL] Extend scalar subquery count bug test with decorrelateInnerQuery disabled - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/26 04:13:50 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40946: [SPARK-43156][SPARK-43098][SQL] Extend scalar subquery count bug test with decorrelateInnerQuery disabled - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/26 04:14:38 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40931: [SPARK-43265] Move Error framework to a common utils module - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/26 04:48:22 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40896: [SPARK-43229][ML][PYTHON][CONNECT] Introduce Barrier Python UDF - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/26 04:52:26 UTC, 4 replies.
- [GitHub] [spark] liang3zy22 opened a new pull request, #40955: [SPARK-42843][SQL] Update the error class _LEGACY_ERROR_TEMP_2007 to REGEX_GROUP_INDEX_EXCEED_REGEX_GROUP_COUNT - posted by "liang3zy22 (via GitHub)" <gi...@apache.org> on 2023/04/26 05:20:58 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40931: [SPARK-43265] Move Error framework to a common utils module - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/26 05:23:17 UTC, 2 replies.
- [GitHub] [spark] itholic commented on a diff in pull request #40658: [WIP][SPARK-43024][PS] Upgrade pandas to 2.0.0 - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/26 05:25:14 UTC, 0 replies.
- [GitHub] [spark] sadikovi commented on a diff in pull request #40922: [SPARK-43063][SQL][FOLLOWUP] Add ToPrettyString expression for Dataset.show - posted by "sadikovi (via GitHub)" <gi...@apache.org> on 2023/04/26 05:34:10 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40922: [SPARK-43063][SQL][FOLLOWUP] Add ToPrettyString expression for Dataset.show - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/26 05:43:16 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40938: [SPARK-43274][SPARK-43275][PYTHON][CONNECT] Introduce `PySparkNotImplementedError` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/26 05:47:29 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40938: [SPARK-43274][SPARK-43275][PYTHON][CONNECT] Introduce `PySparkNotImplementedError` - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/26 05:47:52 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40617: [SPARK-42992][PYTHON] Introduce PySparkRuntimeError - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/26 05:52:10 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40617: [SPARK-42992][PYTHON] Introduce PySparkRuntimeError - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/26 05:52:27 UTC, 0 replies.
- [GitHub] [spark] Surbhi-Vijay commented on a diff in pull request #40171: [SPARK-42598][TEST] Refactor TPCH schema to separate file similar to TPCDS for code reuse - posted by "Surbhi-Vijay (via GitHub)" <gi...@apache.org> on 2023/04/26 05:52:57 UTC, 0 replies.
- [GitHub] [spark] databricks-david-lewis commented on a diff in pull request #40947: [Spark-43284] Switch back to url-encoded strings - posted by "databricks-david-lewis (via GitHub)" <gi...@apache.org> on 2023/04/26 06:15:50 UTC, 1 replies.
- [GitHub] [spark] WeichenXu123 commented on a diff in pull request #40954: [PYSPARK] [CONNECT] [ML] PySpark UDF supports python package dependencies - posted by "WeichenXu123 (via GitHub)" <gi...@apache.org> on 2023/04/26 06:20:31 UTC, 4 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40722: [SPARK-43076][PS][CONNECT] Removing the dependency on `grpcio` when remote session is not used. - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/26 06:22:24 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40722: [SPARK-43076][PS][CONNECT] Removing the dependency on `grpcio` when remote session is not used. - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/26 06:24:04 UTC, 0 replies.
- [GitHub] [spark] viirya commented on a diff in pull request #40915: [SPARK-43232][SQL] Improve ObjectHashAggregateExec performance for high cardinality - posted by "viirya (via GitHub)" <gi...@apache.org> on 2023/04/26 07:32:57 UTC, 2 replies.
- [GitHub] [spark] cloud-fan closed pull request #40856: [SPARK-43199][SQL] Make InlineCTE idempotent - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/26 07:33:00 UTC, 0 replies.
- [GitHub] [spark] eejbyfeldt commented on pull request #38428: [SPARK-40912][CORE]Overhead of Exceptions in KryoDeserializationStream - posted by "eejbyfeldt (via GitHub)" <gi...@apache.org> on 2023/04/26 07:42:03 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40896: [SPARK-43229][ML][PYTHON][CONNECT] Introduce Barrier Python UDF - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/26 07:50:39 UTC, 1 replies.
- [GitHub] [spark] peter-toth commented on pull request #40932: [SPARK-43266][SQL] Move MergeScalarSubqueries to spark-sql - posted by "peter-toth (via GitHub)" <gi...@apache.org> on 2023/04/26 07:50:47 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40368: [SPARK-42748][CONNECT] Server-side Artifact Management - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/26 08:04:45 UTC, 2 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40956: [SPARK-43292][CONNECT][BUILD] Add `spark-repl` as test dependency of `connect-server` module - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/26 08:16:11 UTC, 0 replies.
- [GitHub] [spark] bjornjorgensen commented on pull request #40722: [SPARK-43076][PS][CONNECT] Removing the dependency on `grpcio` when remote session is not used. - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/26 08:36:28 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on a diff in pull request #40954: [PYSPARK] [CONNECT] [ML] PySpark UDF supports python package dependencies - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/26 08:37:43 UTC, 0 replies.
- [GitHub] [spark] JinHelin404 opened a new pull request, #40957: [SPARK-43257][SQL]Delete the error class _LEGACY_ERROR_TEMP_2010 - posted by "JinHelin404 (via GitHub)" <gi...@apache.org> on 2023/04/26 08:42:15 UTC, 0 replies.
- [GitHub] [spark] JkSelf opened a new pull request, #40958: [SPARK-43240][SQL][3.3] Fix the wrong result issue when calling df.describe() method. #40914 - posted by "JkSelf (via GitHub)" <gi...@apache.org> on 2023/04/26 08:47:59 UTC, 0 replies.
- [GitHub] [spark] JinHelin404 commented on pull request #40957: [SPARK-43257][SQL]Delete the error class _LEGACY_ERROR_TEMP_2022 - posted by "JinHelin404 (via GitHub)" <gi...@apache.org> on 2023/04/26 09:00:27 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X closed pull request #40649: [SPARK-41628][CONNECT][SERVER] The Design for support async query execution - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/26 09:05:36 UTC, 0 replies.
- [GitHub] [spark] justaparth commented on pull request #40686: [SPARK-43051][PROTOBUF] Add option to materialize zero values when deserializing protobufs - posted by "justaparth (via GitHub)" <gi...@apache.org> on 2023/04/26 09:05:54 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40914: [SPARK-43240][SQL][3.3] Fix the wrong result issue when calling df.describe() method. - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/26 09:23:48 UTC, 0 replies.
- [GitHub] [spark] JkSelf closed pull request #40958: [SPARK-43240][SQL][3.3] Fix the wrong result issue when calling df.describe() method. #40914 - posted by "JkSelf (via GitHub)" <gi...@apache.org> on 2023/04/26 09:24:16 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40914: [SPARK-43240][SQL][3.3] Fix the wrong result issue when calling df.describe() method. - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/26 09:25:43 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on pull request #38428: [SPARK-40912][CORE]Overhead of Exceptions in KryoDeserializationStream - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/26 09:26:06 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on pull request #39673: [SPARK-42132][SQL] Deduplicate attributes in groupByKey.cogroup - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/26 09:42:50 UTC, 0 replies.
- [GitHub] [spark] bogao007 opened a new pull request, #40959: [CONNECT][SS]Implemented MapGroupsWithState and FlatMapGroupsWithState APIs for Spark Connect - posted by "bogao007 (via GitHub)" <gi...@apache.org> on 2023/04/26 09:48:27 UTC, 0 replies.
- [GitHub] [spark] bogao007 commented on a diff in pull request #40959: [CONNECT][SS]Implemented MapGroupsWithState and FlatMapGroupsWithState APIs for Spark Connect - posted by "bogao007 (via GitHub)" <gi...@apache.org> on 2023/04/26 09:50:50 UTC, 0 replies.
- [GitHub] [spark] CavemanIV commented on a diff in pull request #40921: [SPARK-43242] fix throw 'Unexpected type of BlockId' in diagnose when… - posted by "CavemanIV (via GitHub)" <gi...@apache.org> on 2023/04/26 09:51:00 UTC, 0 replies.
- [GitHub] [spark] allisonwang-db commented on a diff in pull request #40926: [SPARK-43261][PYTHON] Migrate `TypeError` from Spark SQL types into error class. - posted by "allisonwang-db (via GitHub)" <gi...@apache.org> on 2023/04/26 12:04:48 UTC, 0 replies.
- [GitHub] [spark] aimtsou opened a new pull request, #40960: [SPARK-43160][PYTHON-INFRA]: Upgrade mypy and pytest-mypypplugins packages - posted by "aimtsou (via GitHub)" <gi...@apache.org> on 2023/04/26 12:19:29 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR commented on pull request #40906: [SPARK-43134] [CONNECT] [SS] JVM client StreamingQuery exception() API - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/26 12:57:28 UTC, 1 replies.
- [GitHub] [spark] HeartSaVioR closed pull request #40906: [SPARK-43134] [CONNECT] [SS] JVM client StreamingQuery exception() API - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/26 12:58:53 UTC, 0 replies.
- [GitHub] [spark] cloud-fan opened a new pull request, #40961: [SPARK-43293][SQL] `__qualified_access_only` should be ignored in normal columns - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/26 14:30:12 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40961: [SPARK-43293][SQL] `__qualified_access_only` should be ignored in normal columns - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/26 14:30:41 UTC, 1 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #39673: [SPARK-42132][SQL] Deduplicate attributes in groupByKey.cogroup - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/26 14:37:07 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40884: [SPARK-43205] IDENTIFIER() clause - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/26 14:42:07 UTC, 4 replies.
- [GitHub] [spark] bjornjorgensen commented on a diff in pull request #40933: [SPARK-43263][BUILD] Upgrade `FasterXML jackson` to 2.15.0 - posted by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/26 14:46:38 UTC, 1 replies.
- [GitHub] [spark] itholic commented on a diff in pull request #40926: [SPARK-43261][PYTHON] Migrate `TypeError` from Spark SQL types into error class. - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/26 14:51:44 UTC, 2 replies.
- [GitHub] [spark] LuciferYang opened a new pull request, #40962: [SPARK-43294][BUILD] Upgrade zstd-jni to 1.5.5-2 - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/26 15:00:59 UTC, 0 replies.
- [GitHub] [spark] jzhuge opened a new pull request, #40963: [SPARK-43288][SQL] DataSourceV2: CREATE TABLE LIKE - posted by "jzhuge (via GitHub)" <gi...@apache.org> on 2023/04/26 15:40:18 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on a diff in pull request #40762: [SPARK-42953][Connect][Followup] Fix maven test build for Scala client UDF tests - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/26 16:07:00 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40762: [SPARK-42953][Connect][Followup] Fix maven test build for Scala client UDF tests - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/26 16:07:09 UTC, 0 replies.
- [GitHub] [spark] hvanhovell closed pull request #40762: [SPARK-42953][Connect][Followup] Fix maven test build for Scala client UDF tests - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/26 16:07:40 UTC, 0 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40956: [SPARK-43292][BUILD] [CONNECT] Add `spark-repl` as maven test dependency of `connect-server` module - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/26 16:09:18 UTC, 2 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40894: [SPARK-43198][CONNECT] Fix "Could not initialise class ammonite..." error when using filter - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/26 16:11:51 UTC, 0 replies.
- [GitHub] [spark] hvanhovell closed pull request #40894: [SPARK-43198][CONNECT] Fix "Could not initialise class ammonite..." error when using filter - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/26 16:12:36 UTC, 0 replies.
- [GitHub] [spark] srielau commented on a diff in pull request #40884: [SPARK-43205] IDENTIFIER() clause - posted by "srielau (via GitHub)" <gi...@apache.org> on 2023/04/26 16:25:16 UTC, 4 replies.
- [GitHub] [spark] juliuszsompolski closed pull request #40777: [CONNECT] Dump of query cancellation hacking - posted by "juliuszsompolski (via GitHub)" <gi...@apache.org> on 2023/04/26 16:30:53 UTC, 0 replies.
- [GitHub] [spark-docker] HyukjinKwon commented on a diff in pull request #34: [SPARK-40513][DOCS] Add apache/spark docker image overview - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/26 16:56:34 UTC, 0 replies.
- [GitHub] [spark-docker] HyukjinKwon commented on pull request #34: [SPARK-40513][DOCS] Add apache/spark docker image overview - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/26 16:57:13 UTC, 0 replies.
- [GitHub] [spark] itholic opened a new pull request, #40964: [SPARK-43296][CONNECT][PYTHON] Migrate Spark Connect session errors into error class - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/26 16:59:17 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40937: [SPARK-42940][SS][CONNECT] Improve session management for streaming queries - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/26 17:11:11 UTC, 1 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #40937: [SPARK-42940][SS][CONNECT] Improve session management for streaming queries - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/26 17:19:47 UTC, 12 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40896: [SPARK-43229][ML][PYTHON][CONNECT] Introduce Barrier Python UDF - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/26 17:26:27 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40943: [SPARK-43280][BUILD] Reimplement the protobuf breaking change checker - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/26 17:34:39 UTC, 1 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40943: [SPARK-43280][BUILD] Reimplement the protobuf breaking change checker - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/26 17:35:09 UTC, 0 replies.
- [GitHub] [spark] itholic opened a new pull request, #40965: [SPARK-42192][FOLLOWUP][PYTHON] Refine improper error class and error type - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/26 17:35:17 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40877: [SPARK-31733][YARN][TESTS] Make `specify a more specific type for the application` in `ClientSuite` pass in Hadoop 3 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/26 17:35:17 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40877: [SPARK-31733][YARN][TESTS] Make `specify a more specific type for the application` in `ClientSuite` pass in Hadoop 3 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/26 17:35:39 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40420: [SPARK-42617][PS] Support `isocalendar` from the pandas 2.0.0 - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/26 17:41:30 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40436: [SPARK-42619][PS] Add `show_counts` parameter for DataFrame.info - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/26 17:41:39 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40934: [SPARK-43268][SQL] Use proper error classes when exceptions are constructed with a message - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/26 17:43:32 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40956: [SPARK-43292][BUILD] [CONNECT] Add `spark-repl` as maven test dependency of `connect-server` module - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/26 17:49:19 UTC, 3 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40907: [SPARK-43270][PYTHON] Implement `__dir__()` in `pyspark.sql.dataframe.DataFrame` to include columns - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/26 17:50:15 UTC, 2 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40907: [SPARK-43270][PYTHON] Implement `__dir__()` in `pyspark.sql.dataframe.DataFrame` to include columns - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/26 17:51:04 UTC, 2 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40370: [SPARK-42620][PS] Add `inclusive` parameter for (DataFrame|Series).between_time - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/26 17:51:33 UTC, 0 replies.
- [GitHub] [spark] WweiL opened a new pull request, #40966: [SPARK-43206] [SS] [CONNECT] [DRAFT] [DO-NOT-REVIEW] StreamingQuery exception() include stack trace - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/26 18:58:06 UTC, 0 replies.
- [GitHub] [spark] WweiL commented on pull request #40966: [SPARK-43206] [SS] [CONNECT] [DRAFT] [DO-NOT-REVIEW] StreamingQuery exception() include stack trace - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/26 19:42:43 UTC, 1 replies.
- [GitHub] [spark] WweiL closed pull request #40950: [SPARK-43206] [SS] [CONNECT] [DRAFT] [DO-NOT-REVIEW] StreamingQuery exception() include stack trace - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/26 19:48:59 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #40827: [SPARK-42585][CONNECT] Streaming of local relations - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/26 19:53:57 UTC, 3 replies.
- [GitHub] [spark] EnricoMi commented on a diff in pull request #39673: [SPARK-42132][SQL] Deduplicate attributes in groupByKey.cogroup - posted by "EnricoMi (via GitHub)" <gi...@apache.org> on 2023/04/26 20:34:32 UTC, 1 replies.
- [GitHub] [spark] leewyang opened a new pull request, #40967: predict_batch_udf with scalar input fails with batch size of one - posted by "leewyang (via GitHub)" <gi...@apache.org> on 2023/04/26 20:54:24 UTC, 0 replies.
- [GitHub] [spark] leewyang commented on pull request #40967: predict_batch_udf with scalar input fails with batch size of one - posted by "leewyang (via GitHub)" <gi...@apache.org> on 2023/04/26 20:55:10 UTC, 0 replies.
- [GitHub] [spark] WweiL opened a new pull request, #40968: [SPARK-43143] [SS] [CONNECT] Scala StreamingQuery awaitTermination() - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/26 21:02:30 UTC, 0 replies.
- [GitHub] [spark] WweiL commented on pull request #40968: [SPARK-43143] [SS] [CONNECT] Scala StreamingQuery awaitTermination() - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/26 21:06:00 UTC, 1 replies.
- [GitHub] [spark] pengzhon-db commented on a diff in pull request #40937: [SPARK-42940][SS][CONNECT] Improve session management for streaming queries - posted by "pengzhon-db (via GitHub)" <gi...@apache.org> on 2023/04/26 21:06:05 UTC, 2 replies.
- [GitHub] [spark] WweiL commented on a diff in pull request #40937: [SPARK-42940][SS][CONNECT] Improve session management for streaming queries - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/26 21:08:31 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #40968: [SPARK-43143] [SS] [CONNECT] Scala StreamingQuery awaitTermination() - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/26 21:19:16 UTC, 2 replies.
- [GitHub] [spark] warrenzhu25 commented on a diff in pull request #40911: [SPARK-43237][CORE] Handle null exception message in event log - posted by "warrenzhu25 (via GitHub)" <gi...@apache.org> on 2023/04/26 22:42:52 UTC, 1 replies.
- [GitHub] [spark] sweisdb opened a new pull request, #40969: [SPARK-43286][SQL] Updates aes_encrypt CBC mode to generate random IVs - posted by "sweisdb (via GitHub)" <gi...@apache.org> on 2023/04/26 22:47:59 UTC, 0 replies.
- [GitHub] [spark] sweisdb commented on pull request #40903: [WIP][SPARK-NNNNN] Updating AES-CBC support to not use OpenSSL's KDF - posted by "sweisdb (via GitHub)" <gi...@apache.org> on 2023/04/26 22:48:56 UTC, 0 replies.
- [GitHub] [spark] sweisdb closed pull request #40903: [WIP][SPARK-NNNNN] Updating AES-CBC support to not use OpenSSL's KDF - posted by "sweisdb (via GitHub)" <gi...@apache.org> on 2023/04/26 22:48:57 UTC, 0 replies.
- [GitHub] [spark] sweisdb opened a new pull request, #40970: [SPARK-43290][SQL] Adds IV and AAD support to aes_encrypt/aes_decrypt - posted by "sweisdb (via GitHub)" <gi...@apache.org> on 2023/04/26 22:58:25 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40968: [SPARK-43143] [SS] [CONNECT] Scala StreamingQuery awaitTermination() - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/26 23:11:45 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40890: [SPARK-43219][SQL][DOCS] Add `INSERT INTO REPLACE WHERE` statement into website - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/27 00:55:14 UTC, 1 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40965: [SPARK-42192][FOLLOWUP][PYTHON] Refine improper error class and error type - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/27 00:56:15 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40965: [SPARK-42192][FOLLOWUP][PYTHON] Refine improper error class and error type - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/27 00:56:31 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40954: [PYSPARK] [CONNECT] [ML] PySpark UDF supports python package dependencies - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/27 01:05:54 UTC, 0 replies.
- [GitHub] [spark] HeartSaVioR commented on a diff in pull request #40937: [SPARK-42940][SS][CONNECT] Improve session management for streaming queries - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/27 01:17:16 UTC, 3 replies.
- [GitHub] [spark] hvanhovell commented on pull request #40931: [SPARK-43265] Move Error framework to a common utils module - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/27 01:23:02 UTC, 0 replies.
- [GitHub] [spark] hvanhovell closed pull request #40931: [SPARK-43265] Move Error framework to a common utils module - posted by "hvanhovell (via GitHub)" <gi...@apache.org> on 2023/04/27 01:23:42 UTC, 0 replies.
- [GitHub] [spark] itholic commented on pull request #40370: [SPARK-42620][PS] Add `inclusive` parameter for (DataFrame|Series).between_time - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/27 02:02:49 UTC, 0 replies.
- [GitHub] [spark] itholic commented on pull request #40436: [SPARK-42619][PS] Add `show_counts` parameter for DataFrame.info - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/27 02:07:20 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on a diff in pull request #40890: [SPARK-43219][SQL][DOCS] Add `INSERT INTO REPLACE WHERE` statement into website - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/27 02:08:21 UTC, 0 replies.
- [GitHub] [spark] itholic commented on pull request #40420: [SPARK-42617][PS] Support `isocalendar` from the pandas 2.0.0 - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/27 02:10:52 UTC, 1 replies.
- [GitHub] [spark] cloud-fan closed pull request #40961: [SPARK-43293][SQL] `__qualified_access_only` should be ignored in normal columns - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/27 02:52:18 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on pull request #40967: [SPARK-43298][PYTHON][ML] predict_batch_udf with scalar input fails with batch size of one - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/27 03:20:01 UTC, 2 replies.
- [GitHub] [spark] allisonwang-db commented on a diff in pull request #40966: [SPARK-43206] [SS] [CONNECT] StreamingQuery exception() include stack trace - posted by "allisonwang-db (via GitHub)" <gi...@apache.org> on 2023/04/27 03:39:15 UTC, 0 replies.
- [GitHub] [spark] sunchao closed pull request #40920: [SPARK-43248][SQL] Unnecessary serialize/deserialize of Path on parallel gather partition stats - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/27 03:40:32 UTC, 0 replies.
- [GitHub] [spark] sunchao commented on pull request #40920: [SPARK-43248][SQL] Unnecessary serialize/deserialize of Path on parallel gather partition stats - posted by "sunchao (via GitHub)" <gi...@apache.org> on 2023/04/27 03:40:55 UTC, 0 replies.
- [GitHub] [spark] cxzl25 opened a new pull request, #40972: [SPARK-43301][CORE][SHUFFLE] BlockStoreClient getHostLocalDirs RPC supports IOexception retry - posted by "cxzl25 (via GitHub)" <gi...@apache.org> on 2023/04/27 03:47:42 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40739: [SPARK-43302][SQL] Make Python UDAF an AggregateFunction - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/27 03:55:51 UTC, 1 replies.
- [GitHub] [spark] wgtmac closed pull request #40971: Test Apache ORC 1.7.9-SNAPSHOT - posted by "wgtmac (via GitHub)" <gi...@apache.org> on 2023/04/27 04:12:57 UTC, 0 replies.
- [GitHub] [spark] mridulm closed pull request #40687: [SPARK-43052][CORE] Handle stacktrace with null file name in event log - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/27 04:33:15 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40586: [SPARK-42939][SS][CONNECT] Core streaming Python API for Spark Connect - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/27 04:40:14 UTC, 0 replies.
- [GitHub] [spark] yaooqinn commented on a diff in pull request #40922: [SPARK-43063][SQL][FOLLOWUP] Add ToPrettyString expression for Dataset.show - posted by "yaooqinn (via GitHub)" <gi...@apache.org> on 2023/04/27 05:02:46 UTC, 2 replies.
- [GitHub] [spark] LuciferYang commented on a diff in pull request #40675: [SPARK-42657][CONNECT] Support to find and transfer client-side REPL classfiles to server as artifacts - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/27 05:19:22 UTC, 0 replies.
- [GitHub] [spark] itholic opened a new pull request, #40973: [SPARK-43304][CONNECT][PYTHON] Migrate `NotImplementedError` into `PySparkNotImplementedError` - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/27 08:30:33 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40890: [SPARK-43219][SQL][DOCS] Add `INSERT INTO REPLACE WHERE` statement into website - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/27 08:31:47 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40890: [SPARK-43219][SQL][DOCS] Add `INSERT INTO REPLACE WHERE` statement into website - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/27 08:33:05 UTC, 0 replies.
- [GitHub] [spark] itholic commented on pull request #40926: [SPARK-43261][PYTHON] Migrate `TypeError` from Spark SQL types into error class. - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/27 08:34:03 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40865: [SPARK-43156][SQL] Fix `COUNT(*) is null` bug in correlated scalar subquery - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/27 08:35:47 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40922: [SPARK-43063][SQL][FOLLOWUP] Add ToPrettyString expression for Dataset.show - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/27 08:40:09 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40437: [SPARK-41259][SQL] SparkSQLDriver use the spark result string that is consistent with that of `df.show` - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/27 08:41:25 UTC, 0 replies.
- [GitHub] [spark] JinHelin404 commented on pull request #40957: [SPARK-43257][SQL] replace the error class _LEGACY_ERROR_TEMP_2022 with internal error - posted by "JinHelin404 (via GitHub)" <gi...@apache.org> on 2023/04/27 08:46:53 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40926: [SPARK-43261][PYTHON] Migrate `TypeError` from Spark SQL types into error class. - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/27 08:56:17 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40926: [SPARK-43261][PYTHON] Migrate `TypeError` from Spark SQL types into error class. - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/27 08:56:28 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40827: [SPARK-42585][CONNECT] Streaming of local relations - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/27 09:22:23 UTC, 1 replies.
- [GitHub] [spark] HengQian-chaine opened a new pull request, #40974: [CORE] Clear the bitmap for tracking free pages when invoking cleanUp… - posted by "HengQian-chaine (via GitHub)" <gi...@apache.org> on 2023/04/27 10:43:45 UTC, 0 replies.
- [GitHub] [spark] itholic opened a new pull request, #40975: [SPARK-43306][PYTHON] Migrate `ValueError` from Spark SQL types into error class - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/27 10:53:19 UTC, 0 replies.
- [GitHub] [spark] itholic opened a new pull request, #40976: [SPARK-43307][PYTHON] Migrate PandasUDF value errors into error class - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/27 11:17:33 UTC, 0 replies.
- [GitHub] [spark] justaparth commented on a diff in pull request #40686: [SPARK-43051][PROTOBUF] Add option to materialize zero values for fields without presence information - posted by "justaparth (via GitHub)" <gi...@apache.org> on 2023/04/27 11:26:42 UTC, 3 replies.
- [GitHub] [spark] MaxGekk commented on pull request #40957: [SPARK-43257][SQL] Replace the error class _LEGACY_ERROR_TEMP_2022 by an internal error - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/27 11:34:42 UTC, 1 replies.
- [GitHub] [spark] peter-toth commented on pull request #40744: [SPARK-24497][SQL] Support recursive SQL - posted by "peter-toth (via GitHub)" <gi...@apache.org> on 2023/04/27 11:38:47 UTC, 0 replies.
- [GitHub] [spark] JinHelin404 commented on pull request #40957: [SPARK-43257][SQL] Replace the error class _LEGACY_ERROR_TEMP_2022 by an internal error - posted by "JinHelin404 (via GitHub)" <gi...@apache.org> on 2023/04/27 11:49:28 UTC, 1 replies.
- [GitHub] [spark] Hisoka-X opened a new pull request, #40977: [Cherry-pick][SQL] Cherry pick fix COUNT(*) is null bug in correlated scalar subquery - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/27 11:58:29 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X commented on pull request #40977: [Cherry-pick][SQL] Cherry pick fix COUNT(*) is null bug in correlated scalar subquery - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/27 11:59:38 UTC, 0 replies.
- [GitHub] [spark] MaxGekk closed pull request #40957: [SPARK-43257][SQL] Replace the error class _LEGACY_ERROR_TEMP_2022 by an internal error - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/27 12:01:49 UTC, 0 replies.
- [GitHub] [spark] srowen commented on a diff in pull request #40933: [SPARK-43263][BUILD] Upgrade `FasterXML jackson` to 2.15.0 - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/27 12:54:24 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #40686: [SPARK-43051][PROTOBUF] Add option to materialize zero values for fields without presence information - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/27 14:03:15 UTC, 3 replies.
- [GitHub] [spark] bozhang2820 opened a new pull request, #40978: [SPARK-43309][SPARK-38461][CORE] Extend INTERNAL_ERROR with categories and add error class INTERNAL_ERROR_BROADCAST - posted by "bozhang2820 (via GitHub)" <gi...@apache.org> on 2023/04/27 14:06:18 UTC, 0 replies.
- [GitHub] [spark] Hisoka-X opened a new pull request, #40979: [SPARK-43308][SQL] Improve scalar subquery logic plan when result are literal - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/27 14:21:02 UTC, 0 replies.
- [GitHub] [spark] amousavigourabi commented on pull request #40951: [SPARK-43250][SQL] Replace the error class `_LEGACY_ERROR_TEMP_2014` with an internal error - posted by "amousavigourabi (via GitHub)" <gi...@apache.org> on 2023/04/27 14:34:49 UTC, 0 replies.
- [GitHub] [spark] dstrodtman-db commented on pull request #40561: [SPARK-42931][SS] Introduce dropDuplicatesWithinWatermark - posted by "dstrodtman-db (via GitHub)" <gi...@apache.org> on 2023/04/27 15:03:25 UTC, 0 replies.
- [GitHub] [spark] leewyang commented on pull request #40967: [SPARK-43298][PYTHON][ML] predict_batch_udf with scalar input fails with batch size of one - posted by "leewyang (via GitHub)" <gi...@apache.org> on 2023/04/27 15:19:40 UTC, 1 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #40970: [WIP][SPARK-43290][SQL] Adds IV and AAD support to aes_encrypt/aes_decrypt - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/27 16:16:18 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on pull request #40827: [SPARK-42585][CONNECT] Streaming of local relations - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/27 16:18:29 UTC, 0 replies.
- [GitHub] [spark] WweiL commented on a diff in pull request #40966: [SPARK-43206] [SS] [CONNECT] StreamingQuery exception() include stack trace - posted by "WweiL (via GitHub)" <gi...@apache.org> on 2023/04/27 18:28:31 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40967: [SPARK-43298][PYTHON][ML] predict_batch_udf with scalar input fails with batch size of one - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/27 18:29:21 UTC, 0 replies.
- [GitHub] [spark] BeishaoCao-db commented on a diff in pull request #40907: [SPARK-43270][PYTHON] Implement `__dir__()` in `pyspark.sql.dataframe.DataFrame` to include columns - posted by "BeishaoCao-db (via GitHub)" <gi...@apache.org> on 2023/04/27 18:33:36 UTC, 2 replies.
- [GitHub] [spark] zhenlineo opened a new pull request, #40980: [SPARK-43136][CONNECT][Followup] Adding tests for KeyAs - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/27 18:52:45 UTC, 0 replies.
- [GitHub] [spark] zhenlineo commented on pull request #40980: [SPARK-43136][CONNECT][Followup] Adding tests for KeyAs - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/27 18:55:51 UTC, 0 replies.
- [GitHub] [spark] leewyang opened a new pull request, #40967: [SPARK-43298][PYTHON][ML] predict_batch_udf with scalar input fails with batch size of one - posted by "leewyang (via GitHub)" <gi...@apache.org> on 2023/04/27 18:57:53 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #40955: [SPARK-42843][SQL] Update the error class _LEGACY_ERROR_TEMP_2007 to REGEX_GROUP_INDEX_EXCEED_REGEX_GROUP_COUNT - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/27 19:22:41 UTC, 0 replies.
- [GitHub] [spark] anishshri-db opened a new pull request, #40981: [SPARK-43311] Add RocksDB state store memory management enhancements - posted by "anishshri-db (via GitHub)" <gi...@apache.org> on 2023/04/27 19:29:58 UTC, 0 replies.
- [GitHub] [spark] anishshri-db commented on pull request #40981: [SPARK-43311][SS] Add RocksDB state store provider memory management enhancements - posted by "anishshri-db (via GitHub)" <gi...@apache.org> on 2023/04/27 19:32:15 UTC, 0 replies.
- [GitHub] [spark] viirya commented on pull request #40907: [SPARK-43270][PYTHON] Implement `__dir__()` in `pyspark.sql.dataframe.DataFrame` to include columns - posted by "viirya (via GitHub)" <gi...@apache.org> on 2023/04/27 19:38:57 UTC, 0 replies.
- [GitHub] [spark] amaliujia commented on a diff in pull request #40966: [SPARK-43206] [SS] [CONNECT] StreamingQuery exception() include stack trace - posted by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/04/27 20:27:57 UTC, 0 replies.
- [GitHub] [spark] zhouyejoe commented on a diff in pull request #40412: [SPARK-42784] should still create subDir when the number of subDir in merge dir is less than conf - posted by "zhouyejoe (via GitHub)" <gi...@apache.org> on 2023/04/27 20:48:34 UTC, 0 replies.
- [GitHub] [spark] DerekTBrown commented on pull request #40831: [SPARK-43171][K8S] Support custom Unix username in Pod - posted by "DerekTBrown (via GitHub)" <gi...@apache.org> on 2023/04/27 20:54:42 UTC, 0 replies.
- [GitHub] [spark] liuzqt opened a new pull request, #40982: [SPARK-43300][CORE] NonFateSharingCache wrapper for Guava Cache - posted by "liuzqt (via GitHub)" <gi...@apache.org> on 2023/04/27 21:02:50 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on pull request #40812: [SPARK-43157][SQL] Clone InMemoryRelation cached plan to prevent cloned plan from referencing same objects - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/27 21:06:43 UTC, 0 replies.
- [GitHub] [spark] rangadi opened a new pull request, #40983: [SPARK-43312] Option to convert Any fields into JSON - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/27 21:37:06 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on pull request #40983: [SPARK-43312] Option to convert Any fields into JSON - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/27 21:38:11 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on pull request #40686: [SPARK-43051][CONNECTOR] Add option to materialize zero values for fields without presence information - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/27 21:45:30 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #40983: [SPARK-43312][PROTOBUF] Option to convert Any fields into JSON - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/27 22:00:57 UTC, 2 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40381: [SPARK-42761][BUILD][K8S] Upgrade `kubernetes-client` to 6.5.0 - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/27 22:07:30 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40812: [SPARK-43157][SQL] Clone InMemoryRelation cached plan to prevent cloned plan from referencing same objects - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/27 22:11:53 UTC, 0 replies.
- [GitHub] [spark] siying commented on a diff in pull request #40981: [SPARK-43311][SS] Add RocksDB state store provider memory management enhancements - posted by "siying (via GitHub)" <gi...@apache.org> on 2023/04/27 22:30:39 UTC, 1 replies.
- [GitHub] [spark] anishshri-db commented on a diff in pull request #40981: [SPARK-43311][SS] Add RocksDB state store provider memory management enhancements - posted by "anishshri-db (via GitHub)" <gi...@apache.org> on 2023/04/27 22:32:19 UTC, 10 replies.
- [GitHub] [spark] sweisdb commented on a diff in pull request #40970: [WIP][SPARK-43290][SQL] Adds IV and AAD support to aes_encrypt/aes_decrypt - posted by "sweisdb (via GitHub)" <gi...@apache.org> on 2023/04/27 23:15:49 UTC, 0 replies.
- [GitHub] [spark] justaparth commented on a diff in pull request #40686: [SPARK-43051][CONNECTOR] Add option to materialize zero values for fields without presence information - posted by "justaparth (via GitHub)" <gi...@apache.org> on 2023/04/27 23:16:59 UTC, 21 replies.
- [GitHub] [spark] justaparth commented on pull request #40686: [SPARK-43051][CONNECTOR] Add option to materialize zero values for fields without presence information - posted by "justaparth (via GitHub)" <gi...@apache.org> on 2023/04/27 23:17:39 UTC, 1 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #40686: [SPARK-43051][CONNECTOR] Add option to materialize zero values for fields without presence information - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/27 23:34:46 UTC, 20 replies.
- [GitHub] [spark] jzhuge commented on pull request #40963: [SPARK-43288][SQL] DataSourceV2: CREATE TABLE LIKE - posted by "jzhuge (via GitHub)" <gi...@apache.org> on 2023/04/27 23:36:22 UTC, 0 replies.
- [GitHub] [spark] luizfna opened a new pull request, #40984: [SPARK-43252] Rename the error class _LEGACY_ERROR_TEMP_2017 to UNRESOLVED_CUSTOM_CLASS - posted by "luizfna (via GitHub)" <gi...@apache.org> on 2023/04/28 00:14:46 UTC, 1 replies.
- [GitHub] [spark] luizfna closed pull request #40984: [SPARK-43252] Rename the error class _LEGACY_ERROR_TEMP_2017 to UNRESOLVED_CUSTOM_CLASS - posted by "luizfna (via GitHub)" <gi...@apache.org> on 2023/04/28 00:22:55 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng closed pull request #40967: [SPARK-43298][PYTHON][ML] predict_batch_udf with scalar input fails with batch size of one - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/28 00:37:54 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40967: [SPARK-43298][PYTHON][ML] predict_batch_udf with scalar input fails with batch size of one - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/28 00:40:30 UTC, 0 replies.
- [GitHub] [spark] bozhang2820 commented on pull request #40978: [SPARK-43309][SPARK-38461][CORE] Extend INTERNAL_ERROR with categories and add error class INTERNAL_ERROR_BROADCAST - posted by "bozhang2820 (via GitHub)" <gi...@apache.org> on 2023/04/28 01:16:10 UTC, 1 replies.
- [GitHub] [spark] HeartSaVioR commented on a diff in pull request #40981: [SPARK-43311][SS] Add RocksDB state store provider memory management enhancements - posted by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/04/28 01:35:42 UTC, 8 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40390: [SPARK-42768][SQL] Enable cached plan apply AQE by default - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/28 01:40:47 UTC, 0 replies.
- [GitHub] [spark] itholic opened a new pull request, #40985: [SPARK-43314][CONNECT][PYTHON] Migrate Spark Connect client errors into error class - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/28 01:44:00 UTC, 0 replies.
- [GitHub] [spark] itholic opened a new pull request, #40986: [SPARK-43315][CONNECT][PYTHON][SS] Migrate remaining errors from DataFrame(Reader|Writer) into error class - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/28 02:19:39 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40812: [SPARK-43157][SQL] Clone InMemoryRelation cached plan to prevent cloned plan from referencing same objects - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/28 02:25:52 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on a diff in pull request #40978: [SPARK-43309][SPARK-38461][CORE] Extend INTERNAL_ERROR with categories and add error class INTERNAL_ERROR_BROADCAST - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/28 02:30:28 UTC, 1 replies.
- [GitHub] [spark] itholic opened a new pull request, #40987: [SPARK-40448][CONNECT][FOLLOWUP] Remove `InputValidationError` and turn into error class. - posted by "itholic (via GitHub)" <gi...@apache.org> on 2023/04/28 02:39:23 UTC, 0 replies.
- [GitHub] [spark] ueshin opened a new pull request, #40988: [SPARK-41971][SQL][PYTHON] Add a config for pandas conversion how to handle struct types - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/28 02:53:46 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40978: [SPARK-43309][SPARK-38461][CORE] Extend INTERNAL_ERROR with categories and add error class INTERNAL_ERROR_BROADCAST - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/28 03:05:26 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40978: [SPARK-43309][SPARK-38461][CORE] Extend INTERNAL_ERROR with categories and add error class INTERNAL_ERROR_BROADCAST - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/28 03:06:06 UTC, 0 replies.
- [GitHub] [spark] cloud-fan commented on pull request #40739: [SPARK-43302][SQL] Make Python UDAF an AggregateFunction - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/28 03:06:41 UTC, 0 replies.
- [GitHub] [spark] cloud-fan closed pull request #40739: [SPARK-43302][SQL] Make Python UDAF an AggregateFunction - posted by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/04/28 03:07:20 UTC, 0 replies.
- [GitHub] [spark] SandishKumarHN commented on a diff in pull request #40983: [SPARK-43312][PROTOBUF] Option to convert Any fields into JSON - posted by "SandishKumarHN (via GitHub)" <gi...@apache.org> on 2023/04/28 03:59:00 UTC, 0 replies.
- [GitHub] [spark] dongjoon-hyun commented on pull request #40390: [SPARK-42768][SQL] Enable cached plan apply AQE by default - posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org> on 2023/04/28 04:14:16 UTC, 1 replies.
- [GitHub] [spark] Hisoka-X commented on a diff in pull request #40984: [SPARK-43252] Rename the error class _LEGACY_ERROR_TEMP_2017 to UNRESOLVED_CUSTOM_CLASS - posted by "Hisoka-X (via GitHub)" <gi...@apache.org> on 2023/04/28 04:33:24 UTC, 1 replies.
- [GitHub] [spark] WeichenXu123 commented on pull request #40896: [SPARK-43229][ML][PYTHON][CONNECT] Introduce Barrier Python UDF - posted by "WeichenXu123 (via GitHub)" <gi...@apache.org> on 2023/04/28 04:53:19 UTC, 1 replies.
- [GitHub] [spark] RunyaoChen opened a new pull request, #40989: [SPARK-43316] Add more CTE SQL tests - posted by "RunyaoChen (via GitHub)" <gi...@apache.org> on 2023/04/28 06:21:12 UTC, 0 replies.
- [GitHub] [spark] LuciferYang commented on pull request #40922: [SPARK-43063][SQL][FOLLOWUP] Add ToPrettyString expression for Dataset.show - posted by "LuciferYang (via GitHub)" <gi...@apache.org> on 2023/04/28 06:35:18 UTC, 0 replies.
- [GitHub] [spark] ulysses-you commented on pull request #40390: [SPARK-42768][SQL] Enable cached plan apply AQE by default - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/28 06:40:54 UTC, 0 replies.
- [GitHub] [spark] ulysses-you commented on a diff in pull request #40390: [SPARK-42768][SQL] Enable cached plan apply AQE by default - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/28 08:47:22 UTC, 1 replies.
- [GitHub] [spark] pan3793 commented on a diff in pull request #40831: [SPARK-43171][K8S] Support custom Unix username in Pod - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/28 09:58:24 UTC, 0 replies.
- [GitHub] [spark] zhengruifeng commented on pull request #40658: [WIP][SPARK-43024][PS] Upgrade pandas to 2.0.0 - posted by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/04/28 09:59:08 UTC, 0 replies.
- [GitHub] [spark] ulysses-you opened a new pull request, #40990: [SPARK-43317][SQL] Support combine adjacent aggregation - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/28 10:08:14 UTC, 0 replies.
- [GitHub] [spark] ulysses-you commented on a diff in pull request #40990: [SPARK-43317][SQL] Support combine adjacent aggregation - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/28 10:11:15 UTC, 0 replies.
- [GitHub] [spark] ulysses-you commented on pull request #40990: [SPARK-43317][SQL] Support combine adjacent aggregation - posted by "ulysses-you (via GitHub)" <gi...@apache.org> on 2023/04/28 10:12:22 UTC, 0 replies.
- [GitHub] [spark] dzhigimont commented on pull request #40420: [SPARK-42617][PS] Support `isocalendar` from the pandas 2.0.0 - posted by "dzhigimont (via GitHub)" <gi...@apache.org> on 2023/04/28 11:07:29 UTC, 0 replies.
- [GitHub] [spark] kori73 opened a new pull request, #40991: [Spark-42330] Assign name to _LEGACY_ERROR_TEMP_2175: RULE_ID_NOT_FOUND - posted by "kori73 (via GitHub)" <gi...@apache.org> on 2023/04/28 12:31:53 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #32060: [SPARK-34916][SQL] Add condition lambda and rule id to the transform family for early stopping - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/28 13:11:27 UTC, 0 replies.
- [GitHub] [spark] srowen closed pull request #40933: [SPARK-43263][BUILD] Upgrade `FasterXML jackson` to 2.15.0 - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/28 13:30:15 UTC, 0 replies.
- [GitHub] [spark] luizfna commented on a diff in pull request #40984: [SPARK-43252] Rename the error class _LEGACY_ERROR_TEMP_2017 to UNRESOLVED_CUSTOM_CLASS - posted by "luizfna (via GitHub)" <gi...@apache.org> on 2023/04/28 13:31:54 UTC, 0 replies.
- [GitHub] [spark] Kimahriman commented on a diff in pull request #40981: [SPARK-43311][SS] Add RocksDB state store provider memory management enhancements - posted by "Kimahriman (via GitHub)" <gi...@apache.org> on 2023/04/28 13:55:22 UTC, 1 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40992: [SPARK-42769][TEST][FOLLOWUP] Add missing `assert` in integration test - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/28 15:53:30 UTC, 0 replies.
- [GitHub] [spark] nfx opened a new pull request, #40993: [WIP] Make it possible to extend `ChannelBuilder` for `SparkConnectClient` - posted by "nfx (via GitHub)" <gi...@apache.org> on 2023/04/28 16:40:04 UTC, 0 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40994: [SPARK-43319][K8S][TEST] Remove usage of deprecated DefaultKubernetesClient - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/28 17:18:08 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40992: [SPARK-42769][TEST][FOLLOWUP] Add missing `assert` in integration test - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/28 17:20:18 UTC, 0 replies.
- [GitHub] [spark] pan3793 opened a new pull request, #40995: [SPARK-43320][SQL][HIVE] Directly call Hive 2.3.9 API - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/28 17:40:02 UTC, 0 replies.
- [GitHub] [spark] pan3793 commented on pull request #40995: [SPARK-43320][SQL][HIVE] Directly call Hive 2.3.9 API - posted by "pan3793 (via GitHub)" <gi...@apache.org> on 2023/04/28 17:40:44 UTC, 0 replies.
- [GitHub] [spark] robreeves commented on a diff in pull request #40812: [SPARK-43157][SQL] Clone InMemoryRelation cached plan to prevent cloned plan from referencing same objects - posted by "robreeves (via GitHub)" <gi...@apache.org> on 2023/04/28 17:49:49 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on pull request #40937: [SPARK-42940][SS][CONNECT] Improve session management for streaming queries - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/28 19:04:50 UTC, 0 replies.
- [GitHub] [spark] dtenedor opened a new pull request, #40996: [SPARK-43313][SQL] Adding missing column DEFAULT values for MERGE INSERT actions - posted by "dtenedor (via GitHub)" <gi...@apache.org> on 2023/04/28 19:17:28 UTC, 0 replies.
- [GitHub] [spark] grundprinzip commented on pull request #40827: [SPARK-42585][CONNECT] Streaming of local relations - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/28 19:38:22 UTC, 0 replies.
- [GitHub] [spark] grundprinzip commented on a diff in pull request #40827: [SPARK-42585][CONNECT] Streaming of local relations - posted by "grundprinzip (via GitHub)" <gi...@apache.org> on 2023/04/28 19:42:27 UTC, 1 replies.
- [GitHub] [spark] gengliangwang commented on a diff in pull request #40996: [SPARK-43313][SQL] Adding missing column DEFAULT values for MERGE INSERT actions - posted by "gengliangwang (via GitHub)" <gi...@apache.org> on 2023/04/28 20:10:45 UTC, 2 replies.
- [GitHub] [spark] dtenedor commented on a diff in pull request #40996: [SPARK-43313][SQL] Adding missing column DEFAULT values for MERGE INSERT actions - posted by "dtenedor (via GitHub)" <gi...@apache.org> on 2023/04/28 21:48:10 UTC, 1 replies.
- [GitHub] [spark] zhenlineo opened a new pull request, #40997: [SPARK-43321][Connect] Dataset#Joinwith - posted by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/04/28 23:23:36 UTC, 0 replies.
- [GitHub] [spark] dtenedor commented on a diff in pull request #40615: [SPARK-16484][SQL] Add support for Datasketches HllSketch - posted by "dtenedor (via GitHub)" <gi...@apache.org> on 2023/04/28 23:56:54 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #39605: when spark job had ran in k8s is finished ,it register to shutdown ho… - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/29 00:18:21 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] commented on pull request #38388: [SPARK-40909][SQL] Reuse the broadcast exchange for bloom filter - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/29 00:18:23 UTC, 0 replies.
- [GitHub] [spark] ueshin commented on pull request #40988: [SPARK-41971][SQL][PYTHON] Add a config for pandas conversion how to handle struct types - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/29 00:51:03 UTC, 0 replies.
- [GitHub] [spark] ueshin opened a new pull request, #40998: [SPARK-43323][SQL][PYTHON] Fix DataFrame.toPandas with Arrow enabled to handle exceptions properly - posted by "ueshin (via GitHub)" <gi...@apache.org> on 2023/04/29 01:45:48 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40937: [SPARK-42940][SS][CONNECT] Improve session management for streaming queries - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/29 02:56:51 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon commented on a diff in pull request #40998: [SPARK-43323][SQL][PYTHON] Fix DataFrame.toPandas with Arrow enabled to handle exceptions properly - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/29 03:02:43 UTC, 0 replies.
- [GitHub] [spark] HyukjinKwon closed pull request #40907: [SPARK-43270][PYTHON] Implement `__dir__()` in `pyspark.sql.dataframe.DataFrame` to include columns - posted by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/04/29 03:03:51 UTC, 0 replies.
- [GitHub] [spark] cometta commented on pull request #39306: [SPARK-41781][K8S] Add the ability to create pvc before creating driver/executor pod - posted by "cometta (via GitHub)" <gi...@apache.org> on 2023/04/29 03:44:18 UTC, 0 replies.
- [GitHub] [spark] mridulm closed pull request #40911: [SPARK-43237][CORE] Handle null exception message in event log - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/29 04:14:46 UTC, 0 replies.
- [GitHub] [spark] mridulm commented on a diff in pull request #40949: [DRAFT][SPARK-23607][CORE] Use HDFS extended attributes to store application summary information in SHS - posted by "mridulm (via GitHub)" <gi...@apache.org> on 2023/04/29 06:49:16 UTC, 0 replies.
- [GitHub] [spark] MaxGekk commented on a diff in pull request #40991: [Spark-42330] Assign name to _LEGACY_ERROR_TEMP_2175: RULE_ID_NOT_FOUND - posted by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/04/29 09:39:33 UTC, 0 replies.
- [GitHub] [spark] Gelerion commented on pull request #40615: [SPARK-16484][SQL] Add support for Datasketches HllSketch - posted by "Gelerion (via GitHub)" <gi...@apache.org> on 2023/04/29 11:54:34 UTC, 0 replies.
- [GitHub] [spark] kori73 commented on a diff in pull request #40991: [Spark-42330] Assign name to _LEGACY_ERROR_TEMP_2175: RULE_ID_NOT_FOUND - posted by "kori73 (via GitHub)" <gi...@apache.org> on 2023/04/29 12:24:08 UTC, 0 replies.
- [GitHub] [spark] neshkeev commented on pull request #40688: [SPARK-43021][SQL] `CoalesceBucketsInJoin` not work when using AQE - posted by "neshkeev (via GitHub)" <gi...@apache.org> on 2023/04/29 13:04:23 UTC, 0 replies.
- [GitHub] [spark] srowen commented on pull request #40995: [SPARK-43320][SQL][HIVE] Directly call Hive 2.3.9 API - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/29 14:38:22 UTC, 0 replies.
- [GitHub] [spark] srowen closed pull request #40995: [SPARK-43320][SQL][HIVE] Directly call Hive 2.3.9 API - posted by "srowen (via GitHub)" <gi...@apache.org> on 2023/04/29 14:38:23 UTC, 0 replies.
- [GitHub] [spark] blcksrx opened a new pull request, #40999: [SPARK-43325] regexp_extract_all DataFrame API - posted by "blcksrx (via GitHub)" <gi...@apache.org> on 2023/04/29 16:16:40 UTC, 0 replies.
- [GitHub] [spark] pang-wu commented on a diff in pull request #40686: [SPARK-43051][CONNECTOR] Add option to materialize zero values for fields without presence information - posted by "pang-wu (via GitHub)" <gi...@apache.org> on 2023/04/29 19:45:05 UTC, 6 replies.
- [GitHub] [spark] justaparth commented on a diff in pull request #40686: [SPARK-43051][CONNECTOR] Add option to materialize zero values for fields without presence information (proto3 scalars) - posted by "justaparth (via GitHub)" <gi...@apache.org> on 2023/04/29 23:43:44 UTC, 5 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #39605: when spark job had ran in k8s is finished ,it register to shutdown ho… - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/30 00:20:14 UTC, 0 replies.
- [GitHub] [spark] github-actions[bot] closed pull request #38388: [SPARK-40909][SQL] Reuse the broadcast exchange for bloom filter - posted by "github-actions[bot] (via GitHub)" <gi...@apache.org> on 2023/04/30 00:20:15 UTC, 0 replies.
- [GitHub] [spark] rangadi commented on a diff in pull request #40686: [SPARK-43051][CONNECTOR] Add option to materialize zero values for fields without presence information (proto3 scalars) - posted by "rangadi (via GitHub)" <gi...@apache.org> on 2023/04/30 02:52:22 UTC, 7 replies.
- [GitHub] [spark] pang-wu commented on a diff in pull request #40686: [SPARK-43051][CONNECTOR] Add option to materialize zero values for fields without presence information (proto3 scalars) - posted by "pang-wu (via GitHub)" <gi...@apache.org> on 2023/04/30 03:53:48 UTC, 6 replies.
- [GitHub] [spark] zzzzming95 opened a new pull request, #41000: [SPARK-43327] Trigger `committer.setupJob` before plan execute in `FileFormatWriter#write` - posted by "zzzzming95 (via GitHub)" <gi...@apache.org> on 2023/04/30 07:16:57 UTC, 0 replies.
- [GitHub] [spark] WeichenXu123 closed pull request #40724: [SPARK-43081] [ML] [CONNECT] Add torch distributor data loader that loads data from spark partition data - posted by "WeichenXu123 (via GitHub)" <gi...@apache.org> on 2023/04/30 11:54:10 UTC, 0 replies.