dev@spark.apache.org, 2018-05

You are viewing a plain text version of this content. The canonical link for it is here.

- Identifying specific persisted DataFrames via getPersistentRDDs() - posted by Nicholas Chammas <ni...@gmail.com> on 2018/05/01 03:17:08 UTC, 4 replies.
- [build system] jenkins master unreachable, build system currently down - posted by shane knapp <sk...@berkeley.edu> on 2018/05/01 03:27:35 UTC, 4 replies.
- UnresolvedException: Invalid call to dataType on unresolved object - posted by 880f0464 <88...@protonmail.com> on 2018/05/01 11:14:47 UTC, 0 replies.
- spark.python.worker.reuse not working as expected - posted by 880f0464 <88...@protonmail.com> on 2018/05/01 11:17:21 UTC, 0 replies.
- PySpark.sql.filter not performing as it should - posted by 880f0464 <88...@protonmail.com> on 2018/05/01 11:19:07 UTC, 0 replies.
- org.apache.spark.shuffle.FetchFailedException: Too large frame: - posted by Pralabh Kumar <pr...@gmail.com> on 2018/05/01 11:21:01 UTC, 3 replies.
- ApacheCon North America 2018 schedule is now live. - posted by Rich Bowen <rb...@apache.org> on 2018/05/01 12:36:05 UTC, 0 replies.
- Re: Datasource API V2 and checkpointing - posted by Ryan Blue <rb...@netflix.com.INVALID> on 2018/05/01 17:26:14 UTC, 2 replies.
- Re: Sorting on a streaming dataframe - posted by Hemant Bhanawat <he...@gmail.com> on 2018/05/02 05:45:08 UTC, 0 replies.
- Custom datasource as a wrapper for existing ones? - posted by jwozniak <ja...@cern.ch> on 2018/05/02 16:49:07 UTC, 6 replies.
- [build system] meet your build engineer @ spark ai summit SF 2018 - posted by shane knapp <sk...@berkeley.edu> on 2018/05/02 18:11:25 UTC, 1 replies.
- AccumulatorV2 vs AccumulableParam (V1) - posted by Sergey Zhemzhitsky <sz...@gmail.com> on 2018/05/02 22:20:35 UTC, 2 replies.
- SparkR test failures in PR builder - posted by Joseph Bradley <jo...@databricks.com> on 2018/05/02 22:31:47 UTC, 3 replies.
- [Structured streaming, V2] commit on ContinuousReader - posted by Jiří Syrový <sy...@gmail.com> on 2018/05/03 17:43:37 UTC, 1 replies.
- Design for continuous processing shuffle - posted by Joseph Torres <jo...@databricks.com> on 2018/05/04 18:27:34 UTC, 1 replies.
- Spark UI Source Code - posted by Anshi Shrivastava <an...@exadatum.com> on 2018/05/07 08:44:02 UTC, 2 replies.
- Integrating ML/DL frameworks with Spark - posted by Reynold Xin <rx...@databricks.com> on 2018/05/08 00:37:17 UTC, 15 replies.
- Documenting the various DataFrame/SQL join types - posted by Nicholas Chammas <ni...@gmail.com> on 2018/05/08 13:13:35 UTC, 2 replies.
- eager execution and debuggability - posted by Reynold Xin <rx...@databricks.com> on 2018/05/08 18:52:40 UTC, 15 replies.
- [DISCUSS] Spark SQL internal data: InternalRow or UnsafeRow? - posted by Ryan Blue <rb...@netflix.com.INVALID> on 2018/05/08 20:22:00 UTC, 4 replies.
- Revisiting Online serving of Spark models? - posted by Holden Karau <ho...@pigscanfly.ca> on 2018/05/09 14:18:30 UTC, 19 replies.
- Problem with Spark Master shutting down when zookeeper leader is shutdown - posted by agateaaa <ag...@gmail.com> on 2018/05/09 20:50:02 UTC, 0 replies.
- Time for 2.3.1? - posted by Marcelo Vanzin <va...@cloudera.com> on 2018/05/10 18:09:52 UTC, 7 replies.
- REMINDER: Apache EU Roadshow 2018 schedule announced! - posted by sh...@apache.org on 2018/05/11 12:13:36 UTC, 0 replies.
- Re: Possible SPIP to improve matrix and vector column type support - posted by Leif Walsh <le...@gmail.com> on 2018/05/12 22:44:51 UTC, 0 replies.
- Build timeout -- continuous-integration/appveyor/pr — AppVeyor build failed - posted by Ilan Filonenko <if...@cornell.edu> on 2018/05/14 00:45:14 UTC, 3 replies.
- parser error? - posted by Reynold Xin <rx...@databricks.com> on 2018/05/14 05:38:35 UTC, 3 replies.
- InMemoryTableScanExec.inputRDD and buffers (RDD[CachedBatch]) - posted by Jacek Laskowski <ja...@japila.pl> on 2018/05/14 11:30:32 UTC, 0 replies.
- Re: Sort-merge join improvement - posted by Petar Zecevic <pe...@gmail.com> on 2018/05/15 08:56:41 UTC, 1 replies.
- Preventing predicate pushdown - posted by Tomasz Gawęda <to...@outlook.com> on 2018/05/15 12:33:09 UTC, 2 replies.
- [VOTE] Spark 2.3.1 (RC1) - posted by Marcelo Vanzin <va...@cloudera.com> on 2018/05/15 21:00:33 UTC, 18 replies.
- [DISCUSS] PySpark Window UDF - posted by Li Jin <ic...@gmail.com> on 2018/05/16 15:34:52 UTC, 0 replies.
- Re: Running lint-java during PR builds? - posted by Hyukjin Kwon <gu...@gmail.com> on 2018/05/21 05:09:07 UTC, 9 replies.
- Repeated FileSourceScanExec.metrics from ColumnarBatchScan.metrics - posted by Jacek Laskowski <ja...@japila.pl> on 2018/05/22 17:17:05 UTC, 0 replies.
- [VOTE] Spark 2.3.1 (RC2) - posted by Marcelo Vanzin <va...@cloudera.com> on 2018/05/22 19:45:13 UTC, 8 replies.
- ML Pipelines in R - posted by Hossein <fa...@gmail.com> on 2018/05/22 22:23:10 UTC, 1 replies.
- Design proposal for streaming APIs in data source V2 - posted by Joseph Torres <jo...@databricks.com> on 2018/05/24 17:43:35 UTC, 0 replies.
- Spark version for Mesos 0.27.0 - posted by Thodoris Zois <zo...@ics.forth.gr> on 2018/05/25 11:29:36 UTC, 6 replies.
- SparkR was removed from CRAN on 2018-05-01 - posted by Hossein <fa...@gmail.com> on 2018/05/25 17:58:42 UTC, 6 replies.
- [SQL] Understanding RewriteCorrelatedScalarSubquery optimization (and TreeNode.transform) - posted by Jacek Laskowski <ja...@japila.pl> on 2018/05/27 19:43:24 UTC, 1 replies.
- [SQL] Two ScalarSubquery expressions?! Could we have ScalarSubqueryExec instead? - posted by Jacek Laskowski <ja...@japila.pl> on 2018/05/27 20:04:30 UTC, 0 replies.
- Live migration of Spark Streaming job from one configuration to another - posted by Khaled Zaouk <kh...@gmail.com> on 2018/05/29 08:58:40 UTC, 0 replies.
- FYI - posted by eric xu <ka...@hotmail.com> on 2018/05/30 05:17:18 UTC, 1 replies.
- unsubscribe - posted by Hadrien Chicault <ch...@gmail.com> on 2018/05/30 05:48:56 UTC, 0 replies.
- Closing IPC connection - posted by Arun Hive <ar...@yahoo.com.INVALID> on 2018/05/30 17:55:45 UTC, 0 replies.
- Re: Unable to alter partition. The transaction for alter partition did not commit successfully. - posted by Arun Hive <ar...@yahoo.com.INVALID> on 2018/05/30 17:58:24 UTC, 1 replies.
- [SQL] Purpose of RuntimeReplaceable unevaluable unary expressions? - posted by Jacek Laskowski <ja...@japila.pl> on 2018/05/30 18:09:29 UTC, 2 replies.
- [VOTE] SPIP ML Pipelines in R - posted by Hossein <fa...@gmail.com> on 2018/05/30 21:03:03 UTC, 3 replies.
- Spark on Kubernetes plan for 2.4 - posted by Yinan Li <li...@gmail.com> on 2018/05/30 22:58:36 UTC, 0 replies.
- Re: MatrixUDT and VectorUDT in Spark ML - posted by Dongjin Lee <do...@apache.org> on 2018/05/31 02:39:51 UTC, 1 replies.
- Feedback on first commit + jira issue I opened - posted by "Long, Andrew" <lo...@amazon.com> on 2018/05/31 16:44:58 UTC, 1 replies.
- [Spark SQL Discuss] Better support for Partitioning and Bucketing when used together - posted by pnpranavrao <pn...@gmail.com> on 2018/05/31 17:04:19 UTC, 0 replies.
- REMINDER: Apache EU Roadshow 2018 in Berlin is less than 2 weeks away! - posted by sh...@apache.org on 2018/05/31 20:51:47 UTC, 0 replies.