user@spark.apache.org, 2019-01

You are viewing a plain text version of this content. The canonical link for it is here.

- structure streaming dataframe/dataset join (Java) - posted by Mann Du <ma...@gmail.com> on 2019/01/01 01:39:02 UTC, 0 replies.
- Re: Corrupt record handling in spark structured streaming and from_json function - posted by Colin Williams <co...@gmail.com> on 2019/01/01 01:49:17 UTC, 0 replies.
- Re: Do spark-submit overwrite the Spark session created manually? - posted by Neo Chien <se...@gmail.com> on 2019/01/01 07:43:43 UTC, 0 replies.
- Back pressure not working on streaming - posted by JF Chen <da...@gmail.com> on 2019/01/02 03:02:42 UTC, 3 replies.
- Python - posted by Gourav Sengupta <go...@gmail.com> on 2019/01/02 07:35:18 UTC, 2 replies.
- Re: Questions about caching - posted by Gourav Sengupta <go...@gmail.com> on 2019/01/02 07:40:28 UTC, 0 replies.
- [spark-ml] How to write a Spark Application correctly? - posted by Pola Yao <po...@gmail.com> on 2019/01/02 16:27:02 UTC, 0 replies.
- Re: Powered By Spark - posted by Mann Du <ma...@gmail.com> on 2019/01/02 17:29:36 UTC, 0 replies.
- How to reissue a delegated token after max lifetime passes for a spark streaming application on a Kerberized cluster - posted by Ali Nazemian <al...@gmail.com> on 2019/01/03 01:09:34 UTC, 4 replies.
- Using Spark as an ETL tool for moving data from Hive tables to BigQuery - posted by Mich Talebzadeh <mi...@gmail.com> on 2019/01/03 10:02:10 UTC, 0 replies.
- Re: Spark jdbc postgres numeric array - posted by Takeshi Yamamuro <li...@gmail.com> on 2019/01/03 13:04:54 UTC, 2 replies.
- [Spark cluster standalone v2.4.0] - problems with reverse proxy functionnality regarding submitted applications in cluster mode and the spark history server ui - posted by Cheikh_SOW <ch...@live.fr> on 2019/01/03 13:21:48 UTC, 0 replies.
- Question regarding kryo and java encoders in datasets - posted by Devender Yadav <de...@impetus.co.in> on 2019/01/04 06:48:30 UTC, 0 replies.
- Can an UDF return a custom class other than case class? - posted by em...@yeikel.com on 2019/01/07 05:42:41 UTC, 3 replies.
- Writing RDDs to HDFS is empty - posted by Jian Lee <fy...@163.com> on 2019/01/07 08:46:42 UTC, 3 replies.
- Re: Spark Kinesis Connector SSL issue - posted by shzshi <sh...@discover.com> on 2019/01/07 10:42:48 UTC, 4 replies.
- Parquet file number of columns - posted by Gourav Sengupta <go...@gmail.com> on 2019/01/07 13:11:51 UTC, 1 replies.
- [Spark-ml]Error in training ML models: Missing an output location for shuffle xxx - posted by Pola Yao <po...@gmail.com> on 2019/01/08 05:29:13 UTC, 0 replies.
- Performance Issue - posted by Tzahi File <tz...@ironsrc.com> on 2019/01/08 14:09:24 UTC, 8 replies.
- Is it possible to rate limit an UDP? - posted by em...@yeikel.com on 2019/01/08 23:21:27 UTC, 4 replies.
- [Spark SQL] Failure Scenarios involving JDBC and SQL databases - posted by Ramon Tuason <Ra...@microsoft.com.INVALID> on 2019/01/09 03:28:28 UTC, 0 replies.
- P-values logistic regression - posted by Simon Dirmeier <si...@web.de> on 2019/01/09 08:02:05 UTC, 0 replies.
- Reading as Parquet a directory created by Spark Structured Streaming - problems - posted by Phillip Henry <lo...@gmail.com> on 2019/01/09 08:45:43 UTC, 2 replies.
- Troubleshooting Spark OOM - posted by William Shen <wi...@marinsoftware.com> on 2019/01/09 18:18:25 UTC, 4 replies.
- Spark ML with null labels - posted by Patrick McCarthy <pm...@dstillery.com.INVALID> on 2019/01/10 15:53:06 UTC, 2 replies.
- Remote Data Read Time - posted by swastik mittal <sm...@ncsu.edu> on 2019/01/10 21:26:33 UTC, 0 replies.
- Re: spark2.4 arrow enabled true，error log not returned - posted by Bryan Cutler <cu...@gmail.com> on 2019/01/11 01:04:58 UTC, 1 replies.
- [VOTE][RESULT] Spark 2.2.3 (RC1) - posted by Dongjoon Hyun <do...@gmail.com> on 2019/01/11 20:01:14 UTC, 0 replies.
- State of datasource api v2 - posted by Vladimir Prus <vl...@gmail.com> on 2019/01/14 08:48:16 UTC, 1 replies.
- Unsubscribe - posted by "Liu, Jialin" <ji...@illinois.edu> on 2019/01/14 21:55:56 UTC, 13 replies.
- [ANNOUNCE] Announcing Apache Spark 2.2.3 - posted by Dongjoon Hyun <do...@gmail.com> on 2019/01/15 07:47:35 UTC, 4 replies.
- SparkSql query on a port and peocess queries - posted by Soheil Pourbafrani <so...@gmail.com> on 2019/01/15 10:11:47 UTC, 0 replies.
- SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms - posted by Xiangrui Meng <me...@gmail.com> on 2019/01/15 16:52:44 UTC, 8 replies.
- cache table vs. parquet table performance - posted by Tomas Bartalos <to...@gmail.com> on 2019/01/15 18:20:56 UTC, 4 replies.
- DFS Pregel performance vs simple Java DFS implementation - posted by daveb <da...@hotmail.com> on 2019/01/15 18:21:03 UTC, 0 replies.
- dataset best practice question - posted by Mohit Jaggi <mo...@gmail.com> on 2019/01/15 19:30:57 UTC, 2 replies.
- How to force-quit a Spark application? - posted by Pola Yao <po...@gmail.com> on 2019/01/15 21:32:44 UTC, 7 replies.
- How thriftserver load data - posted by Soheil Pourbafrani <so...@gmail.com> on 2019/01/16 08:15:28 UTC, 0 replies.
- How to unsubscribe??? - posted by Junior Alvarez <ju...@ericsson.com> on 2019/01/16 08:22:17 UTC, 2 replies.
- [Spark SQL]: how is “Exchange hashpartitioning” working in spark - posted by nkx <ma...@p3-group.com> on 2019/01/16 11:26:27 UTC, 0 replies.
- Subscribe - posted by Vasu Devan <va...@gmail.com> on 2019/01/16 22:56:35 UTC, 0 replies.
- How to know that a partition is ready when using Structured Streaming - posted by Wayne Guo <gu...@gmail.com> on 2019/01/17 03:36:19 UTC, 0 replies.
- UDF error with Spark 2.4 on scala 2.12 - posted by Andrés Ivaldi <ia...@gmail.com> on 2019/01/17 18:27:01 UTC, 0 replies.
- Question about RDD pipe - posted by Mkal <di...@hotmail.com> on 2019/01/17 22:09:25 UTC, 2 replies.
- [SPARK ON K8]: How do you configure executors to use the keytab inside their image on Kubernetes? - posted by pokemonmaster9505 <ka...@gmail.com> on 2019/01/18 11:55:06 UTC, 0 replies.
- Rdd pipe Subprocess exit code - posted by Mkal <di...@hotmail.com> on 2019/01/18 20:13:20 UTC, 0 replies.
- Spark on Yarn, is it possible to manually blacklist nodes before running spark job? - posted by Serega Sheypak <se...@gmail.com> on 2019/01/18 23:20:43 UTC, 11 replies.
- Persist Dataframe to HDFS considering HDFS Block Size. - posted by Shivam Sharma <28...@gmail.com> on 2019/01/19 15:42:55 UTC, 5 replies.
- userClassPath first fails - posted by Moein Hosseini <mo...@gmail.com> on 2019/01/21 08:21:55 UTC, 0 replies.
- How to Overwrite a saved PySpark ML Model - posted by Aakash Basu <aa...@gmail.com> on 2019/01/21 11:44:27 UTC, 1 replies.
- Increase time for Spark Job to be in Accept mode in Yarn - posted by Chetan Khatri <ch...@gmail.com> on 2019/01/22 10:38:29 UTC, 2 replies.
- Spark UI History server on Kubernetes - posted by Battini Lakshman <ba...@gmail.com> on 2019/01/22 12:32:18 UTC, 3 replies.
- I have trained a ML model, now what? - posted by Riccardo Ferrari <fe...@gmail.com> on 2019/01/22 16:07:17 UTC, 4 replies.
- Spark Core InBox.scala has error - posted by kaishen <ki...@163.com> on 2019/01/23 01:50:16 UTC, 2 replies.
- Local Storage Encryption - Spark ioEncryption - posted by "Sinha, Breeta (Nokia - IN/Bangalore)" <br...@nokia.com> on 2019/01/23 05:28:43 UTC, 0 replies.
- How to sleep Spark job - posted by Soheil Pourbafrani <so...@gmail.com> on 2019/01/23 05:56:10 UTC, 4 replies.
- dropping unused data from a stream - posted by Paul Tremblay <pa...@gmail.com> on 2019/01/23 07:17:12 UTC, 0 replies.
- How to query on Cassandra and load results in Spark dataframe - posted by Soheil Pourbafrani <so...@gmail.com> on 2019/01/23 07:43:50 UTC, 1 replies.
- How to get all input tables of a SPARK SQL 'select' statement - posted by lu...@china-inv.cn on 2019/01/23 09:37:27 UTC, 4 replies.
- Customizing Spark ThriftServer - posted by Soheil Pourbafrani <so...@gmail.com> on 2019/01/23 09:53:01 UTC, 1 replies.
- How to optimize iterative data processing in spark application - posted by Federico D'Ambrosio <fe...@gmail.com> on 2019/01/23 10:16:37 UTC, 0 replies.
- Please add Singapore Spark meetup to Community page... thank you! - posted by Arseny Chernov <ar...@gmail.com> on 2019/01/23 14:23:25 UTC, 0 replies.
- Spark Stateful Streaming - add counter column - posted by Femi Anthony <fe...@gmail.com> on 2019/01/23 15:06:12 UTC, 0 replies.
- Create all the combinations of a groupBy - posted by Pierremalliard <pi...@capgemini.com> on 2019/01/23 17:17:29 UTC, 2 replies.
- Re%3A SPIP%3A DataFrame-based Property Graphs%2C Cypher Queries%2C andAlgorithms&In-Reply-To= - posted by Alastair Green <al...@neotechnology.com> on 2019/01/23 21:04:56 UTC, 0 replies.
- unsubscribe - posted by Irtiza Ali <ia...@an10.io> on 2019/01/24 04:17:50 UTC, 12 replies.
- spark-submit: Warning: Skip remote jar hdfs - posted by Neo Chien <se...@gmail.com> on 2019/01/24 05:24:18 UTC, 0 replies.
- 答复: Re: How to get all input tables of a SPARK SQL 'select' statement - posted by lu...@china-inv.cn on 2019/01/24 09:21:14 UTC, 1 replies.
- Reading compacted Kafka topic is slow - posted by Tomas Bartalos <to...@gmail.com> on 2019/01/24 11:55:22 UTC, 1 replies.
- Structured streaming from Kafka by timestamp - posted by Tomas Bartalos <to...@gmail.com> on 2019/01/24 17:37:56 UTC, 2 replies.
- Dose --py-files place the files on the PYTHONPATH of executor? - posted by thinkdoom2 <th...@gmail.com> on 2019/01/25 01:52:27 UTC, 0 replies.
- Spark job got stuck and no active tasks - posted by Pola Yao <po...@gmail.com> on 2019/01/25 06:04:13 UTC, 3 replies.
- [PySpark] Revisiting PySpark type annotations - posted by Maciej Szymkiewicz <ms...@gmail.com> on 2019/01/25 15:46:16 UTC, 1 replies.
- CfP: LASCAR 2019 - Workshop on Large Scale RDF Analytics || @ESWC 2019 || 2nd – 6th June 2019 || Portorož, Slovenia - posted by Gezim Sejdiu <g....@gmail.com> on 2019/01/28 08:10:48 UTC, 0 replies.
- Silly Spark SQL query - posted by Aakash Basu <aa...@gmail.com> on 2019/01/28 12:11:40 UTC, 1 replies.
- CVE-2018-11760: Apache Spark local privilege escalation vulnerability - posted by Imran Rashid <ir...@apache.org> on 2019/01/28 19:08:44 UTC, 1 replies.
- 答复: Re: Re: How to get all input tables of a SPARK SQL 'select' statement - posted by lu...@china-inv.cn on 2019/01/29 05:38:16 UTC, 0 replies.
- What is the recommended way to store records that don't meet a filter? - posted by em...@yeikel.com on 2019/01/29 06:13:46 UTC, 0 replies.
- Spark Kubernetes Architecture: Deployments vs Pods that create Pods - posted by WILSON Frank <Fr...@uk.thalesgroup.com> on 2019/01/29 13:53:02 UTC, 2 replies.
- How to avoid copying hadoop conf to submit on yarn - posted by Yann Moisan <ya...@gmail.com> on 2019/01/29 15:53:38 UTC, 0 replies.
- Apply Kmeans in partitions - posted by dimitris plakas <di...@gmail.com> on 2019/01/30 14:30:12 UTC, 1 replies.
- - posted by Daniel O' Shaughnessy <da...@gmail.com> on 2019/01/30 22:42:49 UTC, 2 replies.
- Survey on Data Stream Processing - posted by Alexandre Strapacao Guedes Vianna <as...@cin.ufpe.br> on 2019/01/31 14:03:10 UTC, 0 replies.
- Driver OOM does not shut down Spark Context - posted by Bryan Jeffrey <br...@gmail.com> on 2019/01/31 15:01:38 UTC, 0 replies.
- Please stop asking to unsubscribe - posted by Andrew Melo <an...@gmail.com> on 2019/01/31 15:31:09 UTC, 0 replies.
- Exactly-Once delivery with Structured Streaming and Kafka - posted by William Briggs <wr...@gmail.com> on 2019/01/31 17:14:17 UTC, 0 replies.
- Aws - posted by Pedro Tuero <tu...@gmail.com> on 2019/01/31 20:23:26 UTC, 1 replies.
- Fwd: Spark driver pod scheduling fails on auto scaled node - posted by "Prudhvi Chennuru (CONT)" <pr...@capitalone.com> on 2019/01/31 21:50:53 UTC, 2 replies.