You are viewing a plain text version of this content. The canonical link for it is here.
- Re: Convert RDD[Iterrable[MyCaseClass]] to RDD[MyCaseClass] - posted by Chris Teoh <ch...@gmail.com> on 2018/12/01 10:09:30 UTC, 4 replies.
- unsubscribe - posted by "Kappaganthu, Sivaram (CORP)" <Si...@ADP.com> on 2018/12/01 16:48:03 UTC, 2 replies.
- PicklingError - Can't pickle py4j.protocol.Py4JJavaError - it's not the same object - posted by Abdeali Kothari <ab...@gmail.com> on 2018/12/02 12:53:23 UTC, 0 replies.
- "failed to get records for spark-executor after polling for ***" error - posted by JF Chen <da...@gmail.com> on 2018/12/03 09:32:01 UTC, 1 replies.
- Executor launched but no tasks is submitted - posted by "Chang.Wu" <58...@qq.com> on 2018/12/03 13:11:40 UTC, 0 replies.
- Parallel read parquet file, write to postgresql - posted by James Starks <su...@protonmail.com.INVALID> on 2018/12/03 13:40:41 UTC, 1 replies.
- Using spark and mesos container with host_path volume - posted by Antoine DUBOIS <an...@cc.in2p3.fr> on 2018/12/03 15:44:29 UTC, 0 replies.
- How to preserve event order per key in Structured Streaming Repartitioning By Key? - posted by pmatpadi <pm...@gmail.com> on 2018/12/03 22:22:30 UTC, 1 replies.
- Spark Structured streaming - dropDuplicates with watermark - posted by Nirmal Manoharan <ni...@gmail.com> on 2018/12/04 05:00:23 UTC, 0 replies.
- Re: Job hangs in blocked task in final parquet write stage - posted by Conrad Lee <co...@parsely.com> on 2018/12/04 08:45:39 UTC, 1 replies.
- Unsubscribe - posted by GmailLiang <li...@gmail.com> on 2018/12/04 12:41:37 UTC, 0 replies.
- Recommended Node Usage - posted by Hans Fischer <ma...@hans-fischer.com> on 2018/12/04 20:17:03 UTC, 0 replies.
- [Spark Structured Streaming] Dynamically changing maxOffsetsPerTrigger - posted by subramgr <su...@gmail.com> on 2018/12/04 21:44:23 UTC, 0 replies.
- [ANNOUNCE] Apache Bahir 2.3.0 Released - posted by Luciano Resende <lu...@gmail.com> on 2018/12/04 22:57:14 UTC, 0 replies.
- [ANNOUNCE] Apache Bahir 2.3.1 Released - posted by Luciano Resende <lu...@gmail.com> on 2018/12/04 22:57:22 UTC, 0 replies.
- [ANNOUNCE] Apache Bahir 2.3.2 Released - posted by Luciano Resende <lu...@gmail.com> on 2018/12/04 22:57:26 UTC, 0 replies.
- OData compliant API for Spark - posted by Affan Syed <as...@an10.io> on 2018/12/05 05:14:34 UTC, 2 replies.
- how to change temp directory when spark write data ? - posted by JF Chen <da...@gmail.com> on 2018/12/05 08:11:54 UTC, 2 replies.
- How to track batch jobs in spark ? - posted by kant kodali <ka...@gmail.com> on 2018/12/05 21:41:56 UTC, 5 replies.
- Spark Streaming job is missing Streaming tab from the UI on Ambari - posted by Alchemist <al...@gmail.com> on 2018/12/06 03:54:44 UTC, 0 replies.
- Join happening after watermark time - posted by Abhijeet Kumar <ab...@sentienz.com> on 2018/12/06 08:41:45 UTC, 0 replies.
- how to register UDF when scala code invoke python - posted by "mengmeng.meng@mathartsys.com" <me...@mathartsys.com> on 2018/12/06 09:04:48 UTC, 0 replies.
- How to fix spark streaming missing tab - posted by Alchemist <al...@gmail.com> on 2018/12/06 13:20:19 UTC, 1 replies.
- Spark Core - Embed in other application - posted by sparkuser99 <pr...@gmail.com> on 2018/12/07 00:23:49 UTC, 1 replies.
- Spark Structured Streaming - DF shows only one column with list of byte array - posted by salemi <al...@udo.edu> on 2018/12/07 03:05:08 UTC, 0 replies.
- In the future, will Spark support capacity scheduler in standalone mode? - posted by conner <mi...@gmail.com> on 2018/12/07 07:36:23 UTC, 0 replies.
- how to register UDF when scala code invoke python - posted by 朱 婧迪 <ji...@outlook.com> on 2018/12/07 08:56:40 UTC, 0 replies.
- Run SQL on files directly - posted by David Markovitz <Du...@microsoft.com.INVALID> on 2018/12/08 17:39:32 UTC, 2 replies.
- Identifying cause of exception in PySpark - posted by Abdeali Kothari <ab...@gmail.com> on 2018/12/10 03:40:07 UTC, 0 replies.
- Why does join use rows that were sent after watermark of 20 seconds? - posted by Abhijeet Kumar <ab...@sentienz.com> on 2018/12/10 11:53:40 UTC, 2 replies.
- Spark Sql group by less performant - posted by lsn24 <le...@gmail.com> on 2018/12/11 00:28:20 UTC, 2 replies.
- SGD for pyspark - posted by Chunpeng Wang <cp...@gmail.com> on 2018/12/11 16:29:53 UTC, 0 replies.
- Questions about caching - posted by Andrew Melo <an...@gmail.com> on 2018/12/11 17:13:33 UTC, 2 replies.
- Spark version performance - posted by Andrés Ivaldi <ia...@gmail.com> on 2018/12/12 01:57:24 UTC, 0 replies.
- How to set Spark Streaming batch start time? - posted by JF Chen <da...@gmail.com> on 2018/12/12 02:00:01 UTC, 1 replies.
- Kalman filter with spark - posted by Laurent Thiebaud <la...@gisaia.com> on 2018/12/13 12:56:17 UTC, 0 replies.
- Problem running Spark on Kubernetes: Certificate error - posted by Steven Stetzler <st...@gmail.com> on 2018/12/13 21:48:51 UTC, 2 replies.
- how to generate a larg dataset paralleled - posted by lk_spark <lk...@163.com> on 2018/12/14 02:38:14 UTC, 4 replies.
- SANSA 0.5 (Scalable Semantic Analytics Stack) Released - posted by Gezim Sejdiu <g....@gmail.com> on 2018/12/14 08:28:27 UTC, 0 replies.
- Re: Driver Memory taken up by BlockManager - posted by "Davide.Mandrini" <da...@gmail.com> on 2018/12/14 10:19:52 UTC, 0 replies.
- Continuous Processing roadmap - posted by albamoro <al...@gmail.com> on 2018/12/14 11:07:54 UTC, 0 replies.
- Structured Streaming on Kubernetes Performance - posted by Kalvin Chau <wo...@gmail.com> on 2018/12/14 17:33:41 UTC, 0 replies.
- spark2.4 arrow enabled true,error log not returned - posted by 李斌松 <li...@gmail.com> on 2018/12/15 06:39:10 UTC, 0 replies.
- Maven dependecy problem about spark-streaming-kafka_2.11:1.6.3 - posted by big data <bi...@outlook.com> on 2018/12/17 07:03:54 UTC, 0 replies.
- Mllib / kalman - posted by Laurent Thiebaud <la...@gisaia.com> on 2018/12/17 13:59:35 UTC, 1 replies.
- Spark App Write nothing on HDFS - posted by Soheil Pourbafrani <so...@gmail.com> on 2018/12/17 18:08:58 UTC, 0 replies.
- Need help with SparkSQL Query - posted by Nikhil Goyal <no...@gmail.com> on 2018/12/17 21:03:31 UTC, 2 replies.
- Spark 2.2.1 - Operation not allowed: alter table replace columns - posted by Nirav Patel <np...@xactlycorp.com> on 2018/12/17 23:10:16 UTC, 1 replies.
- How to clean up logs-dirs and local-dirs of running spark streaming in yarn cluster mode - posted by shyla deshpande <de...@gmail.com> on 2018/12/18 00:42:55 UTC, 4 replies.
- How to update structured streaming apps gracefully - posted by Yuta Morisawa <yu...@kddi-research.jp> on 2018/12/18 02:56:11 UTC, 5 replies.
- State size on joining two streams - posted by Alexander Chermenin <a....@gmail.com> on 2018/12/18 08:58:13 UTC, 0 replies.
- Spark Scala reading from Google Cloud BigQuery table throws error - posted by Mich Talebzadeh <mi...@gmail.com> on 2018/12/18 10:26:15 UTC, 4 replies.
- Add column value in the dataset on the basis of a condition - posted by Devender Yadav <de...@impetus.co.in> on 2018/12/18 13:47:56 UTC, 3 replies.
- Re: [Apache Beam] Custom DataSourceV2 instanciation: parameters passing and Encoders - posted by Etienne Chauchot <ec...@apache.org> on 2018/12/18 16:09:01 UTC, 0 replies.
- Read Time from a remote data source - posted by swastik mittal <sm...@ncsu.edu> on 2018/12/18 20:20:52 UTC, 3 replies.
- Dataset experimental interfaces - posted by Andrew Old <an...@gmail.com> on 2018/12/18 20:54:44 UTC, 0 replies.
- Multiple sessions in one application? - posted by Jean Georges Perrin <jg...@jgp.net> on 2018/12/19 11:12:58 UTC, 2 replies.
- Spark Kafka Streaming with Offset Gaps - posted by Rishabh Pugalia <ri...@gmail.com> on 2018/12/19 11:40:49 UTC, 1 replies.
- [Spark Core] Support for parquet column indexes - posted by Kamil Krzysztof Krynicki <ka...@cern.ch> on 2018/12/19 13:09:46 UTC, 0 replies.
- - posted by Daniel O' Shaughnessy <da...@gmail.com> on 2018/12/19 13:59:19 UTC, 0 replies.
- Re: question about barrier execution mode in Spark 2.4.0 - posted by Xiangrui Meng <me...@gmail.com> on 2018/12/19 17:21:26 UTC, 0 replies.
- Fwd: Train multiple machine learning models in parallel - posted by Pola Yao <po...@gmail.com> on 2018/12/19 23:55:27 UTC, 0 replies.
- [Spark SQL]use zstd, No enum constant parquet.hadoop.metadata.CompressionCodecName.ZSTD - posted by 李斌松 <li...@gmail.com> on 2018/12/20 03:38:00 UTC, 2 replies.
- Spark not working with Hadoop 4mc compression - posted by Abhijeet Kumar <ab...@sentienz.com> on 2018/12/20 06:03:28 UTC, 1 replies.
- [SPARK SQL] Difference between 'Hive on spark' and Spark SQL - posted by lu...@china-inv.cn on 2018/12/20 07:17:01 UTC, 1 replies.
- Re: Spark job on dataproc failing with Exception in thread "main" java.lang.NoSuchMethodError: com.googl - posted by Mich Talebzadeh <mi...@gmail.com> on 2018/12/20 10:40:12 UTC, 1 replies.
- [Spark cluster standalone v2.4.0] - problems with reverse proxy functionnality regarding submitted applications in cluster mode and the spark history server ui - posted by Cheikh_SOW <ch...@live.fr> on 2018/12/20 16:42:43 UTC, 0 replies.
- Custom Metric Sink on Executor Always ClassNotFound - posted by prosp4300 <pr...@163.com> on 2018/12/20 21:47:57 UTC, 3 replies.
- running updates using SPARK - posted by Gourav Sengupta <go...@gmail.com> on 2018/12/20 22:05:54 UTC, 4 replies.
- Connection issue with AWS S3 from PySpark 2.3.1 - posted by Aakash Basu <aa...@gmail.com> on 2018/12/21 06:28:33 UTC, 11 replies.
- Spark 2 - How to order keys in sparse vector (K-means)? - posted by ddebarbieux <dd...@norsys.fr> on 2018/12/21 14:15:54 UTC, 0 replies.
- Spark executors exceeding heap space allocated - posted by Akshay Mendole <ak...@gmail.com> on 2018/12/21 15:47:09 UTC, 0 replies.
- java.lang.NumberFormatException: Not a version: 9 after I add spark.cassandra.connection.host property to Spark Interpreter - posted by shyla deshpande <de...@gmail.com> on 2018/12/21 20:22:00 UTC, 0 replies.
- Powered By Spark - posted by Ascot Moss <as...@gmail.com> on 2018/12/22 08:13:11 UTC, 0 replies.
- Async action in Dataframe - posted by JiaTao Tao <ta...@gmail.com> on 2018/12/22 08:47:41 UTC, 2 replies.
- About LocalProperty in sqlConf - posted by JiaTao Tao <ta...@gmail.com> on 2018/12/22 09:16:09 UTC, 0 replies.
- Connecting to Cassandra from Zeppelin on EMR cluster - posted by shyla deshpande <de...@gmail.com> on 2018/12/22 16:02:29 UTC, 1 replies.
- Getting FileNotFoundException and LeaseExpired Exception while writing a df to hdfs path - posted by Gaurav Gupta <gg...@gmail.com> on 2018/12/24 20:04:03 UTC, 0 replies.
- Packaging kafka certificates in uber jar - posted by Colin Williams <co...@gmail.com> on 2018/12/24 20:29:05 UTC, 2 replies.
- spark application takes significant some time to succeed even after all jobs are completed - posted by Akshay Mendole <ak...@gmail.com> on 2018/12/25 11:51:54 UTC, 3 replies.
- Tuning G1GC params for aggressive garbage collection? - posted by Akshay Mendole <ak...@gmail.com> on 2018/12/25 11:57:55 UTC, 2 replies.
- Spark Dataset transformations for time based events - posted by Debajyoti Roy <ne...@gmail.com> on 2018/12/26 07:34:29 UTC, 0 replies.
- Corrupt record handling in spark structured streaming and from_json function - posted by Colin Williams <co...@gmail.com> on 2018/12/26 21:55:13 UTC, 2 replies.
- jdbc spark streaming - posted by Nicolas Paris <ni...@riseup.net> on 2018/12/27 22:52:21 UTC, 3 replies.
- How do you set POSIX rlimit on mesos - posted by FengYu Cao <ca...@gmail.com> on 2018/12/28 03:04:40 UTC, 0 replies.
- What are the alternatives to nested DataFrames? - posted by em...@yeikel.com on 2018/12/28 07:40:36 UTC, 6 replies.
- Spark Kinesis Connector SSL issue - posted by Shashikant Bangera <sh...@discover.com> on 2018/12/28 14:13:58 UTC, 0 replies.
- [spark-sql] Hive failing on insert empty array into parquet table - posted by 李斌松 <li...@gmail.com> on 2018/12/29 08:08:25 UTC, 1 replies.
- Postgres Read JDBC with COPY TO STDOUT - posted by Nicolas Paris <ni...@riseup.net> on 2018/12/29 12:06:00 UTC, 1 replies.
- Count() not working on streaming dataframe/structured streaming - posted by Ritesh Shah <RS...@TechMahindra.com> on 2018/12/30 05:17:32 UTC, 0 replies.
- p-values logistic regression - posted by Simon Dirmeier <si...@web.de> on 2018/12/30 14:37:03 UTC, 0 replies.
- Python 3.x - posted by Gourav Sengupta <go...@gmail.com> on 2018/12/30 15:25:00 UTC, 0 replies.
- Spark jdbc postgres numeric array - posted by Alexey <al...@i.ua> on 2018/12/31 15:13:40 UTC, 0 replies.
- Do spark-submit overwrite the Spark session created manually? - posted by em...@yeikel.com on 2018/12/31 21:57:10 UTC, 0 replies.