You are viewing a plain text version of this content. The canonical link for it is here.
- NullPointerException when scanning HBase table - posted by Huiliang Zhang <zh...@gmail.com> on 2018/05/01 02:05:02 UTC, 0 replies.
- Re: Spark launcher listener not getting invoked k8s Spark 2.3 - posted by Marcelo Vanzin <va...@cloudera.com> on 2018/05/01 03:17:26 UTC, 0 replies.
- Filter one dataset based on values from another - posted by lsn24 <le...@gmail.com> on 2018/05/01 04:02:40 UTC, 2 replies.
- UnresolvedException: Invalid call to dataType on unresolved object - posted by 880f0464 <88...@protonmail.com> on 2018/05/01 11:14:47 UTC, 0 replies.
- spark.python.worker.reuse not working as expected - posted by 880f0464 <88...@protonmail.com> on 2018/05/01 11:17:21 UTC, 0 replies.
- PySpark.sql.filter not performing as it should - posted by 880f0464 <88...@protonmail.com> on 2018/05/01 11:19:07 UTC, 0 replies.
- org.apache.spark.shuffle.FetchFailedException: Too large frame: - posted by Pralabh Kumar <pr...@gmail.com> on 2018/05/01 11:21:01 UTC, 3 replies.
- all calculations finished, but "VCores Used" value remains at its max - posted by Valery Khamenya <kh...@gmail.com> on 2018/05/01 11:30:24 UTC, 1 replies.
- Re: Dataframe vs dataset - posted by "Lalwani, Jayesh" <Ja...@capitalone.com> on 2018/05/01 12:27:29 UTC, 1 replies.
- ApacheCon North America 2018 schedule is now live. - posted by Rich Bowen <rb...@apache.org> on 2018/05/01 12:36:05 UTC, 0 replies.
- smarter way to "forget" DataFrame definition and stick to its values - posted by Valery Khamenya <kh...@gmail.com> on 2018/05/01 13:16:23 UTC, 1 replies.
- Fast Unit Tests - posted by marcos rebelo <ol...@gmail.com> on 2018/05/01 15:25:01 UTC, 2 replies.
- Re: [EXT] [Spark 2.x Core] .collect() size limit - posted by klrmowse <kl...@gmail.com> on 2018/05/01 15:49:36 UTC, 0 replies.
- keep getting empty table while using saveAsTable() to save DataFrame as table - posted by nicholasl <yo...@gmail.com> on 2018/05/01 18:51:08 UTC, 0 replies.
- Poor performance reading Hive table made of sequence files - posted by Patrick McCarthy <pm...@dstillery.com> on 2018/05/01 20:36:00 UTC, 0 replies.
- [Spark Streaming]: Does DStream workload run over Spark SQL engine? - posted by Khaled Zaouk <kh...@gmail.com> on 2018/05/02 08:51:16 UTC, 1 replies.
- [Spark scheduling] Spark schedules single task although rdd has 48 partitions? - posted by Paul Borgmans <pa...@asml.com> on 2018/05/02 09:31:51 UTC, 0 replies.
- spark.executor.extraJavaOptions inside application code - posted by Agostino Calamita <ag...@gmail.com> on 2018/05/02 10:59:38 UTC, 1 replies.
- what is the query language used for graphX? - posted by kant kodali <ka...@gmail.com> on 2018/05/02 11:00:23 UTC, 0 replies.
- Dataset Caching and Unpersisting - posted by Daniele Foroni <da...@gmail.com> on 2018/05/02 11:19:21 UTC, 0 replies.
- Re: ML Linear and Logistic Regression - Poor Performance - posted by Irving Duran <ir...@gmail.com> on 2018/05/02 14:15:08 UTC, 0 replies.
- - posted by Filippo Balicchia <fb...@gmail.com> on 2018/05/02 15:03:14 UTC, 5 replies.
- Re: Problem in persisting file in S3 using Spark: xxx file does not exist Exception - posted by Paul Tremblay <pa...@gmail.com> on 2018/05/02 15:56:57 UTC, 1 replies.
- Running apps over a VPN - posted by Christopher Piggott <cp...@gmail.com> on 2018/05/02 15:57:37 UTC, 0 replies.
- Re: Uncaught exception in thread heartbeat-receiver-event-loop-thread - posted by ccherng <cc...@snapchat.com> on 2018/05/02 20:34:07 UTC, 1 replies.
- ConcurrentModificationException - posted by ccherng <cc...@snapchat.com> on 2018/05/02 20:47:20 UTC, 0 replies.
- AccumulatorV2 vs AccumulableParam (V1) - posted by Sergey Zhemzhitsky <sz...@gmail.com> on 2018/05/02 22:20:35 UTC, 2 replies.
- MappingException - org.apache.spark.mllib.classification.LogisticRegressionModel.load - posted by Mina Aslani <as...@gmail.com> on 2018/05/03 03:05:28 UTC, 0 replies.
- question on collect_list or say aggregations in general in structured streaming 2.3.0 - posted by kant kodali <ka...@gmail.com> on 2018/05/03 08:24:33 UTC, 3 replies.
- native-lzo library not available - posted by Fawze Abujaber <fa...@gmail.com> on 2018/05/03 12:06:15 UTC, 3 replies.
- Pickling Keras models for use in UDFs - posted by erp12 <ed...@gmail.com> on 2018/05/03 15:26:57 UTC, 2 replies.
- Re: Read or save specific blocks of a file - posted by Thodoris Zois <zo...@ics.forth.gr> on 2018/05/03 15:46:00 UTC, 1 replies.
- [Structured streaming, V2] commit on ContinuousReader - posted by Jiří Syrový <sy...@gmail.com> on 2018/05/03 17:43:37 UTC, 0 replies.
- SparkContext taking time after adding jars and asking yarn for resources - posted by neeravsalaria <ne...@gmail.com> on 2018/05/04 09:01:52 UTC, 0 replies.
- I cannot use spark 2.3.0 and kafka 0.9? - posted by kant kodali <ka...@gmail.com> on 2018/05/04 09:02:28 UTC, 1 replies.
- [pyspark] Read multiple files parallely into a single dataframe - posted by Shuporno Choudhury <sh...@gmail.com> on 2018/05/04 09:38:26 UTC, 1 replies.
- Free Column Reference with $ - posted by Christopher Piggott <cp...@gmail.com> on 2018/05/04 14:10:40 UTC, 1 replies.
- RV: Unintelligible warning arose out of the blue. - posted by Tomas Zubiri <tz...@prokarma.com> on 2018/05/04 20:15:49 UTC, 1 replies.
- Advice on multiple streaming job - posted by Dhaval Modi <dh...@gmail.com> on 2018/05/05 15:09:53 UTC, 7 replies.
- help needed in perforance improvement of spark structured streaming - posted by amit kumar singh <am...@gmail.com> on 2018/05/05 16:20:34 UTC, 1 replies.
- Spark with HBase on Spark Runtime 2.2.1 - posted by SparkUser6 <al...@gmail.com> on 2018/05/05 23:12:39 UTC, 0 replies.
- Unable to Connect to Apache Phoenix From Spark - posted by SparkUser6 <al...@gmail.com> on 2018/05/06 01:40:01 UTC, 0 replies.
- stage blocked sometimes - posted by 付涛 <78...@qq.com> on 2018/05/06 14:22:37 UTC, 0 replies.
- [beginner][StructuredStreaming] Null pointer exception - possible serialization errors. - posted by karthikjay <as...@gmail.com> on 2018/05/07 00:17:19 UTC, 0 replies.
- Watch Zookeeper in Spark Closure - posted by 王 纯超 <wa...@outlook.com> on 2018/05/07 08:38:18 UTC, 0 replies.
- Re: Spark UI Source Code - posted by Marcelo Vanzin <va...@cloudera.com> on 2018/05/07 16:20:54 UTC, 1 replies.
- Guava dependency issue - posted by Stephen Boesch <ja...@gmail.com> on 2018/05/07 17:30:17 UTC, 3 replies.
- Spark 2.3.0 DataFrame.write.parquet() behavior change from 2.2.0 - posted by Victor Tso-Guillen <vt...@paxata.com> on 2018/05/07 19:18:16 UTC, 2 replies.
- Best place to persist offsets into Zookeeper - posted by ravidspark <ra...@gmail.com> on 2018/05/07 23:53:05 UTC, 0 replies.
- Error submitting Spark Job in yarn-cluster mode on EMR - posted by SparkUser6 <al...@gmail.com> on 2018/05/08 09:14:55 UTC, 1 replies.
- Help Required - Unable to run spark-submit on YARN client mode - posted by Debabrata Ghosh <ma...@gmail.com> on 2018/05/08 11:35:48 UTC, 1 replies.
- Spark 2.3.0 Structured Streaming Kafka Timestamp - posted by Yuta Morisawa <yu...@kddi-research.jp> on 2018/05/09 07:14:36 UTC, 2 replies.
- Spark 2.3.0 --files vs. addFile() - posted by Marius <m....@gmail.com> on 2018/05/09 07:51:04 UTC, 1 replies.
- Malformed URL Exception when connecting to Phoenix to Spark - posted by Alchemist <al...@gmail.com> on 2018/05/09 12:38:52 UTC, 0 replies.
- Invalid Spark URL: spark://HeartbeatReceiver@hostname - posted by Serkan TAS <Se...@enerjisa.com> on 2018/05/09 14:06:53 UTC, 0 replies.
- Fwd: Array[Double] two time slower then DenseVector - posted by David Ignjić <ig...@gmail.com> on 2018/05/09 14:23:56 UTC, 0 replies.
- Livy Failed error on Yarn with Spark - posted by Chetan Khatri <ch...@gmail.com> on 2018/05/09 20:18:19 UTC, 1 replies.
- Problem with Spark Master shutting down when zookeeper leader is shutdown - posted by agateaaa <ag...@gmail.com> on 2018/05/09 20:50:02 UTC, 0 replies.
- [Structured-Streaming][Beginner] Out of order messages with Spark kafka readstream from a specific partition - posted by karthikjay <as...@gmail.com> on 2018/05/10 00:05:02 UTC, 1 replies.
- Making spark streaming application single threaded - posted by ravidspark <ra...@gmail.com> on 2018/05/10 00:10:59 UTC, 0 replies.
- AWS credentials needed while trying to read a model from S3 in Spark - posted by Mina Aslani <as...@gmail.com> on 2018/05/10 01:34:04 UTC, 1 replies.
- [Spark] Supporting python 3.5? - posted by Irving Duran <ir...@gmail.com> on 2018/05/10 13:08:10 UTC, 1 replies.
- Accumulator guarantees - posted by Sergey Zhemzhitsky <sz...@gmail.com> on 2018/05/10 19:24:00 UTC, 1 replies.
- java.lang.NullPointerException - posted by Mina Aslani <as...@gmail.com> on 2018/05/11 02:14:13 UTC, 0 replies.
- UDTF registration fails for hiveEnabled SQLContext - posted by Mick Davies <Mi...@gmail.com> on 2018/05/11 08:22:31 UTC, 5 replies.
- Oozie with spark 2.3 in Kubernetes - posted by purna pradeep <pu...@gmail.com> on 2018/05/11 18:18:24 UTC, 0 replies.
- SPARK SQL: returns null for a column, while HIVE query returns data for the same column - posted by ARAVIND ARUMUGHAM Sethurathnam <ar...@gmail.com> on 2018/05/11 19:24:42 UTC, 1 replies.
- ordered ingestion not guaranteed - posted by ravidspark <ra...@gmail.com> on 2018/05/11 21:25:47 UTC, 2 replies.
- Dataset error with Encoder - posted by Masf <ma...@gmail.com> on 2018/05/12 13:17:20 UTC, 0 replies.
- Spark Structured Streaming is giving error “org.apache.spark.sql.AnalysisException: Inner join between two streaming DataFrames/Datasets is not supported;” - posted by ThomasThomas <th...@gmail.com> on 2018/05/12 14:57:33 UTC, 5 replies.
- Having issues when running spark with s3 - posted by Shivam Sharma <28...@gmail.com> on 2018/05/12 20:40:06 UTC, 0 replies.
- Measure performance time in some spark transformations. - posted by Guillermo Ortiz Fernández <gu...@gmail.com> on 2018/05/12 22:31:30 UTC, 1 replies.
- [Arrow][Dremio] - posted by xmehaut <xa...@gmail.com> on 2018/05/14 04:53:21 UTC, 5 replies.
- assertion failed: Beginning offset 34242088 is after the ending offset 34242084 for topic partition 2. You either provided an invalid fromOffset, or the Kafka topic has been damaged - posted by ravidspark <ra...@gmail.com> on 2018/05/14 16:08:13 UTC, 0 replies.
- How to use StringIndexer for multiple input /output columns in Spark Java - posted by Mina Aslani <as...@gmail.com> on 2018/05/14 20:30:50 UTC, 4 replies.
- spark sql StackOverflow - posted by onmstester onmstester <on...@zoho.com> on 2018/05/15 06:32:46 UTC, 3 replies.
- Spark streaming with kafka input stuck in (Re-)joing group because of group rebalancing - posted by JF Chen <da...@gmail.com> on 2018/05/15 08:15:28 UTC, 0 replies.
- What to consider when implementing a custom streaming sink? - posted by kant kodali <ka...@gmail.com> on 2018/05/15 10:05:42 UTC, 0 replies.
- Scala's Seq:* equivalent in java - posted by onmstester onmstester <on...@zoho.com> on 2018/05/15 10:29:39 UTC, 1 replies.
- Spark structured streaming aggregation within microbatch - posted by Koert Kuipers <ko...@tresata.com> on 2018/05/15 12:20:43 UTC, 0 replies.
- Structured Streaming, Reading and Updating a variable - posted by Martin Engen <Ma...@outlook.com> on 2018/05/15 13:23:29 UTC, 3 replies.
- OneHotEncoderEstimator - java.lang.NoSuchMethodError: org.apache.spark.sql.Dataset.withColumns - posted by Mina Aslani <as...@gmail.com> on 2018/05/15 14:58:13 UTC, 2 replies.
- Sklearn model in pyspark prediction - posted by HARSH TAKKAR <ta...@gmail.com> on 2018/05/15 16:38:59 UTC, 0 replies.
- java.lang.IllegalArgumentException: requirement failed: BLAS.dot(x: Vector, y:Vector) was given Vectors with non-matching sizes - posted by Mina Aslani <as...@gmail.com> on 2018/05/15 17:53:36 UTC, 0 replies.
- Continuous Processing mode behaves differently from Batch mode - posted by Yuta Morisawa <yu...@kddi-research.jp> on 2018/05/16 00:38:48 UTC, 2 replies.
- [structured-streaming][kafka] Will the Kafka readstream timeout after connections.max.idle.ms 540000 ms ? - posted by karthikjay <as...@gmail.com> on 2018/05/16 01:25:52 UTC, 1 replies.
- Interest in adding ability to request GPU's to the spark client? - posted by Daniel Galvez <dt...@gmail.com> on 2018/05/16 05:58:06 UTC, 0 replies.
- [Java] impact of java 10 on spark dev - posted by xmehaut <xa...@gmail.com> on 2018/05/16 06:20:42 UTC, 1 replies.
- OOM: Structured Streaming aggregation state not cleaned up propertly - posted by weand <an...@gmail.com> on 2018/05/16 09:55:37 UTC, 1 replies.
- Datafarme save as table operation is failing when the child columns name contains special characters - posted by abhijeet bedagkar <qa...@gmail.com> on 2018/05/16 12:43:21 UTC, 1 replies.
- Structured Streaming Job stops abruptly and No errors logged - posted by prudhviraj202m <pr...@gmail.com> on 2018/05/16 17:32:10 UTC, 0 replies.
- Unsubscribe - posted by varma dantuluri <dv...@gmail.com> on 2018/05/16 20:39:30 UTC, 0 replies.
- Submit many spark applications - posted by Shiyuan <gs...@gmail.com> on 2018/05/16 20:45:20 UTC, 11 replies.
- PySpark Structured Streaming - using previous iteration computed results in current iteration - posted by Ofer Eliassaf <of...@gmail.com> on 2018/05/17 05:18:37 UTC, 0 replies.
- Spark Jobs ends when assignment not found for Kafka Partition - posted by Biplob Biswas <re...@gmail.com> on 2018/05/17 08:28:44 UTC, 0 replies.
- Snappy file compatible problem with spark - posted by JF Chen <da...@gmail.com> on 2018/05/17 08:59:32 UTC, 1 replies.
- [structured-streaming] foreachPartition alternative in structured streaming. - posted by karthikjay <as...@gmail.com> on 2018/05/17 20:30:16 UTC, 0 replies.
- Getting Data From Hbase using Spark is Extremely Slow - posted by SparkUser6 <al...@gmail.com> on 2018/05/18 00:21:26 UTC, 0 replies.
- Using Apache Kylin as data source for Spark - posted by ShaoFeng Shi <sh...@apache.org> on 2018/05/18 03:56:20 UTC, 1 replies.
- Exception thrown in awaitResult during application launch in yarn cluster - posted by Shiyuan <gs...@gmail.com> on 2018/05/18 07:12:40 UTC, 0 replies.
- How to Spark can solve this example - posted by Esa Heikkinen <es...@student.tut.fi> on 2018/05/18 07:20:54 UTC, 3 replies.
- Understanding the results from Spark's KMeans clustering object - posted by shubham <sh...@gmail.com> on 2018/05/18 07:53:20 UTC, 0 replies.
- Spark on YARN in client-mode: do we need 1 vCore for the AM? - posted by peay <pe...@protonmail.com> on 2018/05/18 10:20:27 UTC, 1 replies.
- RDD does not have sc error - posted by Chao Fang <cf...@aliyun.com> on 2018/05/18 15:37:35 UTC, 0 replies.
- Spark job terminated without any errors - posted by karthikjay <as...@gmail.com> on 2018/05/18 20:44:37 UTC, 0 replies.
- XGBoost on PySpark - posted by Aakash Basu <aa...@gmail.com> on 2018/05/19 06:51:06 UTC, 1 replies.
- Spark is not evenly distributing data - posted by SparkUser6 <al...@gmail.com> on 2018/05/19 19:52:03 UTC, 1 replies.
- Re: OOM: Structured Streaming aggregation state not cleaned up properly - posted by Ted Yu <yu...@gmail.com> on 2018/05/19 20:40:54 UTC, 6 replies.
- Spark UNEVENLY distributing data - posted by Alchemist <al...@gmail.com> on 2018/05/19 22:40:07 UTC, 1 replies.
- Does Spark shows logical or physical plan when executing job on the yarn cluster - posted by giri ar <gi...@gmail.com> on 2018/05/20 07:21:48 UTC, 1 replies.
- is it possible to create one KafkaDirectStream (Dstream) per topic? - posted by kant kodali <ka...@gmail.com> on 2018/05/20 22:24:39 UTC, 1 replies.
- Re: [Spark2.1] SparkStreaming to Cassandra performance problem - posted by Saulo Sobreiro <sa...@outlook.pt> on 2018/05/21 02:34:20 UTC, 3 replies.
- How to skip nonexistent file when read files with spark? - posted by JF Chen <da...@gmail.com> on 2018/05/21 03:30:33 UTC, 6 replies.
- Executors slow down when running on the same node - posted by Javier Pareja <pa...@gmail.com> on 2018/05/21 08:53:40 UTC, 0 replies.
- Adding jars - posted by Malveeka Bhandari <ma...@gmail.com> on 2018/05/21 09:23:57 UTC, 5 replies.
- testing frameworks - posted by Steve Pruitt <bp...@opentext.com> on 2018/05/21 12:24:40 UTC, 4 replies.
- help in copying data from one azure subscription to another azure subscription - posted by amit kumar singh <am...@gmail.com> on 2018/05/21 12:59:04 UTC, 1 replies.
- Spark horizontal scaling is not supported in which cluster mode? Ask - posted by unk1102 <um...@gmail.com> on 2018/05/21 16:29:34 UTC, 1 replies.
- Spark Worker Re-register to Master - posted by "sushil.chaudhary" <su...@capitalone.com> on 2018/05/21 19:04:50 UTC, 2 replies.
- Encounter 'Could not find or load main class' error when submitting spark job on kubernetes - posted by Makoto Hashimoto <to...@gmail.com> on 2018/05/22 07:45:33 UTC, 1 replies.
- problem with saving RandomForestClassifier model - Saprk Java - posted by Donni Khan <pr...@googlemail.com> on 2018/05/22 09:08:19 UTC, 0 replies.
- [structured-streaming]How to reset Kafka offset in readStream and read from beginning - posted by karthikjay <as...@gmail.com> on 2018/05/22 14:24:45 UTC, 3 replies.
- Spark driver pod eviction Kubernetes - posted by purna pradeep <pu...@gmail.com> on 2018/05/22 14:55:18 UTC, 1 replies.
- RE: [EXTERNAL] - Re: testing frameworks - posted by Steve Pruitt <bp...@opentext.com> on 2018/05/22 17:48:44 UTC, 1 replies.
- How to validate orc vectorization is working within spark application? - posted by umargeek <um...@gmail.com> on 2018/05/23 04:59:26 UTC, 0 replies.
- spark sql in-clause problem - posted by onmstester onmstester <on...@zoho.com> on 2018/05/23 05:02:44 UTC, 2 replies.
- Alternative for numpy in Spark Mlib - posted by umargeek <um...@gmail.com> on 2018/05/23 05:04:36 UTC, 1 replies.
- [Beginner][StructuredStreaming] Console sink is not working as expected - posted by karthikjay <as...@gmail.com> on 2018/05/23 05:54:22 UTC, 0 replies.
- [Beginner][StructuredStreaming] Using Spark aggregation - WithWatermark on old data - posted by karthikjay <as...@gmail.com> on 2018/05/23 06:46:04 UTC, 1 replies.
- Bulk / Fast Read and Write with MSSQL Server and Spark - posted by Chetan Khatri <ch...@gmail.com> on 2018/05/23 11:46:59 UTC, 8 replies.
- Spark driver pod garbage collection - posted by purna pradeep <pu...@gmail.com> on 2018/05/23 15:33:56 UTC, 1 replies.
- Cannot make Spark to honour the spark.jars.ivySettings config - posted by Bruno Aranda <ba...@apache.org> on 2018/05/23 18:16:44 UTC, 0 replies.
- CMR: An open-source Data acquisition API for Spark is available - posted by Thomas Fuller <th...@coherentlogic.com> on 2018/05/23 19:06:38 UTC, 0 replies.
- PySpark API on top of Apache Arrow - posted by Corey Nolet <cj...@gmail.com> on 2018/05/23 20:30:08 UTC, 3 replies.
- Write data from Hbase using Spark Failing with NPE - posted by Alchemist <al...@gmail.com> on 2018/05/24 03:47:59 UTC, 0 replies.
- Time series data - posted by amin mohebbi <am...@yahoo.com.INVALID> on 2018/05/24 06:49:52 UTC, 2 replies.
- Positive log-likelihood with Gaussian mixture - posted by Simon Dirmeier <si...@web.de> on 2018/05/24 07:19:43 UTC, 3 replies.
- Streaming : WAL ignored - posted by Walid Lezzar <wa...@gmail.com> on 2018/05/24 11:01:59 UTC, 0 replies.
- re: help with streaming batch interval question needed - posted by Peter Liu <pe...@gmail.com> on 2018/05/24 20:14:47 UTC, 2 replies.
- Why Spark JDBC Writing in a sequential order - posted by Yong Zhang <ja...@hotmail.com> on 2018/05/25 14:42:40 UTC, 2 replies.
- Databricks 1/2 day certification course at Spark Summit - posted by Sumona Routh <su...@gmail.com> on 2018/05/25 17:34:01 UTC, 0 replies.
- [Query] Weight of evidence on Spark - posted by Aakash Basu <aa...@gmail.com> on 2018/05/25 17:42:23 UTC, 0 replies.
- what defines dataset partition number in spark sql - posted by 崔苗 <cu...@danale.com> on 2018/05/26 06:24:09 UTC, 0 replies.
- Spark 2.3 Memory Leak on Executor - posted by Aakash Basu <aa...@gmail.com> on 2018/05/26 12:43:22 UTC, 0 replies.
- Spark 2.3 Tree Error - posted by Aakash Basu <aa...@gmail.com> on 2018/05/26 12:46:00 UTC, 3 replies.
- Silly question on Dropping Temp Table - posted by Aakash Basu <aa...@gmail.com> on 2018/05/26 15:26:53 UTC, 2 replies.
- Aggregation of Streaming UI Statistics for multiple jobs - posted by skmishra <si...@gmail.com> on 2018/05/27 04:48:06 UTC, 1 replies.
- Spark AsyncEventQueue doubt - posted by Aakash Basu <aa...@gmail.com> on 2018/05/27 07:45:54 UTC, 1 replies.
- Big data visualization - posted by amin mohebbi <am...@yahoo.com.INVALID> on 2018/05/28 00:17:41 UTC, 1 replies.
- Execution model in Spark - posted by Esa Heikkinen <es...@student.tut.fi> on 2018/05/28 07:19:30 UTC, 0 replies.
- Name error when writing data as orc - posted by JF Chen <da...@gmail.com> on 2018/05/28 09:49:49 UTC, 0 replies.
- Error on fetchin mass data from cassandra using SparkSQL - posted by Soheil Pourbafrani <so...@gmail.com> on 2018/05/28 12:22:41 UTC, 0 replies.
- trying to understand structured streaming aggregation with watermark and append outputmode - posted by Koert Kuipers <ko...@tresata.com> on 2018/05/28 22:16:49 UTC, 3 replies.
- Pandas UDF for PySpark error. Big Dataset - posted by Traku traku <tr...@gmail.com> on 2018/05/28 23:22:23 UTC, 1 replies.
- GroupBy in Spark / Scala without Agg functions - posted by Chetan Khatri <ch...@gmail.com> on 2018/05/29 18:21:39 UTC, 5 replies.
- Spark 2.3 error on kubernetes - posted by "Mamillapalli, Purna Pradeep" <Pu...@capitalone.com> on 2018/05/29 22:13:18 UTC, 1 replies.
- Spark 2.3 error on Kubernetes - posted by purna pradeep <pu...@gmail.com> on 2018/05/29 22:18:49 UTC, 3 replies.
- 答复: GroupBy in Spark / Scala without Agg functions - posted by Linyuxin <li...@huawei.com> on 2018/05/30 00:51:23 UTC, 1 replies.
- Data is not getting written in sorted format on target oracle table through SPARK - posted by abhijeet bedagkar <qa...@gmail.com> on 2018/05/30 09:29:04 UTC, 1 replies.
- Apache Spark is not working as expected - posted by remil <re...@gmail.com> on 2018/05/30 13:04:57 UTC, 0 replies.
- Blockmgr directories intermittently not being cleaned up - posted by Jeff Frylings <je...@oracle.com> on 2018/05/30 15:49:44 UTC, 2 replies.
- Thrift server not exposing temp tables (spark.sql.hive.thriftServer.singleSession=true) - posted by Daniel Haviv <da...@gmail.com> on 2018/05/30 16:13:39 UTC, 0 replies.
- Error while creating table with space with /wihout partition - posted by abhijeet bedagkar <qa...@gmail.com> on 2018/05/30 16:15:23 UTC, 0 replies.
- Closing IPC connection - posted by Arun Hive <ar...@yahoo.com.INVALID> on 2018/05/30 17:55:45 UTC, 0 replies.
- Re: Unable to alter partition. The transaction for alter partition did not commit successfully. - posted by Arun Hive <ar...@yahoo.com.INVALID> on 2018/05/30 17:58:24 UTC, 1 replies.
- Patition column appname not found in schema StructType(....) - posted by JF Chen <da...@gmail.com> on 2018/05/31 02:50:36 UTC, 0 replies.
- How can we group by messages coming in per batch of structured streaming - posted by amit kumar singh <am...@gmail.com> on 2018/05/31 02:52:34 UTC, 0 replies.
- Limit on the number of Jobs per Application - posted by Jeremy Davis <je...@speakeasy.net> on 2018/05/31 05:11:15 UTC, 0 replies.
- [PySpark Pipeline XGboost] How to use XGboost in PySpark Pipeline - posted by Daniel Du <yu...@usc.edu> on 2018/05/31 08:25:40 UTC, 0 replies.
- Fastest way to drop useless columns - posted by ju...@free.fr on 2018/05/31 08:34:51 UTC, 4 replies.
- [Suggestions needed] Weight of Evidence PySpark - posted by Aakash Basu <aa...@gmail.com> on 2018/05/31 09:50:45 UTC, 0 replies.
- [Help] PySpark Dynamic mean calculation - posted by Aakash Basu <aa...@gmail.com> on 2018/05/31 10:10:49 UTC, 1 replies.
- Apache Spark Installation error - posted by Remil Mohanan <re...@gmail.com> on 2018/05/31 13:59:58 UTC, 1 replies.
- Is Spark DataFrame limit function action or transformation? - posted by unk1102 <um...@gmail.com> on 2018/05/31 17:53:39 UTC, 0 replies.
- REMINDER: Apache EU Roadshow 2018 in Berlin is less than 2 weeks away! - posted by sh...@apache.org on 2018/05/31 20:54:32 UTC, 0 replies.
- Spark Task Failure due to OOM and subsequently task finishes - posted by sparknewbie1 <ka...@gmail.com> on 2018/05/31 22:21:06 UTC, 0 replies.