user@spark.apache.org, 2017-11

You are viewing a plain text version of this content. The canonical link for it is here.

- Spark application fail wit numRecords error - posted by Serkan TAS <Se...@enerjisa.com> on 2017/11/01 06:10:52 UTC, 2 replies.
- Logistic regression in Spark TestCase - posted by cjn <19...@qq.com> on 2017/11/01 07:05:06 UTC, 0 replies.
- Re: Read parquet files as buckets - posted by Michael Artz <mi...@gmail.com> on 2017/11/01 12:41:14 UTC, 0 replies.
- Announcing Spark on Kubernetes release 0.5.0 - posted by Yinan Li <li...@gmail.com> on 2017/11/01 16:49:33 UTC, 0 replies.
- Writing custom Structured Streaming receiver - posted by Daniel Haviv <da...@gmail.com> on 2017/11/01 17:45:03 UTC, 4 replies.
- Fwd: Dose pyspark supports python3.6？ - posted by Jun Shi <ju...@gmail.com> on 2017/11/02 02:54:17 UTC, 2 replies.
- Spark as ETL, was: Re: Dose pyspark supports python3.6？ - posted by JG Perrin <jp...@lumeris.com> on 2017/11/02 13:38:07 UTC, 0 replies.
- Re: share datasets across multiple spark-streaming applications for lookup - posted by JG Perrin <jp...@lumeris.com> on 2017/11/02 13:44:55 UTC, 0 replies.
- Getting Message From Structured Streaming Format Kafka - posted by Daniel de Oliveira Mantovani <da...@gmail.com> on 2017/11/02 15:36:34 UTC, 1 replies.
- Change the owner of hdfs file being saved - posted by Sunita Arvind <su...@gmail.com> on 2017/11/02 16:35:41 UTC, 0 replies.
- Re: Chaining Spark Streaming Jobs - posted by Sunita Arvind <su...@gmail.com> on 2017/11/02 16:54:14 UTC, 0 replies.
- Re: How to get the data url - posted by Jean Georges Perrin <jp...@lumeris.com> on 2017/11/03 10:48:51 UTC, 1 replies.
- Re: Hi all, - posted by Jean Georges Perrin <jp...@lumeris.com> on 2017/11/03 10:48:52 UTC, 1 replies.
- Re: Regarding column partitioning IDs and names as per hierarchical level SparkSQL - posted by Jean Georges Perrin <jp...@lumeris.com> on 2017/11/03 10:48:55 UTC, 1 replies.
- pyspark configuration with Juyter - posted by anudeep <an...@gmail.com> on 2017/11/03 11:31:03 UTC, 3 replies.
- unable to run spark streaming example - posted by Imran Rajjad <ra...@gmail.com> on 2017/11/03 13:51:03 UTC, 0 replies.
- spark-avro aliases incompatible - posted by Gaspar Muñoz <gm...@datiobd.com> on 2017/11/05 09:03:01 UTC, 4 replies.
- Re: Hive From Spark: Jdbc VS sparkContext - posted by Nicolas Paris <ni...@gmail.com> on 2017/11/05 12:57:56 UTC, 9 replies.
- Building Spark with hive 1.1.0 - posted by HARSH TAKKAR <ta...@gmail.com> on 2017/11/06 12:44:23 UTC, 0 replies.
- pySpark driver memory limit - posted by Nicolas Paris <ni...@gmail.com> on 2017/11/06 18:56:42 UTC, 2 replies.
- A pyspark sql query - posted by paulgureghian <pa...@att.net> on 2017/11/06 20:48:05 UTC, 0 replies.
- Re: Structured Stream equivalent of reduceByKey - posted by Michael Armbrust <mi...@databricks.com> on 2017/11/06 21:24:44 UTC, 0 replies.
- Which predicate pushdown work or does not work with Parquet? - posted by Manuel Vonthron <mv...@mnubo.com> on 2017/11/07 00:29:29 UTC, 0 replies.
- Re: Programmatically get status of job (WAITING/RUNNING) - posted by bsikander <be...@gmail.com> on 2017/11/07 09:48:43 UTC, 3 replies.
- Please remove me! - posted by x x <go...@yahoo.com.INVALID> on 2017/11/07 10:03:06 UTC, 0 replies.
- Does Random Forest in spark ML supports multi label classification in scala - posted by HARSH TAKKAR <ta...@gmail.com> on 2017/11/07 12:30:06 UTC, 1 replies.
- How to branch a Stream / have multiple Sinks / do multiple Queries on one Stream - posted by Jürgen Albersdorfer <Ju...@zweiradteile.net> on 2017/11/07 16:07:02 UTC, 0 replies.
- Spark REST API - posted by Paul Corley <pa...@ignitionone.com> on 2017/11/07 17:24:19 UTC, 0 replies.
- Stopping a Spark Streaming Context gracefully - posted by Bryan Jeffrey <br...@gmail.com> on 2017/11/07 18:36:24 UTC, 0 replies.
- Measure executor idle time - posted by samar kumar <sa...@gmail.com> on 2017/11/08 09:09:38 UTC, 0 replies.
- [Spark Structured Streaming] Changing partitions of (flat)MapGroupsWithState - posted by Teemu Heikkilä <te...@emblica.fi> on 2017/11/08 15:34:11 UTC, 1 replies.
- Spark http: Not showing completed apps - posted by purna pradeep <pu...@gmail.com> on 2017/11/09 01:33:28 UTC, 0 replies.
- spark job paused(active stages finished) - posted by "bingli3@iflytek.com" <bi...@iflytek.com> on 2017/11/09 03:37:23 UTC, 2 replies.
- Spark UI not showing completed applications - posted by purna pradeep <pu...@gmail.com> on 2017/11/09 04:13:39 UTC, 1 replies.
- Why the merge method in StructType is private - posted by sathy <sa...@gmail.com> on 2017/11/09 06:31:22 UTC, 0 replies.
- Testing spark e-mail list - posted by David Hodeffi <Da...@niceactimize.com> on 2017/11/09 08:56:58 UTC, 0 replies.
- Does the builtin hive jars talk of spark to HiveMetaStore(2.1) without any issues? - posted by yaooqinn <ya...@gmail.com> on 2017/11/09 10:10:09 UTC, 1 replies.
- Spark SQL - Truncate Day / Hour - posted by David Hodefi <da...@gmail.com> on 2017/11/09 11:05:49 UTC, 3 replies.
- Can we pass the Calcite streaming sql queries to spark sql? - posted by kant kodali <ka...@gmail.com> on 2017/11/09 19:50:38 UTC, 1 replies.
- Spark Streaming in Spark 2.1 with Kafka 0.9 - posted by KhajaAsmath Mohammed <md...@gmail.com> on 2017/11/09 22:28:37 UTC, 0 replies.
- Compression during shuffle writes - posted by Bahubali Jain <ba...@gmail.com> on 2017/11/10 03:54:36 UTC, 0 replies.
- Enable hql on the JDBC thrift server - posted by Arnaud Wolf <ar...@gmail.com> on 2017/11/10 07:08:04 UTC, 1 replies.
- Generate windows on processing time in Spark Structured Streaming - posted by wangsan <wa...@163.com> on 2017/11/10 09:52:37 UTC, 1 replies.
- spark-stream memory table global? - posted by Imran Rajjad <ra...@gmail.com> on 2017/11/10 12:56:04 UTC, 1 replies.
- Spark Streaming Kafka - posted by Frank Staszak <fs...@gmail.com> on 2017/11/10 16:31:40 UTC, 0 replies.
- Parquet files from spark not readable in Cascading - posted by Vikas Gandham <Vi...@maxpoint.com> on 2017/11/10 17:49:59 UTC, 3 replies.
- Spark-avro 4.0.0 is released - posted by Gengliang Wang <ge...@databricks.com> on 2017/11/11 01:23:58 UTC, 0 replies.
- Unsubscribe - posted by Evandro Cataruzzo <ec...@gmail.com> on 2017/11/11 11:05:20 UTC, 0 replies.
- Spark based Data Warehouse - posted by ashish rawat <dc...@gmail.com> on 2017/11/12 07:21:14 UTC, 16 replies.
- Possible "split brain" situation - posted by Gimantha Bandara <gi...@wso2.com> on 2017/11/13 07:09:20 UTC, 0 replies.
- Use of Accumulators - posted by Kedarnath Dixit <ke...@persistent.com> on 2017/11/13 16:58:40 UTC, 0 replies.
- Reload some static data during struct streaming - posted by spark receiver <sp...@gmail.com> on 2017/11/13 21:21:25 UTC, 2 replies.
- Spark 2.2 Structured Streaming + Kinesis - posted by Benjamin Kim <bb...@gmail.com> on 2017/11/13 23:15:32 UTC, 2 replies.
- Databricks Serverless - posted by Benjamin Kim <bb...@gmail.com> on 2017/11/13 23:18:25 UTC, 1 replies.
- Process large JSON file without causing OOM - posted by Alec Swan <al...@gmail.com> on 2017/11/14 00:22:58 UTC, 9 replies.
- Re: Use of Accumulators - posted by vaquar khan <va...@gmail.com> on 2017/11/14 05:46:17 UTC, 4 replies.
- Measuring cluster utilization of a streaming job - posted by Nadeem Lalani <na...@gmail.com> on 2017/11/14 12:54:58 UTC, 1 replies.
- Spark Structured Streaming + Kafka - posted by Agostino Calamita <ag...@gmail.com> on 2017/11/14 13:34:04 UTC, 0 replies.
- Executor not getting added SparkUI & Spark Eventlog in deploymode:cluster - posted by "Mamillapalli, Purna Pradeep" <Pu...@capitalone.com> on 2017/11/14 16:02:17 UTC, 0 replies.
- Spark Streaming fails with unable to get records after polling for 512 ms - posted by jkagitala <jk...@gmail.com> on 2017/11/15 02:56:39 UTC, 3 replies.
- spark strucured csv file stream not detecting new files - posted by Imran Rajjad <ra...@gmail.com> on 2017/11/15 13:47:24 UTC, 0 replies.
- Restart Spark Streaming after deployment - posted by KhajaAsmath Mohammed <md...@gmail.com> on 2017/11/15 16:18:13 UTC, 1 replies.
- [Spark Core]: S3a with Openstack swift object storage not using credentials provided in sparkConf - posted by Marius <m....@gmail.com> on 2017/11/15 20:48:51 UTC, 0 replies.
- Access to Applications metrics - posted by Nick Dimiduk <nd...@gmail.com> on 2017/11/15 22:28:22 UTC, 0 replies.
- [SPARK: org.apache.spark.util.TaskCompletionListenerException] - posted by Romeo Valencia <Ro...@clarivate.com> on 2017/11/15 23:06:23 UTC, 0 replies.
- Processing a splittable file from a single executor - posted by Jeroen Miller <bl...@gmail.com> on 2017/11/16 09:12:56 UTC, 1 replies.
- Apache Spark Downloads Page Error - posted by rjsullivan <rj...@gmail.com> on 2017/11/16 17:54:02 UTC, 0 replies.
- [ML] Spark Package Release: Deep Learning Pipelines 0.2.0 - posted by Siddharth Murching <si...@databricks.com> on 2017/11/16 21:29:06 UTC, 0 replies.
- Spark Streaming Job completed without executing next batches - posted by KhajaAsmath Mohammed <md...@gmail.com> on 2017/11/17 04:01:49 UTC, 1 replies.
- Multiple transformations without recalculating or caching - posted by Fernando Pereira <fe...@gmail.com> on 2017/11/17 10:02:35 UTC, 3 replies.
- Struct Type - posted by KhajaAsmath Mohammed <md...@gmail.com> on 2017/11/17 17:06:21 UTC, 1 replies.
- Union of streaming dataframes - posted by "Lalwani, Jayesh" <Ja...@capitalone.com> on 2017/11/17 17:55:13 UTC, 0 replies.
- Spark Streaming in Wait mode - posted by KhajaAsmath Mohammed <md...@gmail.com> on 2017/11/17 22:23:50 UTC, 0 replies.
- History server and non-HDFS filesystems - posted by Paul Mackles <pa...@loopr.com> on 2017/11/17 23:15:29 UTC, 0 replies.
- SpecificColumnarIterator has grown past JVM limit of 0xFFF - posted by "Md. Rezaul Karim" <re...@insight-centre.org> on 2017/11/18 00:43:46 UTC, 0 replies.
- Weight column values not used in Binary Logistic Regression Summary - posted by Stephen Boesch <ja...@gmail.com> on 2017/11/18 14:53:42 UTC, 0 replies.
- Spark 2.1.2 Spark Streaming checkpoint interval not respected - posted by Shing Hing Man <ma...@yahoo.com.INVALID> on 2017/11/18 15:25:10 UTC, 0 replies.
- Kryo not registered class - posted by Angel Francisco Orta <an...@gmail.com> on 2017/11/19 20:24:21 UTC, 1 replies.
- [Spark SQL]: DataFrame schema resulting in NullPointerException - posted by Chitral Verma <ch...@gmail.com> on 2017/11/19 23:08:48 UTC, 0 replies.
- spark streaming part files in hive partition - posted by KhajaAsmath Mohammed <md...@gmail.com> on 2017/11/19 23:36:03 UTC, 1 replies.
- Dynamic data ingestion into SparkSQL - Interesting question - posted by Aakash Basu <aa...@gmail.com> on 2017/11/20 13:28:33 UTC, 3 replies.
- Re: How to print plan of Structured Streaming DataFrame - posted by "Shixiong(Ryan) Zhu" <sh...@databricks.com> on 2017/11/20 18:57:14 UTC, 0 replies.
- Re: Writing files to s3 with out temporary directory - posted by Jim Carroll <ji...@gmail.com> on 2017/11/20 19:48:17 UTC, 7 replies.
- PySpark 2.2.0, Kafka 0.10 DataFrames - posted by salemi <al...@udo.edu> on 2017/11/20 23:07:47 UTC, 3 replies.
- Long running Spark Job Status on Remote Submission - posted by Harsh Choudhary <sh...@gmail.com> on 2017/11/21 07:29:18 UTC, 0 replies.
- Spark Writing to parquet directory : java.io.IOException: Disk quota exceeded - posted by Chetan Khatri <ch...@gmail.com> on 2017/11/21 10:06:46 UTC, 2 replies.
- Parquet Filter pushdown not working and statistics are not generating for any column with Spark 1.6 CDH 5.7 - posted by Rabin Banerjee <de...@gmail.com> on 2017/11/21 15:29:03 UTC, 0 replies.
- Re: Spark/Parquet/Statistics question - posted by Rabin Banerjee <de...@gmail.com> on 2017/11/21 16:08:37 UTC, 0 replies.
- Custom Data Source for getting data from Rest based services - posted by Sourav Mazumder <so...@gmail.com> on 2017/11/21 17:07:37 UTC, 4 replies.
- What do you pay attention to when validating Spark jobs? - posted by Holden Karau <ho...@pigscanfly.ca> on 2017/11/21 23:34:52 UTC, 1 replies.
- Caching dataframes and overwrite - posted by Michael Artz <mi...@gmail.com> on 2017/11/22 01:40:05 UTC, 0 replies.
- unsubscribe - posted by 韩盼 <pa...@thinkingdata.cn> on 2017/11/22 01:53:22 UTC, 0 replies.
- Spark Stremaing Hive Dynamic Partitions Issue - posted by KhajaAsmath Mohammed <md...@gmail.com> on 2017/11/22 16:06:23 UTC, 0 replies.
- does "Deep Learning Pipelines" scale out linearly? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2017/11/22 18:02:43 UTC, 0 replies.
- newbie: how to partition data on file system. What are best practices? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2017/11/22 18:21:02 UTC, 0 replies.
- Spark Streaming Kerberos Issue - posted by KhajaAsmath Mohammed <md...@gmail.com> on 2017/11/22 18:25:44 UTC, 6 replies.
- build spark source code - posted by Michael Artz <mi...@gmail.com> on 2017/11/23 02:34:29 UTC, 1 replies.
- SparkSQL not support CharType - posted by 163 <he...@163.com> on 2017/11/23 03:09:52 UTC, 2 replies.
- Re: does "Deep Learning Pipelines" scale out linearly? - posted by Nick Pentreath <ni...@gmail.com> on 2017/11/23 05:12:04 UTC, 1 replies.
- Spark Streaming Kinesis Missing Records - posted by Richard Moorhead <ri...@c2fo.com> on 2017/11/24 20:36:07 UTC, 0 replies.
- [Spark streaming] No assigned partition error during seek - posted by venks61176 <me...@gmail.com> on 2017/11/25 01:39:16 UTC, 1 replies.
- NLTK with Spark Streaming - posted by ashish rawat <dc...@gmail.com> on 2017/11/26 07:01:02 UTC, 4 replies.
- [Spark ML] Compatibility between features and models - posted by Ming Ma <mi...@hotmail.com> on 2017/11/27 06:23:13 UTC, 0 replies.
- Loading a large parquet file how much memory do I need - posted by Alexander Czech <al...@googlemail.com> on 2017/11/27 09:56:29 UTC, 7 replies.
- Cosine Similarity between documents - Rows - posted by Donni Khan <pr...@googlemail.com> on 2017/11/27 12:27:21 UTC, 1 replies.
- [Spark R]: dapply only works for very small datasets - posted by "Kunft, Andreas" <an...@tu-berlin.de> on 2017/11/27 18:27:33 UTC, 4 replies.
- Using MatrixFactorizationModel as a feature extractor - posted by Corey Nolet <cj...@gmail.com> on 2017/11/27 20:08:12 UTC, 1 replies.
- How to kill a query job when using spark thrift-server? - posted by 张万新 <ke...@gmail.com> on 2017/11/28 06:24:28 UTC, 0 replies.
- Spark Data Frame. PreSorded partitions - posted by Николай Ижиков <ni...@gmail.com> on 2017/11/28 15:40:12 UTC, 1 replies.
- Structured Streaming: emitted record count - posted by aravias <as...@homeaway.com> on 2017/11/28 18:23:31 UTC, 0 replies.
- [Structured Streaming] Continuous Processing Mode plan? - posted by "Marchant, Hayden " <ha...@citi.com.INVALID> on 2017/11/29 08:24:37 UTC, 0 replies.
- JDK1.8 for spark workers - posted by KhajaAsmath Mohammed <md...@gmail.com> on 2017/11/29 15:54:25 UTC, 2 replies.
- DataFrame joins with Spark-Java - posted by sushma spark <su...@gmail.com> on 2017/11/30 02:08:32 UTC, 1 replies.
- spark2.2 org.apache.spark.sql.catalyst.errors.package$TreeNodeException - posted by starstar <f1...@163.com> on 2017/11/30 05:57:48 UTC, 0 replies.
- Kafka version support - posted by Raghavendra Pandey <ra...@gmail.com> on 2017/11/30 06:17:42 UTC, 3 replies.
- [Spark ML] : Implement the Conjugate Gradient method for ALS - posted by Nate Wendt <na...@curalate.com> on 2017/11/30 20:03:34 UTC, 0 replies.