You are viewing a plain text version of this content. The canonical link for it is here.
- Default date formats in CSV and JSON: see SPARK-16216 - posted by Sean Owen <so...@cloudera.com> on 2016/08/01 14:07:17 UTC, 0 replies.
- Re: sampling operation for DStream - posted by Martin Le <ma...@gmail.com> on 2016/08/01 16:24:52 UTC, 3 replies.
- [MLlib] Term Frequency in TF-IDF seems incorrect - posted by Hao Ren <in...@gmail.com> on 2016/08/01 22:29:23 UTC, 2 replies.
- What happens in Dataset limit followed by rdd - posted by Maciej Szymkiewicz <ms...@gmail.com> on 2016/08/02 01:08:58 UTC, 4 replies.
- Re: tpcds for spark2.0 - posted by kevin <ki...@gmail.com> on 2016/08/02 01:49:55 UTC, 0 replies.
- Re: Testing --supervise flag - posted by Noorul Islam Kamal Malmiyoda <no...@noorul.com> on 2016/08/02 12:07:53 UTC, 0 replies.
- Graph edge type pattern matching in GraphX - posted by "Ulanov, Alexander" <al...@hpe.com> on 2016/08/02 12:29:10 UTC, 0 replies.
- AccumulatorV2 += operator - posted by Bryan Cutler <cu...@gmail.com> on 2016/08/02 20:46:31 UTC, 3 replies.
- SQL Based Authorization for SparkSQL - posted by 马晓宇 <hz...@corp.netease.com> on 2016/08/03 01:40:03 UTC, 1 replies.
- Spark on yarn, only 1 or 2 vcores getting allocated to the containers getting created. - posted by satyajit vegesna <sa...@gmail.com> on 2016/08/03 04:11:10 UTC, 1 replies.
- Spark SQL and Kryo registration - posted by Olivier Girardot <o....@lateral-thoughts.com> on 2016/08/03 08:49:50 UTC, 4 replies.
- How does MapWithStateRDD distribute the data - posted by Soumitra Johri <so...@gmail.com> on 2016/08/03 14:42:30 UTC, 1 replies.
- Quick question about hive-exec 1.2.1.spark2 - posted by Tao Li <tl...@hortonworks.com> on 2016/08/03 22:30:31 UTC, 0 replies.
- Source API requires unbounded distributed storage? - posted by Fred Reiss <fr...@gmail.com> on 2016/08/04 23:38:32 UTC, 2 replies.
- Inquery about Spark's behaviour for configurations in Hadoop configuration instance via read/write.options() - posted by Hyukjin Kwon <gu...@gmail.com> on 2016/08/05 01:26:21 UTC, 0 replies.
- We don't use ASF Jenkins builds, right? - posted by Sean Owen <so...@cloudera.com> on 2016/08/05 04:15:57 UTC, 1 replies.
- PySpark: Make persist() return a context manager - posted by Nicholas Chammas <ni...@gmail.com> on 2016/08/05 04:56:01 UTC, 5 replies.
- Result code of whole stage codegen - posted by Maciej Bryński <ma...@brynski.pl> on 2016/08/05 07:55:27 UTC, 2 replies.
- Apache Arrow data in buffer to RDD/DataFrame/Dataset? - posted by "jpivarski@gmail.com" <jp...@gmail.com> on 2016/08/05 19:40:33 UTC, 8 replies.
- Spark requires sysctl tuning? Servers unresponsive - posted by Ruslan Dautkhanov <da...@gmail.com> on 2016/08/06 06:29:40 UTC, 0 replies.
- [SPARK-2.0][SQL] UDF containing non-serializable object does not work as expected - posted by Hao Ren <in...@gmail.com> on 2016/08/07 21:31:33 UTC, 5 replies.
- Kafka Support new topic subscriptions without requiring restart of the streaming context - posted by "r7raul1984@163.com" <r7...@163.com> on 2016/08/08 06:12:21 UTC, 1 replies.
- Spark 2.0 sql module empty columns in result over parquet tables - posted by ekass <ev...@cslab.ece.ntua.gr> on 2016/08/08 11:14:13 UTC, 0 replies.
- Welcoming Felix Cheung as a committer - posted by Matei Zaharia <ma...@gmail.com> on 2016/08/08 18:15:00 UTC, 20 replies.
- Scaling partitioned Hive table support - posted by Michael Allman <mi...@videoamp.com> on 2016/08/08 18:53:51 UTC, 3 replies.
- SASL Support - posted by Michael Gummelt <mg...@mesosphere.io> on 2016/08/08 22:48:52 UTC, 1 replies.
- Re: Spark 2.0.1 / 2.1.0 on Maven - posted by Chris Fregly <ch...@fregly.com> on 2016/08/09 18:52:41 UTC, 4 replies.
- Get data from CSV files to feed SparkML library methods - posted by Minudika Malshan <mi...@gmail.com> on 2016/08/10 11:16:10 UTC, 2 replies.
- Use cases around image/video processing in spark - posted by Deepak Sharma <de...@gmail.com> on 2016/08/10 15:20:39 UTC, 1 replies.
- Serving Spark ML models via a regular Python web app - posted by Nicholas Chammas <ni...@gmail.com> on 2016/08/11 02:50:05 UTC, 7 replies.
- Sorting within partitions is not maintained in parquet? - posted by Jason Moore <Ja...@quantium.com.au> on 2016/08/11 06:23:22 UTC, 2 replies.
- Who controls 'databricks-jenkins'? - posted by Sean Owen <so...@cloudera.com> on 2016/08/11 08:54:14 UTC, 0 replies.
- Spark hangs after OOM in Serializer - posted by mikhainin <be...@narod.ru> on 2016/08/15 13:39:55 UTC, 0 replies.
- How to resolve the SparkExecption : Size exceeds Integer.MAX_VALUE - posted by Minudika Malshan <mi...@gmail.com> on 2016/08/15 16:46:51 UTC, 1 replies.
- Number of tasks on executors become negative after executor failures - posted by Rachana Srivastava <Ra...@markmonitor.com> on 2016/08/15 19:13:58 UTC, 0 replies.
- Structured Streaming with Kafka sources/sinks - posted by "Guo, Chenzhao" <ch...@intel.com> on 2016/08/16 02:12:25 UTC, 9 replies.
- Resultant RDD after a group by query always returns 200 partitions - posted by Niranda Perera <ni...@gmail.com> on 2016/08/16 06:01:28 UTC, 2 replies.
- Executors go OOM when using JDBC relation provider - posted by Niranda Perera <ni...@gmail.com> on 2016/08/16 13:26:48 UTC, 0 replies.
- GraphFrames 0.2.0 released - posted by Tim Hunter <ti...@databricks.com> on 2016/08/16 16:32:55 UTC, 4 replies.
- [master] ERROR RetryingHMSHandler: AlreadyExistsException(message:Database default already exists) - posted by Jacek Laskowski <ja...@japila.pl> on 2016/08/17 02:33:28 UTC, 3 replies.
- Spark R - Loading Third Party R Library in YARN Executors - posted by Senthil Kumar <se...@gmail.com> on 2016/08/17 09:23:29 UTC, 2 replies.
- How is mapped LogicalPlan to RDDs eventually if ever? How about Dataset? - posted by Jacek Laskowski <ja...@japila.pl> on 2016/08/17 22:00:07 UTC, 0 replies.
- Re: Aggregations with scala pairs - posted by Olivier Girardot <o....@lateral-thoughts.com> on 2016/08/18 06:32:10 UTC, 1 replies.
- Found a typo in Catalyst's exception and want to write a test -- help needed - posted by Jacek Laskowski <ja...@japila.pl> on 2016/08/18 06:46:42 UTC, 1 replies.
- How to convert spark data-frame to datasets? - posted by Minudika Malshan <mi...@gmail.com> on 2016/08/18 14:59:24 UTC, 1 replies.
- Setting YARN executors' JAVA_HOME - posted by Ryan Williams <ry...@gmail.com> on 2016/08/18 17:49:35 UTC, 2 replies.
- Early Draft Structured Streaming Machine Learning - posted by Holden Karau <ho...@pigscanfly.ca> on 2016/08/18 19:33:37 UTC, 0 replies.
- Parquet partitioning / appends - posted by Jeremy Smith <je...@acorns.com> on 2016/08/18 20:01:01 UTC, 0 replies.
- Re: RFC: Remote "HBaseTest" from examples? - posted by Ignacio Zendejas <iz...@node.io> on 2016/08/18 20:43:48 UTC, 0 replies.
- Persisting PySpark ML Pipelines that include custom Transformers - posted by Nicholas Chammas <ni...@gmail.com> on 2016/08/19 18:28:52 UTC, 2 replies.
- Java 8 - posted by Timur Shenkao <ts...@timshenkao.su> on 2016/08/20 10:41:41 UTC, 1 replies.
- Broadcast Variable Life Cycle - posted by Jerry Lam <ch...@gmail.com> on 2016/08/21 17:07:53 UTC, 6 replies.
- Why is isStreaming naming-inconsistent with analyzed and resolved in LogicalPlan? - posted by Jacek Laskowski <ja...@japila.pl> on 2016/08/22 09:02:48 UTC, 0 replies.
- Analyzer.resolver a duplicate of CatalystConf.resolver? - posted by Jacek Laskowski <ja...@japila.pl> on 2016/08/22 09:28:03 UTC, 0 replies.
- critical bugs to be fixed in Spark 2.0.1? - posted by Reynold Xin <rx...@databricks.com> on 2016/08/22 19:14:43 UTC, 2 replies.
- Why can't a Transformer have multiple output columns? - posted by Nicholas Chammas <ni...@gmail.com> on 2016/08/23 14:15:08 UTC, 2 replies.
- Fwd: Anyone else having trouble with replicated off heap RDD persistence? - posted by Michael Allman <mi...@videoamp.com> on 2016/08/23 16:37:06 UTC, 3 replies.
- Serialization troubles with mutable.LinkedHashMap - posted by Rahul Palamuttam <ra...@gmail.com> on 2016/08/23 17:51:31 UTC, 0 replies.
- How do we process/scale variable size batches in Apache Spark Streaming - posted by Rachana Srivastava <Ra...@markmonitor.com> on 2016/08/23 22:20:53 UTC, 0 replies.
- is the Lineage of RDD stored as a byte code in memory or a file? - posted by kant kodali <ka...@gmail.com> on 2016/08/24 01:00:55 UTC, 3 replies.
- Re: Spark dev-setup - posted by Nishadi Kirielle <nd...@gmail.com> on 2016/08/24 06:10:54 UTC, 5 replies.
- spread out of executors with Spark on Mesos - posted by Sun Rui <su...@163.com> on 2016/08/24 08:45:53 UTC, 0 replies.
- Re: [discuss] separate API annotation into two components: InterfaceAudience & InterfaceStability - posted by Tom Graves <tg...@yahoo.com.INVALID> on 2016/08/24 14:28:36 UTC, 1 replies.
- Spark 1.x/2.x qualifiers in downstream artifact names - posted by Michael Heuer <he...@gmail.com> on 2016/08/24 16:41:26 UTC, 5 replies.
- Tree for SQL Query - posted by Maciej Bryński <ma...@brynski.pl> on 2016/08/24 19:31:17 UTC, 3 replies.
- quick question - posted by kant kodali <ka...@gmail.com> on 2016/08/24 20:49:43 UTC, 0 replies.
- StateStore with DStreams - posted by Matt Smith <ma...@gmail.com> on 2016/08/25 01:01:38 UTC, 0 replies.
- Spark streaming get RDD within the sliding window - posted by "Ulanov, Alexander" <al...@hpe.com> on 2016/08/25 01:31:29 UTC, 0 replies.
- Spark Kerberos proxy user - posted by Abel Rincón <ga...@gmail.com> on 2016/08/25 10:10:30 UTC, 2 replies.
- Latest Release of Receiver based Kafka Consumer for Spark Streaming. - posted by Dibyendu Bhattacharya <di...@gmail.com> on 2016/08/25 11:33:41 UTC, 0 replies.
- Spark 2.0 history server summary page gets stuck at "loading history summary" with 10K+ application history - posted by wgtmac <us...@gmail.com> on 2016/08/25 20:47:12 UTC, 0 replies.
- Assembly build on spark 2.0.0 - posted by Srikanth Sampath <ss...@gmail.com> on 2016/08/26 13:59:19 UTC, 4 replies.
- Re: Insert non-null values from dataframe - posted by Russell Spitzer <ru...@gmail.com> on 2016/08/26 15:38:09 UTC, 0 replies.
- Mesos is now a maven module - posted by Michael Gummelt <mg...@mesosphere.io> on 2016/08/26 20:20:33 UTC, 12 replies.
- Performance of loading parquet files into case classes in Spark - posted by Julien Dumazert <ju...@gmail.com> on 2016/08/27 13:27:12 UTC, 5 replies.
- Cache'ing performance - posted by Maciej Bryński <ma...@brynski.pl> on 2016/08/27 20:39:16 UTC, 3 replies.
- Spark 2.0 and Yarn - posted by Srikanth Sampath <ss...@gmail.com> on 2016/08/28 13:59:32 UTC, 1 replies.
- spark roadmap - posted by Denis Bolshakov <bo...@gmail.com> on 2016/08/29 08:23:29 UTC, 1 replies.
- Remaining folders in .sparkStaging directory after app was killed - posted by Artur Sukhenko <ar...@gmail.com> on 2016/08/29 16:06:04 UTC, 0 replies.
- [build system] jenkins wedged itself this weekend, just restarted - posted by shane knapp <sk...@berkeley.edu> on 2016/08/29 16:36:47 UTC, 1 replies.
- Real time streaming in Spark - posted by Tomasz Gawęda <to...@outlook.com> on 2016/08/29 20:13:07 UTC, 1 replies.
- KMeans calls takeSample() twice? - posted by gsamaras <ge...@gmail.com> on 2016/08/29 21:34:53 UTC, 9 replies.
- Saving less data to improve Pregel performance in GraphX? - posted by Fang Zhang <fa...@gmail.com> on 2016/08/30 01:46:55 UTC, 0 replies.
- Inconsistency for nullvalue handling CSV: see SPARK-16462, SPARK-16460, SPARK-15144, SPARK-17290 and SPARK-16903 - posted by Hyukjin Kwon <gu...@gmail.com> on 2016/08/30 02:55:24 UTC, 1 replies.
- What are the names of the network protocols used between Spark Driver, Master and Workers? - posted by kant kodali <ka...@gmail.com> on 2016/08/30 04:22:53 UTC, 1 replies.
- 3Ps for Datasets not available?! (=Parquet Predicate Pushdown) - posted by Jacek Laskowski <ja...@japila.pl> on 2016/08/30 08:20:03 UTC, 2 replies.
- Reynold on vacation next two weeks - posted by Reynold Xin <rx...@databricks.com> on 2016/08/30 08:21:20 UTC, 1 replies.
- Spark 2.0.1 fails for provided hadoop - posted by Rishi Mishra <rm...@snappydata.io> on 2016/08/30 13:02:02 UTC, 0 replies.
- ApacheCon Seville CFP closes September 9th - posted by Rich Bowen <rb...@apache.org> on 2016/08/30 15:03:41 UTC, 0 replies.
- Re: How to check for No of Records per partition in Dataframe - posted by vsr <vs...@cloudera.com> on 2016/08/30 22:31:41 UTC, 0 replies.
- Re: Model abstract class in spark ml - posted by Mohit Jaggi <mo...@gmail.com> on 2016/08/30 23:51:30 UTC, 5 replies.
- dev-subscribe@spark.apache.org - posted by huanqinghappy <hu...@aliyun.com> on 2016/08/31 02:39:51 UTC, 0 replies.
- Questions about bucketing in Spark - posted by Tejas Patil <te...@gmail.com> on 2016/08/31 18:04:49 UTC, 0 replies.