You are viewing a plain text version of this content. The canonical link for it is here.
- Kafka Direct Stream - posted by Udit Mehta <um...@groupon.com> on 2015/10/01 00:02:30 UTC, 9 replies.
- Worker node timeout exception - posted by markluk <ma...@juicero.com> on 2015/10/01 01:32:59 UTC, 2 replies.
- Re: Submitting with --deploy-mode cluster: uploading the jar - posted by Christophe Schmitz <co...@gmail.com> on 2015/10/01 02:36:25 UTC, 2 replies.
- Re: Problem understanding spark word count execution - posted by Nicolae Marasoiu <ni...@adswizz.com> on 2015/10/01 06:57:35 UTC, 8 replies.
- Re: What is the best way to submit multiple tasks? - posted by Shixiong Zhu <zs...@gmail.com> on 2015/10/01 08:21:27 UTC, 0 replies.
- Re: Spark Streaming Standalone 1.5 - Stage cancelled because SparkContext was shut down - posted by Shixiong Zhu <zs...@gmail.com> on 2015/10/01 08:28:23 UTC, 0 replies.
- Re: [cache eviction] partition recomputation in big lineage RDDs - posted by Hemant Bhanawat <he...@gmail.com> on 2015/10/01 09:21:14 UTC, 0 replies.
- calling persist would cause java.util.NoSuchElementException: key not found: - posted by Eyad Sibai <ey...@gmail.com> on 2015/10/01 11:05:58 UTC, 0 replies.
- Calrification on Spark-Hadoop Configuration - posted by Vinoth Sankar <vi...@gmail.com> on 2015/10/01 13:52:27 UTC, 1 replies.
- How to connect HadoopHA from spark - posted by Vinoth Sankar <vi...@gmail.com> on 2015/10/01 14:22:46 UTC, 2 replies.
- automatic start of streaming job on failure on YARN - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/10/01 15:30:24 UTC, 4 replies.
- Re: Lost leader exception in Kafka Direct for Streaming - posted by Cody Koeninger <co...@koeninger.org> on 2015/10/01 16:18:14 UTC, 3 replies.
- Decision Tree Model - posted by hishamm <hi...@unige.ch> on 2015/10/01 16:20:16 UTC, 0 replies.
- Re: Deploying spark-streaming application on production - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/10/01 16:49:43 UTC, 0 replies.
- Pyspark: "Error: No main class set in JAR; please specify one with --class" - posted by YaoPau <jo...@gmail.com> on 2015/10/01 17:56:22 UTC, 2 replies.
- How to access lost executor log file - posted by Lan Jiang <lj...@gmail.com> on 2015/10/01 19:30:00 UTC, 3 replies.
- Getting spark application driver ID programmatically - posted by Snehal Nagmote <na...@gmail.com> on 2015/10/01 19:44:33 UTC, 1 replies.
- Accumulator of rows? - posted by Sa...@wellsfargo.com on 2015/10/01 20:44:50 UTC, 2 replies.
- OOM error in Spark worker - posted by varun sharma <va...@gmail.com> on 2015/10/01 21:05:15 UTC, 0 replies.
- Re: Hive permanent functions are not available in Spark SQL - posted by Yin Huai <yh...@databricks.com> on 2015/10/01 21:27:50 UTC, 2 replies.
- Shuffle Write v/s Shuffle Read - posted by Kartik Mathur <ka...@bluedata.com> on 2015/10/01 21:36:03 UTC, 2 replies.
- "java.io.IOException: Filesystem closed" on executors - posted by Lan Jiang <lj...@gmail.com> on 2015/10/01 21:41:54 UTC, 3 replies.
- Java REST custom receiver - posted by Pavol Loffay <pl...@redhat.com> on 2015/10/01 21:58:13 UTC, 1 replies.
- spark.streaming.kafka.maxRatePerPartition for direct stream - posted by Sourabh Chandak <so...@gmail.com> on 2015/10/01 22:39:15 UTC, 5 replies.
- python version in spark-submit - posted by roy <rp...@njit.edu> on 2015/10/01 22:56:18 UTC, 1 replies.
- SparkSQL: Reading data from hdfs and storing into multiple paths - posted by haridass saisriram <ha...@gmail.com> on 2015/10/01 23:11:08 UTC, 1 replies.
- Call Site - Spark Context - posted by Sandip Mehta <sa...@gmail.com> on 2015/10/02 00:06:18 UTC, 2 replies.
- How to save DataFrame as a Table in Hbase? - posted by unk1102 <um...@gmail.com> on 2015/10/02 00:15:56 UTC, 1 replies.
- Re: Standalone Scala Project - posted by Robineast <Ro...@xense.co.uk> on 2015/10/02 00:28:55 UTC, 0 replies.
- How to Set Retry Policy in Spark - posted by Renxia Wang <re...@gmail.com> on 2015/10/02 00:42:03 UTC, 1 replies.
- Spark cluster - use machine name in WorkerID, not IP address - posted by markluk <ma...@juicero.com> on 2015/10/02 00:48:04 UTC, 1 replies.
- Re: spark.mesos.coarse impacts memory performance on mesos - posted by Utkarsh Sengar <ut...@gmail.com> on 2015/10/02 01:05:29 UTC, 3 replies.
- Re: Spark streaming job filling a lot of data in local spark nodes - posted by swetha kasireddy <sw...@gmail.com> on 2015/10/02 02:59:46 UTC, 1 replies.
- [ANNOUNCE] Announcing Spark 1.5.1 - posted by Reynold Xin <rx...@databricks.com> on 2015/10/02 04:42:31 UTC, 0 replies.
- spark-submit --packages using different resolver - posted by Jerry Lam <ch...@gmail.com> on 2015/10/02 05:58:20 UTC, 3 replies.
- Checkpointing is super slow - posted by Sourabh Chandak <so...@gmail.com> on 2015/10/02 08:17:09 UTC, 6 replies.
- Re: calling persist would cause java.util.NoSuchElementException: key not found: - posted by Shixiong Zhu <zs...@gmail.com> on 2015/10/02 08:36:14 UTC, 0 replies.
- Addition of Meetup Group - Sydney, Mebourne Australia - posted by Andy Huang <an...@servian.com.au> on 2015/10/02 08:57:04 UTC, 0 replies.
- saveAsTextFile creates an empty folder in HDFS - posted by jarias <ja...@elrocin.es> on 2015/10/02 10:21:00 UTC, 3 replies.
- How to use registered Hive UDF in Spark DataFrame? - posted by unk1102 <um...@gmail.com> on 2015/10/02 13:25:26 UTC, 4 replies.
- Compute Real-time Visualizations using spark streaming - posted by Sureshv <su...@transerainc.com> on 2015/10/02 13:47:35 UTC, 1 replies.
- Fwd: Add row IDs column to data frame - posted by Josh Levy-Kramer <jo...@starcount.com> on 2015/10/02 15:33:11 UTC, 0 replies.
- from groupBy return a DataFrame without aggregation? - posted by Sa...@wellsfargo.com on 2015/10/02 17:32:11 UTC, 1 replies.
- Spark Streaming over YARN - posted by ni...@free.fr on 2015/10/02 17:40:47 UTC, 11 replies.
- Re: HDFS small file generation problem - posted by ni...@free.fr on 2015/10/02 17:48:09 UTC, 8 replies.
- are functions deserialized once per task? - posted by Michael Albert <m_...@yahoo.com.INVALID> on 2015/10/02 18:33:21 UTC, 0 replies.
- Re: Adding the values in a column of a dataframe - posted by sethah <sh...@us.ibm.com> on 2015/10/02 20:03:43 UTC, 0 replies.
- Weird Spark Dispatcher Offers? - posted by Alan Braithwaite <al...@cloudflare.com> on 2015/10/02 20:34:37 UTC, 8 replies.
- how to broadcast huge lookup table? - posted by Sa...@wellsfargo.com on 2015/10/02 20:50:01 UTC, 3 replies.
- No plan for broadcastHint - posted by Swapnil Shinde <sw...@gmail.com> on 2015/10/02 20:57:44 UTC, 0 replies.
- Reading JSON in Pyspark throws scala.MatchError - posted by balajikvijayan <ba...@gmail.com> on 2015/10/02 22:42:32 UTC, 5 replies.
- How does FAIR job scheduler work in Standalone cluster mode? - posted by Jacek Laskowski <ja...@japila.pl> on 2015/10/02 23:22:47 UTC, 3 replies.
- Re: how to get Application ID from Submission ID or Driver ID programmatically - posted by firemonk9 <dh...@gmail.com> on 2015/10/02 23:41:13 UTC, 0 replies.
- Re: Scala Limitation - Case Class definition with more than 22 arguments - posted by satish chandra j <js...@gmail.com> on 2015/10/03 06:01:40 UTC, 1 replies.
- Re: Limiting number of cores per job in multi-threaded driver. - posted by Philip Weaver <ph...@gmail.com> on 2015/10/03 06:02:50 UTC, 5 replies.
- Re: Contribution in Apche Spark - posted by Chintan Bhatt <ch...@charusat.ac.in> on 2015/10/03 08:56:12 UTC, 2 replies.
- Re: Hive ORC Malformed while loading into spark data frame - posted by Umesh Kacha <um...@gmail.com> on 2015/10/03 11:18:26 UTC, 1 replies.
- How to optimize group by query fired using hiveContext.sql? - posted by unk1102 <um...@gmail.com> on 2015/10/03 12:19:46 UTC, 0 replies.
- RE : Re: HDFS small file generation problem - posted by nibiau <ni...@free.fr> on 2015/10/03 13:15:21 UTC, 3 replies.
- Re: How to optimize group by query fired using hiveContext.sql? - posted by Alex Rovner <al...@magnetic.com> on 2015/10/03 14:57:07 UTC, 5 replies.
- Can we using Spark Streaming to stream data from Hive table partitions? - posted by unk1102 <um...@gmail.com> on 2015/10/03 15:01:09 UTC, 0 replies.
- performance difference between Thrift server and SparkSQL? - posted by Jeff Thompson <je...@gmail.com> on 2015/10/03 20:08:31 UTC, 2 replies.
- WARN NettyRpcEndpointRef: Error sending message [message = Heartbeat(driver,... - posted by Jacek Laskowski <ja...@japila.pl> on 2015/10/03 20:27:58 UTC, 2 replies.
- Q: optimal way to calculate aggregates on a stream - posted by igor <ig...@4friends.od.ua> on 2015/10/03 22:24:41 UTC, 0 replies.
- How to make sense of Spark log entries - posted by jeff saremi <je...@hotmail.com> on 2015/10/04 03:25:51 UTC, 1 replies.
- Can't determine cause of spark driver crash - posted by adamsky <ad...@gmail.com> on 2015/10/04 03:42:25 UTC, 0 replies.
- preferredNodeLocationData, SPARK-8949, and SparkContext - a leftover? - posted by Jacek Laskowski <ja...@japila.pl> on 2015/10/04 05:36:26 UTC, 2 replies.
- Re: laziness in textFile reading from HDFS? - posted by Matt Narrell <ma...@gmail.com> on 2015/10/04 06:50:12 UTC, 7 replies.
- Examples module not building in intellij - posted by Stephen Boesch <ja...@gmail.com> on 2015/10/04 08:06:35 UTC, 2 replies.
- ml.Pipeline without train step - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2015/10/04 11:34:55 UTC, 0 replies.
- java.lang.OutOfMemoryError: GC overhead limit exceeded - posted by t_ras <ma...@netvision.net.il> on 2015/10/04 13:26:07 UTC, 1 replies.
- Mini projects for spark novice - posted by Rahul Jeevanandam <ra...@incture.com> on 2015/10/04 16:06:14 UTC, 1 replies.
- Enriching df.write.jdbc - posted by Kapil Raaj <ca...@gmail.com> on 2015/10/04 16:19:51 UTC, 0 replies.
- Secondary Sorting in Spark - posted by Bill Bejeck <bb...@gmail.com> on 2015/10/04 20:41:16 UTC, 5 replies.
- spark-ec2 config files. - posted by Renato Perini <re...@gmail.com> on 2015/10/05 01:56:57 UTC, 2 replies.
- Usage of transform for code reuse between Streaming and Batch job affects the performance ? - posted by swetha <sw...@gmail.com> on 2015/10/05 02:59:09 UTC, 1 replies.
- Spark SQL with Hive error: "Conf non-local session path expected to be non-null;" - posted by YaoPau <jo...@gmail.com> on 2015/10/05 04:41:48 UTC, 1 replies.
- Spark 1.5.0 Error on startup - posted by Julius Fernandes <ju...@gmail.com> on 2015/10/05 05:26:52 UTC, 2 replies.
- How to install a Spark Package? - posted by jeff saremi <je...@hotmail.com> on 2015/10/05 05:55:30 UTC, 2 replies.
- String operation in filter with a special character - posted by Hemminger Jeff <je...@atware.co.jp> on 2015/10/05 07:59:55 UTC, 2 replies.
- K-Means seems biased to one center - posted by Justin Pihony <ju...@gmail.com> on 2015/10/05 08:00:11 UTC, 0 replies.
- OutOfMemoryError - posted by Ramkumar V <ra...@gmail.com> on 2015/10/05 08:56:02 UTC, 4 replies.
- looking for HDP users - posted by Tamas Szuromi <ta...@odigeo.com> on 2015/10/05 09:23:50 UTC, 0 replies.
- Graphx hangs and crashes on EdgeRDD creation - posted by William Saar <Wi...@king.com> on 2015/10/05 10:14:19 UTC, 1 replies.
- Re: Store DStreams into Hive using Hive Streaming - posted by Krzysztof Zarzycki <k....@gmail.com> on 2015/10/05 11:21:30 UTC, 2 replies.
- Job on Yarn not using all given capacity ends up failing - posted by Cesar Berezowski <ce...@adaltas.com> on 2015/10/05 12:56:17 UTC, 0 replies.
- StructType has more rows, than corresponding Row has objects. - posted by Eugene Morozov <ev...@gmail.com> on 2015/10/05 13:28:13 UTC, 2 replies.
- Spark handling parallel requests - posted by ta...@yahoo.com.INVALID on 2015/10/05 15:16:51 UTC, 7 replies.
- [Spark on YARN] Multiple Auxiliary Shuffle Service Versions - posted by Andreas Fritzler <an...@gmail.com> on 2015/10/05 15:22:42 UTC, 10 replies.
- Spark context on thrift server - posted by Younes Naguib <Yo...@tritondigital.com> on 2015/10/05 15:38:27 UTC, 1 replies.
- Error: could not find function "includePackage" - posted by "jayendra.parsai@yahoo.in" <ja...@yahoo.in> on 2015/10/05 15:46:53 UTC, 2 replies.
- DStream Transformation to save JSON in Cassandra 2.1 - posted by "Prateek ." <pr...@aricent.com> on 2015/10/05 16:14:11 UTC, 3 replies.
- save checkpoint during dataframe row iteration - posted by Justin Permar <ju...@weather.com> on 2015/10/05 16:59:45 UTC, 0 replies.
- Utility for PySpark DataFrames - smartframes - posted by Don Drake <do...@gmail.com> on 2015/10/05 17:35:41 UTC, 0 replies.
- Spark on YARN using Java 1.8 fails - posted by mvle <mv...@us.ibm.com> on 2015/10/05 17:41:21 UTC, 2 replies.
- GraphX: How can I tell if 2 nodes are connected? - posted by Dino Fancellu <di...@felstar.com> on 2015/10/05 17:51:14 UTC, 6 replies.
- How to change verbosity level and redirect verbosity to file? - posted by Sa...@wellsfargo.com on 2015/10/05 18:12:48 UTC, 1 replies.
- Custom RDD for Proprietary MPP database - posted by VJ Anand <vj...@sankia.com> on 2015/10/05 18:15:33 UTC, 0 replies.
- Broadcast var is null - posted by dpristin <dp...@gmail.com> on 2015/10/05 18:23:04 UTC, 7 replies.
- Exception: "You must build Spark with Hive. Export 'SPARK_HIVE=true' and run build/sbt assembly" - posted by cherah30 <ah...@gmail.com> on 2015/10/05 18:25:54 UTC, 2 replies.
- Spark Survey Results 2015 are now available - posted by Denny Lee <de...@gmail.com> on 2015/10/05 18:54:47 UTC, 0 replies.
- Where to put import sqlContext.implicits._ to be able to work on DataFrames in another file? - posted by Kristina Rogale Plazonic <kp...@gmail.com> on 2015/10/05 19:05:19 UTC, 0 replies.
- Spark metrics cpu/memory - posted by gtanguy <g....@gmail.com> on 2015/10/05 19:19:19 UTC, 0 replies.
- Pyspark 1.5.1: Error when using findSynonyms after loading Word2VecModel - posted by evg952 <ed...@gmail.com> on 2015/10/05 19:51:28 UTC, 0 replies.
- Building RDD for a Custom MPP Database - posted by VJ <vj...@sankia.com> on 2015/10/05 19:53:27 UTC, 1 replies.
- RDD of ImmutableList - posted by Jakub Dubovsky <sp...@seznam.cz> on 2015/10/05 20:04:51 UTC, 6 replies.
- Re: "Method json([class java.util.HashMap]) does not exist" when reading JSON on PySpark - posted by Fernando Paladini <fn...@gmail.com> on 2015/10/05 20:23:26 UTC, 4 replies.
- Please help: Processes with HiveContext slower in cluster - posted by Sa...@wellsfargo.com on 2015/10/05 20:24:38 UTC, 0 replies.
- Re: Lookup / Access of master data in spark streaming - posted by Olivier Girardot <o....@lateral-thoughts.com> on 2015/10/05 22:40:40 UTC, 2 replies.
- Spark SQL "SELECT ... LIMIT" scans the entire Hive table? - posted by YaoPau <jo...@gmail.com> on 2015/10/05 22:53:26 UTC, 1 replies.
- Writing UDF with variable number of arguments - posted by tridib <tr...@live.com> on 2015/10/05 23:26:40 UTC, 0 replies.
- Streaming Performance w/ UpdateStateByKey - posted by Jeff Nadler <jn...@srcginc.com> on 2015/10/05 23:28:32 UTC, 2 replies.
- save DF to JDBC - posted by Ruslan Dautkhanov <da...@gmail.com> on 2015/10/05 23:44:20 UTC, 3 replies.
- Re: Spark thrift service and Hive impersonation. - posted by Jagat Singh <ja...@gmail.com> on 2015/10/05 23:51:51 UTC, 1 replies.
- ERROR: "Size exceeds Integer.MAX_VALUE" Spark 1.5 - posted by Muhammad Ahsan <mu...@gmail.com> on 2015/10/06 00:12:48 UTC, 1 replies.
- RE: No space left on device when running graphx job - posted by Jack Yang <ji...@uow.edu.au> on 2015/10/06 00:43:48 UTC, 0 replies.
- question on make multiple external calls within each partition - posted by Chen Song <ch...@gmail.com> on 2015/10/06 02:35:42 UTC, 3 replies.
- How can I disable logging when running local[*]? - posted by Jeff Jones <jj...@adaptivebiotech.com> on 2015/10/06 05:19:26 UTC, 7 replies.
- does KafkaCluster can be public ? - posted by Erwan ALLAIN <ea...@gmail.com> on 2015/10/06 11:46:07 UTC, 8 replies.
- unresolved dependency: org.apache.spark#spark-streaming_2.10;1.5.0: not found - posted by shahab <sh...@gmail.com> on 2015/10/06 11:50:38 UTC, 2 replies.
- GenericMutableRow and Row mismatch on Spark 1.5? - posted by Ophir Cohen <op...@gmail.com> on 2015/10/06 11:51:11 UTC, 7 replies.
- Enabling kryo serialization slows down machine learning app. - posted by "fede.sc" <fe...@gmail.com> on 2015/10/06 14:02:09 UTC, 0 replies.
- Re: extracting the top 100 values from an rdd and save it as text file - posted by gtanguy <g....@gmail.com> on 2015/10/06 14:19:52 UTC, 0 replies.
- Re: SparkR Error in sparkR.init(master=“local”) in RStudio - posted by akhandeshi <am...@gmail.com> on 2015/10/06 14:20:48 UTC, 10 replies.
- compatibility issue with Jersey2 - posted by oggie <go...@gmail.com> on 2015/10/06 14:57:44 UTC, 7 replies.
- Trying PCA on spark but serialization is error thrown - posted by Simon Hebert <si...@gmail.com> on 2015/10/06 17:12:02 UTC, 1 replies.
- ORC files created by Spark job can't be accessed using hive table - posted by unk1102 <um...@gmail.com> on 2015/10/06 17:51:37 UTC, 4 replies.
- Spark 1.3.1 on Yarn not using all given capacity - posted by czoo <ce...@adaltas.com> on 2015/10/06 18:05:48 UTC, 2 replies.
- API to run spark Jobs - posted by shahid qadri <sh...@icloud.com> on 2015/10/06 18:37:32 UTC, 4 replies.
- Weird performance pattern of Spark Streaming (1.4.1) + direct Kafka - posted by Gerard Maas <ge...@gmail.com> on 2015/10/06 18:45:25 UTC, 11 replies.
- Spark cache memory storage - posted by Lan Jiang <lj...@gmail.com> on 2015/10/06 21:15:14 UTC, 0 replies.
- How to avoid Spark shuffle spill memory? - posted by unk1102 <um...@gmail.com> on 2015/10/06 21:19:22 UTC, 1 replies.
- Re: 1.5 Build Errors - posted by Benjamin Zaitlen <qu...@gmail.com> on 2015/10/06 22:08:29 UTC, 0 replies.
- Help with big data operation performance - posted by Sa...@wellsfargo.com on 2015/10/06 23:05:26 UTC, 0 replies.
- Does feature parity exist between Scala and Python on Spark - posted by dant <da...@gmail.com> on 2015/10/06 23:14:21 UTC, 1 replies.
- Does feature parity exist between Spark and PySpark - posted by dant <da...@gmail.com> on 2015/10/07 00:15:11 UTC, 8 replies.
- Deep learning example using spark - posted by Angel Angel <ar...@gmail.com> on 2015/10/07 05:20:08 UTC, 0 replies.
- Re: Notification on Spark Streaming job failure - posted by Vikram Kone <vi...@gmail.com> on 2015/10/07 06:37:49 UTC, 3 replies.
- Re: Spark job workflow engine recommendations - posted by Vikram Kone <vi...@gmail.com> on 2015/10/07 06:40:54 UTC, 3 replies.
- Help needed to reproduce bug - posted by pnpritchard <ni...@falkonry.com> on 2015/10/07 07:18:03 UTC, 1 replies.
- What is the difference between ml.classification.LogisticRegression and mllib.classification.LogisticRegressionWithLBFGS - posted by YiZhi Liu <ja...@gmail.com> on 2015/10/07 08:47:57 UTC, 4 replies.
- Model exports PMML (Random Forest) - posted by Yasemin Kaya <go...@gmail.com> on 2015/10/07 09:51:47 UTC, 0 replies.
- spark multi tenancy - posted by Dominik Fries <do...@woodmark.de> on 2015/10/07 10:26:38 UTC, 4 replies.
- ClassCastException while reading data from HDFS through Spark - posted by Vinoth Sankar <vi...@gmail.com> on 2015/10/07 11:11:09 UTC, 1 replies.
- hiveContext sql number of tasks - posted by patcharee <Pa...@uni.no> on 2015/10/07 12:34:37 UTC, 1 replies.
- What happens in the master or slave launch ? - posted by Camelia Elena Ciolac <ca...@chalmers.se> on 2015/10/07 15:32:14 UTC, 1 replies.
- Temp files are not removed when done (Mesos) - posted by Alexei Bakanov <ru...@gmail.com> on 2015/10/07 16:14:13 UTC, 2 replies.
- This post has NOT been accepted by the mailing list yet. - posted by akhandeshi <am...@gmail.com> on 2015/10/07 17:10:32 UTC, 1 replies.
- spark performance non-linear response - posted by Yadid Ayzenberg <ya...@media.mit.edu> on 2015/10/07 17:26:24 UTC, 8 replies.
- Graceful shutdown drops processing in Spark Streaming - posted by Michal Čizmazia <mi...@gmail.com> on 2015/10/07 17:33:39 UTC, 2 replies.
- DataFrame with bean class - posted by VJ <vj...@sankia.com> on 2015/10/07 18:02:09 UTC, 0 replies.
- Optimal way to avoid processing null returns in Spark Scala - posted by swetha <sw...@gmail.com> on 2015/10/07 18:42:38 UTC, 1 replies.
- Spark standalone hangup during shuffle flatMap or explode in cluster - posted by Sa...@wellsfargo.com on 2015/10/07 19:23:20 UTC, 2 replies.
- Parquet file size - posted by Younes Naguib <Yo...@tritondigital.com> on 2015/10/07 20:07:11 UTC, 7 replies.
- Spark Streaming: Doing operation in Receiver vs RDD - posted by emiretsk <eu...@gmail.com> on 2015/10/07 21:55:17 UTC, 2 replies.
- Fwd: multiple count distinct in SQL/DataFrame? - posted by Reynold Xin <rx...@databricks.com> on 2015/10/07 22:43:35 UTC, 0 replies.
- Re: Asking about the trend of increasing latency, hbase spikes. - posted by Ted Yu <yu...@gmail.com> on 2015/10/08 02:38:06 UTC, 0 replies.
- Re: SparkSQL: First query execution is always slower than subsequent queries - posted by Michael Armbrust <mi...@databricks.com> on 2015/10/08 03:47:50 UTC, 1 replies.
- Is coalesce smart while merging partitions? - posted by Cesar Flores <ce...@gmail.com> on 2015/10/08 04:00:56 UTC, 2 replies.
- Default size of a datatype in SparkSQL - posted by vivek bhaskar <vi...@gmail.com> on 2015/10/08 07:11:55 UTC, 1 replies.
- Running Spark in Yarn-client mode - posted by Sushrut Ikhar <su...@gmail.com> on 2015/10/08 07:23:10 UTC, 2 replies.
- Build Failure - posted by shahid qadri <sh...@icloud.com> on 2015/10/08 09:55:30 UTC, 3 replies.
- sql query orc slow - posted by patcharee <Pa...@uni.no> on 2015/10/08 10:43:00 UTC, 10 replies.
- Spark ganglia jClassNotFoundException: org.apache.spark.metrics.sink.GangliaSink - posted by gtanguy <g....@gmail.com> on 2015/10/08 10:54:01 UTC, 0 replies.
- How can I read file from HDFS i sparkR from RStudio - posted by Amit Behera <am...@gmail.com> on 2015/10/08 11:58:39 UTC, 1 replies.
- Best practises to clean up RDDs for old applications - posted by Jens Rantil <je...@tink.se> on 2015/10/08 12:43:06 UTC, 0 replies.
- Example of updateStateByKey with initial RDD? - posted by Bryan <br...@gmail.com> on 2015/10/08 12:58:44 UTC, 4 replies.
- Spark 1.5.1 standalone cluster - wrong Akka remoting config? - posted by baraky <ba...@gmail.com> on 2015/10/08 13:04:03 UTC, 2 replies.
- Launching EC2 instances with Spark compiled for Scala 2.11 - posted by Theodore Vasiloudis <th...@gmail.com> on 2015/10/08 14:28:13 UTC, 1 replies.
- Using a variable (a column name) in an IF statement in Spark SQL - posted by Maheshakya Wijewardena <ma...@wso2.com> on 2015/10/08 15:13:22 UTC, 3 replies.
- RowNumber in HiveContext returns null or negative values - posted by Sa...@wellsfargo.com on 2015/10/08 16:25:17 UTC, 5 replies.
- Dataframes - sole data structure for parallel computations? - posted by "Tracewski, Lukasz " <lu...@credit-suisse.com> on 2015/10/08 16:40:58 UTC, 2 replies.
- JDBC thrift server - posted by Younes Naguib <Yo...@tritondigital.com> on 2015/10/08 16:46:00 UTC, 2 replies.
- How to register udf with Any or generic Type in spark - posted by dugasani jcreddy <jc...@yahoo.com.INVALID> on 2015/10/08 17:59:05 UTC, 1 replies.
- Error executing using alternating least square - posted by haridass saisriram <ha...@gmail.com> on 2015/10/08 19:03:06 UTC, 0 replies.
- Using Sqark SQL mapping over an RDD - posted by "Afshartous, Nick" <na...@turbine.com> on 2015/10/08 19:10:29 UTC, 3 replies.
- Applicative logs on Yarn - posted by ni...@free.fr on 2015/10/08 19:37:19 UTC, 1 replies.
- How to increase Spark partitions for the DataFrame? - posted by unk1102 <um...@gmail.com> on 2015/10/08 20:13:02 UTC, 5 replies.
- failed spark job reports on YARN as successful - posted by Lan Jiang <lj...@gmail.com> on 2015/10/08 20:16:33 UTC, 0 replies.
- Why dataframe.persist(StorageLevels.MEMORY_AND_DISK_SER) hangs for long time? - posted by unk1102 <um...@gmail.com> on 2015/10/08 20:27:40 UTC, 3 replies.
- Insert via HiveContext is slow - posted by Daniel Haviv <da...@veracity-group.com> on 2015/10/08 20:51:59 UTC, 4 replies.
- Re: Spark REST Job server feedback? - posted by Tim Smith <se...@gmail.com> on 2015/10/08 21:39:06 UTC, 0 replies.
- Streaming DirectKafka assertion errors - posted by Roman Garcia <ro...@gmail.com> on 2015/10/08 21:51:44 UTC, 2 replies.
- ValueError: can not serialize object larger than 2G - posted by XIANDI <zx...@hotmail.com> on 2015/10/08 21:56:06 UTC, 3 replies.
- unsubscribe - posted by Jürgen Fey <ju...@matchinguu.com> on 2015/10/08 23:10:14 UTC, 5 replies.
- Unexplained sleep time - posted by yael aharon <ya...@gmail.com> on 2015/10/09 00:03:10 UTC, 1 replies.
- Re: "Too many open files" exception on reduceByKey - posted by Tian Zhang <tz...@yahoo.com> on 2015/10/09 00:22:35 UTC, 3 replies.
- Architecture for a Spark batch job. - posted by Renato Perini <re...@gmail.com> on 2015/10/09 01:58:30 UTC, 0 replies.
- [Spark 1.5] Kinesis receivers not starting - posted by Bharath Mukkati <sp...@gmail.com> on 2015/10/09 03:11:34 UTC, 1 replies.
- Error in load hbase on spark - posted by Roy Wang <ro...@163.com> on 2015/10/09 04:29:30 UTC, 6 replies.
- error in sparkSQL 1.5 using count(1) in nested queries - posted by Jeff Thompson <je...@gmail.com> on 2015/10/09 05:47:50 UTC, 1 replies.
- weird issue with sqlContext.createDataFrame - pyspark 1.3.1 - posted by ping yan <sh...@gmail.com> on 2015/10/09 07:28:17 UTC, 2 replies.
- Fixed writer version as version1 for Parquet as wring a Parquet file. - posted by Hyukjin Kwon <gu...@gmail.com> on 2015/10/09 08:04:37 UTC, 1 replies.
- Cache in Spark - posted by vinod kumar <vi...@gmail.com> on 2015/10/09 10:47:43 UTC, 3 replies.
- Different partition number of GroupByKey leads different result - posted by Devin Huang <ho...@163.com> on 2015/10/09 11:05:21 UTC, 5 replies.
- run “dev/mima” error in spark1.4.1 - posted by wangxiaojing <u9...@gmail.com> on 2015/10/09 11:56:51 UTC, 0 replies.
- Datastore or DB for spark - posted by Rahul Jeevanandam <ra...@incture.com> on 2015/10/09 12:10:59 UTC, 6 replies.
- spark-submit hive connection through spark Initial job has not accepted any resources - posted by vinayak <vi...@tcs.com> on 2015/10/09 12:32:47 UTC, 2 replies.
- Kafka streaming "at least once" semantics - posted by bitborn <an...@ave81.com> on 2015/10/09 13:34:04 UTC, 4 replies.
- ExecutorLostFailure when working with RDDs - posted by Ivan Héda <iv...@gmail.com> on 2015/10/09 15:13:47 UTC, 1 replies.
- Kafka and Spark combination - posted by Nikhil Gs <gs...@gmail.com> on 2015/10/09 15:17:54 UTC, 2 replies.
- Streaming Application Unable to get Stream from Kafka - posted by "Prateek ." <pr...@aricent.com> on 2015/10/09 15:25:10 UTC, 0 replies.
- How to handle the UUID in Spark 1.3.1 - posted by java8964 <ja...@hotmail.com> on 2015/10/09 16:28:08 UTC, 3 replies.
- RE: Streaming Application Unable to get Stream from Kafka - posted by "Prateek ." <pr...@aricent.com> on 2015/10/09 16:34:52 UTC, 0 replies.
- Create hashmap using two RDD's - posted by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/10/09 16:59:33 UTC, 6 replies.
- Issue with the class generated from avro schema - posted by alberskib <al...@gmail.com> on 2015/10/09 18:05:52 UTC, 3 replies.
- Best storage format for intermediate process - posted by Sa...@wellsfargo.com on 2015/10/09 20:25:47 UTC, 1 replies.
- updateStateByKey and Partitioner - posted by Tian Zhang <tz...@yahoo.com> on 2015/10/09 20:57:27 UTC, 0 replies.
- How to calculate percentile of a column of DataFrame? - posted by unk1102 <um...@gmail.com> on 2015/10/09 21:01:59 UTC, 29 replies.
- How to tune unavoidable group by query? - posted by unk1102 <um...@gmail.com> on 2015/10/09 21:07:16 UTC, 0 replies.
- How to compile Spark with customized Hadoop? - posted by Dogtail L <sp...@gmail.com> on 2015/10/09 21:10:38 UTC, 4 replies.
- Spark checkpoint restore failure due to s3 consistency issue - posted by Spark Newbie <sp...@gmail.com> on 2015/10/09 23:13:07 UTC, 3 replies.
- Question about GraphX connected-components - posted by John Lilley <jo...@redpoint.net> on 2015/10/09 23:13:47 UTC, 2 replies.
- Cannot connect to standalone spark cluster - posted by ekraffmiller <el...@gmail.com> on 2015/10/09 23:37:34 UTC, 1 replies.
- SQLcontext changing String field to Long - posted by Abhisheks <sm...@gmail.com> on 2015/10/10 01:55:11 UTC, 4 replies.
- akka.event.Logging$LoggerInitializationException - posted by lu...@sina.com on 2015/10/10 05:39:30 UTC, 0 replies.
- Re: Streaming Application Unable to get Stream from Kafka - posted by Terry Hoo <hu...@gmail.com> on 2015/10/10 06:20:39 UTC, 1 replies.
- Jar is cached in yarn-cluster mode? - posted by Rex Xiong <by...@gmail.com> on 2015/10/10 07:53:44 UTC, 0 replies.
- Re: Constant Spark execution time with different # of slaves - posted by Robineast <Ro...@xense.co.uk> on 2015/10/10 15:58:06 UTC, 0 replies.
- DataFrame Explode for ArrayBuffer[Any] - posted by Eugene Morozov <ev...@gmail.com> on 2015/10/10 16:06:00 UTC, 0 replies.
- Re: Spark GraphaX - posted by Robineast <Ro...@xense.co.uk> on 2015/10/10 16:26:35 UTC, 0 replies.
- Re: Checkpointing in Iterative Graph Computation - posted by Robineast <Ro...@xense.co.uk> on 2015/10/10 16:29:37 UTC, 1 replies.
- How StorageLevel, CacheManager and checkpointing influence computing RDD partitions? - posted by Jacek Laskowski <ja...@japila.pl> on 2015/10/10 16:37:39 UTC, 0 replies.
- updateStateByKey and stack overflow - posted by Tian Zhang <tz...@yahoo.com> on 2015/10/10 20:37:19 UTC, 1 replies.
- Join Order Optimization - posted by Raajay <ra...@gmail.com> on 2015/10/11 03:21:43 UTC, 5 replies.
- Re: Best practices to call small spark jobs as part of REST api - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/10/11 10:32:41 UTC, 2 replies.
- Why Spark Stream job stops producing outputs after a while? - posted by Uthayan Suthakar <ut...@gmail.com> on 2015/10/11 19:39:15 UTC, 3 replies.
- Hive with apache spark - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/10/11 19:42:37 UTC, 1 replies.
- Saprk 1.5 - How to join 3 RDDs in a SQL DF? - posted by Subhajit Purkayastha <sp...@p3si.net> on 2015/10/11 20:57:31 UTC, 4 replies.
- Handling expirying state in UDF - posted by brightsparc <br...@gmail.com> on 2015/10/12 02:14:27 UTC, 1 replies.
- yarn-cluster mode throwing NullPointerException - posted by Rachana Srivastava <Ra...@markmonitor.com> on 2015/10/12 05:49:52 UTC, 1 replies.
- Spark retrying task indefinietly - posted by Amit Singh Hora <ho...@gmail.com> on 2015/10/12 08:05:22 UTC, 1 replies.
- Re: Configuring Spark for reduceByKey on on massive data sets - posted by hotdog <li...@163.com> on 2015/10/12 08:30:03 UTC, 0 replies.
- TaskMemoryManager. cleanUpAllAllocatedMemory -> Memory leaks ??? - posted by Lei Wu <wu...@gmail.com> on 2015/10/12 09:28:49 UTC, 2 replies.
- Define new stage in pipeline - posted by Nethaji Chandrasiri <ne...@wso2.com> on 2015/10/12 10:21:43 UTC, 1 replies.
- how to use SharedSparkContext - posted by Fengdong Yu <fe...@everstring.com> on 2015/10/12 10:48:11 UTC, 2 replies.
- SQLContext within foreachRDD - posted by Daniel Haviv <da...@veracity-group.com> on 2015/10/12 11:52:37 UTC, 2 replies.
- "dynamically" sort a large collection? - posted by Yifan LI <ia...@gmail.com> on 2015/10/12 12:03:58 UTC, 3 replies.
- Data skipped while writing Spark Streaming output to HDFS - posted by Sathiskumar <sa...@gmail.com> on 2015/10/12 12:25:07 UTC, 1 replies.
- Is there any better way of writing this code - posted by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/10/12 13:28:33 UTC, 1 replies.
- What is the abstraction for a Worker process in Spark code - posted by Muhammad Haseeb Javed <11...@seecs.edu.pk> on 2015/10/12 14:12:22 UTC, 1 replies.
- read from hive tables and write back to hive - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/10/12 15:10:16 UTC, 0 replies.
- pagination spark sq - posted by Ravisankar Mani <rr...@gmail.com> on 2015/10/12 16:05:05 UTC, 1 replies.
- Creating Custom Receiver for Spark Streaming - posted by Something Something <ma...@gmail.com> on 2015/10/12 17:47:57 UTC, 1 replies.
- Does Spark use more memory than MapReduce? - posted by YaoPau <jo...@gmail.com> on 2015/10/12 18:52:16 UTC, 2 replies.
- Can't create UDF's in spark 1.5 while running using the hive thrift service - posted by Trystan Leftwich <tr...@atscale.com> on 2015/10/12 19:01:59 UTC, 0 replies.
- Spark job is running infinitely - posted by Saurav Sinha <sa...@gmail.com> on 2015/10/12 19:07:23 UTC, 6 replies.
- Spark UI consuming lots of memory - posted by pnpritchard <ni...@falkonry.com> on 2015/10/12 20:01:54 UTC, 7 replies.
- OutOfMemoryError OOM ByteArrayOutputStream.hugeCapacity - posted by Alexander Pivovarov <ap...@gmail.com> on 2015/10/12 21:50:08 UTC, 0 replies.
- Re: why would a spark Job fail without throwing run-time exceptions? - posted by pnpritchard <ni...@falkonry.com> on 2015/10/12 22:06:04 UTC, 0 replies.
- Storing object in spark streaming - posted by Something Something <ma...@gmail.com> on 2015/10/12 23:03:47 UTC, 1 replies.
- Dev Setup for Python/Scala Packages - posted by bsowell <bs...@gmail.com> on 2015/10/12 23:22:09 UTC, 0 replies.
- Re: ClassCastException when use spark1.5.1 - posted by pnpritchard <ni...@falkonry.com> on 2015/10/12 23:22:44 UTC, 0 replies.
- Calculate Hierarchical root as new column - posted by epheatt <er...@gmail.com> on 2015/10/13 00:08:32 UTC, 0 replies.
- Problem installing Sparck on Windows 8 - posted by Marco Mistroni <mm...@gmail.com> on 2015/10/13 00:11:58 UTC, 6 replies.
- Spark Streaming Latency in practice - posted by xweb <as...@gmail.com> on 2015/10/13 00:35:26 UTC, 1 replies.
- DEBUG level log in receivers and executors - posted by Spark Newbie <sp...@gmail.com> on 2015/10/13 01:09:33 UTC, 1 replies.
- Cannot get spark-streaming_2.10-1.5.0.pom from the maven repository - posted by y <yo...@gmail.com> on 2015/10/13 05:09:30 UTC, 6 replies.
- Install via directions in "Learning Spark". Exception when running bin/pyspark - posted by David Bess <da...@sbcglobal.net> on 2015/10/13 05:44:02 UTC, 2 replies.
- How to add transformations as pipeline Stages ? - posted by Nethaji Chandrasiri <ne...@wso2.com> on 2015/10/13 07:50:40 UTC, 0 replies.
- Spark DataFrame GroupBy into List - posted by SLiZn Liu <sl...@gmail.com> on 2015/10/13 08:08:33 UTC, 8 replies.
- writing to hive - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/10/13 09:29:37 UTC, 1 replies.
- Re: HiveThriftServer not registering with Zookeeper - posted by Xiaoyu Wang <wa...@jd.com> on 2015/10/13 10:35:58 UTC, 2 replies.
- an problem about zippartition - posted by 张仪yf1 <zh...@hikvision.com> on 2015/10/13 13:09:11 UTC, 2 replies.
- Why is the Spark Web GUI failing with JavaScript "Uncaught SyntaxError"? - posted by Joshua Fox <jo...@twiggle.com> on 2015/10/13 14:17:41 UTC, 6 replies.
- localhost webui port - posted by "Langston, Jim" <Ji...@dynatrace.com> on 2015/10/13 14:47:58 UTC, 1 replies.
- How can I use dynamic resource allocation option in spark-jobserver? - posted by "정유선 (JUNG YOUSUN)" <je...@sk.com> on 2015/10/13 15:21:31 UTC, 0 replies.
- Spark shuffle service does not work in stand alone - posted by Sa...@wellsfargo.com on 2015/10/13 16:23:03 UTC, 6 replies.
- Why is my spark executor is terminated? - posted by "Wang, Ningjun (LNG-NPV)" <ni...@lexisnexis.com> on 2015/10/13 16:34:19 UTC, 3 replies.
- Conf setting for Java Spark - posted by Ramkumar V <ra...@gmail.com> on 2015/10/13 19:50:18 UTC, 0 replies.
- Changing application log level in standalone cluster - posted by Tom Graves <tg...@yahoo.com.INVALID> on 2015/10/13 20:00:44 UTC, 1 replies.
- Generated ORC files cause NPE in Hive - posted by Daniel Haviv <da...@veracity-group.com> on 2015/10/13 20:14:29 UTC, 1 replies.
- Machine learning with spark (book code example error) - posted by Zsombor Egyed <eg...@starschema.net> on 2015/10/13 20:31:49 UTC, 2 replies.
- TTL for saveAsObjectFile() - posted by antoniosi <an...@gmail.com> on 2015/10/13 22:07:20 UTC, 1 replies.
- Re: SPARK SQL Error - posted by pnpritchard <ni...@falkonry.com> on 2015/10/13 22:41:32 UTC, 5 replies.
- Any plans to support Spark Streaming within an interactive shell? - posted by YaoPau <jo...@gmail.com> on 2015/10/13 22:47:49 UTC, 1 replies.
- Fwd: Problem about cannot open shared object file - posted by 赵夏 <zh...@gmail.com> on 2015/10/13 23:42:18 UTC, 0 replies.
- Spark 1.5 java.net.ConnectException: Connection refused - posted by Spark Newbie <sp...@gmail.com> on 2015/10/13 23:47:13 UTC, 6 replies.
- Announcement: Hackathon at Netherlands Cancer Institute next week - posted by Kees van Bochove <ke...@thehyve.nl> on 2015/10/13 23:55:22 UTC, 0 replies.
- Building with SBT and Scala 2.11 - posted by Jakob Odersky <jo...@gmail.com> on 2015/10/14 02:53:33 UTC, 5 replies.
- Fwd: [Streaming] join events in last 10 minutes - posted by Daniel Li <da...@gmail.com> on 2015/10/14 03:29:17 UTC, 0 replies.
- When does python program started in pyspark - posted by canan chen <cc...@gmail.com> on 2015/10/14 04:50:47 UTC, 1 replies.
- OutOfMemoryError When Reading Many json Files - posted by SLiZn Liu <sl...@gmail.com> on 2015/10/14 06:18:47 UTC, 2 replies.
- Running in cluster mode causes native library linking to fail - posted by Bernardo Vecchia Stein <be...@gmail.com> on 2015/10/14 06:44:51 UTC, 10 replies.
- [SQL] Memory leak with spark streaming and spark sql in spark 1.5.1 - posted by Terry Hoo <hu...@gmail.com> on 2015/10/14 10:07:39 UTC, 2 replies.
- java.io.InvalidClassException using spark1.4.1 for Terasort - posted by Shreeharsha G Neelakantachar <sh...@in.ibm.com> on 2015/10/14 10:42:04 UTC, 1 replies.
- Re: graphx - mutable? - posted by rohit13k <ro...@gmail.com> on 2015/10/14 11:03:17 UTC, 0 replies.
- EdgeTriplet showing two versions of the same vertex - posted by rohit13k <ro...@gmail.com> on 2015/10/14 11:13:57 UTC, 0 replies.
- spark sql OOM - posted by Andy Zhao <an...@gmail.com> on 2015/10/14 11:36:10 UTC, 4 replies.
- Node afinity for Kafka-Direct Stream - posted by Gerard Maas <ge...@gmail.com> on 2015/10/14 11:38:02 UTC, 9 replies.
- stability of Spark 1.4.1 with Python 3 versions - posted by "shoira.mukhsinova@bnpparibasfortis.com" <sh...@bnpparibasfortis.com> on 2015/10/14 13:06:42 UTC, 1 replies.
- Fwd: Partition Column in JDBCRDD or Datasource API - posted by satish chandra j <js...@gmail.com> on 2015/10/14 13:15:25 UTC, 0 replies.
- spark streaming filestream API - posted by "Chandra Mohan, Ananda Vel Murugan" <An...@honeywell.com> on 2015/10/14 13:42:44 UTC, 3 replies.
- thriftserver: access temp dataframe from in-memory of spark-shell - posted by Sa...@wellsfargo.com on 2015/10/14 16:10:25 UTC, 1 replies.
- NullPointerException when adding to accumulator - posted by "Sela, Amit" <AN...@paypal.com.INVALID> on 2015/10/14 16:30:32 UTC, 7 replies.
- Dynamic partitioning pruning - posted by Younes Naguib <Yo...@tritondigital.com> on 2015/10/14 16:50:10 UTC, 0 replies.
- Question about data frame partitioning in Spark 1.3.0 - posted by Cesar Flores <ce...@gmail.com> on 2015/10/14 17:45:11 UTC, 3 replies.
- Get *document*-topic distribution from PySpark LDA model? - posted by moustachio <re...@gmail.com> on 2015/10/14 18:02:32 UTC, 0 replies.
- Programmatically connect to remote YARN in yarn-client mode - posted by Florian Kaspar <fl...@onelogic.de> on 2015/10/14 19:01:54 UTC, 3 replies.
- Reusing Spark Functions - posted by "Starch, Michael D (398M)" <Mi...@jpl.nasa.gov> on 2015/10/14 19:18:31 UTC, 1 replies.
- Strange spark problems among different versions - posted by xia zhao <zh...@gmail.com> on 2015/10/14 20:11:15 UTC, 0 replies.
- If you use Spark 1.5 and disabled Tungsten mode ... - posted by Reynold Xin <rx...@databricks.com> on 2015/10/14 21:00:37 UTC, 3 replies.
- spark-shell :javap fails with complaint about JAVA_HOME, but it is set correctly - posted by Robert Dodier <ro...@gmail.com> on 2015/10/14 22:19:56 UTC, 1 replies.
- spark-avro 2.0.1 generates strange schema (spark-avro 1.0.0 is fine) - posted by Alex Nastetsky <al...@vervemobile.com> on 2015/10/14 22:38:12 UTC, 2 replies.
- PySpark - Hive Context Does Not Return Results but SQL Context Does for Similar Query. - posted by "charles.drotar" <ch...@capitalone.com> on 2015/10/14 22:38:33 UTC, 2 replies.
- Spark streaming checkpoint against s3 - posted by Tian Zhang <tz...@yahoo.com> on 2015/10/14 22:41:58 UTC, 2 replies.
- Spark 1.5.1 ClassNotFoundException in cluster mode. - posted by Renato Perini <re...@gmail.com> on 2015/10/14 23:05:28 UTC, 2 replies.
- IPv6 regression in Spark 1.5.1 - posted by Thomas Dudziak <to...@gmail.com> on 2015/10/14 23:40:30 UTC, 1 replies.
- Spark Master Dying saying TimeoutException - posted by Kartik Mathur <ka...@bluedata.com> on 2015/10/15 00:11:21 UTC, 2 replies.
- Application not found in Spark historyserver in yarn-client mode - posted by Anfernee Xu <an...@gmail.com> on 2015/10/15 01:20:12 UTC, 2 replies.
- dataframes and numPartitions - posted by Alex Nastetsky <al...@vervemobile.com> on 2015/10/15 05:14:03 UTC, 2 replies.
- Sensitivity analysis using Spark MlLib - posted by Sourav Mazumder <so...@gmail.com> on 2015/10/15 05:37:38 UTC, 0 replies.
- Spark 1.5 Streaming and Kinesis - posted by Phil Kallos <ph...@gmail.com> on 2015/10/15 07:49:54 UTC, 12 replies.
- Re: Spark on Mesos / Executor Memory - posted by Bharath Ravi Kumar <re...@gmail.com> on 2015/10/15 08:49:21 UTC, 4 replies.
- (Unknown) - posted by Lei Wu <wu...@gmail.com> on 2015/10/15 10:34:32 UTC, 1 replies.
- Design doc for Spark task scheduling - posted by Lei Wu <wu...@gmail.com> on 2015/10/15 10:38:28 UTC, 0 replies.
- PMML export for LinearRegressionModel - posted by Fazlan Nazeem <fa...@wso2.com> on 2015/10/15 11:10:32 UTC, 0 replies.
- How VectorIndexer works in Spark ML pipelines - posted by VISHNU SUBRAMANIAN <jo...@gmail.com> on 2015/10/15 11:14:53 UTC, 1 replies.
- Rdd Partitions issue - posted by Renu Yadav <yr...@gmail.com> on 2015/10/15 12:37:26 UTC, 0 replies.
- org.apache.spark.sql.AnalysisException with registerTempTable - posted by Yusuf Can Gürkan <yu...@useinsider.com> on 2015/10/15 13:22:04 UTC, 1 replies.
- Best practices to handle corrupted records - posted by Antonio Murgia <an...@studio.unibo.it> on 2015/10/15 14:28:41 UTC, 6 replies.
- Fwd: Get the previous state string - posted by Yogesh Vyas <in...@gmail.com> on 2015/10/15 14:41:45 UTC, 1 replies.
- How to specify the numFeatures in HashingTF - posted by Jianguo Li <fl...@gmail.com> on 2015/10/15 17:46:13 UTC, 1 replies.
- Re: - posted by Dirceu Semighini Filho <di...@gmail.com> on 2015/10/15 18:39:52 UTC, 0 replies.
- word2vec cosineSimilarity - posted by Arthur Chan <ar...@gmail.com> on 2015/10/15 18:57:53 UTC, 0 replies.
- SQL Context error in 1.5.1 - any work around ? - posted by Sourav Mazumder <so...@gmail.com> on 2015/10/15 18:59:02 UTC, 1 replies.
- Complex transformation on a dataframe column - posted by Hao Wang <bi...@gmail.com> on 2015/10/15 19:04:08 UTC, 1 replies.
- Can I convert RDD[My_OWN_JAVA_CLASS] to DataFrame in Spark 1.3.x? - posted by java8964 <ja...@hotmail.com> on 2015/10/15 19:26:45 UTC, 1 replies.
- Spark 1.5.1 ThriftServer - posted by Dirceu Semighini Filho <di...@gmail.com> on 2015/10/15 19:41:10 UTC, 0 replies.
- Spark SQL running totals - posted by Stefan Panayotov <sp...@msn.com> on 2015/10/15 19:48:56 UTC, 7 replies.
- multiple pyspark instances simultaneously (same time) - posted by "jeff.sadowski@gmail.com" <je...@gmail.com> on 2015/10/15 20:01:29 UTC, 4 replies.
- s3a file system and spark deployment mode - posted by Scott Reynolds <sr...@twilio.com> on 2015/10/15 20:04:34 UTC, 6 replies.
- How to enable Spark mesos docker executor? - posted by Klaus Ma <kl...@cguru.net> on 2015/10/16 03:28:13 UTC, 2 replies.
- [Spark ML] How to extends MLlib's optimization algorithm - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/10/16 04:37:50 UTC, 0 replies.
- Get list of Strings from its Previous State - posted by Yogesh Vyas <in...@gmail.com> on 2015/10/16 05:53:47 UTC, 0 replies.
- Is Feature Transformations supported by Spark export to PMML - posted by Weiwei Zhang <wz...@dons.usfca.edu> on 2015/10/16 07:40:39 UTC, 0 replies.
- Get the previous state string in Spark streaming - posted by "Chandra Mohan, Ananda Vel Murugan" <An...@honeywell.com> on 2015/10/16 07:40:41 UTC, 4 replies.
- Turn off logs in spark-sql shell - posted by Muhammad Ahsan <mu...@gmail.com> on 2015/10/16 07:56:01 UTC, 1 replies.
- Streaming of COAP Resources - posted by Sadaf <sa...@platalytics.com> on 2015/10/16 08:38:48 UTC, 1 replies.
- Ensuring eager evaluation inside mapPartitions - posted by alberskib <al...@gmail.com> on 2015/10/16 11:47:48 UTC, 2 replies.
- Re: filtering reversed tuples - posted by Gylfi <gy...@berkeley.edu> on 2015/10/16 12:39:27 UTC, 0 replies.
- HBase Spark Streaming giving error after restore - posted by Amit Singh Hora <ho...@gmail.com> on 2015/10/16 15:02:32 UTC, 4 replies.
- issue of tableau connect to spark sql 1.5 - posted by "Wangfei (X)" <wa...@huawei.com> on 2015/10/16 16:33:13 UTC, 0 replies.
- HTTP 500 if try to access Spark UI in yarn-cluster (only) - posted by Sebastian YEPES FERNANDEZ <sy...@gmail.com> on 2015/10/16 16:39:16 UTC, 0 replies.
- Convert SchemaRDD to RDD - posted by satish chandra j <js...@gmail.com> on 2015/10/16 16:41:35 UTC, 3 replies.
- Compiling spark 1.5.1 fails with scala.reflect.internal.Types$TypeError: bad symbolic reference. - posted by Simon Hafner <re...@gmail.com> on 2015/10/16 16:54:48 UTC, 0 replies.
- Issue of jar dependency in yarn-cluster mode - posted by Rex Xiong <by...@gmail.com> on 2015/10/16 16:57:20 UTC, 1 replies.
- Accessing HDFS HA from spark job (UnknownHostException error) - posted by kyarovoy <ky...@gmail.com> on 2015/10/16 17:58:37 UTC, 0 replies.
- Dynamic partition pruning - posted by Younes Naguib <Yo...@tritondigital.com> on 2015/10/16 19:20:36 UTC, 4 replies.
- PySpark + Streaming + DataFrames - posted by Jason White <ja...@shopify.com> on 2015/10/16 20:03:21 UTC, 4 replies.
- Problems w/YARN Spark Streaming app reading from Kafka - posted by Robert Towne <Ro...@WebTrends.com> on 2015/10/16 20:52:24 UTC, 1 replies.
- How to put an object in cache for ever in Streaming - posted by swetha <sw...@gmail.com> on 2015/10/16 21:02:19 UTC, 3 replies.
- Problem of RDD in calculation - posted by ChengBo <Ch...@huawei.com> on 2015/10/16 21:10:59 UTC, 3 replies.
- Question of RDD in calculation - posted by Shepherd <Ch...@huawei.com> on 2015/10/16 21:14:40 UTC, 0 replies.
- Clustering KMeans error in 1.5.1 - posted by robin_up <ro...@gmail.com> on 2015/10/16 22:36:04 UTC, 0 replies.
- In-memory computing and cache() in Spark - posted by Jia Zhan <zh...@gmail.com> on 2015/10/16 23:02:31 UTC, 5 replies.
- How to speed up reading from file? - posted by Sa...@wellsfargo.com on 2015/10/16 23:08:34 UTC, 1 replies.
- Multiple joins in Spark - posted by Shyam Parimal Katti <sp...@nyu.edu> on 2015/10/17 02:01:17 UTC, 4 replies.
- Location preferences in pyspark? - posted by Philip Weaver <ph...@gmail.com> on 2015/10/17 02:42:03 UTC, 2 replies.
- driver ClassNotFoundException when MySQL JDBC exceptions are thrown on executor - posted by Hurshal Patel <hp...@gmail.com> on 2015/10/17 03:03:20 UTC, 2 replies.
- How to have Single refernce of a class in Spark Streaming? - posted by swetha <sw...@gmail.com> on 2015/10/17 03:05:54 UTC, 1 replies.
- repartition vs partitionby - posted by shahid qadri <sh...@icloud.com> on 2015/10/17 09:32:45 UTC, 4 replies.
- PySpark: breakdown application execution time and fine-tuning - posted by saluc <sa...@usi.ch> on 2015/10/17 12:10:39 UTC, 0 replies.
- can I use Spark as alternative for gem fire cache ? - posted by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/10/17 15:28:13 UTC, 1 replies.
- Output println info in LogMessage Info ? - posted by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/10/17 20:40:36 UTC, 1 replies.
- Spark Streaming scheduler delay VS driver.cores - posted by Adrian Tanase <at...@adobe.com> on 2015/10/17 21:58:18 UTC, 1 replies.
- Should I convert json into parquet? - posted by Gavin Yue <yu...@gmail.com> on 2015/10/17 23:07:45 UTC, 3 replies.
- Checkpointing calls the job twice? - posted by jatinganhotra <ja...@gmail.com> on 2015/10/18 05:38:37 UTC, 0 replies.
- callUdf("percentile_approx",col("mycol"),lit(0.25)) does not compile spark 1.5.1 source but it does work in spark 1.5.1 bin - posted by unk1102 <um...@gmail.com> on 2015/10/18 09:10:44 UTC, 4 replies.
- Spark Streaming - use the data in different jobs - posted by Oded Maimon <od...@scene53.com> on 2015/10/18 13:49:11 UTC, 2 replies.
- REST api to avoid spark context creation - posted by anshu shukla <an...@gmail.com> on 2015/10/18 14:13:07 UTC, 1 replies.
- No suitable Constructor found while compiling - posted by VJ Anand <vj...@sankia.com> on 2015/10/18 15:39:14 UTC, 1 replies.
- our spark gotchas report while creating batch pipeline - posted by "igor.berman" <ig...@gmail.com> on 2015/10/18 17:51:12 UTC, 2 replies.
- Pass spark partition explicitly ? - posted by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/10/18 19:56:59 UTC, 2 replies.
- Indexing Support - posted by Mustafa Elbehery <el...@gmail.com> on 2015/10/18 23:16:27 UTC, 2 replies.
- Spark SQL Thriftserver and Hive UDF in Production - posted by ReeceRobinson <Re...@TheRobinsons.gen.nz> on 2015/10/19 05:04:35 UTC, 3 replies.
- pyspark groupbykey throwing error: unpack requires a string argument of length 4 - posted by fahad shah <sf...@gmail.com> on 2015/10/19 07:42:52 UTC, 4 replies.
- master die and worker registration failed with duplicated worker id - posted by ZhuGe <tc...@outlook.com> on 2015/10/19 09:43:17 UTC, 0 replies.
- best way to generate per key auto increment numerals after sorting - posted by fahad shah <sf...@gmail.com> on 2015/10/19 10:11:38 UTC, 2 replies.
- Re: Incrementally add/remove vertices in GraphX - posted by mas <ma...@gmail.com> on 2015/10/19 12:36:39 UTC, 0 replies.
- SHUFFLE in PARTITIONBY or shuffle in general - posted by shahid ashraf <sh...@trialx.com> on 2015/10/19 13:16:38 UTC, 1 replies.
- Is one batch created by Streaming Context always equal to one RDD? - posted by vaibhavrtk <va...@gmail.com> on 2015/10/19 13:39:38 UTC, 1 replies.
- How to take user jars precedence over Spark jars - posted by YiZhi Liu <ja...@gmail.com> on 2015/10/19 14:07:57 UTC, 3 replies.
- Spark executor on Mesos - how to set effective user id? - posted by Eugene Chepurniy <eu...@zoomdata.com> on 2015/10/19 14:14:05 UTC, 2 replies.
- Issue in spark batches - posted by varun sharma <va...@gmail.com> on 2015/10/19 14:48:07 UTC, 6 replies.
- spark streaming failing to replicate blocks - posted by Eugen Cepoi <ce...@gmail.com> on 2015/10/19 14:51:11 UTC, 5 replies.
- Re: How does shuffle work in spark ? - posted by shahid <sh...@trialx.com> on 2015/10/19 15:54:08 UTC, 2 replies.
- [Spark MLlib] How to apply spark ml given models for questions with general background - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/10/19 16:46:40 UTC, 0 replies.
- flattening a JSON data structure - posted by nunomrc <nu...@rightster.com> on 2015/10/19 17:08:32 UTC, 2 replies.
- k-prototypes in MLLib? - posted by Fernando Velasco <fe...@gmail.com> on 2015/10/19 17:38:04 UTC, 0 replies.
- pyspark: results differ based on whether persist() has been called - posted by peay2 <pe...@yahoo.fr> on 2015/10/19 18:04:52 UTC, 1 replies.
- How to calculate row by now and output retults in Spark - posted by Shepherd <Ch...@huawei.com> on 2015/10/19 19:35:42 UTC, 2 replies.
- Storing Compressed data in HDFS into Spark - posted by ahaider3 <ah...@hawk.iit.edu> on 2015/10/19 20:13:45 UTC, 3 replies.
- writing avro parquet - posted by Alex Nastetsky <al...@vervemobile.com> on 2015/10/19 20:14:34 UTC, 1 replies.
- new 1.5.1 behavior - exception on executor throws ClassNotFound on driver - posted by gbop <li...@gmail.com> on 2015/10/19 20:15:53 UTC, 6 replies.
- Differentiate Spark streaming in event logs - posted by franklyn <fr...@gmail.com> on 2015/10/19 20:47:15 UTC, 1 replies.
- Spark SQL Exception: Conf non-local session path expected to be non-null - posted by YaoPau <jo...@gmail.com> on 2015/10/19 22:08:37 UTC, 5 replies.
- Dynamic Allocation & Spark Streaming - posted by robert towne <bi...@gmail.com> on 2015/10/19 22:13:42 UTC, 2 replies.
- Spark ML/MLib newbie question - posted by George Paulson <of...@cluemail.com> on 2015/10/19 23:02:50 UTC, 0 replies.
- Concurrency/Multiple Users - posted by GooniesNeverSayDie <rs...@progress.com> on 2015/10/19 23:13:24 UTC, 1 replies.
- serialization error - posted by daze5112 <da...@ato.gov.au> on 2015/10/20 00:01:04 UTC, 2 replies.
- java TwitterUtils.createStream() how create "user stream" ??? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/10/20 01:10:42 UTC, 1 replies.
- Multiple Spark Streaming Jobs on Single Master - posted by Augustus Hong <au...@branchmetrics.io> on 2015/10/20 01:26:27 UTC, 4 replies.
- Filter RDD - posted by Shepherd <Ch...@huawei.com> on 2015/10/20 01:27:05 UTC, 2 replies.
- Succinct experience - posted by Younes Naguib <Yo...@tritondigital.com> on 2015/10/20 01:35:22 UTC, 0 replies.
- why the Rating(user: Int, product: Int, rating: Double)(in MLlib's ALS), the 'user' and 'product' must be Int? - posted by futureage <li...@gmail.com> on 2015/10/20 05:58:04 UTC, 0 replies.
- Spark opening to many connection with zookeeper - posted by Amit Singh Hora <ho...@gmail.com> on 2015/10/20 09:32:20 UTC, 8 replies.
- [spark1.5.1] HiveQl.parse throws org.apache.spark.sql.AnalysisException: null - posted by Ayoub <be...@gmail.com> on 2015/10/20 11:25:14 UTC, 4 replies.
- difference between rdd.collect().toMap to rdd.collectAsMap() ? - posted by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/10/20 11:35:11 UTC, 1 replies.
- Re: can I use Spark as alternative for gem fire cache ? - posted by Deenar Toraskar <de...@gmail.com> on 2015/10/20 11:42:57 UTC, 1 replies.
- Is there a way to create multiple streams in spark streaming? - posted by LinQili <li...@outlook.com> on 2015/10/20 12:20:14 UTC, 1 replies.
- Ahhhh... Spark creates >30000 partitions... What can I do? - posted by t3l <t3...@threelights.de> on 2015/10/20 14:00:03 UTC, 6 replies.
- Concurrency issue in Streams of data - posted by Priya Ch <le...@gmail.com> on 2015/10/20 14:15:02 UTC, 0 replies.
- spark straggle task - posted by "Triones,Deng(vip.com)" <tr...@vipshop.com> on 2015/10/20 14:20:09 UTC, 1 replies.
- Re: JdbcRDD Constructor - posted by satish chandra j <js...@gmail.com> on 2015/10/20 16:19:20 UTC, 1 replies.
- hbase refguide URL - posted by Ted Yu <yu...@gmail.com> on 2015/10/20 16:38:15 UTC, 0 replies.
- Top 10 count - posted by Carol McDonald <cm...@maprtech.com> on 2015/10/20 16:56:42 UTC, 3 replies.
- Partition for each executor - posted by t3l <t3...@threelights.de> on 2015/10/20 17:13:24 UTC, 1 replies.
- Hive custom transform scripts in Spark? - posted by wuyangjack <v-...@microsoft.com> on 2015/10/20 17:21:24 UTC, 3 replies.
- Spark SQL: Preserving Dataframe Schema - posted by Jerry Lam <ch...@gmail.com> on 2015/10/20 17:24:54 UTC, 5 replies.
- Spark: How to find similar text title - posted by Ascot Moss <as...@gmail.com> on 2015/10/20 17:39:06 UTC, 1 replies.
- Using spark in cluster mode - posted by masoom alam <ma...@wanclouds.net> on 2015/10/20 17:48:26 UTC, 1 replies.
- Can not subscript to mailing list - posted by "jeff.sadowski@gmail.com" <je...@gmail.com> on 2015/10/20 17:48:49 UTC, 2 replies.
- mailing list subscription - posted by Jeff Sadowski <je...@gmail.com> on 2015/10/20 18:22:22 UTC, 1 replies.
- Incremental load of RDD from HDFS? - posted by Chris Spagnoli <cs...@clouddatavisualizer.com> on 2015/10/20 18:23:21 UTC, 2 replies.
- How to change the compression format when using SequenceFileOutputFormat with Spark - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/10/20 19:52:42 UTC, 1 replies.
- Preemption with Spark on Yarn - posted by "surbhi.mungre" <mu...@gmail.com> on 2015/10/20 20:28:13 UTC, 0 replies.
- hive thriftserver and fair scheduling - posted by Sadhan Sood <sa...@gmail.com> on 2015/10/20 20:55:55 UTC, 2 replies.
- Get statistic result from RDD - posted by Shepherd <Ch...@huawei.com> on 2015/10/20 22:33:39 UTC, 5 replies.
- Re: Problem building Spark - posted by Ted Yu <yu...@gmail.com> on 2015/10/20 23:11:33 UTC, 0 replies.
- Whether Spark is appropriate for our use case. - posted by Aliaksei Tsyvunchyk <at...@exadel.com> on 2015/10/20 23:29:22 UTC, 1 replies.
- SF Spark Office Hours Experiment - Friday Afternoon - posted by Holden Karau <ho...@pigscanfly.ca> on 2015/10/21 00:55:28 UTC, 4 replies.
- spark-shell (1.5.1) not starting cleanly on Windows. - posted by Renato Perini <re...@gmail.com> on 2015/10/21 01:36:21 UTC, 1 replies.
- How to distinguish columns when joining DataFrames with shared parent? - posted by Isabelle Phan <nl...@gmail.com> on 2015/10/21 02:23:08 UTC, 6 replies.
- Job splling to disk and memory in Spark Streaming - posted by swetha <sw...@gmail.com> on 2015/10/21 02:59:08 UTC, 0 replies.
- Spark_1.5.1_on_HortonWorks - posted by Ajay Chander <it...@gmail.com> on 2015/10/21 06:05:31 UTC, 12 replies.
- Can we split partition - posted by shahid <sh...@trialx.com> on 2015/10/21 06:57:02 UTC, 1 replies.
- How to get Histogram of all columns in a large CSV / RDD[Array[double]] ? - posted by "DEVAN M.S." <ms...@gmail.com> on 2015/10/21 07:08:49 UTC, 0 replies.
- Spark job stuck at DataFrameReader.json() method - posted by "Bae, Jae Hyeon" <me...@gmail.com> on 2015/10/21 08:45:52 UTC, 2 replies.
- Getting info from DecisionTreeClassificationModel - posted by rake <ra...@randykerber.com> on 2015/10/21 09:04:49 UTC, 2 replies.
- Re: Job splling to disk and memory in Spark Streaming - posted by Tathagata Das <td...@databricks.com> on 2015/10/21 09:36:57 UTC, 1 replies.
- How to use And Operator in filter (PySpark) - posted by Jeff Zhang <zj...@gmail.com> on 2015/10/21 12:11:06 UTC, 0 replies.
- Spark-Testing-Base Q/A - posted by Mark Vervuurt <m....@gmail.com> on 2015/10/21 12:16:30 UTC, 4 replies.
- Spark on Yarn - posted by Raghuveer Chanda <ra...@gmail.com> on 2015/10/21 12:33:11 UTC, 4 replies.
- Problem with applying Multivariate Gaussian Model - posted by Eyal Sharon <ey...@scene53.com> on 2015/10/21 12:35:20 UTC, 0 replies.
- Inner Joins on Cassandra RDDs - posted by Priya Ch <le...@gmail.com> on 2015/10/21 14:07:28 UTC, 0 replies.
- spark SQL thrift server - support for more features via jdbc (table catalog) - posted by rkrist <rk...@vub.sk> on 2015/10/21 15:22:30 UTC, 0 replies.
- Can we add an unsubscribe link in the footer of every email? - posted by Nicholas Chammas <ni...@gmail.com> on 2015/10/21 16:38:56 UTC, 1 replies.
- Spark 1.5.1 with Hive 0.13.1 - posted by Sébastien Rainville <se...@gmail.com> on 2015/10/21 17:18:22 UTC, 0 replies.
- How to check whether the RDD is empty or not - posted by diplomatic Guru <di...@gmail.com> on 2015/10/21 19:00:25 UTC, 4 replies.
- [Spark Streaming] Design Patterns forEachRDD - posted by Nipun Arora <ni...@gmail.com> on 2015/10/21 19:55:05 UTC, 2 replies.
- dataframe average error: Float does not take parameters - posted by Carol McDonald <cm...@maprtech.com> on 2015/10/21 20:12:47 UTC, 2 replies.
- Mapping to multiple groups in Apache Spark - posted by Jeffrey Richley <je...@gmail.com> on 2015/10/21 20:23:23 UTC, 1 replies.
- Kafka Streaming and Filtering > 3000 partitons - posted by Dave Ariens <da...@blackberry.com> on 2015/10/21 20:50:39 UTC, 4 replies.
- Slow activation using Spark Streaming's new receiver scheduling mechanism - posted by "Budde, Adam" <bu...@amazon.com> on 2015/10/21 21:15:19 UTC, 0 replies.
- Distributed caching of a file in SPark Streaming - posted by swetha <sw...@gmail.com> on 2015/10/21 22:05:48 UTC, 0 replies.
- spark streaming 1.51. uses very old version of twitter4j - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/10/21 22:11:20 UTC, 1 replies.
- Poor use cases for Spark - posted by Ben Thompson <t....@gmail.com> on 2015/10/21 22:27:37 UTC, 1 replies.
- --jars option not working for spark on Mesos in cluster mode - posted by Virag Kothari <vi...@streamsets.com> on 2015/10/21 23:14:19 UTC, 0 replies.
- java.util.NoSuchElementException: key not found error - posted by Sourav Mazumder <so...@gmail.com> on 2015/10/22 01:40:05 UTC, 1 replies.
- problems with spark 1.5.1 streaming TwitterUtils.createStream() - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/10/22 01:56:26 UTC, 0 replies.
- Sporadic error after moving from kafka receiver to kafka direct stream - posted by Conor Fennell <co...@altocloud.com> on 2015/10/22 02:23:43 UTC, 4 replies.
- Spark_sql - posted by Ajay Chander <it...@gmail.com> on 2015/10/22 03:32:59 UTC, 1 replies.
- how to use Trees and ensembles: class probabilities - posted by "r7raul1984@163.com" <r7...@163.com> on 2015/10/22 04:58:21 UTC, 0 replies.
- Analyzing consecutive elements - posted by Sampo Niskanen <sa...@wellmo.com> on 2015/10/22 08:35:45 UTC, 6 replies.
- Request for submitting Spark jobs in code purely, without jar - posted by 陈宇航 <yu...@foxmail.com> on 2015/10/22 08:43:19 UTC, 1 replies.
- Spark StreamingStatefull information - posted by Arttii <a....@reply.de> on 2015/10/22 11:54:03 UTC, 1 replies.
- Accessing external Kerberised resources from Spark executors in Yarn client/cluster mode - posted by Deenar Toraskar <de...@gmail.com> on 2015/10/22 13:59:20 UTC, 1 replies.
- Save RandomForest Model from ML package - posted by Sebastian Kuepers <se...@publicispixelpark.de> on 2015/10/22 14:33:45 UTC, 2 replies.
- Re: Error in starting Spark Streaming Context - posted by Tiago Albineli Motta <ti...@gmail.com> on 2015/10/22 15:22:58 UTC, 1 replies.
- Spark 1.5.1+Hadoop2.6 .. unable to write to S3 (HADOOP-12420) - posted by Ashish Shrowty <as...@gmail.com> on 2015/10/22 16:12:42 UTC, 3 replies.
- [Spark Streaming] How do we reset the updateStateByKey values. - posted by Uthayan Suthakar <ut...@gmail.com> on 2015/10/22 17:06:49 UTC, 5 replies.
- Spark groupby and agg inconsistent and missing data - posted by Sa...@wellsfargo.com on 2015/10/22 17:27:29 UTC, 3 replies.
- [jira] Ankit shared "SPARK-11213: Documentation for remote spark Submit for R Scripts from 1.5 on CDH 5.4" with you - posted by "Ankit (JIRA)" <ji...@apache.org> on 2015/10/22 17:39:27 UTC, 1 replies.
- Spark SQL: Issues with using DirectParquetOutputCommitter with APPEND mode and OVERWRITE mode - posted by Jerry Lam <ch...@gmail.com> on 2015/10/22 18:00:59 UTC, 0 replies.
- Spark 1.5 on CDH 5.4.0 - posted by Deenar Toraskar <de...@gmail.com> on 2015/10/22 18:04:00 UTC, 3 replies.
- Python worker exited unexpectedly (crashed) - posted by shahid <sh...@trialx.com> on 2015/10/22 18:37:27 UTC, 0 replies.
- Maven Repository Hosting for Spark SQL 1.5.1 - posted by William Li <a-...@expedia.com> on 2015/10/22 18:48:37 UTC, 4 replies.
- Re: Large number of conf broadcasts - posted by Koert Kuipers <ko...@tresata.com> on 2015/10/22 19:03:58 UTC, 4 replies.
- "java.io.IOException: Connection reset by peer" thrown on the resource manager when launching Spark on Yarn - posted by PashMic <pa...@gmail.com> on 2015/10/22 19:21:48 UTC, 0 replies.
- Running 2 spark application in parallel - posted by Suman Somasundar <su...@oracle.com> on 2015/10/22 20:20:50 UTC, 2 replies.
- spark.deploy.zookeeper.url - posted by Michal Čizmazia <mi...@gmail.com> on 2015/10/22 20:55:31 UTC, 0 replies.
- Fwd: sqlContext load by offset - posted by Kayode Odeyemi <dr...@gmail.com> on 2015/10/22 21:31:13 UTC, 1 replies.
- Re: How to avoid executor time out on yarn spark while dealing with large shuffle skewed data? - posted by Anubhav Agarwal <an...@gmail.com> on 2015/10/22 21:39:17 UTC, 0 replies.
- [SPARK STREAMING] polling based operation instead of event based operation - posted by Nipun Arora <ni...@gmail.com> on 2015/10/22 22:48:33 UTC, 2 replies.
- Spark YARN Shuffle service wire compatibility - posted by Jong Wook Kim <jo...@nyu.edu> on 2015/10/22 23:05:11 UTC, 0 replies.
- Getting ClassNotFoundException: scala.Some on Spark 1.5.x - posted by Babar Tareen <ba...@gmail.com> on 2015/10/22 23:47:14 UTC, 0 replies.
- Spark issue running jar on Linux vs Windows - posted by Michael Lewis <le...@icloud.com> on 2015/10/23 01:39:41 UTC, 3 replies.
- Best way to use Spark UDFs via Hive (Spark Thrift Server) - posted by Dave Moyers <da...@icloud.com> on 2015/10/23 02:15:41 UTC, 1 replies.
- Save to paquet files failed - posted by Ram VISWANADHA <ra...@dailymotion.com> on 2015/10/23 04:15:34 UTC, 1 replies.
- Saving RDDs in Tachyon - posted by mark <ma...@googlemail.com> on 2015/10/23 04:27:19 UTC, 1 replies.
- Saving offset while reading from kafka - posted by Ramkumar V <ra...@gmail.com> on 2015/10/23 05:07:02 UTC, 1 replies.
- (SOLVED) Ahhhh... Spark creates >30000 partitions... What can I do? - posted by t3l <t3...@threelights.de> on 2015/10/23 06:13:01 UTC, 0 replies.
- Whether Spark will use disk when the memory is not enough on MEMORY_ONLY Storage Level - posted by JoneZhang <jo...@gmail.com> on 2015/10/23 06:22:42 UTC, 2 replies.
- How to restart a failed Spark Streaming Application automatically in client mode on YARN - posted by y <yo...@gmail.com> on 2015/10/23 07:04:01 UTC, 1 replies.
- How to close connection in mapPartitions? - posted by Bin Wang <wb...@gmail.com> on 2015/10/23 08:16:08 UTC, 3 replies.
- How to get inverse Matrix / RDD or how to solve linear system of equations - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/10/23 11:19:10 UTC, 2 replies.
- Strange problem of SparkLauncher - posted by 陈宇航 <yu...@foxmail.com> on 2015/10/23 11:28:28 UTC, 0 replies.
- Running many small Spark jobs repeatedly - posted by Stephan Kepser <st...@codecentric.de> on 2015/10/23 12:00:48 UTC, 0 replies.
- java.lang.NegativeArraySizeException? as iterating a big RDD - posted by Yifan LI <ia...@gmail.com> on 2015/10/23 12:24:19 UTC, 3 replies.
- NoSuchMethodException : com.google.common.io.ByteStreams.limit - posted by jinhong lu <lu...@gmail.com> on 2015/10/23 13:10:00 UTC, 1 replies.
- Re: Cannot start REPL shell since 1.4.0 - posted by emlyn <em...@swiftkey.com> on 2015/10/23 13:12:10 UTC, 3 replies.
- How to set memory for SparkR with master="local[*]" - posted by Matej Holec <ho...@gmail.com> on 2015/10/23 13:43:38 UTC, 2 replies.
- Maven build failed (Spark master) - posted by Kayode Odeyemi <dr...@gmail.com> on 2015/10/23 15:14:32 UTC, 15 replies.
- I don't understand what this sentence means."7.1 GB of 7 GB physical memory used" - posted by JoneZhang <jo...@gmail.com> on 2015/10/23 15:46:02 UTC, 1 replies.
- Re: Unable to build Spark 1.5, is build broken or can anyone successfully build? - posted by Robineast <Ro...@xense.co.uk> on 2015/10/23 15:54:41 UTC, 0 replies.
- How does Spark coordinate with Tachyon wrt data locality - posted by "Kinsella, Shane" <Sh...@Aspect.com> on 2015/10/23 17:15:01 UTC, 1 replies.
- Improve parquet write speed to HDFS and spark.sql.execution.id is already set ERROR - posted by Anubhav Agarwal <an...@gmail.com> on 2015/10/23 17:25:07 UTC, 1 replies.
- Stream are not serializable - posted by crakjie <wa...@hotmail.fr> on 2015/10/23 18:49:04 UTC, 2 replies.
- Huge shuffle data size - posted by pratik khadloya <ti...@gmail.com> on 2015/10/23 20:10:27 UTC, 4 replies.
- spark.python.worker.memory Discontinuity - posted by Connor Zanin <cn...@udel.edu> on 2015/10/23 20:16:36 UTC, 0 replies.
- Saprk error:- Not a valid DFS File name - posted by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/10/23 20:40:02 UTC, 3 replies.
- How to implement zipWithIndex as a UDF? - posted by Benyi Wang <be...@gmail.com> on 2015/10/23 21:10:42 UTC, 2 replies.
- Spark cant ORC files properly using 1.5.1 hadoop 2.6 - posted by unk1102 <um...@gmail.com> on 2015/10/23 23:08:26 UTC, 0 replies.
- Spark Streaming: how to StreamingContext.queueStream - posted by Anfernee Xu <an...@gmail.com> on 2015/10/24 01:13:10 UTC, 0 replies.
- "Failed to bind to" error with spark-shell on CDH5 and YARN - posted by Lin Zhao <li...@exabeam.com> on 2015/10/24 01:46:27 UTC, 3 replies.
- get host from rdd map - posted by weoccc <we...@gmail.com> on 2015/10/24 02:16:40 UTC, 3 replies.
- streaming.twitter.TwitterUtils what is the best way to save twitter status to HDFS? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/10/24 03:35:34 UTC, 0 replies.
- [SPARK-9776]Another instance of Derby may have already booted the database #8947 - posted by "Ge, Yao (Y.)" <yg...@ford.com> on 2015/10/24 03:36:04 UTC, 0 replies.
- question about HadoopFsRelation - posted by Koert Kuipers <ko...@tresata.com> on 2015/10/24 06:23:27 UTC, 3 replies.
- Unable to use saveAsSequenceFile - posted by Amit Singh Hora <ho...@gmail.com> on 2015/10/24 13:34:38 UTC, 0 replies.
- Contributing Receiver based Low Level Kafka Consumer from Spark-Packages to Apache Spark Project - posted by Dibyendu Bhattacharya <di...@gmail.com> on 2015/10/24 17:09:37 UTC, 0 replies.
- java how to configure streaming.dstream.DStream<> saveAsTextFiles() to work with hdfs? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/10/24 22:27:00 UTC, 0 replies.
- spark inner join - posted by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/10/24 22:31:51 UTC, 0 replies.
- [SPARK STREAMING] Concurrent operations in spark streaming - posted by Nipun Arora <ni...@gmail.com> on 2015/10/24 23:08:33 UTC, 4 replies.
- [Spark SQL]: Spark Job Hangs on the refresh method when saving over 1 million files - posted by Jerry Lam <ch...@gmail.com> on 2015/10/25 02:35:33 UTC, 8 replies.
- Newbie Help for spark compilation problem - posted by Bilinmek Istemiyor <be...@gmail.com> on 2015/10/25 15:56:37 UTC, 7 replies.
- SparkR in yarn-client mode needs sparkr.zip - posted by Ram Venkatesh <ve...@gmail.com> on 2015/10/25 16:29:24 UTC, 4 replies.
- Spark scala REPL - Unable to create sqlContext - posted by Yao <yg...@ford.com> on 2015/10/25 16:42:46 UTC, 4 replies.
- [SPARK MLLIB] could not understand the wrong and inscrutable result of Linear Regression codes - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/10/25 18:14:30 UTC, 8 replies.
- Error building Spark on Windows with sbt - posted by Richard Eggert <ri...@gmail.com> on 2015/10/25 20:38:11 UTC, 5 replies.
- [Yarn-Client]Can not access SparkUI - posted by Earthson <Ea...@gmail.com> on 2015/10/26 07:21:20 UTC, 3 replies.
- Re: Loading Files from HDFS Incurs Network Communication - posted by Sean Owen <so...@cloudera.com> on 2015/10/26 10:00:08 UTC, 12 replies.
- Accumulators internals and reliability - posted by "Sela, Amit" <AN...@paypal.com.INVALID> on 2015/10/26 10:13:45 UTC, 1 replies.
- Re: Accumulators internals and reliability - posted by Fengdong Yu <fe...@everstring.com> on 2015/10/26 11:55:36 UTC, 0 replies.
- Concurrent execution of actions within a driver - posted by praveen S <my...@gmail.com> on 2015/10/26 12:26:59 UTC, 3 replies.
- Spark 1.5.1 hadoop 2.4 does not clear hive staging files after job finishes - posted by unk1102 <um...@gmail.com> on 2015/10/26 13:08:13 UTC, 0 replies.
- Error Compiling Spark 1.4.1 w/ Scala 2.11 & Hive Support - posted by Bryan Jeffrey <br...@gmail.com> on 2015/10/26 14:01:29 UTC, 3 replies.
- HiveContext ignores ("skip.header.line.count"="1") - posted by Daniel Haviv <da...@veracity-group.com> on 2015/10/26 14:32:42 UTC, 3 replies.
- Re: Anyone feels sparkSQL in spark1.5.1 very slow? - posted by filthysocks <js...@uos.de> on 2015/10/26 14:35:09 UTC, 1 replies.
- Loading binary files from NFS share - posted by Kayode Odeyemi <dr...@gmail.com> on 2015/10/26 14:58:34 UTC, 2 replies.
- correct and fast way to stop streaming application - posted by Krot Viacheslav <kr...@gmail.com> on 2015/10/26 16:28:22 UTC, 5 replies.
- Problem with make-distribution.sh - posted by Yana Kadiyska <ya...@gmail.com> on 2015/10/26 16:34:30 UTC, 5 replies.
- Spark Streaming: how to use StreamingContext.queueStream with existing RDD - posted by Anfernee Xu <an...@gmail.com> on 2015/10/26 17:16:34 UTC, 1 replies.
- rdd conversion - posted by Yasemin Kaya <go...@gmail.com> on 2015/10/26 17:40:02 UTC, 2 replies.
- Broadcast table - posted by Younes Naguib <Yo...@tritondigital.com> on 2015/10/26 19:17:10 UTC, 2 replies.
- Spark Implementation of XGBoost - posted by Meihua Wu <ro...@gmail.com> on 2015/10/26 19:42:53 UTC, 7 replies.
- Custom function to operate on Dataframe Window - posted by aaryabhatta <ra...@gmail.com> on 2015/10/26 20:03:28 UTC, 0 replies.
- Kryo makes String data invalid - posted by Sa...@wellsfargo.com on 2015/10/26 20:17:22 UTC, 0 replies.
- Dynamic Resource Allocation with Spark Streaming (Standalone Cluster, Spark 1.5.1) - posted by Matthias Niehoff <ma...@codecentric.de> on 2015/10/26 21:00:33 UTC, 3 replies.
- LARGE COLLECT - posted by shahid qadri <sh...@icloud.com> on 2015/10/26 21:27:54 UTC, 0 replies.
- Submitting Spark Applications - Do I need to leave ports open? - posted by markluk <ma...@juicero.com> on 2015/10/26 21:35:48 UTC, 0 replies.
- Spark with business rules - posted by danilo <da...@gmail.com> on 2015/10/26 23:12:42 UTC, 2 replies.
- Results change in group by operation - posted by Sa...@wellsfargo.com on 2015/10/26 23:40:17 UTC, 0 replies.
- Joining large data sets - posted by Bryan <br...@gmail.com> on 2015/10/27 00:13:46 UTC, 0 replies.
- spark to hbase - posted by jinhong lu <lu...@gmail.com> on 2015/10/27 10:22:59 UTC, 7 replies.
- There is any way to write from spark to HBase CDH4? - posted by avivb <av...@taykey.com> on 2015/10/27 10:36:23 UTC, 6 replies.
- 回复: spark to hbase - posted by "fightfate@163.com" <fi...@163.com> on 2015/10/27 10:42:12 UTC, 0 replies.
- Separate all values from Iterable - posted by Shams ul Haque <sh...@cashcare.in> on 2015/10/27 11:50:21 UTC, 1 replies.
- is it proper to make RDD as function parameter in the codes - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/10/27 11:50:28 UTC, 0 replies.
- SPARKONHBase checkpointing issue - posted by Amit Singh Hora <ho...@gmail.com> on 2015/10/27 12:53:28 UTC, 3 replies.
- specify yarn-client for --master from a laptop - posted by xiaohe lan <zo...@gmail.com> on 2015/10/27 13:19:20 UTC, 0 replies.
- get directory names that are affected by sc.textFile("path/to/dir/*/*/*.js") - posted by Նարեկ Գալստեան <ng...@gmail.com> on 2015/10/27 14:48:33 UTC, 2 replies.
- doc building process hangs on Failed to load class “org.slf4j.impl.StaticLoggerBinder” - posted by Alex Luya <al...@gmail.com> on 2015/10/27 15:09:07 UTC, 1 replies.
- [Spark Streaming] Why are some uncached RDDs are growing? - posted by diplomatic Guru <di...@gmail.com> on 2015/10/27 16:05:46 UTC, 1 replies.
- Spark SQL Persistent Table - joda DateTime Compatability - posted by Bryan Jeffrey <br...@gmail.com> on 2015/10/27 16:33:03 UTC, 1 replies.
- SparkSQL on hive error - posted by Anand Nalya <an...@gmail.com> on 2015/10/27 16:35:09 UTC, 1 replies.
- Using Hadoop Custom Input format in Spark - posted by "Balachandar R.A." <ba...@gmail.com> on 2015/10/27 17:53:31 UTC, 2 replies.
- [Spark-SQL]: Unable to propagate hadoop configuration after SparkContext is initialized - posted by Jerry Lam <ch...@gmail.com> on 2015/10/27 18:43:39 UTC, 6 replies.
- [Spark Streaming] Connect to Database only once at the start of Streaming job - posted by Uthayan Suthakar <ut...@gmail.com> on 2015/10/27 20:02:35 UTC, 3 replies.
- How to increase active job count to make spark job faster? - posted by unk1102 <um...@gmail.com> on 2015/10/27 20:08:30 UTC, 0 replies.
- Does using Custom Partitioner before calling reduceByKey improve performance? - posted by swetha <sw...@gmail.com> on 2015/10/27 21:20:59 UTC, 5 replies.
- Spark Core Transitive Dependencies - posted by Furkan KAMACI <fu...@gmail.com> on 2015/10/27 23:20:58 UTC, 3 replies.
- expected Kinesis checkpoint behavior when driver restarts - posted by Hster Geguri <hs...@gmail.com> on 2015/10/28 00:09:09 UTC, 1 replies.
- Berkeley DB storage for Spark - posted by Mina <se...@yahoo.com> on 2015/10/28 01:06:27 UTC, 0 replies.
- python.worker.memory parameter - posted by Connor Zanin <cn...@udel.edu> on 2015/10/28 02:29:38 UTC, 0 replies.
- How to catch error during Spark job? - posted by Isabelle Phan <nl...@gmail.com> on 2015/10/28 02:40:34 UTC, 0 replies.
- Filter applied on merged Parquet shemsa with new column fails. - posted by Hyukjin Kwon <gu...@gmail.com> on 2015/10/28 03:11:29 UTC, 1 replies.
- --jars option using hdfs jars cannot effect when spark standlone deploymode with cluster - posted by "ouruia@cnsuning.com" <ou...@cnsuning.com> on 2015/10/28 04:10:50 UTC, 0 replies.
- org.apache.spark.shuffle.FetchFailedException: Failed to connect to ..... on worker failure - posted by kundan kumar <ii...@gmail.com> on 2015/10/28 07:30:43 UTC, 0 replies.
- How to check whether my Spark Jobs are palatalized or not - posted by Vinoth Sankar <vi...@gmail.com> on 2015/10/28 10:35:32 UTC, 0 replies.
- Mllib explain feature for tree ensembles - posted by Eugen Cepoi <ce...@gmail.com> on 2015/10/28 11:29:39 UTC, 2 replies.
- Prevent partitions from moving - posted by t3l <t3...@threelights.de> on 2015/10/28 12:35:51 UTC, 0 replies.
- SparkR 1.5.1 ClassCastException when working with CSV files - posted by rporcio <rp...@gmail.com> on 2015/10/28 12:46:20 UTC, 1 replies.
- Why is no predicate pushdown performed, when using Hive (HiveThriftServer2) ? - posted by Martin Senne <ma...@googlemail.com> on 2015/10/28 13:32:42 UTC, 0 replies.
- How do I parallize Spark Jobs at Executor Level. - posted by Vinoth Sankar <vi...@gmail.com> on 2015/10/28 13:49:06 UTC, 5 replies.
- nested select is not working in spark sql - posted by Kishor Bachhav <kb...@pivotal.io> on 2015/10/28 13:52:50 UTC, 2 replies.
- Building spark-1.5.x and MQTT - posted by Bob Corsaro <rc...@gmail.com> on 2015/10/28 14:19:43 UTC, 4 replies.
- Apache Spark on Raspberry Pi Cluster with Docker - posted by Mark Bonnekessel <ma...@mailbox.org> on 2015/10/28 14:20:27 UTC, 1 replies.
- Hive Version - posted by Bryan Jeffrey <br...@gmail.com> on 2015/10/28 14:26:18 UTC, 1 replies.
- No way to supply hive-site.xml in yarn client mode? - posted by Zoltan Fedor <zo...@gmail.com> on 2015/10/28 15:28:03 UTC, 17 replies.
- Spark -- Writing to Partitioned Persistent Table - posted by Bryan Jeffrey <br...@gmail.com> on 2015/10/28 15:41:53 UTC, 11 replies.
- Inconsistent Persistence of DataFrames in Spark 1.5 - posted by Colin Alstad <co...@pokitdok.com> on 2015/10/28 16:38:52 UTC, 2 replies.
- Spark/Kafka Streaming Job Gets Stuck - posted by "Afshartous, Nick" <na...@turbine.com> on 2015/10/28 16:45:15 UTC, 6 replies.
- SparkSQL: What is the cost of DataFrame.registerTempTable(String)? Can I have multiple tables referencing to the same DataFrame? - posted by Anfernee Xu <an...@gmail.com> on 2015/10/28 18:17:50 UTC, 1 replies.
- newbie trouble submitting java app to AWS cluster I created using spark-ec2 script from spark-1.5.1-bin-hadoop2.6 distribution - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/10/28 22:37:45 UTC, 2 replies.
- Parquet Schema Projection - posted by elosev <el...@amazon.com> on 2015/10/28 22:44:08 UTC, 0 replies.
- Collect Column as Array in Grouped DataFrame - posted by saurfang <fo...@outlook.com> on 2015/10/29 02:19:08 UTC, 1 replies.
- NullPointerException when cache DataFrame in Java (Spark1.5.1) - posted by "Zhang, Jingyu" <ji...@news.com.au> on 2015/10/29 03:33:09 UTC, 3 replies.
- Re: newbie trouble submitting java app to AWS cluster I created using spark-ec2 script from spark-1.5.1-bin-hadoop2.6 distribution - posted by Sabarish Sasidharan <sa...@manthan.com> on 2015/10/29 04:44:35 UTC, 1 replies.
- Need more tasks in KafkaDirectStream - posted by varun sharma <va...@gmail.com> on 2015/10/29 07:27:49 UTC, 4 replies.
- Spark standalone: zookeeper timeout configuration - posted by zedar <za...@gmail.com> on 2015/10/29 11:11:20 UTC, 0 replies.
- Packaging a jar for a jdbc connection using sbt assembly and scala. - posted by "dean.wood" <de...@sparkol.com> on 2015/10/29 11:34:27 UTC, 3 replies.
- Mock Cassandra DB Connection in Unit Testing - posted by Priya Ch <le...@gmail.com> on 2015/10/29 12:27:47 UTC, 4 replies.
- Pivot Data in Spark and Scala - posted by Ascot Moss <as...@gmail.com> on 2015/10/29 12:29:41 UTC, 6 replies.
- [Spark] java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE - posted by Yifan LI <ia...@gmail.com> on 2015/10/29 12:52:15 UTC, 1 replies.
- Exception while reading from kafka stream - posted by Ramkumar V <ra...@gmail.com> on 2015/10/29 13:44:10 UTC, 13 replies.
- spark-1.5.1 application detail ui url - posted by carlilek <ca...@janelia.hhmi.org> on 2015/10/29 13:45:20 UTC, 1 replies.
- [SPARK STREAMING ] Sending data to ElasticSearch - posted by Nipun Arora <ni...@gmail.com> on 2015/10/29 15:49:08 UTC, 0 replies.
- submitting custom metrics.properties file - posted by Radu Brumariu <br...@gmail.com> on 2015/10/29 16:58:36 UTC, 0 replies.
- How to properly read the first number lines of file into a RDD - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/10/29 17:21:31 UTC, 0 replies.
- SPARK SQL- Parquet projection pushdown for nested data - posted by Sadhan Sood <sa...@gmail.com> on 2015/10/29 18:00:35 UTC, 2 replies.
- Loading dataframes to vertica database - posted by spakle <vi...@gmail.com> on 2015/10/29 18:37:10 UTC, 0 replies.
- RDD's filter() or using 'where' condition in SparkSQL - posted by Anfernee Xu <an...@gmail.com> on 2015/10/29 18:51:53 UTC, 3 replies.
- Exception in thread "main" java.lang.IllegalArgumentException: Positive number of slices required - posted by Jerry Wong <je...@gmail.com> on 2015/10/29 19:45:47 UTC, 0 replies.
- Aster Functions equivalent in spark : cfilter, npath and sessionize - posted by didier vila <sp...@hotmail.com> on 2015/10/29 20:16:37 UTC, 1 replies.
- sparkR 1.5.1 batch yarn-client mode failing on daemon.R not found - posted by tstewart <st...@yahoo.com> on 2015/10/29 20:17:19 UTC, 1 replies.
- Running FPGrowth over a JavaPairRDD? - posted by Fernando Paladini <fn...@gmail.com> on 2015/10/29 22:30:23 UTC, 1 replies.
- Maintaining overall cumulative data in Spark Streaming - posted by Sandeep Giri <sa...@knowbigdata.com> on 2015/10/29 23:08:42 UTC, 4 replies.
- Issue on spark.driver.maxResultSize - posted by karthik kadiyam <ka...@gmail.com> on 2015/10/30 00:01:19 UTC, 0 replies.
- Save data to different S3 - posted by William Li <a-...@expedia.com> on 2015/10/30 00:55:46 UTC, 4 replies.
- SparkLauncher is blocked until mail process is killed. - posted by 陈宇航 <yu...@foxmail.com> on 2015/10/30 02:54:22 UTC, 1 replies.
- issue with spark.driver.maxResultSize parameter in spark 1.3 - posted by karthik kadiyam <ka...@gmail.com> on 2015/10/30 03:03:19 UTC, 2 replies.
- Re: SparkLauncher is blocked until main process is killed. - posted by Ted Yu <yu...@gmail.com> on 2015/10/30 03:11:17 UTC, 1 replies.
- Re: [Spark] java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE - posted by Deng Ching-Mallete <oc...@apache.org> on 2015/10/30 03:13:03 UTC, 1 replies.
- 回复:SparkLauncher is blocked until main process is killed. - posted by 陈宇航 <yu...@foxmail.com> on 2015/10/30 03:56:46 UTC, 3 replies.
- Spark 1.5.1 Dynamic Resource Allocation - posted by tstewart <st...@yahoo.com> on 2015/10/30 04:00:06 UTC, 1 replies.
- Spark 1.5.1 Build Failure - posted by Raghuveer Chanda <ra...@gmail.com> on 2015/10/30 07:34:34 UTC, 3 replies.
- Best practises - posted by Deepak Sharma <de...@gmail.com> on 2015/10/30 11:53:03 UTC, 0 replies.
- 回复:Best practises - posted by huangzheng <11...@qq.com> on 2015/10/30 12:05:08 UTC, 0 replies.
- Issue of Hive parquet partitioned table schema mismatch - posted by Rex Xiong <by...@gmail.com> on 2015/10/30 12:05:11 UTC, 3 replies.
- Spark Streaming (1.5.0) flaky when recovering from checkpoint - posted by "David P. Kleinschmidt" <da...@kleinschmidt.name> on 2015/10/30 13:44:46 UTC, 0 replies.
- heap memory - posted by Younes Naguib <Yo...@tritondigital.com> on 2015/10/30 14:46:14 UTC, 0 replies.
- Caching causes later actions to get stuck - posted by Sampo Niskanen <sa...@wellmo.com> on 2015/10/30 15:57:03 UTC, 0 replies.
- SparkR job with >200 tasks hangs when calling from web server - posted by rporcio <rp...@gmail.com> on 2015/10/30 16:09:03 UTC, 0 replies.
- Pulling data from a secured SQL database - posted by Thomas Ginter <th...@utah.edu> on 2015/10/30 18:49:25 UTC, 3 replies.
- how to merge two dataframes - posted by Yana Kadiyska <ya...@gmail.com> on 2015/10/30 20:11:15 UTC, 4 replies.
- Using model saved by MLlib with out creating spark context - posted by vijuks <vi...@gmail.com> on 2015/10/30 21:33:04 UTC, 0 replies.
- RE: Spark tunning increase number of active tasks - posted by "YI, XIAOCHUAN" <xy...@att.com> on 2015/10/30 22:11:16 UTC, 3 replies.
- key not found: sportingpulse.com in Spark SQL 1.5.0 - posted by "Zhang, Jingyu" <ji...@news.com.au> on 2015/10/30 22:57:56 UTC, 5 replies.
- Stack overflow error caused by long lineage RDD created after many recursions - posted by Panos Str <st...@gmail.com> on 2015/10/30 23:10:53 UTC, 1 replies.
- Very slow performance on very small record counts - posted by "Young, Matthew T" <ma...@intel.com> on 2015/10/30 23:38:13 UTC, 0 replies.
- Extending Spark ML LogisticRegression Object - posted by njoshi <ni...@teamaol.com> on 2015/10/31 00:32:00 UTC, 0 replies.
- foreachPartition - posted by Alex Nastetsky <al...@vervemobile.com> on 2015/10/31 00:42:11 UTC, 2 replies.
- CompositeInputFormat in Spark - posted by Alex Nastetsky <al...@vervemobile.com> on 2015/10/31 04:53:34 UTC, 0 replies.
- Performance issues in SSSP using GraphX - posted by Khaled Ammar <kh...@gmail.com> on 2015/10/31 05:01:37 UTC, 0 replies.
- Assign unique link ID - posted by Sarath Chandra <sa...@algofusiontech.com> on 2015/10/31 08:44:06 UTC, 3 replies.
- Programatically create RDDs based on input - posted by amit tewari <am...@gmail.com> on 2015/10/31 13:09:15 UTC, 3 replies.
- job hangs when using pipe() with reduceByKey() - posted by hotdog <li...@163.com> on 2015/10/31 13:18:40 UTC, 2 replies.
- How to lookup by a key in an RDD - posted by swetha <sw...@gmail.com> on 2015/10/31 17:04:07 UTC, 1 replies.
- Sorry, but Nabble and ML suck - posted by Martin Senne <ma...@googlemail.com> on 2015/10/31 17:19:50 UTC, 4 replies.
- Why does predicate pushdown not work on HiveContext (concrete HiveThriftServer2) ? - posted by Martin Senne <ma...@googlemail.com> on 2015/10/31 17:50:54 UTC, 0 replies.