user@spark.apache.org, 2015-03

You are viewing a plain text version of this content. The canonical link for it is here.

- Re: Unable to find org.apache.spark.sql.catalyst.ScalaReflection class - posted by Ashish Nigam <as...@gmail.com> on 2015/03/01 00:06:32 UTC, 13 replies.
- Re: Tools to manage workflows on Spark - posted by Ashish Nigam <as...@gmail.com> on 2015/03/01 00:07:31 UTC, 11 replies.
- Re: Is there any Sparse Matrix implementation in Spark/MLib? - posted by Joseph Bradley <jo...@databricks.com> on 2015/03/01 01:25:39 UTC, 3 replies.
- Re: Some questions after playing a little with the new ml.Pipeline. - posted by Joseph Bradley <jo...@databricks.com> on 2015/03/01 01:32:20 UTC, 11 replies.
- Re: Reg. Difference in Performance - posted by Joseph Bradley <jo...@databricks.com> on 2015/03/01 01:34:42 UTC, 1 replies.
- Re: Accumulator in SparkUI for streaming - posted by Tim Smith <se...@gmail.com> on 2015/03/01 03:55:31 UTC, 0 replies.
- Connection pool in workers - posted by "A.K.M. Ashrafuzzaman" <as...@gmail.com> on 2015/03/01 07:40:22 UTC, 3 replies.
- RE: Spark SQL Stackoverflow error - posted by Jishnu Prathap <ji...@wipro.com> on 2015/03/01 09:03:50 UTC, 2 replies.
- Re: Scalable JDBCRDD - posted by Cody Koeninger <co...@koeninger.org> on 2015/03/01 09:09:01 UTC, 6 replies.
- Submitting jobs on Spark EC2 cluster: class not found, even if it's on CLASSPATH - posted by olegshirokikh <ol...@solver.com> on 2015/03/01 09:39:55 UTC, 0 replies.
- Spark Streaming testing strategies - posted by Marcin Kuthan <ma...@gmail.com> on 2015/03/01 10:13:49 UTC, 4 replies.
- Re: Number of cores per executor on Spark Standalone - posted by Deborah Siegel <de...@gmail.com> on 2015/03/01 10:58:56 UTC, 0 replies.
- Re: Columnar-Oriented RDDs - posted by Night Wolf <ni...@gmail.com> on 2015/03/01 12:33:35 UTC, 1 replies.
- Store Spark data into hive table - posted by tarek_abouzeid <ta...@yahoo.com> on 2015/03/01 15:51:25 UTC, 0 replies.
- unsafe memory access in spark 1.2.1 - posted by "Zalzberg, Idan (Agoda)" <Id...@agoda.com> on 2015/03/01 16:03:20 UTC, 5 replies.
- Re: Problem getting program to run on 15TB input - posted by Arun Luthra <ar...@gmail.com> on 2015/03/01 18:56:38 UTC, 1 replies.
- Pushing data from AWS Kinesis -> Spark Streaming -> AWS Redshift - posted by Mike Trienis <mi...@orcsol.com> on 2015/03/01 20:06:55 UTC, 1 replies.
- Re: Streaming scheduling delay - posted by Josh J <jo...@gmail.com> on 2015/03/01 23:03:03 UTC, 0 replies.
- Column Similarities using DIMSUM fails with GC overhead limit exceeded - posted by Sabarish Sasidharan <sa...@manthan.com> on 2015/03/02 00:31:45 UTC, 8 replies.
- RE: Is SPARK_CLASSPATH really deprecated? - posted by Taeyun Kim <ta...@innowireless.com> on 2015/03/02 01:05:46 UTC, 1 replies.
- documentation - graphx-programming-guide error? - posted by Deborah Siegel <de...@gmail.com> on 2015/03/02 08:12:10 UTC, 1 replies.
- Is SQLContext thread-safe? - posted by Haopu Wang <HW...@qilinsoft.com> on 2015/03/02 09:43:21 UTC, 5 replies.
- Re: SparkSQL Timestamp query failure - posted by anu <an...@gmail.com> on 2015/03/02 10:37:50 UTC, 2 replies.
- Architecture of Apache Spark SQL - posted by dubey_a <Ab...@xoriant.com> on 2015/03/02 10:48:53 UTC, 2 replies.
- SQL Queries running on Schema RDD's in Spark SQL - posted by dubey_a <Ab...@xoriant.com> on 2015/03/02 11:00:24 UTC, 0 replies.
- Performance tuning in Spark SQL. - posted by dubey_a <Ab...@xoriant.com> on 2015/03/02 11:01:32 UTC, 3 replies.
- Best practices for query creation in Spark SQL. - posted by dubey_a <Ab...@xoriant.com> on 2015/03/02 11:03:25 UTC, 1 replies.
- Re: Number of Executors per worker process - posted by Spico Florin <sp...@gmail.com> on 2015/03/02 11:05:09 UTC, 0 replies.
- Combiners in Spark - posted by Guillermo Ortiz <ko...@gmail.com> on 2015/03/02 11:55:52 UTC, 1 replies.
- GraphX path traversal - posted by Madabhattula Rajesh Kumar <mr...@gmail.com> on 2015/03/02 12:17:50 UTC, 7 replies.
- multiple sparkcontexts and streamingcontexts - posted by jamborta <ja...@gmail.com> on 2015/03/02 15:43:33 UTC, 7 replies.
- Re: Store DStreams into Hive using Hive Streaming - posted by tarek_abouzeid <ta...@yahoo.com> on 2015/03/02 15:47:30 UTC, 0 replies.
- Re: SparkSQL production readiness - posted by Daniel Siegmann <da...@teamaol.com> on 2015/03/02 16:29:22 UTC, 2 replies.
- Re: java.util.NoSuchElementException: key not found: - posted by Rok Roskar <ro...@gmail.com> on 2015/03/02 16:34:42 UTC, 0 replies.
- Re: bitten by spark.yarn.executor.memoryOverhead - posted by Ryan Williams <ry...@gmail.com> on 2015/03/02 17:36:37 UTC, 2 replies.
- Re: Upgrade to Spark 1.2.1 using Guava - posted by Pat Ferrel <pa...@occamsmachete.com> on 2015/03/02 17:52:03 UTC, 0 replies.
- Re: Is SparkSQL optimizer aware of the needed data after the query? - posted by Michael Armbrust <mi...@databricks.com> on 2015/03/02 18:30:30 UTC, 0 replies.
- Issues reading in Json file with spark sql - posted by kpeng1 <kp...@gmail.com> on 2015/03/02 20:28:15 UTC, 2 replies.
- Re: What joda-time dependency does spark submit use/need? - posted by Su She <su...@gmail.com> on 2015/03/02 20:37:34 UTC, 0 replies.
- External Data Source in Spark - posted by "Addanki, Santosh Kumar" <sa...@sap.com> on 2015/03/02 20:41:52 UTC, 3 replies.
- Executing hive query from Spark code - posted by nitinkak001 <ni...@gmail.com> on 2015/03/02 21:37:05 UTC, 4 replies.
- Dataframe v/s SparkSQL - posted by Manoj Samel <ma...@gmail.com> on 2015/03/02 21:50:19 UTC, 1 replies.
- Re: Workaround for spark 1.2.X roaringbitmap kryo problem? - posted by Arun Luthra <ar...@gmail.com> on 2015/03/02 21:55:11 UTC, 6 replies.
- JavaRDD method ambiguous after upgrading to Java 8 - posted by btiernay <bt...@hotmail.com> on 2015/03/02 23:03:08 UTC, 2 replies.
- Spark Error: Cause was: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@localhost:7077 - posted by Krishnanand Khambadkone <kk...@yahoo.com.INVALID> on 2015/03/02 23:06:43 UTC, 7 replies.
- RDD partitions per executor in Cassandra Spark Connector - posted by "Rumph, Frens Jan" <ma...@frensjan.nl> on 2015/03/02 23:08:29 UTC, 2 replies.
- Spark UI and running spark-submit with --master yarn - posted by Anupama Joshi <an...@gmail.com> on 2015/03/03 00:39:30 UTC, 6 replies.
- Problems running version 1.3.0-rc1 - posted by Yiannis Gkoufas <jo...@gmail.com> on 2015/03/03 03:13:19 UTC, 1 replies.
- Re: throughput in the web console? - posted by Saiph Kappa <sa...@gmail.com> on 2015/03/03 03:47:34 UTC, 1 replies.
- LBGFS optimizer performace - posted by Gustavo Enrique Salazar Torres <gs...@ime.usp.br> on 2015/03/03 05:39:30 UTC, 8 replies.
- how to clean shuffle write each iteration - posted by lisendong <li...@163.com> on 2015/03/03 07:33:20 UTC, 3 replies.
- Exception while select into table. - posted by LinQili <li...@outlook.com> on 2015/03/03 07:36:20 UTC, 3 replies.
- Spark SQL Thrift Server start exception : java.lang.ClassNotFoundException: org.datanucleus.api.jdo.JDOPersistenceManagerFactory - posted by fanooos <de...@gmail.com> on 2015/03/03 07:49:32 UTC, 3 replies.
- Supporting Hive features in Spark SQL Thrift JDBC server - posted by shahab <sh...@gmail.com> on 2015/03/03 08:51:53 UTC, 13 replies.
- gc time too long when using mllib als - posted by lisendong <li...@163.com> on 2015/03/03 09:56:11 UTC, 2 replies.
- SparkSQL, executing an "OR" - posted by Guillermo Ortiz <ko...@gmail.com> on 2015/03/03 10:13:32 UTC, 2 replies.
- Return jobid for a hive query? - posted by Rex Xiong <by...@gmail.com> on 2015/03/03 10:30:27 UTC, 0 replies.
- Re: One of the executor not getting StopExecutor message - posted by twinkle sachdeva <tw...@gmail.com> on 2015/03/03 11:08:25 UTC, 1 replies.
- Is the RDD's Partitions determined before hand ? - posted by Jeff Zhang <zj...@gmail.com> on 2015/03/03 11:21:43 UTC, 6 replies.
- insert Hive table with RDD - posted by patcharee <Pa...@uni.no> on 2015/03/03 12:08:57 UTC, 3 replies.
- delay between removing the block manager of an executor, and marking that as lost - posted by twinkle sachdeva <tw...@gmail.com> on 2015/03/03 12:16:06 UTC, 1 replies.
- java.lang.IncompatibleClassChangeError when using PrunedFilteredScan - posted by taoewang <ta...@sequoiadb.com> on 2015/03/03 12:38:45 UTC, 2 replies.
- Re: RDDs - posted by "Kartheek.R" <ka...@gmail.com> on 2015/03/03 13:00:12 UTC, 3 replies.
- LATERAL VIEW explode requests the full schema - posted by matthes <ma...@web.de> on 2015/03/03 13:36:03 UTC, 1 replies.
- spark.local.dir leads to "Job cancelled because SparkContext was shut down" - posted by lisendong <li...@163.com> on 2015/03/03 14:15:56 UTC, 1 replies.
- (Unknown) - posted by shahab <sh...@gmail.com> on 2015/03/03 14:48:18 UTC, 3 replies.
- Can not query TempTable registered by SQL Context using HiveContext - posted by shahab <sh...@gmail.com> on 2015/03/03 14:52:22 UTC, 2 replies.
- Why can't Spark Streaming recover from the checkpoint directory when using a third party library for processingmulti-line JSON? - posted by Emre Sevinc <em...@gmail.com> on 2015/03/03 15:36:03 UTC, 3 replies.
- PRNG in Scala - posted by Vijayasarathy Kannan <kv...@vt.edu> on 2015/03/03 16:08:57 UTC, 2 replies.
- Re: Shared Drivers - posted by John Omernik <jo...@omernik.com> on 2015/03/03 16:10:26 UTC, 1 replies.
- Re: On app upgrade, restore sliding window data. - posted by Matus Faro <ma...@kik.com> on 2015/03/03 16:23:29 UTC, 0 replies.
- Re: Running Spark jobs via oozie - posted by nitinkak001 <ni...@gmail.com> on 2015/03/03 16:46:00 UTC, 1 replies.
- Re: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved attributes: pyspark on yarn - posted by Gustavo Enrique Salazar Torres <gs...@ime.usp.br> on 2015/03/03 17:38:29 UTC, 1 replies.
- Resource manager UI for Spark applications - posted by Rohini joshi <ro...@gmail.com> on 2015/03/03 17:45:57 UTC, 7 replies.
- Issue with yarn cluster - hangs in accepted state. - posted by abhi <ab...@gmail.com> on 2015/03/03 17:51:24 UTC, 3 replies.
- Solve least square problem of the form min norm(A x - b)^2^ + lambda * n * norm(x)^2 ? - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2015/03/03 18:01:22 UTC, 12 replies.
- Issue using S3 bucket from Spark 1.2.1 with hadoop 2.4 - posted by Ankur Srivastava <an...@gmail.com> on 2015/03/03 18:44:33 UTC, 2 replies.
- Spark Monitoring UI for Hadoop Yarn Cluster - posted by Srini Karri <sk...@gmail.com> on 2015/03/03 18:47:11 UTC, 6 replies.
- Re: Having lots of FetchFailedException in join - posted by Jianshi Huang <ji...@gmail.com> on 2015/03/03 20:03:17 UTC, 16 replies.
- UnsatisfiedLinkError related to libgfortran when running MLLIB code on RHEL 5.8 - posted by Prashant Sharma <sh...@gmail.com> on 2015/03/03 20:21:05 UTC, 1 replies.
- Why different numbers of partitions give different results for the same computation on the same dataset? - posted by Saiph Kappa <sa...@gmail.com> on 2015/03/03 20:32:06 UTC, 2 replies.
- dynamically change receiver for a spark stream - posted by Islem <is...@yahoo.fr> on 2015/03/03 21:44:03 UTC, 0 replies.
- Does sc.newAPIHadoopFile support multiple directories (or nested directories)? - posted by "S. Zhou" <my...@yahoo.com.INVALID> on 2015/03/03 22:59:34 UTC, 8 replies.
- java.lang.NoClassDefFoundError: org/apache/spark/streaming/kafka/KafkaUtils - posted by Krishnanand Khambadkone <kk...@yahoo.com.INVALID> on 2015/03/03 23:36:12 UTC, 0 replies.
- Spark sql results can't be printed out to system console from spark streaming application - posted by Cui Lin <cu...@hds.com> on 2015/03/04 00:55:05 UTC, 2 replies.
- ImportError: No module named iter ... (on CDH5 v1.2.0+cdh5.3.2+369-1.cdh5.3.2.p0.17.el6.noarch) ... - posted by "subscriptions@prismalytics.io" <su...@prismalytics.io> on 2015/03/04 01:21:34 UTC, 2 replies.
- TreeNodeException: Unresolved attributes - posted by Anusha Shamanur <an...@gmail.com> on 2015/03/04 01:45:27 UTC, 3 replies.
- Spark Streaming Switchover Time - posted by "Nastooh Avessta (navesta)" <na...@cisco.com> on 2015/03/04 06:57:38 UTC, 6 replies.
- how to save Word2VecModel - posted by anupamme <me...@gmail.com> on 2015/03/04 07:16:29 UTC, 2 replies.
- Connecting a PHP/Java applications to Spark SQL Thrift Server - posted by fanooos <de...@gmail.com> on 2015/03/04 08:15:08 UTC, 3 replies.
- Re: Parallel execution of JavaDStream/JavaPairDStream - posted by Jishnu Prathap <ji...@wipro.com> on 2015/03/04 09:02:58 UTC, 0 replies.
- Re: - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/03/04 09:34:59 UTC, 2 replies.
- Spark RDD Python, Numpy Shape command - posted by rui li <ru...@googlemail.com> on 2015/03/04 09:59:56 UTC, 0 replies.
- scala.Double vs java.lang.Double in RDD - posted by Tobias Pfeiffer <tg...@preferred.jp> on 2015/03/04 10:17:41 UTC, 2 replies.
- spark master shut down suddenly - posted by lisendong <li...@163.com> on 2015/03/04 10:40:48 UTC, 4 replies.
- Spark Streaming and SchemaRDD usage - posted by Haopu Wang <HW...@qilinsoft.com> on 2015/03/04 11:04:24 UTC, 0 replies.
- Is FileInputDStream returned by fileStream method a reliable receiver? - posted by Emre Sevinc <em...@gmail.com> on 2015/03/04 11:14:22 UTC, 1 replies.
- Unable to submit spark job to mesos cluster - posted by Sarath Chandra <sa...@algofusiontech.com> on 2015/03/04 12:38:38 UTC, 3 replies.
- Re: Speed Benchmark - posted by Guillaume Guy <gu...@gmail.com> on 2015/03/04 12:52:22 UTC, 1 replies.
- Does SparkSQL support "..... having count (fieldname)" in SQL statement? - posted by shahab <sh...@gmail.com> on 2015/03/04 13:22:40 UTC, 3 replies.
- Save and read parquet from the same path - posted by Karlson <ks...@siberie.de> on 2015/03/04 16:14:41 UTC, 1 replies.
- Re: Nested Case Classes (Found and Required Same) - posted by Bojan Kostic <bl...@gmail.com> on 2015/03/04 16:51:12 UTC, 0 replies.
- Error communicating with MapOutputTracker - posted by Thomas Gerber <th...@radius.com> on 2015/03/04 17:15:08 UTC, 3 replies.
- Re: issue Running Spark Job on Yarn Cluster - posted by sachin Singh <sa...@gmail.com> on 2015/03/04 18:21:58 UTC, 1 replies.
- Does anyone integrate HBASE on Spark - posted by sandeep vura <sa...@gmail.com> on 2015/03/04 18:51:29 UTC, 1 replies.
- Passing around SparkContext with in the Driver - posted by kpeng1 <kp...@gmail.com> on 2015/03/04 19:09:11 UTC, 1 replies.
- Spark logs in standalone clusters - posted by Thomas Gerber <th...@radius.com> on 2015/03/04 19:21:47 UTC, 0 replies.
- spark sql median and standard deviation - posted by tridib <tr...@live.com> on 2015/03/04 19:51:03 UTC, 1 replies.
- configure number of cached partition in memory on SparkSQL - posted by Judy Nash <ju...@exchange.microsoft.com> on 2015/03/04 20:51:58 UTC, 0 replies.
- Integer column in schema RDD from parquet being considered as string - posted by gtinside <gt...@gmail.com> on 2015/03/04 21:34:18 UTC, 1 replies.
- Spark SQL Static Analysis - posted by Justin Pihony <ju...@gmail.com> on 2015/03/04 21:53:48 UTC, 2 replies.
- Driver disassociated - posted by Thomas Gerber <th...@radius.com> on 2015/03/04 22:39:53 UTC, 5 replies.
- distribution of receivers in spark streaming - posted by Du Li <li...@yahoo-inc.com.INVALID> on 2015/03/05 00:34:06 UTC, 5 replies.
- RDD coalesce or repartition by #records or #bytes? - posted by Du Li <li...@yahoo-inc.com.INVALID> on 2015/03/05 00:47:53 UTC, 1 replies.
- Issues with maven dependencies for version 1.2.0 but not version 1.1.0 - posted by kpeng1 <kp...@gmail.com> on 2015/03/05 00:49:59 UTC, 8 replies.
- In the HA master mode, how to identify the alive master? - posted by Xuelin Cao <xu...@gmail.com> on 2015/03/05 02:32:25 UTC, 0 replies.
- Extra output from Spark run - posted by cjwang <cj...@cjwang.us> on 2015/03/05 03:11:25 UTC, 2 replies.
- how to update als in mllib? - posted by lisendong <li...@163.com> on 2015/03/05 04:07:28 UTC, 0 replies.
- Re: Where can I find more information about the R interface for Spark? - posted by haopu <hw...@qilinsoft.com> on 2015/03/05 04:14:03 UTC, 0 replies.
- Re: Where can I find more information about the R interface forSpark? - posted by 鹰 <98...@qq.com> on 2015/03/05 04:19:26 UTC, 2 replies.
- Unable to Read/Write Avro RDD on cluster. - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/03/05 05:25:27 UTC, 0 replies.
- How to parse Json formatted Kafka message in spark streaming - posted by Cui Lin <cu...@hds.com> on 2015/03/05 06:43:15 UTC, 7 replies.
- using log4j2 with spark - posted by Lior Chaga <li...@taboola.com> on 2015/03/05 08:50:21 UTC, 1 replies.
- Re: Which OutputCommitter to use for S3? - posted by Pei-Lun Lee <pl...@appier.com> on 2015/03/05 09:28:40 UTC, 3 replies.
- Connection PHP application to Spark Sql thrift server - posted by fanooos <de...@gmail.com> on 2015/03/05 09:56:32 UTC, 2 replies.
- Re: Identify the performance bottleneck from hardware prospective - posted by davidkl <da...@hotmail.com> on 2015/03/05 10:39:03 UTC, 2 replies.
- Managing permissions when saving as text file - posted by didmar <ma...@gmail.com> on 2015/03/05 11:33:35 UTC, 2 replies.
- Nullpointer Exception on broadcast variables (YARN Cluster mode) - posted by samriddhac <sa...@gmail.com> on 2015/03/05 11:37:16 UTC, 0 replies.
- Map task in Trident. - posted by Vladimir Protsenko <pr...@gmail.com> on 2015/03/05 12:15:53 UTC, 0 replies.
- spark-shell --master yarn-client fail on Windows - posted by Xi Shen <da...@gmail.com> on 2015/03/05 14:02:18 UTC, 0 replies.
- Spark with data on NFS v HDFS - posted by Ashish Mukherjee <as...@gmail.com> on 2015/03/05 14:58:22 UTC, 1 replies.
- Partitioning Dataset and Using Reduce in Apache Spark - posted by raggy <ra...@gmail.com> on 2015/03/05 15:45:59 UTC, 3 replies.
- Spark Build with Hadoop 2.6, yarn - encounter java.lang.NoClassDefFoundError: org/codehaus/jackson/map/deser/std/StdDeserializer - posted by Todd Nist <ts...@gmail.com> on 2015/03/05 19:04:15 UTC, 12 replies.
- IncompatibleClassChangeError - posted by ey-chih chow <ey...@hotmail.com> on 2015/03/05 19:31:58 UTC, 1 replies.
- Spark v1.2.1 failing under BigTop build in External Flume Sink (due to missing Netty library) - posted by "Kelly, Jonathan" <jo...@amazon.com> on 2015/03/05 19:39:38 UTC, 3 replies.
- Training Random Forest - posted by drarse <dr...@gmail.com> on 2015/03/05 21:31:56 UTC, 1 replies.
- RE: spark standalone with multiple executors in one work node - posted by Judy Nash <ju...@exchange.microsoft.com> on 2015/03/05 22:59:40 UTC, 0 replies.
- Question about the spark assembly deployed to the cluster with the ec2 scripts - posted by Darin McBeath <dd...@yahoo.com.INVALID> on 2015/03/05 23:14:50 UTC, 0 replies.
- Spark Streaming - Duration 1s not matching reality - posted by eleroy <el...@msn.com> on 2015/03/06 00:10:18 UTC, 1 replies.
- Writing to S3 and retrieving folder names - posted by Mike Trienis <mi...@orcsol.com> on 2015/03/06 00:33:21 UTC, 1 replies.
- Building Spark 1.3 for Scala 2.11 using Maven - posted by Night Wolf <ni...@gmail.com> on 2015/03/06 01:46:09 UTC, 4 replies.
- Spark code development practice - posted by Xi Shen <da...@gmail.com> on 2015/03/06 02:19:00 UTC, 5 replies.
- spark-ec2 script problems - posted by roni <ro...@gmail.com> on 2015/03/06 02:44:54 UTC, 1 replies.
- SparkSQL JSON array support - posted by Justin Pihony <ju...@gmail.com> on 2015/03/06 03:11:10 UTC, 1 replies.
- Construct model matrix from SchemaRDD automatically - posted by Wush Wu <wu...@bridgewell.com> on 2015/03/06 05:43:47 UTC, 1 replies.
- why my YoungGen GC takes so long time? - posted by lisendong <li...@163.com> on 2015/03/06 07:17:27 UTC, 0 replies.
- spark-stream programme failed on yarn-client - posted by fenghaixiong <98...@qq.com> on 2015/03/06 08:39:15 UTC, 2 replies.
- Compile Spark with Maven & Zinc Scala Plugin - posted by Night Wolf <ni...@gmail.com> on 2015/03/06 08:51:45 UTC, 3 replies.
- 回复：Compile Spark with Maven & Zinc Scala Plugin - posted by 鹰 <98...@qq.com> on 2015/03/06 08:56:32 UTC, 0 replies.
- Store the shuffled files in memory using Tachyon - posted by sara mustafa <en...@gmail.com> on 2015/03/06 09:25:41 UTC, 0 replies.
- No overwrite flag for saveAsXXFile - posted by Jeff Zhang <zj...@gmail.com> on 2015/03/06 11:14:19 UTC, 5 replies.
- Optimizing SQL Query - posted by anu <an...@gmail.com> on 2015/03/06 13:07:25 UTC, 2 replies.
- [SPARK-SQL] How to pass parameter when running hql script using cli? - posted by James <al...@gmail.com> on 2015/03/06 13:20:28 UTC, 2 replies.
- Using 1.3.0 client jars with 1.2.1 assembly in yarn-cluster mode - posted by Zsolt Tóth <to...@gmail.com> on 2015/03/06 15:08:03 UTC, 1 replies.
- Spark-SQL and Hive - is Hive required? - posted by Edmon Begoli <eb...@gmail.com> on 2015/03/06 15:31:47 UTC, 6 replies.
- Data Frame types - posted by Cesar Flores <ce...@gmail.com> on 2015/03/06 16:22:16 UTC, 2 replies.
- spark-sorted, or secondary sort and streaming reduce for spark - posted by Koert Kuipers <ko...@tresata.com> on 2015/03/06 17:53:11 UTC, 2 replies.
- Re: Visualize Spark Job - posted by Phuoc Do <ph...@vida.io> on 2015/03/06 19:13:50 UTC, 0 replies.
- SparkSQL supports hive "insert overwrite directory"? - posted by ogoh <ok...@gmail.com> on 2015/03/06 19:18:28 UTC, 0 replies.
- takeSample triggers 2 jobs - posted by Rares Vernica <rv...@gmail.com> on 2015/03/06 19:37:03 UTC, 1 replies.
- Help with transformWith in SparkStreaming - posted by Laeeq Ahmed <la...@yahoo.com.INVALID> on 2015/03/06 20:06:23 UTC, 2 replies.
- HiveContext test, "Spark Context did not initialize after waiting 10000ms" - posted by nitinkak001 <ni...@gmail.com> on 2015/03/06 23:47:15 UTC, 1 replies.
- Spark streaming and executor object reusage - posted by Jean-Pascal Billaud <jp...@tellapart.com> on 2015/03/07 02:32:00 UTC, 4 replies.
- How to reuse a ML trained model? - posted by Xi Shen <da...@gmail.com> on 2015/03/07 11:10:45 UTC, 7 replies.
- MLlib/kmeans newbie question(s) - posted by Pierce Lamb <ri...@gmail.com> on 2015/03/08 00:20:32 UTC, 1 replies.
- Re: distcp on ec2 standalone spark cluster - posted by roni <ro...@gmail.com> on 2015/03/08 03:02:59 UTC, 1 replies.
- Bulk insert strategy - posted by "A.K.M. Ashrafuzzaman" <as...@gmail.com> on 2015/03/08 07:54:09 UTC, 2 replies.
- A way to share RDD directly using Tachyon? - posted by Yijie Shen <he...@gmail.com> on 2015/03/08 11:29:36 UTC, 1 replies.
- Using sparkContext in inside a map function - posted by danielil <da...@veracity-group.com> on 2015/03/08 17:14:46 UTC, 1 replies.
- using sparkContext from within a map function (from spark streaming app) - posted by Daniel Haviv <da...@veracity-group.com> on 2015/03/08 17:22:29 UTC, 1 replies.
- Can't cache RDD of collaborative filtering on MLlib - posted by Yuichiro Sakamoto <ks...@muc.biglobe.ne.jp> on 2015/03/08 17:43:58 UTC, 4 replies.
- Re: Python script runs fine in local mode, errors in other modes - posted by Davies Liu <da...@databricks.com> on 2015/03/08 22:13:50 UTC, 0 replies.
- General Purpose Spark Cluster Hardware Requirements? - posted by Nasir Khan <na...@gmail.com> on 2015/03/08 22:29:16 UTC, 3 replies.
- A strange problem in spark sql join - posted by "Dai, Kevin" <yu...@ebay.com> on 2015/03/09 07:15:32 UTC, 0 replies.
- How to use the TF-IDF model? - posted by Xi Shen <da...@gmail.com> on 2015/03/09 07:39:53 UTC, 1 replies.
- what are the types of tasks when running ALS iterations - posted by lisendong <li...@163.com> on 2015/03/09 07:43:51 UTC, 1 replies.
- How to load my ML model? - posted by Xi Shen <da...@gmail.com> on 2015/03/09 07:54:22 UTC, 1 replies.
- Re: A strange problem in spark sql join - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/03/09 08:02:48 UTC, 1 replies.
- Re: No executors allocated on yarn with latest master branch - posted by Sandy Ryza <sa...@cloudera.com> on 2015/03/09 08:25:15 UTC, 0 replies.
- Ensuring data locality when opening files - posted by Daniel Haviv <da...@veracity-group.com> on 2015/03/09 08:46:15 UTC, 0 replies.
- How to build Spark and run examples using Intellij ? - posted by MEETHU MATHEW <me...@yahoo.co.in> on 2015/03/09 11:25:15 UTC, 0 replies.
- Is there any problem in having a long opened connection to spark sql thrift server - posted by fanooos <de...@gmail.com> on 2015/03/09 11:41:45 UTC, 1 replies.
- issue creating spark context with CDH 5.3.1 - posted by sachin Singh <sa...@gmail.com> on 2015/03/09 13:33:35 UTC, 2 replies.
- failure to display logs on YARN UI with log aggregation on - posted by rok <ro...@gmail.com> on 2015/03/09 15:29:57 UTC, 1 replies.
- How to preserve/preset partition information when load time series data? - posted by Shuai Zheng <sz...@gmail.com> on 2015/03/09 17:18:52 UTC, 2 replies.
- Read Parquet file from scala directly - posted by Shuai Zheng <sz...@gmail.com> on 2015/03/09 17:25:33 UTC, 2 replies.
- java.lang.RuntimeException: Couldn't find function Some - posted by Patcharee Thongtra <Pa...@uni.no> on 2015/03/09 18:03:40 UTC, 0 replies.
- GraphX Snapshot Partitioning - posted by Matthew Bucci <mr...@gmail.com> on 2015/03/09 18:21:16 UTC, 3 replies.
- saveAsTextFile extremely slow near finish - posted by mingweili0x <ml...@spokeo.com> on 2015/03/09 18:31:59 UTC, 3 replies.
- Spark with Spring - posted by Tarun Garg <bi...@live.com> on 2015/03/09 18:48:27 UTC, 1 replies.
- distcp problems on ec2 standalone spark cluster - posted by roni <ro...@gmail.com> on 2015/03/09 19:17:39 UTC, 0 replies.
- Spark Streaming input data source list - posted by Cui Lin <cu...@hds.com> on 2015/03/09 19:37:09 UTC, 4 replies.
- From Spark web ui, how to prove the parquet column pruning working - posted by java8964 <ja...@hotmail.com> on 2015/03/09 20:15:15 UTC, 1 replies.
- error on training with logistic regression sgd - posted by Peng Xia <sp...@gmail.com> on 2015/03/09 20:54:21 UTC, 1 replies.
- Joining data using Latitude, Longitude - posted by Ankur Srivastava <an...@gmail.com> on 2015/03/09 20:56:56 UTC, 6 replies.
- sc.textFile() on windows cannot access UNC path - posted by "Wang, Ningjun (LNG-NPV)" <ni...@lexisnexis.com> on 2015/03/09 22:09:38 UTC, 8 replies.
- Top, takeOrdered, sortByKey - posted by Saba Sehrish <ss...@fnal.gov> on 2015/03/09 22:21:39 UTC, 2 replies.
- yarn + spark deployment issues (high memory consumption and task hung) - posted by pranavkrs <pr...@yahoo.com> on 2015/03/09 22:22:29 UTC, 0 replies.
- Process time series RDD after sortByKey - posted by Shuai Zheng <sz...@gmail.com> on 2015/03/09 23:41:28 UTC, 0 replies.
- sparse vector operations in Python - posted by "Daniel, Ronald (ELS-SDG)" <R....@elsevier.com> on 2015/03/10 00:21:02 UTC, 1 replies.
- Re: Process time series RDD after sortByKey - posted by Zhan Zhang <zz...@hortonworks.com> on 2015/03/10 01:46:46 UTC, 5 replies.
- Spark History server default conf values - posted by Srini Karri <sk...@gmail.com> on 2015/03/10 05:36:48 UTC, 2 replies.
- Top rows per group - posted by Moss <rh...@gmail.com> on 2015/03/10 05:43:20 UTC, 1 replies.
- ANSI Standard Supported by the Spark-SQL - posted by Ravindra <ra...@gmail.com> on 2015/03/10 07:16:19 UTC, 3 replies.
- Registering custom UDAFs with HiveConetxt in SparkSQL, how? - posted by shahab <sh...@gmail.com> on 2015/03/10 10:44:24 UTC, 6 replies.
- Spark-on-YARN architecture - posted by Harika <ma...@gmail.com> on 2015/03/10 11:06:56 UTC, 3 replies.
- [SparkSQL] Reuse HiveContext to different Hive warehouse? - posted by Haopu Wang <HW...@qilinsoft.com> on 2015/03/10 11:37:34 UTC, 3 replies.
- Does any one know how to deploy a custom UDAF jar file in SparkSQL? - posted by shahab <sh...@gmail.com> on 2015/03/10 13:48:15 UTC, 1 replies.
- Pyspark not using all cores - posted by htailor <he...@live.co.uk> on 2015/03/10 16:30:46 UTC, 1 replies.
- Re: Setting up Spark with YARN on EC2 cluster - posted by roni <ro...@gmail.com> on 2015/03/10 18:27:27 UTC, 2 replies.
- Compilation error - posted by Mohit Anchlia <mo...@gmail.com> on 2015/03/10 18:54:16 UTC, 11 replies.
- ec2 persistent-hdfs with ebs using spot instances - posted by Deborah Siegel <de...@gmail.com> on 2015/03/10 20:17:58 UTC, 0 replies.
- How to pass parameter to spark-shell when choose client mode --master yarn-client - posted by Shuai Zheng <sz...@gmail.com> on 2015/03/10 20:19:39 UTC, 0 replies.
- java.io.InvalidClassException: org.apache.spark.rdd.PairRDDFunctions; local class incompatible: stream classdesc - posted by Manas Kar <ma...@gmail.com> on 2015/03/10 21:05:30 UTC, 1 replies.
- Writing wide parquet file in Spark SQL - posted by kpeng1 <kp...@gmail.com> on 2015/03/10 21:13:08 UTC, 2 replies.
- Temp directory used by spark-submit - posted by Justin Yip <yi...@prediction.io> on 2015/03/10 21:17:24 UTC, 1 replies.
- Compilation error on JavaPairDStream - posted by Mohit Anchlia <mo...@gmail.com> on 2015/03/10 21:32:42 UTC, 2 replies.
- Spark 1.3 SQL Type Parser Changes? - posted by Nitay Joffe <ni...@actioniq.co> on 2015/03/10 21:51:03 UTC, 2 replies.
- SchemaRDD: SQL Queries vs Language Integrated Queries - posted by Cesar Flores <ce...@gmail.com> on 2015/03/10 22:13:31 UTC, 4 replies.
- Hadoop Map vs Spark stream Map - posted by Mohit Anchlia <mo...@gmail.com> on 2015/03/10 23:04:16 UTC, 0 replies.
- SQL with Spark Streaming - posted by Mohit Anchlia <mo...@gmail.com> on 2015/03/11 01:12:15 UTC, 7 replies.
- Numbering RDD members Sequentially - posted by Steve Lewis <lo...@gmail.com> on 2015/03/11 01:31:14 UTC, 2 replies.
- Why spark master consumes 100% CPU when we kill a spark streaming app? - posted by Xuelin Cao <xu...@gmail.com> on 2015/03/11 03:10:12 UTC, 2 replies.
- Is it possible to use windows service to start and stop spark standalone cluster - posted by "Wang, Ningjun (LNG-NPV)" <ni...@lexisnexis.com> on 2015/03/11 04:47:35 UTC, 0 replies.
- Re: Is it possible to use windows service to start and stop spark standalone cluster - posted by Silvio Fiorito <si...@granturing.com> on 2015/03/11 05:39:59 UTC, 1 replies.
- S3 SubFolder Write Issues - posted by cpalm3 <cp...@gmail.com> on 2015/03/11 05:45:07 UTC, 3 replies.
- SocketTextStream not working from messages sent from other host - posted by Cui Lin <cu...@hds.com> on 2015/03/11 07:06:41 UTC, 1 replies.
- Split in Java - posted by Samarth Bhargav <ma...@gmail.com> on 2015/03/11 07:15:39 UTC, 0 replies.
- Spark fpg large basket - posted by Sean Barzilay <se...@gmail.com> on 2015/03/11 07:48:53 UTC, 6 replies.
- How to set per-user spark.local.dir? - posted by Jianshi Huang <ji...@gmail.com> on 2015/03/11 08:14:45 UTC, 4 replies.
- Example of partitionBy in pyspark - posted by Stephen Boesch <ja...@gmail.com> on 2015/03/11 09:41:56 UTC, 1 replies.
- skewed outer join with spark 1.2.0 - memory consumption - posted by Marcin Cylke <ma...@ext.allegro.pl> on 2015/03/11 11:19:56 UTC, 1 replies.
- "Timed out while stopping the job generator" plus subsequent failures - posted by Tobias Pfeiffer <tg...@preferred.jp> on 2015/03/11 11:43:37 UTC, 3 replies.
- hbase sql query - posted by Udbhav Agarwal <ud...@syncoms.com> on 2015/03/11 12:16:43 UTC, 6 replies.
- Running Spark from Scala source files other than main file - posted by Aung Kyaw Htet <ak...@gmail.com> on 2015/03/11 12:53:46 UTC, 1 replies.
- Unable to saveToCassandra while cassandraTable works fine - posted by "Tiwari, Tarun" <Ta...@Kronos.com> on 2015/03/11 13:42:48 UTC, 1 replies.
- PairRDD serialization exception - posted by Manas Kar <ma...@gmail.com> on 2015/03/11 14:40:09 UTC, 5 replies.
- Define exception handling on lazy elements? - posted by Michal Klos <mi...@gmail.com> on 2015/03/11 14:51:02 UTC, 4 replies.
- SVM questions (data splitting, SVM parameters) - posted by Natalia Connolly <na...@gmail.com> on 2015/03/11 16:18:14 UTC, 1 replies.
- bad symbolic reference. A signature in SparkContext.class refers to term conf in value org.apache.hadoop which is not available - posted by Patcharee Thongtra <Pa...@uni.no> on 2015/03/11 16:37:23 UTC, 1 replies.
- Read parquet folders recursively - posted by Masf <ma...@gmail.com> on 2015/03/11 17:15:13 UTC, 4 replies.
- Architecture Documentation - posted by Mohit Anchlia <mo...@gmail.com> on 2015/03/11 17:48:19 UTC, 1 replies.
- Scaling problem in RandomForest? - posted by insperatum <in...@gmail.com> on 2015/03/11 18:00:03 UTC, 1 replies.
- Spark Streaming recover from Checkpoint with Spark SQL - posted by Marius Soutier <mp...@gmail.com> on 2015/03/11 18:35:18 UTC, 3 replies.
- Re: How to read from hdfs using spark-shell in Intel hadoop? - posted by Arush Kharbanda <ar...@sigmoidanalytics.com> on 2015/03/11 19:43:32 UTC, 0 replies.
- Getting incorrect weights for LinearRegression - posted by "EcoMotto Inc." <ec...@gmail.com> on 2015/03/11 19:59:01 UTC, 2 replies.
- Writing to a single file from multiple executors - posted by SamyaMaiti <sa...@gmail.com> on 2015/03/11 21:00:26 UTC, 3 replies.
- Which strategy is used for broadcast variables? - posted by Tom <th...@gmail.com> on 2015/03/11 21:57:05 UTC, 4 replies.
- Spark SQL using Hive metastore - posted by Grandl Robert <rg...@yahoo.com.INVALID> on 2015/03/11 22:06:54 UTC, 2 replies.
- SVD transform of large matrix with MLlib - posted by sergunok <se...@gmail.com> on 2015/03/11 22:33:47 UTC, 1 replies.
- can spark take advantage of ordered data? - posted by Jonathan Coveney <jc...@gmail.com> on 2015/03/11 22:38:04 UTC, 2 replies.
- PySpark: Python 2.7 cluster installation script (with Numpy, IPython, etc) - posted by Sebastián Ramírez <se...@senseta.com> on 2015/03/11 22:42:06 UTC, 0 replies.
- Re: Is it possible to use windows service to start and stop spark standalone cluster - posted by Yana Kadiyska <ya...@gmail.com> on 2015/03/11 22:44:28 UTC, 0 replies.
- JavaSparkContext - jarOfClass or jarOfObject dont work - posted by Nirav Patel <np...@xactlycorp.com> on 2015/03/12 01:14:43 UTC, 0 replies.
- Re: How to use more executors - posted by Du Li <li...@yahoo-inc.com.INVALID> on 2015/03/12 01:42:32 UTC, 3 replies.
- StreamingListener - posted by Corey Nolet <cj...@gmail.com> on 2015/03/12 02:18:59 UTC, 1 replies.
- Re: Taking a lot of time to write a ~500MB data into files/cassandra - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/03/12 07:08:58 UTC, 0 replies.
- Re: Unable to read files In Yarn Mode of Spark Streaming ? - posted by Prannoy <pr...@sigmoidanalytics.com> on 2015/03/12 08:32:45 UTC, 2 replies.
- connecting spark application with SAP hana - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/03/12 08:45:52 UTC, 1 replies.
- Using Neo4j with Apache Spark - posted by d34th4ck3r <ga...@gmail.com> on 2015/03/12 08:48:12 UTC, 11 replies.
- Re: Is there a limit to the number of RDDs in a Spark context? - posted by Juan Rodríguez Hortalá <ju...@gmail.com> on 2015/03/12 13:03:58 UTC, 0 replies.
- zzzzzzz - posted by zhuhuatong <zh...@126.com> on 2015/03/12 14:32:23 UTC, 0 replies.
- spark sql performance - posted by Udbhav Agarwal <ud...@syncoms.com> on 2015/03/12 15:02:31 UTC, 11 replies.
- Which is more efficient : first join three RDDs and then do filtering or vice versa? - posted by shahab <sh...@gmail.com> on 2015/03/12 16:04:36 UTC, 2 replies.
- run spark standalone mode - posted by Grandl Robert <rg...@yahoo.com.INVALID> on 2015/03/12 16:33:16 UTC, 2 replies.
- SPARKQL Join partitioner - posted by gtanguy <g....@gmail.com> on 2015/03/12 17:22:45 UTC, 0 replies.
- How to consider HTML files in Spark - posted by yh18190 <yh...@gmail.com> on 2015/03/12 17:26:34 UTC, 1 replies.
- KafkaUtils and specifying a specific partition - posted by ColinMc <co...@shiftenergy.com> on 2015/03/12 17:58:08 UTC, 5 replies.
- Jackson-core-asl conflict with Spark - posted by Uthayan Suthakar <ut...@gmail.com> on 2015/03/12 17:58:22 UTC, 5 replies.
- Efficient Top count in each window - posted by Laeeq Ahmed <la...@yahoo.com.INVALID> on 2015/03/12 19:06:22 UTC, 1 replies.
- Re: Stand-alone Spark on windows - posted by Arush Kharbanda <ar...@sigmoidanalytics.com> on 2015/03/12 19:06:27 UTC, 0 replies.
- Re: can not submit job to spark in windows - posted by Arush Kharbanda <ar...@sigmoidanalytics.com> on 2015/03/12 19:46:10 UTC, 1 replies.
- AWS SDK HttpClient version conflict (spark.files.userClassPathFirst not working) - posted by Adam Lewandowski <ad...@gmail.com> on 2015/03/12 19:50:31 UTC, 2 replies.
- Handling worker batch processing during driver shutdown - posted by Jose Fernandez <jf...@sdl.com> on 2015/03/12 20:27:54 UTC, 5 replies.
- spark sql writing in avro - posted by kpeng1 <kp...@gmail.com> on 2015/03/13 00:05:28 UTC, 5 replies.
- Limit # of parallel parquet decompresses - posted by ankits <an...@gmail.com> on 2015/03/13 00:07:04 UTC, 0 replies.
- Error running rdd.first on hadoop - posted by "Lau, Kawing (GE Global Research)" <ka...@ge.com> on 2015/03/13 00:45:41 UTC, 1 replies.
- Logistic Regression displays ERRORs - posted by cjwang <cj...@cjwang.us> on 2015/03/13 01:46:25 UTC, 1 replies.
- repartitionAndSortWithinPartitions and mapPartitions and sort order - posted by Darin McBeath <dd...@yahoo.com.INVALID> on 2015/03/13 02:11:34 UTC, 0 replies.
- Unable to stop Worker in standalone mode by sbin/stop-all.sh - posted by sequoiadb <ma...@sequoiadb.com> on 2015/03/13 02:18:19 UTC, 3 replies.
- Re: NegativeArraySizeException when doing joins on skewed data - posted by Soila Pertet Kavulya <sk...@gmail.com> on 2015/03/13 02:24:41 UTC, 0 replies.
- Support for skewed joins in Spark - posted by Soila Pertet Kavulya <sk...@gmail.com> on 2015/03/13 02:37:38 UTC, 2 replies.
- SV: Pyspark Hbase scan. - posted by Castberg, René Christian <Re...@dnvgl.com> on 2015/03/13 07:13:41 UTC, 1 replies.
- set up spark cluster with heterogeneous hardware - posted by Du Li <li...@yahoo-inc.com.INVALID> on 2015/03/13 07:30:30 UTC, 1 replies.
- Explanation on the Hive in the Spark assembly - posted by "bit1129@163.com" <bi...@163.com> on 2015/03/13 09:24:09 UTC, 5 replies.
- No assemblies found in assembly/target/scala-2.10 - posted by Patcharee Thongtra <Pa...@uni.no> on 2015/03/13 09:26:53 UTC, 1 replies.
- Spark SQL. Cast to Bigint - posted by Masf <ma...@gmail.com> on 2015/03/13 09:48:53 UTC, 2 replies.
- How to do spares vector product in Spark? - posted by Xi Shen <da...@gmail.com> on 2015/03/13 09:49:39 UTC, 2 replies.
- RDD to InputStream - posted by Ayoub <be...@gmail.com> on 2015/03/13 10:54:19 UTC, 4 replies.
- Using rdd methods with Dstream - posted by Laeeq Ahmed <la...@yahoo.com.INVALID> on 2015/03/13 11:41:46 UTC, 6 replies.
- Visualizing the DAG of a Spark application - posted by t1ny <wb...@gmail.com> on 2015/03/13 11:42:02 UTC, 2 replies.
- Pyspark saveAsTextFile exceptions - posted by Madabhattula Rajesh Kumar <mr...@gmail.com> on 2015/03/13 12:51:04 UTC, 0 replies.
- [GRAPHX] could not process graph with 230M edges - posted by Hlib Mykhailenko <hl...@inria.fr> on 2015/03/13 16:09:28 UTC, 1 replies.
- Errors in SPARK - posted by sandeep vura <sa...@gmail.com> on 2015/03/13 16:30:21 UTC, 4 replies.
- Lots of fetch failures on saveAsNewAPIHadoopDataset PairRDDFunctions - posted by freedafeng <fr...@yahoo.com> on 2015/03/13 16:42:07 UTC, 0 replies.
- Workflow layer for Spark - posted by Karthikeyan Muthukumar <mk...@gmail.com> on 2015/03/13 16:46:21 UTC, 2 replies.
- Hanging tasks in spark 1.2.1 while working with 1.1.1 - posted by Eugen Cepoi <ce...@gmail.com> on 2015/03/13 17:45:05 UTC, 6 replies.
- [ANNOUNCE] Announcing Spark 1.3! - posted by Patrick Wendell <pw...@gmail.com> on 2015/03/13 18:00:14 UTC, 2 replies.
- jar conflict with Spark default packaging - posted by Shuai Zheng <sz...@gmail.com> on 2015/03/13 18:04:06 UTC, 1 replies.
- How do I alter the combination of keys that exit the Spark shell? - posted by Adamantios Corais <ad...@gmail.com> on 2015/03/13 18:29:55 UTC, 5 replies.
- serialization stakeoverflow error during reduce on nested objects - posted by ilaxes <il...@hotmail.com> on 2015/03/13 18:52:29 UTC, 2 replies.
- Date and decimal datatype not working - posted by "BASAK, ANANDA" <ab...@att.com> on 2015/03/13 19:23:47 UTC, 8 replies.
- Any way to find out feature importance in Spark SVM? - posted by Natalia Connolly <na...@gmail.com> on 2015/03/13 19:35:40 UTC, 1 replies.
- how to print RDD by key into file with grouByKey - posted by Adrian Mocanu <am...@verticalscope.com> on 2015/03/13 19:58:54 UTC, 1 replies.
- Problem connecting to HBase - posted by HARIPRIYA AYYALASOMAYAJULA <ah...@gmail.com> on 2015/03/13 20:18:24 UTC, 6 replies.
- spark flume tryOrIOException NoSuchMethodError - posted by jaredtims <ja...@yahoo.com> on 2015/03/13 22:06:02 UTC, 0 replies.
- Partitioning - posted by Mohit Anchlia <mo...@gmail.com> on 2015/03/13 22:26:21 UTC, 4 replies.
- Unable to connect - posted by Mohit Anchlia <mo...@gmail.com> on 2015/03/13 22:41:49 UTC, 1 replies.
- LogisticRegressionWithLBFGS shows ERRORs - posted by cjwang <cj...@cjwang.us> on 2015/03/13 22:41:53 UTC, 4 replies.
- Spark on HDFS vs. Lustre vs. other file systems - formal research and performance evaluation - posted by Edmon Begoli <eb...@gmail.com> on 2015/03/13 23:06:38 UTC, 0 replies.
- org.apache.spark.SparkException Error sending message - posted by Chen Song <ch...@gmail.com> on 2015/03/13 23:38:31 UTC, 1 replies.
- Spark will process _temporary folder on S3 is very slow and always cause failure - posted by Shuai Zheng <sz...@gmail.com> on 2015/03/13 23:51:00 UTC, 5 replies.
- Need Advice about reading lots of text files - posted by Pat Ferrel <pa...@occamsmachete.com> on 2015/03/14 00:06:52 UTC, 8 replies.
- Spark SQL 1.3 max operation giving wrong results - posted by gtinside <gt...@gmail.com> on 2015/03/14 00:12:42 UTC, 1 replies.
- Loading in json with spark sql - posted by kpeng1 <kp...@gmail.com> on 2015/03/14 00:45:26 UTC, 2 replies.
- building all modules in spark by mvn - posted by sequoiadb <ma...@sequoiadb.com> on 2015/03/14 01:57:58 UTC, 2 replies.
- Upgrade from Spark 1.1.0 to 1.1.1+ Issues - posted by EH <ea...@gmail.com> on 2015/03/14 02:17:46 UTC, 8 replies.
- spark there is no space on the disk - posted by Peng Xia <sp...@gmail.com> on 2015/03/14 03:10:15 UTC, 7 replies.
- How does Spark honor data locality when allocating computing resources for an application - posted by "bit1129@163.com" <bi...@163.com> on 2015/03/14 03:41:23 UTC, 2 replies.
- Aggregation of distributed datasets - posted by raggy <ra...@gmail.com> on 2015/03/14 05:56:15 UTC, 0 replies.
- Please help me understand TF-IDF Vector structure - posted by Xi Shen <da...@gmail.com> on 2015/03/14 08:05:07 UTC, 1 replies.
- Streaming linear regression example question - posted by Margus Roo <ma...@roo.ee> on 2015/03/14 08:05:28 UTC, 3 replies.
- deploying Spark on standalone cluster - posted by sara mustafa <en...@gmail.com> on 2015/03/14 08:13:05 UTC, 3 replies.
- How to avoid using some nodes while running a spark program on yarn - posted by James <al...@gmail.com> on 2015/03/14 09:49:19 UTC, 4 replies.
- Spark Release 1.3.0 DataFrame API - posted by David Mitchell <jd...@gmail.com> on 2015/03/14 16:32:57 UTC, 5 replies.
- Pausing/throttling spark/spark-streaming application - posted by tulinski <to...@gmail.com> on 2015/03/14 17:59:23 UTC, 0 replies.
- Spark and HBase join issue - posted by francexo83 <fr...@gmail.com> on 2015/03/14 18:52:17 UTC, 1 replies.
- Bug in "Spark SQL and Dataframes" : "Inferring the Schema Using Reflection"? - posted by Dean Arnold <re...@gmail.com> on 2015/03/14 19:55:06 UTC, 1 replies.
- How to create data frame from an avro file in Spark 1.3.0 - posted by Shing Hing Man <ma...@yahoo.com.INVALID> on 2015/03/14 21:17:25 UTC, 0 replies.
- Bug in Streaming files? - posted by Justin Pihony <ju...@gmail.com> on 2015/03/14 21:18:24 UTC, 1 replies.
- Re: How to create data frame from an avro file in Spark 1.3.0 - posted by Michael Armbrust <mi...@databricks.com> on 2015/03/14 21:58:17 UTC, 0 replies.
- order preservation with RDDs - posted by "kian.ho" <hu...@gmail.com> on 2015/03/15 04:51:06 UTC, 2 replies.
- 1.3 release - posted by Eric Friedman <er...@gmail.com> on 2015/03/15 07:22:08 UTC, 6 replies.
- Null Pointer Exception due to mapVertices function in GraphX - posted by James <al...@gmail.com> on 2015/03/15 08:54:47 UTC, 0 replies.
- Re: Spark Streaming on Yarn Input from Flume - posted by tarek_abouzeid <ta...@yahoo.com> on 2015/03/15 10:26:26 UTC, 0 replies.
- Software stack for Recommendation engine with spark mlib - posted by Shashidhar Rao <ra...@gmail.com> on 2015/03/15 11:45:37 UTC, 5 replies.
- [Spark SQL]: Convert JavaSchemaRDD back to JavaRDD of a specific class - posted by Renato Marroquín Mogrovejo <re...@gmail.com> on 2015/03/15 15:22:53 UTC, 1 replies.
- Saving Dstream into a single file - posted by tarek_abouzeid <ta...@yahoo.com> on 2015/03/15 15:31:58 UTC, 2 replies.
- Submitting spark application using Yarn Rest API - posted by Srini Karri <sk...@gmail.com> on 2015/03/15 18:14:35 UTC, 0 replies.
- Re: Running spark function on parquet without sql - posted by Cheng Lian <li...@gmail.com> on 2015/03/15 18:35:12 UTC, 0 replies.
- Benchmarks of 'Hive on Tez' vs 'Hive on Spark' vs Spark SQL - posted by Slim Baltagi <sb...@gmail.com> on 2015/03/15 19:00:11 UTC, 0 replies.
- Slides of my talk in LA: 'Spark or Hadoop: is it an either-or proposition?' - posted by Slim Baltagi <sb...@gmail.com> on 2015/03/15 19:04:31 UTC, 1 replies.
- Re: Spark 1.2 – How to change Default (Random) port …. - posted by Shailesh Birari <sb...@gmail.com> on 2015/03/16 03:42:46 UTC, 0 replies.
- Building spark over specified tachyon - posted by "fightfate@163.com" <fi...@163.com> on 2015/03/16 04:01:00 UTC, 4 replies.
- Input validation for LogisticRegressionWithSGD - posted by Rohit U <rj...@gmail.com> on 2015/03/16 04:51:06 UTC, 2 replies.
- Re: Trouble launching application that reads files - posted by "robert.tunney" <ro...@gmail.com> on 2015/03/16 05:17:10 UTC, 0 replies.
- k-means hang without error/warning - posted by Xi Shen <da...@gmail.com> on 2015/03/16 05:25:12 UTC, 4 replies.
- Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient - posted by sandeep vura <sa...@gmail.com> on 2015/03/16 05:51:57 UTC, 10 replies.
- Running Scala Word Count Using Maven - posted by Su She <su...@gmail.com> on 2015/03/16 06:20:32 UTC, 2 replies.
- Spark Streaming with compressed xml files - posted by Vijay Innamuri <vi...@gmail.com> on 2015/03/16 06:58:20 UTC, 3 replies.
- why generateJob is a private API? - posted by madhu phatak <ph...@gmail.com> on 2015/03/16 07:14:49 UTC, 2 replies.
- Question about Spark Streaming Receiver Failure - posted by Jun Yang <ya...@gmail.com> on 2015/03/16 08:10:47 UTC, 8 replies.
- Does spark-1.3.0 support the analytic functions defined in Hive, such as row_number, rank - posted by hseagle <hs...@gmail.com> on 2015/03/16 08:14:31 UTC, 1 replies.
- How to set Spark executor memory? - posted by Xi Shen <da...@gmail.com> on 2015/03/16 08:22:45 UTC, 15 replies.
- start-slave.sh failed with ssh port other than 22 - posted by ZhuGe <tc...@outlook.com> on 2015/03/16 09:07:49 UTC, 1 replies.
- Processing of text file in large gzip archive - posted by sergunok <se...@gmail.com> on 2015/03/16 09:09:05 UTC, 3 replies.
- unable to access spark @ spark://debian:7077 - posted by Ralph Bergmann <ra...@dasralph.de> on 2015/03/16 09:14:13 UTC, 4 replies.
- MappedStream vs Transform API - posted by madhu phatak <ph...@gmail.com> on 2015/03/16 09:31:55 UTC, 7 replies.
- Handling fatal errors of executors and decommission datanodes - posted by Jianshi Huang <ji...@gmail.com> on 2015/03/16 10:36:38 UTC, 4 replies.
- Re: How to pass parameters to a spark-jobserver Scala class? - posted by Sasi <sa...@gmail.com> on 2015/03/16 11:34:57 UTC, 0 replies.
- Parquet and repartition - posted by Masf <ma...@gmail.com> on 2015/03/16 12:11:04 UTC, 2 replies.
- Error when using multiple python files spark-submit - posted by poiuytrez <gu...@databerries.com> on 2015/03/16 12:11:40 UTC, 2 replies.
- Can I start multiple executors in local mode? - posted by Xi Shen <da...@gmail.com> on 2015/03/16 12:46:49 UTC, 2 replies.
- Re: configure number of cached partition in memory on SparkSQL - posted by Cheng Lian <li...@gmail.com> on 2015/03/16 13:41:03 UTC, 1 replies.
- Iterative Algorithms with Spark Streaming - posted by Alex Minnaar <am...@verticalscope.com> on 2015/03/16 13:57:12 UTC, 1 replies.
- Re: Spark 1.3 createDataframe error with pandas df - posted by kevindahl <ke...@gmail.com> on 2015/03/16 14:23:23 UTC, 1 replies.
- insert hive partitioned table - posted by patcharee <Pa...@uni.no> on 2015/03/16 14:59:08 UTC, 3 replies.
- HDP 2.2 AM abort : Unable to find ExecutorLauncher class - posted by Bharath Ravi Kumar <re...@gmail.com> on 2015/03/16 15:13:01 UTC, 13 replies.
- Priority queue in spark - posted by abhi <ab...@gmail.com> on 2015/03/16 15:45:50 UTC, 7 replies.
- RDD to DataFrame for using ALS under org.apache.spark.ml.recommendation.ALS - posted by jaykatukuri <jk...@apple.com> on 2015/03/16 17:08:33 UTC, 4 replies.
- ClassNotFoundException - posted by Ralph Bergmann <ra...@dasralph.de> on 2015/03/16 17:10:04 UTC, 2 replies.
- [SPARK-3638 ] java.lang.NoSuchMethodError: org.apache.http.impl.conn.DefaultClientConnectionOperator. - posted by Shuai Zheng <sz...@gmail.com> on 2015/03/16 17:38:45 UTC, 3 replies.
- Any IRC channel on Spark? - posted by Feng Lin <lf...@gmail.com> on 2015/03/16 18:16:28 UTC, 2 replies.
- Basic GraphX deployment and usage question - posted by Khaled Ammar <kh...@gmail.com> on 2015/03/16 18:21:20 UTC, 1 replies.
- Creating a hive table on top of a parquet file written out by spark - posted by kpeng1 <kp...@gmail.com> on 2015/03/16 18:55:22 UTC, 1 replies.
- problems with spark-streaming-kinesis-asl and "sbt assembly" ("different file contents found") - posted by "Kelly, Jonathan" <jo...@amazon.com> on 2015/03/16 19:30:36 UTC, 4 replies.
- sqlContext.parquetFile doesn't work with s3n in version 1.3.0 - posted by Shuai Zheng <sz...@gmail.com> on 2015/03/16 19:46:57 UTC, 3 replies.
- partitionBy not working w HashPartitioner - posted by Adrian Mocanu <am...@verticalscope.com> on 2015/03/16 19:50:53 UTC, 0 replies.
- Re: What is best way to run spark job in "yarn-cluster" mode from java program(servlet container) and NOT using spark-submit command. - posted by rrussell25 <rr...@gmail.com> on 2015/03/16 20:57:19 UTC, 0 replies.
- Querying JSON in Spark SQL - posted by Fatma Ozcan <fa...@gmail.com> on 2015/03/16 21:47:37 UTC, 1 replies.
- question regarding the dependency DAG in Spark - posted by Grandl Robert <rg...@yahoo.com.INVALID> on 2015/03/16 21:58:38 UTC, 0 replies.
- Spark @ EC2: Futures timed out & Ask timed out - posted by Otis Gospodnetic <ot...@gmail.com> on 2015/03/16 22:56:09 UTC, 2 replies.
- Can LBFGS be used on streaming data? - posted by "EcoMotto Inc." <ec...@gmail.com> on 2015/03/16 23:19:23 UTC, 6 replies.
- Re: Using TF-IDF from MLlib - posted by Joseph Bradley <jo...@databricks.com> on 2015/03/17 01:14:26 UTC, 3 replies.
- Garbage stats in Random Forest leaf node? - posted by cjwang <cj...@cjwang.us> on 2015/03/17 01:19:43 UTC, 2 replies.
- Suggestion for user logging - posted by Xi Shen <da...@gmail.com> on 2015/03/17 01:26:07 UTC, 0 replies.
- version conflict common-net - posted by Jacob Abraham <ab...@gmail.com> on 2015/03/17 01:33:10 UTC, 4 replies.
- Spark from S3 very slow - posted by Pere Kyle <pe...@whisper.sh> on 2015/03/17 01:47:41 UTC, 0 replies.
- Iterate over contents of schemaRDD loaded from parquet file to extract timestamp - posted by anu <an...@gmail.com> on 2015/03/17 05:43:09 UTC, 1 replies.
- Hive on Spark with Spark as a service on CDH5.2 - posted by anu <an...@gmail.com> on 2015/03/17 07:05:42 UTC, 1 replies.
- Re: What is best way to run spark job in "yarn-cluster" mode from java program(servlet container) and NOT using spark-submit command. - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/03/17 08:19:07 UTC, 2 replies.
- build spark 1.3.0 on windows 7. - posted by Ahmed Nawar <ah...@gmail.com> on 2015/03/17 09:54:45 UTC, 0 replies.
- IllegalAccessError in GraphX (Spark 1.3.0 LDA) - posted by Jeffrey Jedele <je...@gmail.com> on 2015/03/17 10:03:58 UTC, 2 replies.
- Building Spark on Windows WAS: Any IRC channel on Spark? - posted by Ted Yu <yu...@gmail.com> on 2015/03/17 10:14:33 UTC, 3 replies.
- Apache Spark Executor - number of threads - posted by Igor Petrov <ig...@gmail.com> on 2015/03/17 10:36:05 UTC, 2 replies.
- LZO configuration can not affect - posted by 唯我者 <87...@qq.com> on 2015/03/17 10:36:59 UTC, 3 replies.
- TreeNodeException: Unresolved plan found - posted by Ophir Cohen <op...@gmail.com> on 2015/03/17 11:25:02 UTC, 1 replies.
- Spark-submit and multiple files - posted by poiuytrez <gu...@databerries.com> on 2015/03/17 11:29:35 UTC, 4 replies.
- Should I do spark-sql query on HDFS or hive? - posted by 李铖 <li...@gmail.com> on 2015/03/17 11:39:12 UTC, 1 replies.
- Should I do spark-sql query on HDFS or apache hive? - posted by 李铖 <li...@gmail.com> on 2015/03/17 11:41:51 UTC, 3 replies.
- Hive error on partitioned tables - posted by Masf <ma...@gmail.com> on 2015/03/17 11:47:47 UTC, 0 replies.
- GraphX - Correct path traversal order from an Array[Edge[ED]] - posted by bertlhf <be...@analytag.com> on 2015/03/17 12:31:33 UTC, 0 replies.
- Spark SQL UDT Kryo serialization, Unable to find class - posted by zia_kayani <zi...@platalytics.com> on 2015/03/17 13:17:49 UTC, 2 replies.
- HiveContext can't find registered function - posted by Ophir Cohen <op...@gmail.com> on 2015/03/17 14:34:28 UTC, 6 replies.
- Downloading data from url - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/03/17 14:52:42 UTC, 2 replies.
- org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException - posted by fanooos <de...@gmail.com> on 2015/03/17 15:25:47 UTC, 0 replies.
- High GC time - posted by jatinpreet <ja...@gmail.com> on 2015/03/17 16:27:49 UTC, 1 replies.
- Why I didn't see the benefits of using KryoSerializer - posted by java8964 <ja...@hotmail.com> on 2015/03/17 17:01:35 UTC, 3 replies.
- Unable to saveAsParquetFile to HDFS since Spark 1.3.0 - posted by Franz Graf <in...@Locked.de> on 2015/03/17 17:24:09 UTC, 1 replies.
- Spark yarn-client submission example? - posted by Michal Klos <mi...@gmail.com> on 2015/03/17 18:05:35 UTC, 0 replies.
- Set spark.fileserver.uri on private cluster - posted by Rares Vernica <rv...@gmail.com> on 2015/03/17 19:34:54 UTC, 0 replies.
- Question on RDD groupBy and executors - posted by Vijayasarathy Kannan <kv...@vt.edu> on 2015/03/17 20:18:30 UTC, 0 replies.
- Using regular rdd transforms on schemaRDD - posted by kpeng1 <kp...@gmail.com> on 2015/03/17 20:30:09 UTC, 1 replies.
- graceful shutdown not so graceful? - posted by "necro351 ." <ne...@gmail.com> on 2015/03/17 20:31:15 UTC, 0 replies.
- Log4j files per spark job - posted by "Dan H." <dc...@gmail.com> on 2015/03/17 20:34:42 UTC, 0 replies.
- Spark 1.0.2 failover doesnt port running application context to new master - posted by Nirav Patel <np...@xactlycorp.com> on 2015/03/17 20:40:13 UTC, 0 replies.
- shuffle write size - posted by Chen Song <ch...@gmail.com> on 2015/03/17 22:23:13 UTC, 1 replies.
- ML Pipeline question about caching - posted by Cesar Flores <ce...@gmail.com> on 2015/03/17 23:26:52 UTC, 1 replies.
- Idempotent count - posted by Binh Nguyen Van <bi...@gmail.com> on 2015/03/17 23:30:50 UTC, 3 replies.
- StorageLevel: OFF_HEAP - posted by Ranga <sr...@gmail.com> on 2015/03/17 23:45:05 UTC, 8 replies.
- Question on Spark 1.3 SQL External Datasource - posted by Yang Lei <ge...@gmail.com> on 2015/03/17 23:53:49 UTC, 2 replies.
- Memory Settings for local execution context - posted by "Alex Turner (TMS)" <al...@toyota.com> on 2015/03/18 00:46:52 UTC, 0 replies.
- Using Spark with a SOCKS proxy - posted by "Kelly, Jonathan" <jo...@amazon.com> on 2015/03/18 01:15:16 UTC, 1 replies.
- saveAsTable fails to save RDD in Spark SQL 1.3.0 - posted by smoradi <sm...@currenex.com> on 2015/03/18 02:24:31 UTC, 3 replies.
- InvalidAuxServiceException in dynamicAllocation - posted by Sea <26...@qq.com> on 2015/03/18 03:15:18 UTC, 1 replies.
- HIVE SparkSQL - posted by 宫勐 <sh...@gmail.com> on 2015/03/18 04:28:48 UTC, 2 replies.
- [spark-streaming] can shuffle write to disk be disabled? - posted by Darren Hoo <da...@gmail.com> on 2015/03/18 06:39:05 UTC, 9 replies.
- updateStateByKey performance - posted by Nikos Viorres <nv...@gmail.com> on 2015/03/18 06:49:17 UTC, 0 replies.
- Transform a Schema RDD to another Schema RDD with a different schema - posted by anu <an...@gmail.com> on 2015/03/18 06:50:33 UTC, 0 replies.
- updateStateByKey performance / API - posted by Nikos Viorres <nv...@gmail.com> on 2015/03/18 07:09:32 UTC, 0 replies.
- updateStateByKey performance & API - posted by nvrs <nv...@gmail.com> on 2015/03/18 07:12:19 UTC, 3 replies.
- Spark + Kafka - posted by James King <ja...@gmail.com> on 2015/03/18 10:38:05 UTC, 6 replies.
- Re: GraphX: Get edges for a vertex - posted by mas <ma...@gmail.com> on 2015/03/18 10:52:39 UTC, 1 replies.
- Spark SQL weird exception after upgrading from 1.1.1 to 1.2.x - posted by Roberto Coluccio <ro...@gmail.com> on 2015/03/18 11:03:01 UTC, 6 replies.
- Spark Job History Server - posted by patcharee <Pa...@uni.no> on 2015/03/18 11:30:56 UTC, 7 replies.
- Apache Spark ALS recommendations approach - posted by Aram Mkrtchyan <ar...@gmail.com> on 2015/03/18 12:13:57 UTC, 7 replies.
- Integration of Spark1.2.0 cdh4 with Jetty 9.2.10 - posted by sayantini <sa...@gmail.com> on 2015/03/18 12:59:23 UTC, 0 replies.
- sparksql native jdbc driver - posted by sequoiadb <ma...@sequoiadb.com> on 2015/03/18 13:20:55 UTC, 2 replies.
- srcAttr in graph.triplets don't update when the size of graph is huge - posted by "张林(林岳)" <li...@alibaba-inc.com> on 2015/03/18 13:29:40 UTC, 0 replies.
- DataFrame operation on parquet: GC overhead limit exceeded - posted by Yiannis Gkoufas <jo...@gmail.com> on 2015/03/18 14:15:08 UTC, 12 replies.
- Re: Difference among batchDuration, windowDuration, slideDuration - posted by jaredtims <ja...@yahoo.com> on 2015/03/18 15:16:08 UTC, 0 replies.
- Column Similarity using DIMSUM - posted by Manish Gupta 8 <mg...@sapient.com> on 2015/03/18 15:40:35 UTC, 4 replies.
- Did DataFrames break basic SQLContext? - posted by Justin Pihony <ju...@gmail.com> on 2015/03/18 16:20:48 UTC, 3 replies.
- Re: How to get the cached RDD - posted by praveenbalaji <pr...@soundhound.com> on 2015/03/18 17:46:51 UTC, 0 replies.
- mapPartitions - How Does it Works - posted by "ashish.usoni" <as...@gmail.com> on 2015/03/18 18:19:34 UTC, 4 replies.
- Null pointer exception reading Parquet - posted by sprookie <cu...@gmail.com> on 2015/03/18 18:21:53 UTC, 1 replies.
- Database operations on executor nodes - posted by Praveen Balaji <pr...@soundhound.com> on 2015/03/18 18:22:42 UTC, 1 replies.
- [Spark SQL] Elasticsearch-hadoop - exception when creating Temporary table - posted by Todd Nist <ts...@gmail.com> on 2015/03/18 18:47:29 UTC, 0 replies.
- Using a different spark jars than the one on the cluster - posted by jaykatukuri <jk...@apple.com> on 2015/03/18 19:09:58 UTC, 2 replies.
- Spark + HBase + Kerberos - posted by Eric Walk <Er...@perficient.com> on 2015/03/18 19:39:12 UTC, 2 replies.
- Spark Streaming S3 Performance Implications - posted by Mike Trienis <mi...@orcsol.com> on 2015/03/18 19:44:25 UTC, 2 replies.
- RDD pair to pair of RDDs - posted by "Alex Turner (TMS)" <al...@toyota.com> on 2015/03/18 19:48:57 UTC, 0 replies.
- topic modeling using LDA in MLLib - posted by heszak <hz...@collabware.com> on 2015/03/18 21:34:51 UTC, 1 replies.
- RDD ordering after map - posted by sergunok <se...@gmail.com> on 2015/03/18 22:02:46 UTC, 1 replies.
- MEMORY_ONLY vs MEMORY_AND_DISK - posted by sergunok <se...@gmail.com> on 2015/03/18 22:05:21 UTC, 1 replies.
- Does newly-released LDA (Latent Dirichlet Allocation) algorithm supports ngrams? - posted by heszak <hz...@collabware.com> on 2015/03/18 22:37:28 UTC, 1 replies.
- Spark and Morphlines, parallelization, multithreading - posted by dgoldenberg <dg...@gmail.com> on 2015/03/18 23:19:18 UTC, 0 replies.
- Apache Spark User List: people's responses not showing in the browser view - posted by dgoldenberg <dg...@gmail.com> on 2015/03/18 23:21:57 UTC, 10 replies.
- [SQL] Elasticsearch-hadoop, exception creating temporary table - posted by Todd Nist <ts...@gmail.com> on 2015/03/19 00:48:40 UTC, 4 replies.
- iPython Notebook + Spark + Accumulo -- best practice? - posted by davidh <da...@annaisystems.com> on 2015/03/19 01:45:59 UTC, 18 replies.
- saving or visualizing PCA - posted by roni <ro...@gmail.com> on 2015/03/19 02:14:55 UTC, 2 replies.
- SparkSQL 1.3.0 JDBC data source issues - posted by Pei-Lun Lee <pl...@appier.com> on 2015/03/19 05:20:59 UTC, 1 replies.
- Error while Insert data into hive table via spark - posted by Dhimant <dh...@gmail.com> on 2015/03/19 07:12:09 UTC, 0 replies.
- Need some help on the Spark performance on Hadoop Yarn - posted by Yi Ming Huang <hu...@cn.ibm.com> on 2015/03/19 07:44:20 UTC, 0 replies.
- MLlib Spam example gets stuck in Stage X - posted by Su She <su...@gmail.com> on 2015/03/19 07:45:59 UTC, 11 replies.
- RE: Column Similarity using DIMSUM - posted by Manish Gupta 8 <mg...@sapient.com> on 2015/03/19 08:46:09 UTC, 0 replies.
- how to specify multiple masters in sbin/start-slaves.sh script? - posted by sequoiadb <ma...@sequoiadb.com> on 2015/03/19 09:00:15 UTC, 0 replies.
- OutOfMemoryError during reduce tasks - posted by Balazs Meszaros <me...@zhaw.ch> on 2015/03/19 09:09:19 UTC, 0 replies.
- calculating TF-IDF for large 100GB dataset problems - posted by sergunok <se...@gmail.com> on 2015/03/19 13:16:58 UTC, 1 replies.
- Saprk 1.2.0 | Spark job fails with MetadataFetchFailedException - posted by Aniket Bhatnagar <an...@gmail.com> on 2015/03/19 13:24:16 UTC, 0 replies.
- Reading a text file into RDD[Char] instead of RDD[String] - posted by Michael Lewis <le...@me.com> on 2015/03/19 13:46:56 UTC, 2 replies.
- Writing Spark Streaming Programs - posted by James King <ja...@gmail.com> on 2015/03/19 15:50:26 UTC, 5 replies.
- JAVA_HOME problem with upgrade to 1.3.0 - posted by "Williams, Ken" <Ke...@windlogics.com> on 2015/03/19 16:59:35 UTC, 3 replies.
- saveAsTable broken in v1.3 DataFrames? - posted by Christian Perez <ch...@svds.com> on 2015/03/19 17:00:12 UTC, 5 replies.
- Load balancing - posted by Mohit Anchlia <mo...@gmail.com> on 2015/03/19 19:02:14 UTC, 4 replies.
- Problems with spark.akka.frameSize - posted by Vijayasarathy Kannan <kv...@vt.edu> on 2015/03/19 19:03:51 UTC, 0 replies.
- Spark Streaming custom receiver for local data - posted by MartijnD <ik...@martijndwars.nl> on 2015/03/19 19:59:04 UTC, 0 replies.
- Issues with SBT and Spark - posted by Vijayasarathy Kannan <kv...@vt.edu> on 2015/03/19 20:17:13 UTC, 2 replies.
- Spark SQL filter DataFrame by date? - posted by kamatsuoka <ke...@gmail.com> on 2015/03/19 20:22:38 UTC, 1 replies.
- FetchFailedException: Adjusted frame length exceeds 2147483647: 12716268407 - discarded - posted by roni <ro...@gmail.com> on 2015/03/19 23:28:54 UTC, 1 replies.
- Cloudant as Spark SQL External Datastore on Spark 1.3.0 - posted by Yang Lei <ge...@gmail.com> on 2015/03/19 23:39:48 UTC, 0 replies.
- Catching InvalidClassException in sc.objectFile - posted by Justin Yip <yi...@prediction.io> on 2015/03/19 23:45:51 UTC, 0 replies.
- Timeout Issues from Spark 1.2.0+ - posted by EH <ea...@gmail.com> on 2015/03/19 23:49:40 UTC, 0 replies.
- Reliable method/tips to solve dependency issues? - posted by Jim Kleckner <ji...@cloudphysics.com> on 2015/03/20 01:56:22 UTC, 0 replies.
- Spark SQL Self join with agreegate - posted by Shailesh Birari <sb...@gmail.com> on 2015/03/20 02:30:45 UTC, 1 replies.
- Re: KMeans with large clusters Java Heap Space - posted by mvsundaresan <mv...@yahoo.com> on 2015/03/20 04:56:15 UTC, 0 replies.
- Spark MLLib KMeans Top Terms - posted by mvsundaresan <mv...@yahoo.com> on 2015/03/20 05:01:10 UTC, 0 replies.
- Launching Spark Cluster Application through IDE - posted by raggy <ra...@gmail.com> on 2015/03/20 05:16:31 UTC, 1 replies.
- Measuer Bytes READ and Peak Memory Usage for Query - posted by anu <an...@gmail.com> on 2015/03/20 07:32:31 UTC, 2 replies.
- Visualizing Spark Streaming data - posted by Harut <ha...@gmail.com> on 2015/03/20 08:43:43 UTC, 6 replies.
- Re: Powered by Spark addition - posted by Ricardo Almeida <ra...@actnowib.com> on 2015/03/20 09:15:03 UTC, 0 replies.
- Clean the shuffle data during iteration - posted by James <al...@gmail.com> on 2015/03/20 09:53:28 UTC, 0 replies.
- Spark 1.2. loses often all executors - posted by mrm <ma...@skimlinks.com> on 2015/03/20 11:21:52 UTC, 4 replies.
- about Partition Index - posted by Long Cheng <pa...@gmail.com> on 2015/03/20 14:13:40 UTC, 0 replies.
- ShuffleBlockFetcherIterator: Failed to get block(s) - posted by Eric Friedman <er...@gmail.com> on 2015/03/20 14:49:53 UTC, 1 replies.
- Accessing AWS S3 in Frankfurt (v4 only - AWS4-HMAC-SHA256) - posted by Ralf Heyde <rh...@hubrick.com> on 2015/03/20 14:53:18 UTC, 3 replies.
- How to handle under-performing nodes in the cluster - posted by Yiannis Gkoufas <jo...@gmail.com> on 2015/03/20 15:35:37 UTC, 2 replies.
- What is the jvm size when start spark-submit through local mode - posted by Shuai Zheng <sz...@gmail.com> on 2015/03/20 15:39:49 UTC, 0 replies.
- Buffering for Socket streams - posted by jamborta <ja...@gmail.com> on 2015/03/20 16:19:36 UTC, 1 replies.
- Re: RDD Blocks skewing to just few executors - posted by Alessandro Lulli <lu...@di.unipi.it> on 2015/03/20 17:32:29 UTC, 1 replies.
- can distinct transform applied on DStream? - posted by Darren Hoo <da...@gmail.com> on 2015/03/20 18:37:00 UTC, 2 replies.
- com.esotericsoftware.kryo.KryoException: java.io.IOException: File too large vs FileNotFoundException (Too many open files) on spark 1.2.1 - posted by Shuai Zheng <sz...@gmail.com> on 2015/03/20 20:28:26 UTC, 3 replies.
- Mailing list schizophrenia? - posted by Jim Kleckner <ji...@cloudphysics.com> on 2015/03/20 20:29:51 UTC, 4 replies.
- Matching Spark application metrics data to App Id - posted by Judy Nash <ju...@exchange.microsoft.com> on 2015/03/20 20:43:52 UTC, 0 replies.
- EC2 cluster created by spark using old HDFS 1.0 - posted by morfious902002 <an...@gmail.com> on 2015/03/20 20:54:36 UTC, 1 replies.
- Spark per app logging - posted by Udit Mehta <um...@groupon.com> on 2015/03/20 21:43:14 UTC, 3 replies.
- Create a Spark cluster with cloudera CDH 5.2 support - posted by morfious902002 <an...@gmail.com> on 2015/03/20 22:27:27 UTC, 1 replies.
- How to check that a dataset is sorted after it has been written out? - posted by Michael Albert <m_...@yahoo.com.INVALID> on 2015/03/20 23:41:16 UTC, 3 replies.
- IPyhon notebook command for spark need to be updated? - posted by cong yue <yu...@gmail.com> on 2015/03/20 23:45:30 UTC, 3 replies.
- Spark 1.3 Dynamic Allocation - Requesting 0 new executor(s) because tasks are backlogged - posted by Manoj Samel <ma...@gmail.com> on 2015/03/21 00:15:47 UTC, 5 replies.
- Re: WebUI on yarn through ssh tunnel affected by AmIpfilter - posted by benbongalon <be...@gmail.com> on 2015/03/21 00:25:56 UTC, 1 replies.
- Spark Streaming Not Reading Messages From Multiple Kafka Topics - posted by EH <ea...@gmail.com> on 2015/03/21 02:27:31 UTC, 1 replies.
- Registring UDF from a different package fails - posted by Ravindra <ra...@gmail.com> on 2015/03/21 03:04:52 UTC, 0 replies.
- About the env of Spark1.2 - posted by tangzilu <zi...@hotmail.com> on 2015/03/21 04:54:40 UTC, 2 replies.
- Filesystem closed Exception - posted by Sea <26...@qq.com> on 2015/03/21 06:35:01 UTC, 1 replies.
- 'nested' RDD problem, advise needed - posted by Michael Lewis <le...@me.com> on 2015/03/21 18:26:56 UTC, 1 replies.
- Model deployment help - posted by Shashidhar Rao <ra...@gmail.com> on 2015/03/21 18:40:35 UTC, 1 replies.
- Spark streaming alerting - posted by Mohit Anchlia <mo...@gmail.com> on 2015/03/21 21:22:35 UTC, 8 replies.
- ArrayIndexOutOfBoundsException in ALS.trainImplicit - posted by Sabarish Sasidharan <sa...@manthan.com> on 2015/03/21 21:23:28 UTC, 1 replies.
- join two DataFrames, same column name - posted by Eric Friedman <er...@gmail.com> on 2015/03/21 23:02:32 UTC, 3 replies.
- netlib-java cannot load native lib in Windows when using spark-submit - posted by Xi Shen <da...@gmail.com> on 2015/03/22 00:58:20 UTC, 5 replies.
- Reducing Spark's logging verbosity - posted by Edmon Begoli <eb...@gmail.com> on 2015/03/22 01:43:47 UTC, 1 replies.
- Error while installing Spark 1.3.0 on local machine - posted by HARIPRIYA AYYALASOMAYAJULA <ah...@gmail.com> on 2015/03/22 01:52:40 UTC, 1 replies.
- How to do nested foreach with RDD - posted by Xi Shen <da...@gmail.com> on 2015/03/22 06:37:48 UTC, 2 replies.
- DataFrame saveAsTable - partitioned tables - posted by "deenar.toraskar" <de...@db.com> on 2015/03/22 08:19:38 UTC, 2 replies.
- Should Spark SQL support retrieve column value from Row by column name? - posted by amghost <zh...@gmail.com> on 2015/03/22 08:40:14 UTC, 2 replies.
- Re: converting DStream[String] into RDD[String] in spark streaming - posted by "deenar.toraskar" <de...@db.com> on 2015/03/22 09:43:42 UTC, 1 replies.
- Spark sql thrift server slower than hive - posted by fanooos <de...@gmail.com> on 2015/03/22 11:38:12 UTC, 2 replies.
- How to use DataFrame with MySQL - posted by gavin zhang <ga...@gmail.com> on 2015/03/22 15:32:21 UTC, 3 replies.
- How Does aggregate work - posted by "ashish.usoni" <as...@gmail.com> on 2015/03/22 16:05:59 UTC, 3 replies.
- How to check that a dataset is sorted after it has been written out? [repost] - posted by Michael Albert <m_...@yahoo.com.INVALID> on 2015/03/22 16:37:48 UTC, 0 replies.
- lower&upperBound not working/spark 1.3 - posted by Marek Wiewiorka <ma...@gmail.com> on 2015/03/22 16:44:44 UTC, 3 replies.
- spark disk-to-disk - posted by Koert Kuipers <ko...@tresata.com> on 2015/03/23 02:03:25 UTC, 6 replies.
- Convert Spark SQL table to RDD in Scala / error: value toFloat is a not a member of Any - posted by Minnow Noir <mi...@gmail.com> on 2015/03/23 02:08:55 UTC, 2 replies.
- SocketTimeout only when launching lots of executors - posted by Tianshuo Deng <td...@twitter.com.INVALID> on 2015/03/23 05:16:11 UTC, 1 replies.
- Cassandra time series + Spark - posted by "Rumph, Frens Jan" <ma...@frensjan.nl> on 2015/03/23 08:04:52 UTC, 0 replies.
- log files of failed task - posted by sergunok <se...@gmail.com> on 2015/03/23 08:23:39 UTC, 1 replies.
- Spark UI tunneling - posted by sergunok <se...@gmail.com> on 2015/03/23 08:42:29 UTC, 3 replies.
- Spark Sql with python udf fail - posted by lonely Feb <lo...@gmail.com> on 2015/03/23 08:43:59 UTC, 5 replies.
- Data/File structure Validation - posted by Ahmed Nawar <ah...@gmail.com> on 2015/03/23 09:48:54 UTC, 3 replies.
- Spark SQL udf(ScalaUdf) is very slow - posted by zzcclp <44...@qq.com> on 2015/03/23 10:10:22 UTC, 2 replies.
- Use pig load function in spark - posted by "Dai, Kevin" <yu...@ebay.com> on 2015/03/23 10:29:25 UTC, 4 replies.
- registerTempTable is not a member of RDD on spark 1.2? - posted by IT CTO <go...@gmail.com> on 2015/03/23 13:25:42 UTC, 4 replies.
- Why doesn't the --conf parameter work in yarn-cluster mode (but works in yarn-client and local)? - posted by Emre Sevinc <em...@gmail.com> on 2015/03/23 13:39:08 UTC, 4 replies.
- Spark RDD mapped to Hbase to be updateable - posted by Siddharth Ubale <si...@syncoms.com> on 2015/03/23 15:22:01 UTC, 0 replies.
- RDD storage in spark steaming - posted by abhi <ab...@gmail.com> on 2015/03/23 15:26:47 UTC, 1 replies.
- Spark error NoClassDefFoundError: org/apache/hadoop/mapred/InputSplit - posted by ", Roy" <rp...@njit.edu> on 2015/03/23 16:10:39 UTC, 3 replies.
- Is yarn-standalone mode deprecated? - posted by nitinkak001 <ni...@gmail.com> on 2015/03/23 16:49:29 UTC, 3 replies.
- Re: PySpark, ResultIterable and taking a list and saving it into different parquet files - posted by chuwiey <be...@gmail.com> on 2015/03/23 17:14:37 UTC, 0 replies.
- Parquet file + increase read parallelism - posted by SamyaMaiti <sa...@gmail.com> on 2015/03/23 18:10:27 UTC, 0 replies.
- SchemaRDD/DataFrame result partitioned according to the underlying datasource partitions - posted by Stephen Boesch <ja...@gmail.com> on 2015/03/23 18:22:20 UTC, 1 replies.
- Re: Write to Parquet File in Python - posted by chuwiey <be...@gmail.com> on 2015/03/23 19:19:38 UTC, 0 replies.
- Strange behavior with PySpark when using Join() and zip() - posted by Ofer Mendelevitch <om...@hortonworks.com> on 2015/03/23 19:27:30 UTC, 3 replies.
- Converting SparkSQL query to Scala query - posted by nishitd <ni...@yahoo.com> on 2015/03/23 19:42:46 UTC, 1 replies.
- newbie quesiton - spark with mesos - posted by Anirudha Jadhav <an...@nyu.edu> on 2015/03/23 19:46:52 UTC, 3 replies.
- Spark-thriftserver Issue - posted by Neil Dev <ne...@gmail.com> on 2015/03/23 20:01:29 UTC, 3 replies.
- Getting around Serializability issues for types not in my control - posted by adelbertc <ad...@gmail.com> on 2015/03/23 20:03:48 UTC, 4 replies.
- SparkEnv - posted by Koert Kuipers <ko...@tresata.com> on 2015/03/23 20:21:47 UTC, 0 replies.
- JDBC DF using DB2 - posted by Jack Arenas <j...@ckarenas.com> on 2015/03/23 20:34:25 UTC, 0 replies.
- Re: JDBC DF using DB2 - posted by Ted Yu <yu...@gmail.com> on 2015/03/23 21:01:05 UTC, 0 replies.
- objectFile uses only java serializer? - posted by Koert Kuipers <ko...@tresata.com> on 2015/03/23 21:14:59 UTC, 1 replies.
- hadoop input/output format advanced control - posted by Koert Kuipers <ko...@tresata.com> on 2015/03/23 21:36:41 UTC, 1 replies.
- Is it possible to use json4s 3.2.11 with Spark 1.3.0? - posted by Alexey Zinoviev <al...@gmail.com> on 2015/03/23 22:12:02 UTC, 6 replies.
- Shuffle Spill Memory and Shuffle Spill Disk - posted by Bijay Pathak <bi...@cloudwick.com> on 2015/03/23 22:29:50 UTC, 1 replies.
- Weird exception in Spark job - posted by nitinkak001 <ni...@gmail.com> on 2015/03/24 00:06:25 UTC, 2 replies.
- GraphX Pregal optimization - posted by Clare Huang <cl...@gmail.com> on 2015/03/24 01:02:55 UTC, 0 replies.
- Invalid ContainerId ... Caused by: java.lang.NumberFormatException: For input string: "e04" - posted by Manoj Samel <ma...@gmail.com> on 2015/03/24 02:32:36 UTC, 6 replies.
- how to cache table with OFF_HEAP storage level in SparkSQL thriftserver - posted by LiuZeshan <li...@qq.com> on 2015/03/24 04:37:36 UTC, 0 replies.
- Hive context datanucleus error - posted by Udit Mehta <um...@groupon.com> on 2015/03/24 05:19:06 UTC, 1 replies.
- diffrence in PCA of MLib vs H2o in R - posted by roni <ro...@gmail.com> on 2015/03/24 07:13:31 UTC, 5 replies.
- Spark SQL: Day of month from Timestamp - posted by Harut Martirosyan <ha...@gmail.com> on 2015/03/24 08:16:15 UTC, 2 replies.
- Question about Data Sources API - posted by Ashish Mukherjee <as...@gmail.com> on 2015/03/24 08:57:24 UTC, 3 replies.
- Spark as a service - posted by Ashish Mukherjee <as...@gmail.com> on 2015/03/24 11:28:44 UTC, 5 replies.
- How to deploy binary dependencies to workers? - posted by Xi Shen <da...@gmail.com> on 2015/03/24 12:13:58 UTC, 7 replies.
- Unable to run Hive queries on Spark - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/03/24 12:47:09 UTC, 0 replies.
- issue while creating spark context - posted by sachin Singh <sa...@gmail.com> on 2015/03/24 12:51:25 UTC, 8 replies.
- Standalone Scheduler VS YARN Performance - posted by Harut Martirosyan <ha...@gmail.com> on 2015/03/24 13:21:50 UTC, 1 replies.
- EC2 Having script run at startup - posted by Theodore Vasiloudis <th...@gmail.com> on 2015/03/24 13:22:33 UTC, 1 replies.
- Optimal solution for getting the header from CSV with Spark - posted by Spico Florin <sp...@gmail.com> on 2015/03/24 15:12:24 UTC, 5 replies.
- How to avoid being killed by YARN node manager ? - posted by Yuichiro Sakamoto <ks...@muc.biglobe.ne.jp> on 2015/03/24 16:49:08 UTC, 2 replies.
- Does HiveContext connect to HiveServer2? - posted by nitinkak001 <ni...@gmail.com> on 2015/03/24 16:58:52 UTC, 3 replies.
- akka.version error - posted by Mohit Anchlia <mo...@gmail.com> on 2015/03/24 18:08:46 UTC, 0 replies.
- What his the ideal method to interact with Spark Cluster from a Cloud App? - posted by Noorul Islam K M <no...@noorul.com> on 2015/03/24 18:27:50 UTC, 1 replies.
- Hadoop 2.5 not listed in Spark 1.4 build page - posted by Manoj Samel <ma...@gmail.com> on 2015/03/24 18:28:08 UTC, 2 replies.
- CombineByKey - Please explain its working - posted by "ashish.usoni" <as...@gmail.com> on 2015/03/24 18:31:13 UTC, 0 replies.
- spark worker on mesos slave | possible networking config issue - posted by Anirudha Jadhav <an...@nyu.edu> on 2015/03/24 18:48:21 UTC, 3 replies.
- Dataframe groupby custom functions (python) - posted by jamborta <ja...@gmail.com> on 2015/03/24 18:49:11 UTC, 1 replies.
- Spark Application Hung - posted by Ashish Rawat <As...@guavus.com> on 2015/03/24 19:41:08 UTC, 1 replies.
- Spark GraphX In Action on documentation page? - posted by Michael Malak <mi...@yahoo.com.INVALID> on 2015/03/24 19:49:31 UTC, 0 replies.
- SparkSQL UDTs with Ordering - posted by Patrick Woody <pa...@gmail.com> on 2015/03/24 20:25:44 UTC, 2 replies.
- java.lang.OutOfMemoryError: unable to create new native thread - posted by Thomas Gerber <th...@radius.com> on 2015/03/24 20:38:43 UTC, 6 replies.
- FAILED SelectChannelConnector@0.0.0.0:4040 java.net.BindException: Address already in use - posted by ", Roy" <rp...@njit.edu> on 2015/03/24 20:43:31 UTC, 3 replies.
- updateStateByKey - Seq[V] order - posted by Adrian Mocanu <am...@verticalscope.com> on 2015/03/24 21:52:56 UTC, 0 replies.
- 1.3 Hadoop File System problem - posted by Jim Carroll <ji...@gmail.com> on 2015/03/25 00:55:49 UTC, 3 replies.
- column expression in left outer join for DataFrame - posted by SK <sk...@gmail.com> on 2015/03/25 01:50:54 UTC, 5 replies.
- filter expression in API document for DataFrame - posted by SK <sk...@gmail.com> on 2015/03/25 03:11:12 UTC, 0 replies.
- Graphx gets slower as the iteration number increases - posted by "orangeprince@foxmail.com" <or...@foxmail.com> on 2015/03/25 03:12:05 UTC, 1 replies.
- Spark Performance -Hive or Hbase? - posted by Siddharth Ubale <si...@syncoms.com> on 2015/03/25 07:55:49 UTC, 0 replies.
- Server IPC version 9 cannot communicate with client version 4 - posted by sandeep vura <sa...@gmail.com> on 2015/03/25 08:22:45 UTC, 9 replies.
- Serialization Problem in Spark Program - posted by donhoff_h <16...@qq.com> on 2015/03/25 08:44:29 UTC, 3 replies.
- Spark Maven Test error - posted by zzcclp <44...@qq.com> on 2015/03/25 09:03:48 UTC, 0 replies.
- OutOfMemoryError when using DataFrame created by Spark SQL - posted by SLiZn Liu <sl...@gmail.com> on 2015/03/25 09:48:19 UTC, 3 replies.
- Exception in thread "main" java.lang.VerifyError: class org.apache.hadoop.yarn.proto.YarnProtos$PriorityProto overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet; - posted by Canoe <ca...@gmail.com> on 2015/03/25 09:59:06 UTC, 0 replies.
- Explanation streaming-cep-engine with example - posted by Dhimant <dh...@gmail.com> on 2015/03/25 10:04:18 UTC, 0 replies.
- issue while submitting Spark Job as --master yarn-cluster - posted by sachin Singh <sa...@gmail.com> on 2015/03/25 11:06:57 UTC, 0 replies.
- Spark-sql query got exception.Help - posted by 李铖 <li...@gmail.com> on 2015/03/25 11:26:04 UTC, 7 replies.
- Re: issue while submitting Spark Job as --master yarn-cluster - posted by Xi Shen <da...@gmail.com> on 2015/03/25 11:55:31 UTC, 2 replies.
- Spark ML Pipeline inaccessible types - posted by za...@email.cz on 2015/03/25 12:00:40 UTC, 6 replies.
- How to randomise data on spark - posted by critikaled <is...@gmail.com> on 2015/03/25 12:53:05 UTC, 0 replies.
- Using ORC input for mllib algorithms - posted by Zsolt Tóth <to...@gmail.com> on 2015/03/25 13:03:55 UTC, 2 replies.
- How do you write Dataframes to elasticsearch - posted by yamanoj <ma...@gmail.com> on 2015/03/25 13:06:47 UTC, 1 replies.
- NetwrokWordCount + Spark standalone - posted by James King <ja...@gmail.com> on 2015/03/25 14:01:12 UTC, 3 replies.
- JavaKinesisWordCountASLYARN Example not working on EMR - posted by "ankur.jain" <an...@yash.com> on 2015/03/25 14:48:43 UTC, 3 replies.
- foreachRDD execution - posted by Luis Ángel Vicente Sánchez <la...@gmail.com> on 2015/03/25 14:57:57 UTC, 1 replies.
- What are the best options for quickly filtering a DataFrame on a single column? - posted by Stuart Layton <st...@gmail.com> on 2015/03/25 15:41:55 UTC, 3 replies.
- Spark Streaming - Minimizing batch interval - posted by RodrigoB <ro...@aspect.com> on 2015/03/25 15:53:38 UTC, 1 replies.
- Total size of serialized results is bigger than spark.driver.maxResultSize - posted by "Wang, Ningjun (LNG-NPV)" <ni...@lexisnexis.com> on 2015/03/25 15:58:28 UTC, 0 replies.
- Write Parquet File with spark-streaming with Spark 1.3 - posted by richiesgr <ri...@gmail.com> on 2015/03/25 16:53:51 UTC, 2 replies.
- upgrade from spark 1.2.1 to 1.3 on EC2 cluster and problems - posted by roni <ro...@gmail.com> on 2015/03/25 16:58:04 UTC, 11 replies.
- Recovered state for updateStateByKey and incremental streams processing - posted by Ravi Reddy <ra...@gmail.com> on 2015/03/25 18:09:13 UTC, 0 replies.
- python : Out of memory: Kill process - posted by Eduardo Cusa <ed...@usmediaconsulting.com> on 2015/03/25 18:33:23 UTC, 9 replies.
- Re: Total size of serialized results is bigger than spark.driver.maxResultSize - posted by Denny Lee <de...@gmail.com> on 2015/03/25 18:45:09 UTC, 0 replies.
- Unable to Hive program from Spark Programming Guide (OutOfMemoryError) - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/03/25 18:48:06 UTC, 2 replies.
- Re: OOM for HiveFromSpark example - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/03/25 18:54:28 UTC, 12 replies.
- Re: OutOfMemory : Java heap space error - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/03/25 18:54:38 UTC, 0 replies.
- Can a DataFrame be saved to s3 directly using Parquet? - posted by Stuart Layton <st...@gmail.com> on 2015/03/25 19:59:20 UTC, 2 replies.
- Spark shell never leaves ACCEPTED state in YARN CDH5 - posted by "Khandeshi, Ami" <Am...@fmr.com.INVALID> on 2015/03/25 20:08:50 UTC, 3 replies.
- writing DStream RDDs to the same file - posted by Adrian Mocanu <am...@verticalscope.com> on 2015/03/25 20:49:32 UTC, 1 replies.
- trouble with jdbc df in python - posted by elliott cordo <el...@gmail.com> on 2015/03/25 22:19:50 UTC, 3 replies.
- Exception Failed to add a datanode. User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration - posted by varvind <vi...@gmail.com> on 2015/03/25 22:31:24 UTC, 0 replies.
- Cross-compatibility of YARN shuffle service - posted by Matt Cheah <mc...@palantir.com> on 2015/03/25 23:44:31 UTC, 1 replies.
- How to specify the port for AM Actor ... - posted by Manoj Samel <ma...@gmail.com> on 2015/03/25 23:49:48 UTC, 5 replies.
- Re: filter expression in API document for DataFrame - posted by Michael Armbrust <mi...@databricks.com> on 2015/03/26 01:08:40 UTC, 0 replies.
- [SparkSQL] How to calculate stddev on a DataFrame? - posted by Haopu Wang <HW...@qilinsoft.com> on 2015/03/26 03:28:16 UTC, 2 replies.
- The dreaded bradcast error Error: Failed to get broadcast_0_piece0 of broadcast_0 - posted by rkgurram <rk...@gmail.com> on 2015/03/26 03:50:40 UTC, 0 replies.
- SparkSQL overwrite parquet file does not generate _common_metadata - posted by Pei-Lun Lee <pl...@appier.com> on 2015/03/26 05:48:08 UTC, 6 replies.
- How to troubleshoot server.TransportChannelHandler Exception - posted by Xi Shen <da...@gmail.com> on 2015/03/26 06:28:26 UTC, 2 replies.
- Can I call aggregate UDF in DataFrame? - posted by Haopu Wang <HW...@qilinsoft.com> on 2015/03/26 08:37:17 UTC, 0 replies.
- Hive Table not from from Spark SQL - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/03/26 08:56:36 UTC, 8 replies.
- Missing an output location for shuffle. : ( - posted by 李铖 <li...@gmail.com> on 2015/03/26 11:20:14 UTC, 3 replies.
- Handling Big data for interactive BI tools - posted by kundan kumar <ii...@gmail.com> on 2015/03/26 11:26:39 UTC, 6 replies.
- Column not found in schema when querying partitioned table - posted by Jon Chase <jo...@gmail.com> on 2015/03/26 11:29:22 UTC, 2 replies.
- Port configuration for BlockManagerId - posted by Manish Gupta 8 <mg...@sapient.com> on 2015/03/26 11:38:37 UTC, 1 replies.
- Windowing and Analytics Functions in Spark SQL - posted by Masf <ma...@gmail.com> on 2015/03/26 12:09:25 UTC, 4 replies.
- Why k-means cluster hang for a long time? - posted by Xi Shen <da...@gmail.com> on 2015/03/26 13:09:07 UTC, 8 replies.
- Why executor encourage OutOfMemoryException: Java heap space - posted by sergunok <se...@gmail.com> on 2015/03/26 13:13:17 UTC, 0 replies.
- Spark-core and guava - posted by Stevo Slavić <ss...@gmail.com> on 2015/03/26 13:24:51 UTC, 2 replies.
- Which RDD operations preserve ordering? - posted by sergunok <se...@gmail.com> on 2015/03/26 13:58:26 UTC, 1 replies.
- Spark-1.3.0 UI shows 0 cores in completed applications tab - posted by MEETHU MATHEW <me...@yahoo.co.in> on 2015/03/26 13:58:52 UTC, 1 replies.
- Populating a HashMap from a GraphX connectedComponents graph - posted by Bob DuCharme <bo...@snee.com> on 2015/03/26 14:24:22 UTC, 0 replies.
- RDD equivalent of HBase Scan - posted by Stuart Layton <st...@gmail.com> on 2015/03/26 14:46:06 UTC, 3 replies.
- Spark log shows only this line repeated: RecurringTimer - JobGenerator] DEBUG o.a.s.streaming.util.RecurringTimer - Callback for JobGenerator called at time X - posted by Adrian Mocanu <am...@verticalscope.com> on 2015/03/26 14:55:26 UTC, 1 replies.
- Recreating the Mesos/Spark paper's experiments - posted by Hans van den Bogert <ha...@gmail.com> on 2015/03/26 15:09:25 UTC, 1 replies.
- RDD Exception Handling - posted by Kevin Conaway <ke...@zoomdata.com> on 2015/03/26 15:15:30 UTC, 1 replies.
- [Spark Streaming] Disk not being cleaned up during runtime after RDD being processed - posted by NathanMarin <na...@teads.tv> on 2015/03/26 15:40:12 UTC, 5 replies.
- Implicit matrix factorization returning different results between spark 1.2.0 and 1.3.0 - posted by Ravi Mody <rm...@gmail.com> on 2015/03/26 15:56:32 UTC, 5 replies.
- Parallel actions from driver - posted by Aram Mkrtchyan <ar...@gmail.com> on 2015/03/26 16:54:51 UTC, 3 replies.
- How to get rdd count() without double evaluation of the RDD? - posted by "Wang, Ningjun (LNG-NPV)" <ni...@lexisnexis.com> on 2015/03/26 17:09:13 UTC, 3 replies.
- HQL function Rollup and Cube - posted by Chang Lim <ch...@gmail.com> on 2015/03/26 17:23:30 UTC, 4 replies.
- DataFrame GroupBy - posted by gtanguy <g....@gmail.com> on 2015/03/26 17:24:30 UTC, 0 replies.
- Combining Many RDDs - posted by sparkx <ya...@yang-cs.com> on 2015/03/26 17:40:38 UTC, 7 replies.
- Strange JavaDeserialization error - java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel - posted by Ondrej Smola <on...@gmail.com> on 2015/03/26 18:59:10 UTC, 8 replies.
- EsHadoopSerializationException: java.net.SocketTimeoutException: Read timed out - posted by Adrian Mocanu <am...@verticalscope.com> on 2015/03/26 19:09:59 UTC, 1 replies.
- Fuzzy GroupBy - posted by Mihran Shahinian <sl...@gmail.com> on 2015/03/26 21:47:08 UTC, 1 replies.
- Error in creating log directory - posted by pzilaro <pa...@yahoo.com> on 2015/03/26 21:48:46 UTC, 1 replies.
- Spark SQL queries hang forever - posted by Jon Chase <jo...@gmail.com> on 2015/03/26 22:02:06 UTC, 5 replies.
- WordCount example - posted by Mohit Anchlia <mo...@gmail.com> on 2015/03/26 22:38:34 UTC, 4 replies.
- RDD.map does not allowed to preservesPartitioning? - posted by Zhan Zhang <zz...@hortonworks.com> on 2015/03/26 22:44:23 UTC, 5 replies.
- Can't access file in spark, but can in hadoop - posted by Dale Johnson <da...@ebay.com> on 2015/03/26 23:06:16 UTC, 5 replies.
- Building spark 1.2 from source requires more dependencies - posted by Pala M Muthaia <mc...@rocketfuelinc.com> on 2015/03/26 23:36:26 UTC, 3 replies.
- Re: K Means cluster with spark - posted by Xi Shen <da...@gmail.com> on 2015/03/26 23:44:10 UTC, 0 replies.
- Spark History Server : jobs link doesn't open - posted by ", Roy" <rp...@njit.edu> on 2015/03/27 00:27:53 UTC, 2 replies.
- FakeClassTag in Java API - posted by kmader <ke...@gmail.com> on 2015/03/27 01:18:42 UTC, 0 replies.
- Difference behaviour of DateType in SparkSQL between 1.2 and 1.3 - posted by Wush Wu <wu...@bridgewell.com> on 2015/03/27 02:30:12 UTC, 0 replies.
- FetchFailedException during shuffle - posted by Chen Song <ch...@gmail.com> on 2015/03/27 02:45:17 UTC, 1 replies.
- spark-sql throws org.datanucleus.store.rdbms.connectionpool.DatastoreDriverNotFoundException - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/03/27 03:45:42 UTC, 5 replies.
- Spark SQL configurations - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/03/27 03:51:13 UTC, 1 replies.
- k-means can only run on one executor with one thread? - posted by Xi Shen <da...@gmail.com> on 2015/03/27 04:04:45 UTC, 6 replies.
- How to get a top X percent of a distribution represented as RDD - posted by Aung Htet <au...@gmail.com> on 2015/03/27 04:31:21 UTC, 5 replies.
- SparkContext.wholeTextFiles throws not serializable exception - posted by Xi Shen <da...@gmail.com> on 2015/03/27 06:41:37 UTC, 1 replies.
- Add partition support in saveAsParquet - posted by Jianshi Huang <ji...@gmail.com> on 2015/03/27 07:22:54 UTC, 1 replies.
- Can spark sql read existing tables created in hive - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/03/27 07:34:31 UTC, 12 replies.
- saveAsTable with path not working as expected (pyspark + Scala) - posted by Tom Walwyn <tw...@gmail.com> on 2015/03/27 08:45:47 UTC, 3 replies.
- failed to launch workers on spark - posted by mas <ma...@gmail.com> on 2015/03/27 10:04:16 UTC, 1 replies.
- Error in Delete Table - posted by Masf <ma...@gmail.com> on 2015/03/27 10:45:51 UTC, 2 replies.
- Re: Error while querying hive table from spark shell - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/03/27 10:55:36 UTC, 0 replies.
- Spark streaming - posted by jamborta <ja...@gmail.com> on 2015/03/27 11:47:30 UTC, 4 replies.
- Spark SQL "lateral view explode" doesn't work, and unable to save array types to Parquet - posted by Jon Chase <jo...@gmail.com> on 2015/03/27 12:00:24 UTC, 5 replies.
- Decrease In Performance due to Auto Increase of Partitions in Spark - posted by sayantini <sa...@gmail.com> on 2015/03/27 12:09:04 UTC, 1 replies.
- Spark SQL and DataSources API roadmap - posted by Ashish Mukherjee <as...@gmail.com> on 2015/03/27 12:41:21 UTC, 0 replies.
- Checking Data Integrity in Spark - posted by Sathish Kumaran Vairavelu <vs...@gmail.com> on 2015/03/27 12:43:42 UTC, 1 replies.
- saving schemaRDD to cassandra - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/03/27 13:33:35 UTC, 0 replies.
- Shuffle Read and Write - posted by Laeeq Ahmed <la...@yahoo.com.INVALID> on 2015/03/27 15:58:19 UTC, 1 replies.
- RDD collect hangs on large input data - posted by Zsolt Tóth <to...@gmail.com> on 2015/03/27 16:48:14 UTC, 2 replies.
- Python Example sql.py not working in version spark-1.3.0-bin-hadoop2.4 - posted by Peter Mac <Pe...@noaa.gov> on 2015/03/27 17:13:38 UTC, 1 replies.
- JettyUtils.createServletHandler Method not Found? - posted by kmader <ke...@gmail.com> on 2015/03/27 17:50:10 UTC, 2 replies.
- "Could not compute split, block not found" in Spark Streaming Simple Application - posted by Saiph Kappa <sa...@gmail.com> on 2015/03/27 18:09:24 UTC, 1 replies.
- How to avoid the repartitioning in graph construction - posted by Yifan LI <ia...@gmail.com> on 2015/03/27 18:32:43 UTC, 0 replies.
- [Dataframe] Problem with insertIntoJDBC and existing database - posted by Pierre Bailly-Ferry <pb...@talend.com> on 2015/03/27 18:42:53 UTC, 0 replies.
- Spark 1.3 Source - Github and source tar does not seem to match - posted by Manoj Samel <ma...@gmail.com> on 2015/03/27 19:28:30 UTC, 1 replies.
- Single threaded laptop implementation beating a 128 node GraphX cluster on a 1TB data set (128 billion nodes) - What is a use case for GraphX then? when is it worth the cost? - posted by Eran Medan <eh...@gmail.com> on 2015/03/27 19:32:18 UTC, 7 replies.
- spark streaming driver hang - posted by Chen Song <ch...@gmail.com> on 2015/03/27 20:24:30 UTC, 1 replies.
- RDD resiliency -- does it keep state? - posted by Michal Klos <mi...@gmail.com> on 2015/03/27 20:36:37 UTC, 4 replies.
- [spark-sql] What is the right way to represent an “Any” type in Spark SQL? - posted by Eran Medan <eh...@gmail.com> on 2015/03/27 22:31:36 UTC, 2 replies.
- Understanding Spark Memory distribution - posted by Ankur Srivastava <an...@gmail.com> on 2015/03/27 22:52:29 UTC, 7 replies.
- 2 input paths generate 3 partitions - posted by Rares Vernica <rv...@gmail.com> on 2015/03/27 23:12:12 UTC, 3 replies.
- Streaming anomaly detection using ARIMA - posted by Corey Nolet <cj...@gmail.com> on 2015/03/28 02:13:39 UTC, 1 replies.
- Setting a custom loss function for GradientDescent - posted by shmoanne <jl...@eng.ucsd.edu> on 2015/03/28 03:11:58 UTC, 1 replies.
- unable to read avro file - posted by Joanne Contact <jo...@gmail.com> on 2015/03/28 05:38:27 UTC, 1 replies.
- rdd.toDF().saveAsParquetFile("tachyon://host:19998/test") - posted by sud_self <85...@qq.com> on 2015/03/28 06:42:50 UTC, 1 replies.
- Spark - Hive Metastore MySQL driver - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/03/28 08:04:02 UTC, 3 replies.
- Anyone has some simple example with spark-sql with spark 1.3 - posted by Vincent He <vi...@gmail.com> on 2015/03/28 13:08:45 UTC, 7 replies.
- How to add all combinations of items rated by user and difference between the ratings? - posted by anishm <an...@gmail.com> on 2015/03/28 13:09:46 UTC, 0 replies.
- Custom edge partitioning in graphX - posted by arpp <ar...@gmail.com> on 2015/03/28 13:54:12 UTC, 0 replies.
- input size too large | Performance issues with Spark - posted by nsareen <ns...@gmail.com> on 2015/03/28 15:03:40 UTC, 1 replies.
- Re: Spark-submit not working when application jar is in hdfs - posted by rrussell25 <rr...@gmail.com> on 2015/03/28 17:36:19 UTC, 3 replies.
- Re: Why KMeans with mllib is so slow ? - posted by davidshen84 <da...@gmail.com> on 2015/03/29 06:04:17 UTC, 2 replies.
- Untangling dependency issues in spark streaming - posted by Neelesh <ne...@gmail.com> on 2015/03/29 09:10:34 UTC, 2 replies.
- Unable to run NetworkWordCount.java - posted by "mehak.soni" <me...@gmail.com> on 2015/03/29 09:26:09 UTC, 1 replies.
- RDD Persistance synchronization - posted by Harut Martirosyan <ha...@gmail.com> on 2015/03/29 10:07:01 UTC, 3 replies.
- Re: converting DStream[String] into RDD[String] in spark streaming [I] - posted by Deenar Toraskar <de...@gmail.com> on 2015/03/29 10:08:18 UTC, 1 replies.
- kmeans|| in Spark is not real paralleled? - posted by Xi Shen <da...@gmail.com> on 2015/03/29 10:20:30 UTC, 1 replies.
- Does Spark HiveContext supported with JavaSparkContext? - posted by Vincent He <vi...@gmail.com> on 2015/03/29 15:25:22 UTC, 5 replies.
- Build fails on 1.3 Branch - posted by mjhb <sp...@mjhb.com> on 2015/03/29 17:48:50 UTC, 4 replies.
- Arguments/parameters in Spark shell scripts? - posted by Minnow Noir <mi...@gmail.com> on 2015/03/29 20:16:31 UTC, 0 replies.
- Can't run spark-submit with an application jar on a Mesos cluster - posted by seglo <wl...@gmail.com> on 2015/03/29 21:12:42 UTC, 5 replies.
- Running Spark in Local Mode - posted by FreePeter <we...@gmail.com> on 2015/03/29 22:21:42 UTC, 1 replies.
- Task result in Spark Worker Node - posted by raggy <ra...@gmail.com> on 2015/03/30 03:34:50 UTC, 0 replies.
- Pregel API Abstraction for GraphX - posted by Kenny Bastani <ke...@gmail.com> on 2015/03/30 05:29:09 UTC, 0 replies.
- What is the meaning to of 'STATE' in a worker/ an executor? - posted by Niranda Perera <ni...@gmail.com> on 2015/03/30 06:09:20 UTC, 1 replies.
- 转发：How SparkStreaming output messages to Kafka? - posted by lu...@sina.com on 2015/03/30 07:34:36 UTC, 0 replies.
- java.io.FileNotFoundException when using HDFS in cluster mode - posted by Nick Travers <n....@gmail.com> on 2015/03/30 07:34:46 UTC, 3 replies.
- Re: How SparkStreaming output messages to Kafka? - posted by Saisai Shao <sa...@gmail.com> on 2015/03/30 08:03:31 UTC, 3 replies.
- 回复：Re: How SparkStreaming output messages to Kafka? - posted by lu...@sina.com on 2015/03/30 08:58:24 UTC, 0 replies.
- Spark 1.3 build with hive support fails on JLine - posted by Night Wolf <ni...@gmail.com> on 2015/03/30 10:02:26 UTC, 0 replies.
- Spark caching - posted by Renato Marroquín Mogrovejo <re...@gmail.com> on 2015/03/30 10:43:36 UTC, 2 replies.
- 回复：Re: Re: How SparkStreaming output messages to Kafka? - posted by lu...@sina.com on 2015/03/30 10:46:40 UTC, 1 replies.
- 回复：回复：Re: Re: How SparkStreaming output messages to Kafka? - posted by lu...@sina.com on 2015/03/30 10:58:14 UTC, 0 replies.
- Problem with groupBy and OOM when just writing the group in a file - posted by Mario Pastorelli <ma...@teralytics.ch> on 2015/03/30 11:06:50 UTC, 2 replies.
- why "Shuffle Write" is not zero when everything is cached and there is enough memory? - posted by shahab <sh...@gmail.com> on 2015/03/30 11:15:34 UTC, 6 replies.
- 回复：Re: 回复：Re: Re: How SparkStreaming output messages to Kafka? - posted by lu...@sina.com on 2015/03/30 11:19:53 UTC, 0 replies.
- Receive on driver program (without serializing) - posted by MartijnD <ik...@martijndwars.nl> on 2015/03/30 11:52:53 UTC, 0 replies.
- Too many open files - posted by Masf <ma...@gmail.com> on 2015/03/30 13:22:18 UTC, 4 replies.
- Re: DataFrame and non-lazy RDD operation - posted by Wail <w....@cces-kacst-mit.org> on 2015/03/30 15:23:12 UTC, 0 replies.
- Re: different result from implicit ALS with explicit ALS - posted by lisendong <li...@163.com> on 2015/03/30 16:27:20 UTC, 5 replies.
- Spark Streaming/Flume display all events - posted by Chong Zhang <ch...@gmail.com> on 2015/03/30 16:36:33 UTC, 1 replies.
- Re: Is it possible to do incremental training using ALSModel (MLlib)? - posted by dvpe <dv...@hotmail.com> on 2015/03/30 17:06:05 UTC, 0 replies.
- Online Realtime Recommendation System - posted by dvpe <dv...@hotmail.com> on 2015/03/30 17:09:12 UTC, 0 replies.
- Re: Job Opportunity in London - posted by Chitturi Padma <le...@gmail.com> on 2015/03/30 17:17:18 UTC, 1 replies.
- actorStream woes - posted by Marius Soutier <mp...@gmail.com> on 2015/03/30 17:41:14 UTC, 0 replies.
- Spark streaming with Kafka, multiple partitions fail, single partition ok - posted by Nicolas Phung <ni...@gmail.com> on 2015/03/30 18:05:24 UTC, 5 replies.
- Cannot run spark-shell "command not found". - posted by vance46 <wa...@purdue.edu> on 2015/03/30 19:34:19 UTC, 2 replies.
- Re: Actor not found - posted by sparkdi <sh...@dubna.us> on 2015/03/30 20:26:22 UTC, 2 replies.
- When will 1.3.1 release? - posted by Shuai Zheng <sz...@gmail.com> on 2015/03/30 21:34:11 UTC, 2 replies.
- Spark and OpenJDK - jar: No such file or directory - posted by "Kelly, Jonathan" <jo...@amazon.com> on 2015/03/30 22:03:29 UTC, 1 replies.
- log4j.properties in jar - posted by Udit Mehta <um...@groupon.com> on 2015/03/30 22:24:49 UTC, 1 replies.
- Spark 1.3.0 Build Failure - posted by ARose <As...@telarix.com> on 2015/03/30 22:34:26 UTC, 1 replies.
- Registering classes with KryoSerializer - posted by Arun Lists <li...@gmail.com> on 2015/03/30 22:59:19 UTC, 0 replies.
- Why is a Spark job faster through Eclipse than Standalone Cluster - posted by rival95 <ri...@hotmail.com> on 2015/03/30 23:10:49 UTC, 1 replies.
- Java and Kryo Serialization, Java.io.OptionalDataException - posted by zia_kayani <zi...@platalytics.com> on 2015/03/30 23:51:38 UTC, 0 replies.
- "Spark-events does not exist" error, while it does with all the req. rights - posted by Tom <th...@gmail.com> on 2015/03/31 00:50:46 UTC, 7 replies.
- Re: Spark 1.3 build with hive support fails - posted by nightwolf <ni...@gmail.com> on 2015/03/31 01:51:10 UTC, 1 replies.
- Re: Spark Streaming - Subroutine not being executed more than once - posted by jhakku <sr...@gmail.com> on 2015/03/31 02:14:56 UTC, 0 replies.
- Spark Streaming on YARN with loss of application master - posted by Matt Narrell <ma...@gmail.com> on 2015/03/31 02:23:34 UTC, 0 replies.
- Task size is large when CombineTextInputFormat is used - posted by Taeyun Kim <ta...@innowireless.com> on 2015/03/31 02:41:24 UTC, 0 replies.
- data frame API, change groupBy result column name - posted by Neal Yin <ne...@workday.com> on 2015/03/31 02:49:46 UTC, 1 replies.
- How to configure SparkUI to use internal ec2 ip - posted by anny9699 <an...@gmail.com> on 2015/03/31 05:48:48 UTC, 5 replies.
- Parquet Hive table become very slow on 1.3? - posted by "Zheng, Xudong" <do...@gmail.com> on 2015/03/31 08:47:51 UTC, 2 replies.
- workers no route to host - posted by ZhuGe <tc...@outlook.com> on 2015/03/31 09:12:30 UTC, 0 replies.
- Broadcasting a parquet file using spark and python - posted by jitesh129 <ji...@gmail.com> on 2015/03/31 12:25:00 UTC, 1 replies.
- Unable to save dataframe with UDT created with sqlContext.createDataFrame - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2015/03/31 13:10:50 UTC, 1 replies.
- refer to dictionary - posted by Peng Xia <sp...@gmail.com> on 2015/03/31 13:43:53 UTC, 2 replies.
- Re: can't union two rdds - posted by roy <rp...@njit.edu> on 2015/03/31 15:16:05 UTC, 1 replies.
- Spark sql query fails with executor lost/ out of memory expection while caching a table - posted by "ankurjain.nitrr" <an...@gmail.com> on 2015/03/31 15:18:37 UTC, 0 replies.
- "Ambiguous references" to a field set in a partitioned table AND the data - posted by Nicolas Fouché <ni...@gmail.com> on 2015/03/31 16:06:47 UTC, 1 replies.
- pyspark error with zip - posted by Charles Hayden <ch...@atigeo.com> on 2015/03/31 17:27:36 UTC, 0 replies.
- Hygienic closures for scala function serialization - posted by Erik Erlandson <ej...@redhat.com> on 2015/03/31 18:05:45 UTC, 0 replies.
- How to setup a Spark Cluter? - posted by bhushansc007 <bh...@gmail.com> on 2015/03/31 18:56:01 UTC, 1 replies.
- SparkSql - java.util.NoSuchElementException: key not found: node when access JSON Array - posted by Todd Nist <ts...@gmail.com> on 2015/03/31 21:26:09 UTC, 0 replies.
- java.io.NotSerializableException: org.apache.hadoop.hbase.client.Result - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/03/31 21:51:02 UTC, 3 replies.
- Using 'fair' scheduler mode - posted by asadrao <as...@microsoft.com> on 2015/03/31 22:19:44 UTC, 0 replies.
- --driver-memory parameter doesn't work for spark-submmit on yarn? - posted by Shuai Zheng <sz...@gmail.com> on 2015/03/31 22:27:16 UTC, 0 replies.
- Query REST web service with Spark? - posted by Minnow Noir <mi...@gmail.com> on 2015/03/31 22:46:46 UTC, 1 replies.
- Did anybody run Spark-perf on powerpc? - posted by Tom <th...@gmail.com> on 2015/03/31 23:02:59 UTC, 0 replies.
- joining multiple parquet files - posted by roni <ro...@gmail.com> on 2015/03/31 23:14:13 UTC, 0 replies.