You are viewing a plain text version of this content. The canonical link for it is here.
- Spark sql with Zeppelin, Task not serializable error when I try to cache the spark sql table - posted by shyla deshpande <de...@gmail.com> on 2017/06/01 04:37:59 UTC, 1 replies.
- RE: The following Error seems to happen once in every ten minutes (Spark Structured Streaming)? - posted by Mahesh Sawaiker <ma...@persistent.com> on 2017/06/01 04:56:09 UTC, 0 replies.
- RE: Message getting lost in Kafka + Spark Streaming - posted by Sidney Feiner <si...@startapp.com> on 2017/06/01 05:54:13 UTC, 1 replies.
- Re: An Architecture question on the use of virtualised clusters - posted by Jörn Franke <jo...@gmail.com> on 2017/06/01 07:21:08 UTC, 11 replies.
- Number Of Partitions in RDD - posted by Vikash Pareek <vi...@infoobjects.com> on 2017/06/01 10:28:56 UTC, 5 replies.
- Re: Creating Dataframe by querying Impala - posted by Anubhav Agarwal <an...@gmail.com> on 2017/06/01 16:34:06 UTC, 1 replies.
- statefulStreaming checkpointing too often - posted by David Rosenstrauch <da...@gmail.com> on 2017/06/01 21:54:49 UTC, 1 replies.
- command to get list oin spark 2.0 scala of all persisted rdd's in spark 2.0 scala shell - posted by nancy henry <na...@gmail.com> on 2017/06/02 05:59:28 UTC, 0 replies.
- Spark 2.1 - Infering schema of dataframe after reading json files not during - posted by Aseem Bansal <as...@gmail.com> on 2017/06/02 14:11:58 UTC, 2 replies.
- Spark SQL, formatting timezone in UTC - posted by yohann jardin <yo...@hotmail.com> on 2017/06/02 17:10:51 UTC, 0 replies.
- Parquet Read Speed: Spark SQL vs Parquet MR - posted by Mike Wheeler <ro...@gmail.com> on 2017/06/03 16:31:14 UTC, 0 replies.
- Is there a way to do conditional group by in spark 2.1.1? - posted by kant kodali <ka...@gmail.com> on 2017/06/03 22:00:10 UTC, 4 replies.
- SparkAppHandle.Listener.infoChanged behaviour - posted by Mohammad Tariq <do...@gmail.com> on 2017/06/04 02:16:30 UTC, 2 replies.
- Is there a way to do partial sort in update mode in Spark Structured Streaming? - posted by kant kodali <ka...@gmail.com> on 2017/06/04 04:23:32 UTC, 0 replies.
- What is the easiest way for an application to Query parquet data on HDFS? - posted by kant kodali <ka...@gmail.com> on 2017/06/04 06:29:52 UTC, 8 replies.
- Spark Job is stuck at SUBMITTED when set Driver Memory > Executor Memory - posted by Abdulfattah Safa <fa...@gmail.com> on 2017/06/04 11:46:00 UTC, 3 replies.
- Kafka + Spark Streaming consumer API offsets - posted by Nipun Arora <ni...@gmail.com> on 2017/06/05 14:59:36 UTC, 0 replies.
- Spark Streaming Checkpoint and Exactly Once Guarantee on Kafka Direct Stream - posted by ALunar Beach <al...@gmail.com> on 2017/06/05 18:14:35 UTC, 4 replies.
- Incorrect CAST to TIMESTAMP in Hive compatibility - posted by verbamour <ve...@gmail.com> on 2017/06/05 19:42:04 UTC, 1 replies.
- Edge Node in Spark - posted by Ashok Kumar <as...@yahoo.com.INVALID> on 2017/06/05 20:45:03 UTC, 5 replies.
- Adding header to an rdd before saving to text file - posted by upendra 1991 <up...@yahoo.com.INVALID> on 2017/06/05 21:15:36 UTC, 2 replies.
- Spark Streaming Job Stuck - posted by "Jain, Nishit" <nj...@underarmour.com> on 2017/06/05 21:51:53 UTC, 0 replies.
- Spark on Kubernetes: Birds-of-a-Feather Session 12:50pm 6/6 @ Spark Summit - posted by Erik Erlandson <ee...@redhat.com> on 2017/06/06 00:30:03 UTC, 0 replies.
- Re: Spark Streaming Job Stuck - posted by Tathagata Das <ta...@gmail.com> on 2017/06/06 08:26:39 UTC, 2 replies.
- Java SPI jar reload in Spark - posted by "Jonnas Li(Contractor)" <zh...@envisioncn.com> on 2017/06/06 09:35:54 UTC, 6 replies.
- [Spark Structured Streaming] Exception while using watermark with type of timestamp - posted by Biplob Biswas <re...@gmail.com> on 2017/06/06 10:03:12 UTC, 2 replies.
- a stage can belong to more than one job please? - posted by 萝卜丝炒饭 <14...@qq.com> on 2017/06/06 12:04:26 UTC, 2 replies.
- problem initiating spark context with pyspark - posted by Curtis Burkhalter <cu...@gmail.com> on 2017/06/06 14:45:45 UTC, 7 replies.
- Performance issue when running Spark-1.6.1 in yarn-client mode with Hadoop 2.6.0 - posted by satishjohn <sa...@gmail.com> on 2017/06/06 16:02:57 UTC, 2 replies.
- Exception which using ReduceByKeyAndWindow in Spark Streaming. - posted by SRK <sw...@gmail.com> on 2017/06/06 18:30:56 UTC, 7 replies.
- How to perform clean-up after stateful streaming processes an RDD? - posted by David Rosenstrauch <da...@gmail.com> on 2017/06/06 23:08:30 UTC, 3 replies.
- StructuredStreaming : org.apache.spark.sql.streaming.StreamingQueryException - posted by aravias <as...@homeaway.com> on 2017/06/07 03:40:40 UTC, 0 replies.
- Convert the feature vector to raw data - posted by kundan kumar <ii...@gmail.com> on 2017/06/07 09:00:30 UTC, 3 replies.
- - posted by Patrik Medvedev <pa...@gmail.com> on 2017/06/07 09:07:07 UTC, 0 replies.
- [Spark JDBC] Does spark support read from remote Hive server via JDBC - posted by Patrik Medvedev <pa...@gmail.com> on 2017/06/07 09:15:07 UTC, 8 replies.
- Scala, Python or Java for Spark programming - posted by Mich Talebzadeh <mi...@gmail.com> on 2017/06/07 15:20:16 UTC, 8 replies.
- [CSV] If number of columns of one row bigger than maxcolumns it stop the whole parsing process. - posted by Chanh Le <gi...@gmail.com> on 2017/06/07 15:50:00 UTC, 9 replies.
- user-unsubscribe@spark.apache.org - posted by wi...@gmail.com on 2017/06/07 16:17:45 UTC, 2 replies.
- Re: Question about mllib.recommendation.ALS - posted by Ryan <ry...@gmail.com> on 2017/06/08 03:17:35 UTC, 3 replies.
- Re: good http sync client to be used with spark - posted by Ryan <ry...@gmail.com> on 2017/06/08 03:21:45 UTC, 0 replies.
- Re: Worker node log not showed - posted by Ryan <ry...@gmail.com> on 2017/06/08 03:24:46 UTC, 1 replies.
- Re: No TypeTag Available for String - posted by Ryan <ry...@gmail.com> on 2017/06/08 03:26:15 UTC, 0 replies.
- Read Data From NFS - posted by ayan guha <gu...@gmail.com> on 2017/06/08 05:26:40 UTC, 7 replies.
- SPARK-19547 - posted by "Rastogi, Pankaj" <pa...@verizon.com> on 2017/06/08 05:42:24 UTC, 0 replies.
- [Spark Core] Does spark support read from remote Hive server via JDBC - posted by Даша Ковальчук <da...@gmail.com> on 2017/06/08 08:16:18 UTC, 8 replies.
- [Spark STREAMING]: Can not kill job gracefully on spark standalone cluster - posted by "Mariusz D." <du...@gmail.com> on 2017/06/08 09:51:16 UTC, 0 replies.
- Output of select in non exponential form. - posted by kundan kumar <ii...@gmail.com> on 2017/06/08 10:15:14 UTC, 0 replies.
- unsubscribe - posted by Brindha Sengottaiyan <sb...@yahoo.com.INVALID> on 2017/06/08 19:05:47 UTC, 0 replies.
- Is Structured streaming ready for production usage - posted by SRK <sw...@gmail.com> on 2017/06/08 22:03:27 UTC, 4 replies.
- Need Spark(Scala) Performance Tuning tips - posted by Debabrata Ghosh <ma...@gmail.com> on 2017/06/09 12:50:31 UTC, 3 replies.
- RowMatrix: tallSkinnyQR - posted by Arun <ar...@gmail.com> on 2017/06/09 14:33:00 UTC, 1 replies.
- Re: StructuredStreaming : StreamingQueryException - posted by aravias <as...@homeaway.com> on 2017/06/09 15:21:33 UTC, 1 replies.
- RDD saveAsText and DataFrame write.mode(SaveMode).text(Path) duplicating rows - posted by "Barona, Ricardo" <ri...@intel.com> on 2017/06/09 17:17:56 UTC, 2 replies.
- [jira] Lantao Jin shared "SPARK-21023: Ignore to load default properties file is not a good choice from the perspective of system" with you - posted by "Lantao Jin (JIRA)" <ji...@apache.org> on 2017/06/10 07:46:18 UTC, 0 replies.
- What is the real difference between Kafka streaming and Spark Streaming? - posted by kant kodali <ka...@gmail.com> on 2017/06/11 08:12:03 UTC, 13 replies.
- LibSVM should have just one input file - posted by "darion.yaphet" <fl...@163.com> on 2017/06/12 03:46:35 UTC, 1 replies.
- help with "ERROR server.TransportRequestHandler: Error sending result StreamResponse" - posted by Steve Sun <st...@gmail.com> on 2017/06/12 04:49:12 UTC, 0 replies.
- Use SQL Script to Write Spark SQL Jobs - posted by bo yang <bo...@gmail.com> on 2017/06/12 05:29:50 UTC, 7 replies.
- [How-To] Custom file format as source - posted by OBones <ob...@free.fr> on 2017/06/12 10:01:23 UTC, 3 replies.
- SPARK environment settings issue when deploying a custom distribution - posted by Chanh Le <gi...@gmail.com> on 2017/06/12 11:14:29 UTC, 1 replies.
- Re: [E] Re: Spark Job is stuck at SUBMITTED when set Driver Memory > Executor Memory - posted by "Rastogi, Pankaj" <pa...@verizon.com> on 2017/06/12 16:30:02 UTC, 1 replies.
- Parquet file generated by Spark, but not compatible read by Hive - posted by Yong Zhang <ja...@hotmail.com> on 2017/06/12 21:05:13 UTC, 3 replies.
- broadcast() multiple times the same df. Is it cached ? - posted by matd <ma...@gmail.com> on 2017/06/12 21:14:19 UTC, 0 replies.
- Deciphering spark warning "Truncated the string representation of a plan since it was too large." - posted by Henry M <he...@gmail.com> on 2017/06/12 22:10:47 UTC, 1 replies.
- Can I use ChannelTrafficShapingHandler to control the network read/write speed in shuffle? - posted by Niu Zhaojie <nz...@gmail.com> on 2017/06/13 11:08:34 UTC, 1 replies.
- Assign Custom receiver to a scheduler pool - posted by Rabin Banerjee <de...@gmail.com> on 2017/06/13 11:54:25 UTC, 3 replies.
- how to debug app with cluster mode please? - posted by 萝卜丝炒饭 <14...@qq.com> on 2017/06/13 12:49:28 UTC, 0 replies.
- having trouble using structured streaming with file sink (parquet) - posted by "Mendelson, Assaf" <As...@rsa.com> on 2017/06/13 13:43:11 UTC, 1 replies.
- SANSA 0.2 (Semantic Technologies on top of Spark) Released - posted by Jens Lehmann <je...@cs.uni-bonn.de> on 2017/06/13 16:06:43 UTC, 0 replies.
- Java access to internal representation of DataTypes.DateType - posted by Anton Kravchenko <kr...@gmail.com> on 2017/06/13 16:16:12 UTC, 2 replies.
- Read Local File - posted by Dirceu Semighini Filho <di...@gmail.com> on 2017/06/13 18:02:18 UTC, 2 replies.
- UDF percentile_approx - posted by Andrés Ivaldi <ia...@gmail.com> on 2017/06/13 18:52:35 UTC, 3 replies.
- Spark Streaming Design Suggestion - posted by Shashi Vishwakarma <sh...@gmail.com> on 2017/06/13 20:03:14 UTC, 3 replies.
- Join pushdown on two external tables from the same external source? - posted by drewrobb <dr...@gmail.com> on 2017/06/13 23:18:15 UTC, 0 replies.
- Exception when accessing Spark Web UI in yarn-client mode - posted by satishjohn <sa...@gmail.com> on 2017/06/14 14:15:45 UTC, 0 replies.
- Configurable Task level time outs and task failures - posted by AnilKumar B <ak...@gmail.com> on 2017/06/14 18:01:28 UTC, 0 replies.
- CFP FOR SPARK SUMMIT EUROPE CLOSES FRIDAY - posted by Scott walent <sc...@gmail.com> on 2017/06/14 20:01:17 UTC, 0 replies.
- [MLLib]: Executor OutOfMemory in BlockMatrix Multiplication - posted by Anthony Thomas <ah...@eng.ucsd.edu> on 2017/06/14 23:07:36 UTC, 3 replies.
- Create dataset from dataframe with missing columns - posted by "Tokayer, Jason M." <Ja...@capitalone.com> on 2017/06/14 23:15:24 UTC, 1 replies.
- Create dataset from data frame with missing columns - posted by to...@gmail.com on 2017/06/14 23:45:43 UTC, 0 replies.
- [Spark Sql/ UDFs] Spark and Hive UDFs parity - posted by RD <rd...@gmail.com> on 2017/06/15 04:52:19 UTC, 4 replies.
- the dependence length of RDD, can its size be greater than 1 pleaae? - posted by 萝卜丝炒饭 <14...@qq.com> on 2017/06/15 08:11:27 UTC, 3 replies.
- Repartition vs PartitionBy Help/Understanding needed - posted by Aakash Basu <aa...@gmail.com> on 2017/06/15 09:27:56 UTC, 1 replies.
- Create dataset from dataframe with fewer columns - posted by to...@gmail.com on 2017/06/15 09:54:07 UTC, 0 replies.
- [How-To] Migrating from mllib.tree.DecisionTree to ml.regression.DecisionTreeRegressor - posted by OBones <ob...@free.fr> on 2017/06/15 09:59:55 UTC, 2 replies.
- Re: Create dataset from dataframe with missing columns - posted by Riccardo Ferrari <fe...@gmail.com> on 2017/06/15 10:26:00 UTC, 0 replies.
- Spark don't run all code when is submit to yarn-cluster mode. - posted by Cosmin Posteuca <co...@gmail.com> on 2017/06/15 12:40:16 UTC, 0 replies.
- Nested "struct" fonction call creates a compilation error in Spark SQL - posted by Olivier Girardot <o....@lateral-thoughts.com> on 2017/06/15 13:04:26 UTC, 3 replies.
- Best alternative for Category Type in Spark Dataframe - posted by saatvikshah1994 <sa...@gmail.com> on 2017/06/15 14:19:37 UTC, 10 replies.
- fetching and joining data from two different clusters - posted by Mich Talebzadeh <mi...@gmail.com> on 2017/06/15 16:03:21 UTC, 8 replies.
- [SparkSQL] Escaping a query for a dataframe query - posted by "mark.jenkins4@baesystems.com" <ma...@baesystems.com> on 2017/06/15 16:05:56 UTC, 3 replies.
- Serialization of fastutils reference collections - posted by Leonid Toshchev <lt...@gmail.com> on 2017/06/15 17:11:40 UTC, 0 replies.
- access a broadcasted variable from within ForeachPartitionFunction Java API - posted by Anton Kravchenko <kr...@gmail.com> on 2017/06/15 18:29:26 UTC, 3 replies.
- Re: featureSubsetStrategy parameter for GradientBoostedTreesModel - posted by Pralabh Kumar <pr...@gmail.com> on 2017/06/16 02:30:30 UTC, 0 replies.
- spark-submit: file not found exception occurs - posted by Shupeng Geng <sh...@envisioncn.com> on 2017/06/16 03:14:20 UTC, 0 replies.
- how to call udf with parameters - posted by lk_spark <lk...@163.com> on 2017/06/16 03:51:05 UTC, 6 replies.
- Max number of columns - posted by Jan Holmberg <ja...@perigeum.fi> on 2017/06/16 07:44:49 UTC, 1 replies.
- What is the charting library used by Databricks UI? - posted by kant kodali <ka...@gmail.com> on 2017/06/16 07:55:43 UTC, 2 replies.
- [Error] Python version mismatch in CDH cluster when running pyspark job - posted by Divya Gehlot <di...@gmail.com> on 2017/06/16 08:40:04 UTC, 0 replies.
- Spark SQL within a DStream map function - posted by Mike Hugo <mi...@piragua.com> on 2017/06/16 20:01:53 UTC, 1 replies.
- Error while doing mvn release for spark 2.0.2 using scala 2.10 - posted by Kanagha Kumar <kp...@salesforce.com> on 2017/06/16 21:59:27 UTC, 4 replies.
- Spark-Kafka integration - build failing with sbt - posted by karan alang <ka...@gmail.com> on 2017/06/16 23:51:37 UTC, 4 replies.
- difference between spark-integrated hive and original hive - posted by wuchang <58...@qq.com> on 2017/06/17 09:09:27 UTC, 0 replies.
- Build spark without hive issue, spark-sql doesn't work. - posted by wuchang <58...@qq.com> on 2017/06/17 09:27:38 UTC, 1 replies.
- the scheme in stream reader - posted by 萝卜丝炒饭 <14...@qq.com> on 2017/06/18 07:27:49 UTC, 2 replies.
- Unsubscribe - posted by Palash Gupta <sp...@yahoo.com.INVALID> on 2017/06/18 11:17:37 UTC, 5 replies.
- Does spark support hive table(parquet) column renaming? - posted by 李斌松 <li...@gmail.com> on 2017/06/19 13:19:36 UTC, 0 replies.
- Spark streaming data loss - posted by vasanth kumar <rj...@gmail.com> on 2017/06/19 13:46:27 UTC, 0 replies.
- Stream Processing: how to refresh a loaded dataset periodically - posted by aravias <as...@homeaway.com> on 2017/06/19 17:29:55 UTC, 0 replies.
- spark submit with logs and kerberos - posted by Juan Pablo Briganti <ju...@globant.com> on 2017/06/19 17:39:55 UTC, 0 replies.
- How save streaming aggregations on 'Structured Streams' in parquet format ? - posted by kaniska Mandal <ka...@gmail.com> on 2017/06/19 18:03:45 UTC, 5 replies.
- how many topics spark streaming can handle - posted by Ashok Kumar <as...@yahoo.com.INVALID> on 2017/06/19 19:00:36 UTC, 3 replies.
- Merging multiple Pandas dataframes - posted by saatvikshah1994 <sa...@gmail.com> on 2017/06/19 23:21:42 UTC, 4 replies.
- Flume DStream produces 0 records after HDFS node killed - posted by N B <nb...@gmail.com> on 2017/06/20 01:23:15 UTC, 5 replies.
- the meaning of partition column and bucket column please? - posted by 萝卜丝炒饭 <14...@qq.com> on 2017/06/20 02:00:27 UTC, 0 replies.
- Spark Streaming - Increasing number of executors slows down processing rate - posted by Mal Edwin <ma...@vinadionline.com> on 2017/06/20 03:42:28 UTC, 1 replies.
- Do we anything for Deep Learning in Spark? - posted by Gaurav1809 <ga...@gmail.com> on 2017/06/20 04:36:59 UTC, 3 replies.
- Why my project has this kind of error ? - posted by 张明磊 <zm...@163.com> on 2017/06/20 05:54:34 UTC, 1 replies.
- spark2.1 and kafka0.10 - posted by lk_spark <lk...@163.com> on 2017/06/20 07:04:18 UTC, 0 replies.
- Cassandra querying time stamps - posted by sujeet jog <su...@gmail.com> on 2017/06/20 10:52:02 UTC, 3 replies.
- spark higher order functions - posted by AssafMendelson <as...@rsa.com> on 2017/06/20 14:02:07 UTC, 1 replies.
- Using Spark as a simulator - posted by Esa Heikkinen <es...@student.tut.fi> on 2017/06/20 14:04:19 UTC, 4 replies.
- "Sharing" dataframes... - posted by Jean Georges Perrin <jg...@jgp.net> on 2017/06/20 17:46:36 UTC, 8 replies.
- org.apache.spark.sql.types missing from spark-sql_2.11-2.1.1.jar? - posted by Jean Georges Perrin <jg...@jgp.net> on 2017/06/20 20:14:54 UTC, 2 replies.
- How to bootstrap Spark Kafka direct with the previous state in case of a code upgrade - posted by SRK <sw...@gmail.com> on 2017/06/20 20:54:25 UTC, 0 replies.
- Bizzare diff in behavior between scala REPL and sparkSQL UDF - posted by jeff saremi <je...@hotmail.com> on 2017/06/20 21:48:06 UTC, 1 replies.
- Re: appendix - posted by Wenchen Fan <cl...@gmail.com> on 2017/06/21 02:21:54 UTC, 0 replies.
- Spark 2.1.1 and Hadoop version 2.2 or 2.7? - posted by N B <nb...@gmail.com> on 2017/06/21 05:51:40 UTC, 1 replies.
- Saving RDD as Kryo (broken in 2.1) - posted by Alexander Krasheninnikov <a....@corp.badoo.com> on 2017/06/21 08:39:14 UTC, 2 replies.
- JDBC RDD Timestamp Parsing Issue - posted by Aviral Agarwal <av...@gmail.com> on 2017/06/21 09:06:59 UTC, 4 replies.
- gfortran runtime library for Spark - posted by Saroj C <sa...@tcs.com> on 2017/06/21 09:30:16 UTC, 2 replies.
- Broadcasts & Storage Memory - posted by Bryan Jeffrey <br...@gmail.com> on 2017/06/21 20:43:24 UTC, 3 replies.
- Using YARN w/o HDFS - posted by "Alaa Zubaidi (PDF)" <al...@pdf.com> on 2017/06/21 23:50:02 UTC, 2 replies.
- spark2.1 kafka0.10 - posted by lk_spark <lk...@163.com> on 2017/06/22 03:13:23 UTC, 5 replies.
- Unsubscribe - posted by Anita Tailor <ta...@gmail.com> on 2017/06/22 04:17:12 UTC, 1 replies.
- How does Spark deal with Data Skewness? - posted by Sea aj <sa...@gmail.com> on 2017/06/22 15:58:16 UTC, 0 replies.
- Trouble with PySpark UDFs and SPARK_HOME only on EMR - posted by Nick Chammas <ni...@gmail.com> on 2017/06/22 16:08:17 UTC, 1 replies.
- Spark submit - org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run error - posted by Kanagha Kumar <kp...@salesforce.com> on 2017/06/22 23:24:28 UTC, 0 replies.
- Using Spark with Local File System/NFS - posted by saatvikshah1994 <sa...@gmail.com> on 2017/06/23 00:21:10 UTC, 1 replies.
- A question about rdd transformation - posted by Lionel Luffy <li...@gmail.com> on 2017/06/23 02:56:58 UTC, 1 replies.
- OutOfMemoryError - posted by Tw UxTLi51Nus <Tw...@posteo.co> on 2017/06/23 07:07:00 UTC, 0 replies.
- Spark Memory Optimization - posted by Tw UxTLi51Nus <Tw...@posteo.co> on 2017/06/23 07:32:52 UTC, 0 replies.
- How does HashPartitioner distribute data in Spark? - posted by Vikash Pareek <vi...@infoobjects.com> on 2017/06/23 07:42:05 UTC, 4 replies.
- Question about standalone Spark cluster reading from Kerberosed hadoop - posted by Mu Kong <ko...@gmail.com> on 2017/06/23 09:10:12 UTC, 3 replies.
- Container exited with a non-zero exit code 1 - posted by Link Qian <fa...@outlook.com> on 2017/06/23 09:58:49 UTC, 0 replies.
- HDP 2.5 - Python - Spark-On-Hbase - posted by ayan guha <gu...@gmail.com> on 2017/06/23 12:46:42 UTC, 6 replies.
- Spark job profiler results showing high TCP cpu time - posted by Reth RM <re...@gmail.com> on 2017/06/23 17:46:21 UTC, 3 replies.
- Is there an api in Dataset/Dataframe that does repartitionAndSortWithinPartitions? - posted by Keith Chapman <ke...@gmail.com> on 2017/06/23 21:43:04 UTC, 5 replies.
- Re: Help in Parsing 'Categorical' type of data - posted by Yanbo Liang <yb...@gmail.com> on 2017/06/24 02:42:31 UTC, 0 replies.
- issue about the windows slice of stream - posted by 萝卜丝炒饭 <14...@qq.com> on 2017/06/24 06:51:52 UTC, 2 replies.
- Fwd: Can we access files on Cluster mode - posted by sudhir k <k....@gmail.com> on 2017/06/24 17:30:07 UTC, 7 replies.
- RDD and DataFrame persistent memory usage - posted by Ashok Kumar <as...@yahoo.com.INVALID> on 2017/06/25 07:13:15 UTC, 0 replies.
- Question on Spark code - posted by kant kodali <ka...@gmail.com> on 2017/06/25 09:27:31 UTC, 6 replies.
- Problem in avg function Spark 1.6.3 using spark-shell - posted by Eko Susilo <ek...@gmail.com> on 2017/06/25 11:19:13 UTC, 1 replies.
- Could you please add a book info on Spark website? - posted by "Md. Rezaul Karim" <re...@insight-centre.org> on 2017/06/25 11:33:51 UTC, 2 replies.
- the compile of spark stoped without any hints, would you like help me please? - posted by 萝卜丝炒饭 <14...@qq.com> on 2017/06/25 12:29:29 UTC, 1 replies.
- What is the equivalent of mapPartitions in SpqrkSQL? - posted by jeff saremi <je...@hotmail.com> on 2017/06/25 15:32:07 UTC, 7 replies.
- How to Fill Sparse Data With the Previous Non-Empty Value in SPARQL Dataset - posted by "Carlo.Allocca" <ca...@open.ac.uk> on 2017/06/25 19:37:42 UTC, 1 replies.
- Spark streaming persist to hdfs question - posted by Naveen Madhire <vm...@umail.iu.edu> on 2017/06/26 03:09:54 UTC, 2 replies.
- Meetup in Taiwan - posted by Yang Bryan <ke...@gmail.com> on 2017/06/26 04:26:08 UTC, 0 replies.
- Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve - posted by mckunkel <m....@fz-juelich.de> on 2017/06/26 17:46:38 UTC, 0 replies.
- ZeroMQ Streaming in Spark2.x - posted by Aashish Chaudhary <aa...@kitware.com> on 2017/06/26 18:58:27 UTC, 3 replies.
- Spark Streaming reduceByKeyAndWindow with inverse function seems to iterate over all the keys in the window even though they are not present in the current batch - posted by SRK <sw...@gmail.com> on 2017/06/26 19:53:08 UTC, 0 replies.
- Re: Spark Streaming reduceByKeyAndWindow with inverse function seems to iterate over all the keys in the window even though they are not present in the current batch - posted by Tathagata Das <ta...@gmail.com> on 2017/06/26 21:11:21 UTC, 0 replies.
- Question about Parallel Stages in Spark - posted by satishl <sa...@gmail.com> on 2017/06/27 00:10:25 UTC, 8 replies.
- Re: Spark Streaming reduceByKeyAndWindow with inverse function seems toiterate over all the keys in the window even though they are not presentin the current batch - posted by 萝卜丝炒饭 <14...@qq.com> on 2017/06/27 00:52:34 UTC, 0 replies.
- how to mention others in JIRA comment please? - posted by 萝卜丝炒饭 <14...@qq.com> on 2017/06/27 01:56:50 UTC, 2 replies.
- PySpark 2.1.1 Can't Save Model - Permission Denied - posted by John Omernik <jo...@omernik.com> on 2017/06/27 12:47:01 UTC, 1 replies.
- What is the purpose of having RDD.context and RDD.sparkContext at the same time? - posted by Sergey Zhemzhitsky <sz...@gmail.com> on 2017/06/27 12:47:11 UTC, 0 replies.
- proxy on spark UI - posted by "Soheila S." <so...@gmail.com> on 2017/06/27 14:05:20 UTC, 0 replies.
- [ML] Stop conditions for RandomForest - posted by OBones <ob...@free.fr> on 2017/06/27 15:07:22 UTC, 2 replies.
- the function of countByValueAndWindow and foreachRDD in DStream, would you like help me understand it please? - posted by 萝卜丝炒饭 <14...@qq.com> on 2017/06/27 15:08:20 UTC, 0 replies.
- How do I find the time taken by each step in a stage in a Spark Job - posted by SRK <sw...@gmail.com> on 2017/06/27 18:36:30 UTC, 1 replies.
- (Spark-ml) java.util.NosuchElementException: key not found exception on doing prediction and computing test error. - posted by neha nihal <ne...@gmail.com> on 2017/06/27 18:45:19 UTC, 3 replies.
- Spark Encoder with mysql Enum and data truncated Error - posted by mckunkel <m....@fz-juelich.de> on 2017/06/27 19:15:18 UTC, 0 replies.
- IDE for python - posted by Xiaomeng Wan <sh...@gmail.com> on 2017/06/27 21:16:51 UTC, 7 replies.
- Spark standalone , client mode. How do I monitor? - posted by anna stax <an...@gmail.com> on 2017/06/28 00:03:03 UTC, 1 replies.
- How to reduce the amount of data that is getting written to the checkpoint from Spark Streaming - posted by SRK <sw...@gmail.com> on 2017/06/28 00:48:24 UTC, 0 replies.
- How to make sure that Spark Kafka Direct Streaming job maintains the state upon code deployment? - posted by SRK <sw...@gmail.com> on 2017/06/28 05:49:15 UTC, 0 replies.
- Re: [PySpark]: How to store NumPy array into single DataFrame cell efficiently - posted by Nick Pentreath <ni...@gmail.com> on 2017/06/28 10:41:31 UTC, 0 replies.
- How to propagate Non-Empty Value in SPARQL Dataset - posted by carloallocca <c....@samsung.com> on 2017/06/28 13:30:53 UTC, 1 replies.
- using Apache Spark standalone on a server for a class/multiple users, db.lck does not get removed - posted by Robert Kudyba <rk...@fordham.edu> on 2017/06/28 14:55:28 UTC, 1 replies.
- Structured Streaming Questions - posted by Revin Chalil <rc...@expedia.com> on 2017/06/28 17:27:10 UTC, 1 replies.
- Building Kafka 0.10 Source for Structured Streaming Error. - posted by satyajit vegesna <sa...@gmail.com> on 2017/06/28 19:00:38 UTC, 3 replies.
- Spark Project build Issues.(Intellij) - posted by satyajit vegesna <sa...@gmail.com> on 2017/06/29 00:13:21 UTC, 2 replies.
- about broadcast join of base table in spark sql - posted by paleyl <pa...@gmail.com> on 2017/06/29 02:42:43 UTC, 3 replies.
- SparkSQL to read XML Blob data to create multiple rows - posted by "Talap, Amol" <am...@capgemini.com> on 2017/06/29 04:30:13 UTC, 6 replies.
- sqlstream for real time analytics - posted by Mich Talebzadeh <mi...@gmail.com> on 2017/06/29 06:55:57 UTC, 1 replies.
- What's the simplest way to Read Avro records from Kafka to Spark DataSet/DataFrame? - posted by kant kodali <ka...@gmail.com> on 2017/06/29 07:56:17 UTC, 1 replies.
- Understanding how spark share db connections created on driver - posted by salvador <so...@gmail.com> on 2017/06/29 09:52:45 UTC, 4 replies.
- Python Spark for full fledged ETL - posted by upkar_kohli <up...@gmail.com> on 2017/06/29 14:44:15 UTC, 4 replies.
- Spark on yarn logging - posted by John Vines <vi...@apache.org> on 2017/06/29 15:05:42 UTC, 0 replies.
- The stability of Spark Stream Kafka 010 - posted by Martin Peng <we...@gmail.com> on 2017/06/29 16:07:27 UTC, 1 replies.
- spark.pyspark.python is ignored? - posted by Jason White <ja...@shopify.com> on 2017/06/29 16:37:14 UTC, 0 replies.
- Spark querying parquet data partitioned in S3 - posted by fran <fr...@hivehome.com> on 2017/06/29 16:44:00 UTC, 1 replies.
- Interesting Stateful Streaming question - posted by kant kodali <ka...@gmail.com> on 2017/06/29 18:55:00 UTC, 2 replies.
- PySpark working with Generators - posted by saatvikshah1994 <sa...@gmail.com> on 2017/06/29 19:59:08 UTC, 6 replies.
- Spark, S3A, and 503 SlowDown / rate limit issues - posted by Everett Anderson <ev...@nuna.com.INVALID> on 2017/06/29 20:56:52 UTC, 0 replies.
- [Spark ML] LogisticRegressionWithSGD - posted by Kevin Quinn <kf...@gmail.com> on 2017/06/29 21:46:46 UTC, 1 replies.
- Re: Apache Arrow + Spark examples? - posted by Nirav Patel <np...@xactlycorp.com> on 2017/06/30 00:11:08 UTC, 0 replies.
- Re: Does spark support Apache Arrow - posted by Nirav Patel <np...@xactlycorp.com> on 2017/06/30 00:15:47 UTC, 0 replies.
- Project tungsten phase2 - SIMD and columnar in-memory storage - posted by Nirav Patel <np...@xactlycorp.com> on 2017/06/30 01:50:19 UTC, 0 replies.
- spark streaming socket read issue - posted by pradeepbill <pr...@gmail.com> on 2017/06/30 13:47:07 UTC, 1 replies.
- Withcolumn date with sysdate - posted by sudhir k <k....@gmail.com> on 2017/06/30 16:50:41 UTC, 1 replies.