You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gatorsmile <gi...@git.apache.org> on 2018/10/17 04:55:10 UTC

[GitHub] spark pull request #22746: [SPARK-24499][SQL][DOC] Split the page of sql-pro...

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22746#discussion_r225780740
  
    --- Diff: docs/sql-getting-started.md ---
    @@ -0,0 +1,369 @@
    +---
    +layout: global
    +title: Getting Started
    +displayTitle: Getting Started
    +---
    +
    +* Table of contents
    +{:toc}
    +
    +## Starting Point: SparkSession
    +
    +<div class="codetabs">
    +<div data-lang="scala"  markdown="1">
    +
    +The entry point into all functionality in Spark is the [`SparkSession`](api/scala/index.html#org.apache.spark.sql.SparkSession) class. To create a basic `SparkSession`, just use `SparkSession.builder()`:
    +
    +{% include_example init_session scala/org/apache/spark/examples/sql/SparkSQLExample.scala %}
    +</div>
    +
    +<div data-lang="java" markdown="1">
    +
    +The entry point into all functionality in Spark is the [`SparkSession`](api/java/index.html#org.apache.spark.sql.SparkSession) class. To create a basic `SparkSession`, just use `SparkSession.builder()`:
    +
    +{% include_example init_session java/org/apache/spark/examples/sql/JavaSparkSQLExample.java %}
    +</div>
    +
    +<div data-lang="python"  markdown="1">
    +
    +The entry point into all functionality in Spark is the [`SparkSession`](api/python/pyspark.sql.html#pyspark.sql.SparkSession) class. To create a basic `SparkSession`, just use `SparkSession.builder`:
    +
    +{% include_example init_session python/sql/basic.py %}
    +</div>
    +
    +<div data-lang="r"  markdown="1">
    +
    +The entry point into all functionality in Spark is the [`SparkSession`](api/R/sparkR.session.html) class. To initialize a basic `SparkSession`, just call `sparkR.session()`:
    +
    +{% include_example init_session r/RSparkSQLExample.R %}
    +
    +Note that when invoked for the first time, `sparkR.session()` initializes a global `SparkSession` singleton instance, and always returns a reference to this instance for successive invocations. In this way, users only need to initialize the `SparkSession` once, then SparkR functions like `read.df` will be able to access this global instance implicitly, and users don't need to pass the `SparkSession` instance around.
    +</div>
    +</div>
    +
    +`SparkSession` in Spark 2.0 provides builtin support for Hive features including the ability to
    +write queries using HiveQL, access to Hive UDFs, and the ability to read data from Hive tables.
    +To use these features, you do not need to have an existing Hive setup.
    +
    +## Creating DataFrames
    +
    +<div class="codetabs">
    +<div data-lang="scala"  markdown="1">
    +With a `SparkSession`, applications can create DataFrames from an [existing `RDD`](#interoperating-with-rdds),
    +from a Hive table, or from [Spark data sources](#data-sources).
    --- End diff --
    
    The link `[Spark data sources](#data-sources)` does not work after this change. Could you fix all the similar cases? Thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org