You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@phoenix.apache.org by jm...@apache.org on 2015/10/18 18:26:34 UTC
svn commit: r1709291 - in /phoenix/site: publish/language/datatypes.html
publish/phoenix_spark.html source/src/site/markdown/phoenix_spark.md
Author: jmahonin
Date: Sun Oct 18 16:26:34 2015
New Revision: 1709291
URL: http://svn.apache.org/viewvc?rev=1709291&view=rev
Log:
Update phoenix-spark docs, added 'Why not JDBC?' section
Modified:
phoenix/site/publish/language/datatypes.html
phoenix/site/publish/phoenix_spark.html
phoenix/site/source/src/site/markdown/phoenix_spark.md
Modified: phoenix/site/publish/language/datatypes.html
URL: http://svn.apache.org/viewvc/phoenix/site/publish/language/datatypes.html?rev=1709291&r1=1709290&r2=1709291&view=diff
==============================================================================
--- phoenix/site/publish/language/datatypes.html (original)
+++ phoenix/site/publish/language/datatypes.html Sun Oct 18 16:26:34 2015
@@ -1,7 +1,7 @@
<!DOCTYPE html>
<!--
- Generated by Apache Maven Doxia at 2015-10-05
+ Generated by Apache Maven Doxia at 2015-10-18
Rendered using Reflow Maven Skin 1.1.0 (http://andriusvelykis.github.io/reflow-maven-skin)
-->
<html xml:lang="en" lang="en">
Modified: phoenix/site/publish/phoenix_spark.html
URL: http://svn.apache.org/viewvc/phoenix/site/publish/phoenix_spark.html?rev=1709291&r1=1709290&r2=1709291&view=diff
==============================================================================
--- phoenix/site/publish/phoenix_spark.html (original)
+++ phoenix/site/publish/phoenix_spark.html Sun Oct 18 16:26:34 2015
@@ -1,7 +1,7 @@
<!DOCTYPE html>
<!--
- Generated by Apache Maven Doxia at 2015-10-01
+ Generated by Apache Maven Doxia at 2015-10-18
Rendered using Reflow Maven Skin 1.1.0 (http://andriusvelykis.github.io/reflow-maven-skin)
-->
<html xml:lang="en" lang="en">
@@ -149,10 +149,16 @@
<h4 id="Prerequisites">Prerequisites</h4>
<ul>
<li>Phoenix 4.4.0+</li>
- <li>Spark 1.3.0+</li>
+ <li>Spark 1.3.1+</li>
</ul>
</div>
<div class="section">
+ <h4 id="Why_not_JDBC">Why not JDBC?</h4>
+ <p>Although Spark supports connecting directly to JDBC databases, itâs only able to parallelize queries by partioning on a numeric column. It also requires a known lower bound, upper bound and partition count in order to create split queries.</p>
+ <p>In contrast, the phoenix-spark integration is able to leverage the underlying splits provided by Phoenix in order to retrieve and save data across multiple workers. All thatâs required is a database URL and a table name. Optional SELECT columns can be given, as well as pushdown predicates for efficient filtering.</p>
+ <p>The choice of which method to use to access Phoenix comes down to each specific use case.</p>
+ </div>
+ <div class="section">
<h4 id="Spark_setup">Spark setup</h4>
<ol style="list-style-type: decimal">
<li>Ensure that all requisite Phoenix / HBase platform dependencies are available on the classpath for the Spark executors and drivers</li>
Modified: phoenix/site/source/src/site/markdown/phoenix_spark.md
URL: http://svn.apache.org/viewvc/phoenix/site/source/src/site/markdown/phoenix_spark.md?rev=1709291&r1=1709290&r2=1709291&view=diff
==============================================================================
--- phoenix/site/source/src/site/markdown/phoenix_spark.md (original)
+++ phoenix/site/source/src/site/markdown/phoenix_spark.md Sun Oct 18 16:26:34 2015
@@ -6,7 +6,21 @@ as RDDs or DataFrames, and enables persi
#### Prerequisites
* Phoenix 4.4.0+
-* Spark 1.3.0+
+* Spark 1.3.1+
+
+#### Why not JDBC?
+
+Although Spark supports connecting directly to JDBC databases, it's only able to parallelize
+queries by partioning on a numeric column. It also requires a known lower bound, upper bound
+and partition count in order to create split queries.
+
+In contrast, the phoenix-spark integration is able to leverage the underlying splits provided by
+Phoenix in order to retrieve and save data across multiple workers. All that's required is a
+database URL and a table name. Optional SELECT columns can be given, as well as pushdown predicates
+for efficient filtering.
+
+The choice of which method to use to access Phoenix comes down to each specific use case.
+
#### Spark setup