You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Josh Mahonin (JIRA)" <ji...@apache.org> on 2017/05/02 18:24:04 UTC
[jira] [Comment Edited] (PHOENIX-3814) Unable to connect to Phoenix via Spark

    [ https://issues.apache.org/jira/browse/PHOENIX-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993446#comment-15993446 ] 

Josh Mahonin edited comment on PHOENIX-3814 at 5/2/17 6:23 PM:
---------------------------------------------------------------

Sorry for the delay in responding, was out of town and returned to a flooded house...

At the surface, that exception doesn't seem Spark specific at all, but perhaps there's some sort of mismatch between HBase/Hadoop JARs within Spark itself. I assume the SYSTEM.MUTEX issue doesn't crop up through any other usage pattern, only through Spark? Also, since there are multiple pre-built Spark packages for multiple versions of Hadoop, which one are you using? 

Note that the Spark functionality has only been tested up to Spark 2.0, and they have a habit of breaking things between minor releases. If you're looking for a more stable solution, I would suggest looking at either Spark 1.6 or 2.0 with Phoenix 4.10 (the release is binary compatible with Spark 2.0, but if you compile your own you can specify Spark 1.6 compatibility with the 'spark16' maven profile.

Re: SaveModes, PHOENIX-2745 describes a similar issue. At this point, only 'Overwrite' is supported since the DataFrame save() does a blind upsert, without checking if the data is already present. Patches would be greatly appreciated to update this behaviour.








was (Author: jmahonin):
Sorry for the delay, in responding, was out of town and returned to a flooded house...

At the surface, that exception doesn't seem Spark specific at all, but perhaps there's some sort of mismatch between HBase/Hadoop JARs within Spark itself. I assume the SYSTEM.MUTEX issue doesn't crop up through any other usage pattern, only through Spark? Also, since there are multiple pre-built Spark packages for multiple versions of Hadoop, which one are you using? 

Note that the Spark functionality has only been tested up to Spark 2.0, and they have a habit of breaking things between minor releases. If you're looking for a more stable solution, I would suggest looking at either Spark 1.6 or 2.0 with Phoenix 4.10 (the release is binary compatible with Spark 2.0, but if you compile your own you can specify Spark 1.6 compatibility with the 'spark16' maven profile.

Re: SaveModes, PHOENIX-2745 describes a similar issue. At this point, only 'Overwrite' is supported since the DataFrame save() does a blind upsert, without checking if the data is already present. Patches would be greatly appreciated to update this behaviour.







> Unable to connect to Phoenix via Spark
> --------------------------------------
>
>                 Key: PHOENIX-3814
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3814
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.10.0
>         Environment: Ubuntu 16.04.1, Apache Spark 2.1.0, Hbase 1.2.5, Phoenix 4.10.0
>            Reporter: Wajid Khattak
>
> Please see http://stackoverflow.com/questions/43640864/apache-phoenix-for-spark-not-working



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)