You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@datalab.apache.org by "Marian Hladun (Jira)" <ji...@apache.org> on 2021/09/27 13:06:00 UTC

[jira] [Commented] (DATALAB-1867) [Branch-515][Azure][RStudio]: Playbook Flight_data_Preparation_R run with error

    [ https://issues.apache.org/jira/browse/DATALAB-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17420724#comment-17420724 ] 

Marian Hladun commented on DATALAB-1867:
----------------------------------------

To run updated Flight_data_Preparation_R-Spark.r :

1. `devtools::install_github("rstudio/sparklyr")` in Rstudio console
 2. 2008.csv.bz2 must be unarchived in bucket -> 2008.csv

To connect to Spark cluster via Apache Livy:
 
 1. Create Spark Cluster by adding compute(Press gear->add compute->configure your cluster) to your RStudio Notebook.
 2. Open RStudio UI in the right-upper corner choose Connections tab and press New Connection, choose Livy. 
 3. Change connection script to: 
 library(sparklyr)
 sc <- spark_connect(master = "http://*your-cluster-master-node-ip-here*:8998",
 version = "3.1.0",
 method = "livy")
 4. Test connection and connect to your cluster.

> [Branch-515][Azure][RStudio]: Playbook Flight_data_Preparation_R run with error
> -------------------------------------------------------------------------------
>
>                 Key: DATALAB-1867
>                 URL: https://issues.apache.org/jira/browse/DATALAB-1867
>             Project: Apache DataLab
>          Issue Type: Bug
>          Components: DataLab Main
>         Environment: Branch-515
>            Reporter: Vira Vitanska
>            Assignee: Marian Hladun
>            Priority: Trivial
>              Labels: AZURE, Debian, DevOps, Known_issues(release2.4)
>             Fix For: v.2.5.1
>
>         Attachments: image-2020-09-22-13-23-03-877.png
>
>
> *Preconditions:*
> 1. RStudio is created on  Azure 
> *Steps to reproduce:*
> 1. Run Flight_data_Preparation_R 
> *Actual result:*
> 1. Playbook runs with error:
> Error in read.df(bucket_path("carriers.csv"), "csv", header = "true",  : 
>   could not find function "read.df"
>  *Expected result:*
>  # Playbook runs successfully
>  
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@datalab.apache.org
For additional commands, e-mail: dev-help@datalab.apache.org