You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@livy.apache.org by "Marcelo Vanzin (JIRA)" <ji...@apache.org> on 2017/11/29 18:50:00 UTC

[jira] [Resolved] (LIVY-417) Not able to work with dataframes on livy

     [ https://issues.apache.org/jira/browse/LIVY-417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcelo Vanzin resolved LIVY-417.
---------------------------------
    Resolution: Not A Problem

{code}
DataFrame df = client.submit(new SparkJob(sql)).get(); // SparkJob implements livy Job
{code}

You can't do that. {{DataFrame}} only really works in the process where there is a valid {{SparkContext}}. If you serialize it, you lose the reference to the context, so your local process can't really call methods on the data frame.

You'll have to change your app to keep the {{DataFrame}} only in the Livy side, and submit a Livy job every time you need to interact with it, without sending the actual data frame over the Livy client.

> Not able to work with dataframes on livy
> ----------------------------------------
>
>                 Key: LIVY-417
>                 URL: https://issues.apache.org/jira/browse/LIVY-417
>             Project: Livy
>          Issue Type: Bug
>            Reporter: Partha Pratim Ghosh
>
> I am using livy's programmatic API. The requirement is to create multiple contexts through livy, pull dataframes and later persist them back. Through a job a DataFrame can be pulled into my application. However, when persistence is required, the dataframe is sent to another Job which receives a DataFrame and tries to persist it. Here DataFrame internals are null. 
> So, what is the procedure to extract a DataFrame from spark context using livy to an application and later persisting it back to the same spark context through livy. At present not able to find any such route.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)