You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by Paul Brenner <pb...@placeiq.com> on 2017/01/24 17:26:35 UTC

zeppelin memory usage

We are using zeppelin with multiple users on the same server and often run out of memory on the dedicated VM that zeppelin runs on. It isn’t uncommon to run out of memory on the VM zeppelin is running on when just 3-4 users are using zeppelin. Is this normal behavior? Does each user’s spark interpreter usually consume 500mb - 1gb of memory on the vm?

I thought I saw reports of companies using zeppelin with 10s or 100 users, is there something we are doing wrong?

The VM zeppelin runs on currently only has 3gb of ram so we can raise that number a little, but I don’t see how we could ever get to 10 simultaneous users.

http://www.placeiq.com/ http://www.placeiq.com/ http://www.placeiq.com/

Paul Brenner

https://twitter.com/placeiq https://twitter.com/placeiq https://twitter.com/placeiq
https://www.facebook.com/PlaceIQ https://www.facebook.com/PlaceIQ
https://www.linkedin.com/company/placeiq https://www.linkedin.com/company/placeiq

DATA SCIENTIST

(217) 390-3033 

 

http://www.placeiq.com/2015/05/26/placeiq-named-winner-of-prestigious-2015-oracle-data-cloud-activate-award/ http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/ http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/ http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/ http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/ http://placeiq.com/2016/03/08/measuring-addressable-tv-campaigns-is-now-possible/ http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/ http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/ http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/ http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/ http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/ http://pages.placeiq.com/Location-Data-Accuracy-Whitepaper-Download.html?utm_source=Signature&utm_medium=Email&utm_campaign=AccuracyWP http://placeiq.com/2016/08/03/placeiq-bolsters-location-intelligence-platform-with-mastercard-insights/ http://placeiq.com/2016/10/26/the-making-of-a-location-data-industry-milestone/ http://placeiq.com/2016/12/07/placeiq-introduces-landmark-a-groundbreaking-offering-that-delivers-access-to-the-highest-quality-location-data-for-insights-that-fuel-limitless-business-decisions/

Re: zeppelin memory usage

Posted by Paul Brenner <pb...@placeiq.com>.
It looks like one of our big problems is that zeppelin doesn’t always kill all completed processes. Is there an accepted way to kill a spark instance? Most of the time executing sys.exit in a paragraph will kill the spark instance in yarn and I believe also kill the corresponding zeppelin process… but not 100% of the time.

Similar with restarting the interpreter… most of the time kills the spark instance in yarn and kills the corresponding zeppelin process but not always.

Currently we have to check yarn and use “yarn application -kill” on anything that doesn’t get cleaned up properly there. We also have to login to the zeppelin vm and manually hunt down old processes that are still running days after their interpreters were stopped.

http://www.placeiq.com/ http://www.placeiq.com/ http://www.placeiq.com/

Paul Brenner

https://twitter.com/placeiq https://twitter.com/placeiq https://twitter.com/placeiq
https://www.facebook.com/PlaceIQ https://www.facebook.com/PlaceIQ
https://www.linkedin.com/company/placeiq https://www.linkedin.com/company/placeiq

DATA SCIENTIST

(217) 390-3033 

 

http://www.placeiq.com/2015/05/26/placeiq-named-winner-of-prestigious-2015-oracle-data-cloud-activate-award/ http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/ http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/ http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/ http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/ http://placeiq.com/2016/03/08/measuring-addressable-tv-campaigns-is-now-possible/ http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/ http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/ http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/ http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/ http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/ http://pages.placeiq.com/Location-Data-Accuracy-Whitepaper-Download.html?utm_source=Signature&utm_medium=Email&utm_campaign=AccuracyWP http://placeiq.com/2016/08/03/placeiq-bolsters-location-intelligence-platform-with-mastercard-insights/ http://placeiq.com/2016/10/26/the-making-of-a-location-data-industry-milestone/ http://placeiq.com/2016/12/07/placeiq-introduces-landmark-a-groundbreaking-offering-that-delivers-access-to-the-highest-quality-location-data-for-insights-that-fuel-limitless-business-decisions/

On Tue, Jan 24, 2017 at 12:53 PM t p

<
mailto:t p <ta...@gmail.com>
> wrote:

a, pre, code, a:link, body { word-wrap: break-word !important; }

Running in a VM we’ve noticed Zeppelin consume lot of memory and have encountered out-of-memory and GC issues - with a couple of users.

In our case, I attributed it to the use-case: Spark interpreter connecting to a Postgres DB to load few tables into data frames which had large number of rows We registered these DF's as temp tables (createOrReplaceTempView) while the notebook ran. When we ran joins across these temp tables, using spark sql, we ran out of memory.

While we’re on AWS, we were not in a position to run/provision scalable clusters (using YARN etc.) and neither did we want to provision very large EC2 instances.

To keep memory in check, we’ve changed our approach to use JDBC+Postgres to push down the predicates to Postgres and use Spark data frames for gathering the results (much smaller result sets) to interface with Zeppelin UI/pages for displaying the results etc. Effectively, we ended with dividing the work load - postgres provided the right balance of memory/disk/cpu for querying and spark interfaced well with zeppelin for analytics (esp across pages for drill down use case).

It’d be great to hear experience's from others who use Zeppelin on a single VM/host or in _non_ clustered environments.

On Jan 24, 2017, at 12:26 PM, Paul Brenner <
mailto:pbrenner@placeiq.com
> wrote:

We are using zeppelin with multiple users on the same server and often run out of memory on the dedicated VM that zeppelin runs on. It isn’t uncommon to run out of memory on the VM zeppelin is running on when just 3-4 users are using zeppelin. Is this normal behavior? Does each user’s spark interpreter usually consume 500mb - 1gb of memory on the vm?

I thought I saw reports of companies using zeppelin with 10s or 100 users, is there something we are doing wrong?

The VM zeppelin runs on currently only has 3gb of ram so we can raise that number a little, but I don’t see how we could ever get to 10 simultaneous users.

http://www.placeiq.com/ http://www.placeiq.com/ http://www.placeiq.com/

Paul Brenner

https://twitter.com/placeiq https://twitter.com/placeiq https://twitter.com/placeiq
https://www.facebook.com/PlaceIQ https://www.facebook.com/PlaceIQ
https://www.linkedin.com/company/placeiq https://www.linkedin.com/company/placeiq

DATA SCIENTIST

(217) 390-3033 

 

http://www.placeiq.com/2015/05/26/placeiq-named-winner-of-prestigious-2015-oracle-data-cloud-activate-award/ http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/ http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/ http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/ http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/ http://placeiq.com/2016/03/08/measuring-addressable-tv-campaigns-is-now-possible/ http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/ http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/ http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/ http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/ http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/ http://pages.placeiq.com/Location-Data-Accuracy-Whitepaper-Download.html?utm_source=Signature&utm_medium=Email&utm_campaign=AccuracyWP http://placeiq.com/2016/08/03/placeiq-bolsters-location-intelligence-platform-with-mastercard-insights/ http://placeiq.com/2016/10/26/the-making-of-a-location-data-industry-milestone/ http://placeiq.com/2016/12/07/placeiq-introduces-landmark-a-groundbreaking-offering-that-delivers-access-to-the-highest-quality-location-data-for-insights-that-fuel-limitless-business-decisions/

Re: zeppelin memory usage

Posted by t p <ta...@gmail.com>.
Running in a VM we’ve noticed Zeppelin consume lot of memory and have encountered out-of-memory and GC issues - with a couple of users.

In our case, I attributed it to the use-case: Spark interpreter connecting to a Postgres DB to load few tables into data frames which had large number of rows We registered these DF's as temp tables (createOrReplaceTempView) while the notebook ran. When we ran joins across these temp tables, using spark sql, we ran out of memory.

While we’re on AWS, we were not in a position to run/provision scalable clusters (using YARN etc.) and neither did we want to provision very large EC2 instances.

To keep memory in check, we’ve changed our approach to use JDBC+Postgres to push down the predicates to Postgres and use Spark data frames for gathering the results (much smaller result sets) to interface with Zeppelin UI/pages for displaying the results etc. Effectively, we ended with dividing the work load - postgres provided the right balance of memory/disk/cpu for querying and spark interfaced well with zeppelin for analytics (esp across pages for drill down use case).

It’d be great to hear experience's from others who use Zeppelin on a single VM/host or in _non_ clustered environments.



> On Jan 24, 2017, at 12:26 PM, Paul Brenner <pb...@placeiq.com> wrote:
> 
> 
> We are using zeppelin with multiple users on the same server and often run out of memory on the dedicated VM that zeppelin runs on. It isn’t uncommon to run out of memory on the VM zeppelin is running on when just 3-4 users are using zeppelin. Is this normal behavior? Does each user’s spark interpreter usually consume 500mb - 1gb of memory on the vm?
> 
> I thought I saw reports of companies using zeppelin with 10s or 100 users, is there something we are doing wrong?
> 
> The VM zeppelin runs on currently only has 3gb of ram so we can raise that number a little, but I don’t see how we could ever get to 10 simultaneous users.
> 
>  <http://www.placeiq.com/> <http://www.placeiq.com/> <http://www.placeiq.com/>	Paul Brenner	 <https://twitter.com/placeiq> <https://twitter.com/placeiq> <https://twitter.com/placeiq>	 <https://www.facebook.com/PlaceIQ> <https://www.facebook.com/PlaceIQ>	 <https://www.linkedin.com/company/placeiq> <https://www.linkedin.com/company/placeiq>
> DATA SCIENTIST
> (217) 390-3033  
> 
>  <http://www.placeiq.com/2015/05/26/placeiq-named-winner-of-prestigious-2015-oracle-data-cloud-activate-award/> <http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/> <http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/> <http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/> <http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/> <http://placeiq.com/2016/03/08/measuring-addressable-tv-campaigns-is-now-possible/> <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/> <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/> <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/> <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/> <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/> <http://pages.placeiq.com/Location-Data-Accuracy-Whitepaper-Download.html?utm_source=Signature&utm_medium=Email&utm_campaign=AccuracyWP> <http://placeiq.com/2016/08/03/placeiq-bolsters-location-intelligence-platform-with-mastercard-insights/> <http://placeiq.com/2016/10/26/the-making-of-a-location-data-industry-milestone/> <http://placeiq.com/2016/12/07/placeiq-introduces-landmark-a-groundbreaking-offering-that-delivers-access-to-the-highest-quality-location-data-for-insights-that-fuel-limitless-business-decisions/>