You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by Al...@t-systems.com on 2018/01/23 14:36:55 UTC

Whats the best practice limit of Query results?

Hi guys

We're using Zeppelin to do some analysis of log files (Cloudera Cluster, Zeppelin 0.7.1 currently) and we're experiencing that zeppelin tends to get really slow when notebooks / queries return large datasets.


*         Is there a best practice on what amounts of data / query results zeppelin can handle?

*         And is there a way to increase the performance?

o   (This may even be actually browser specific?)

As an example we'd like to be able to save a simple select timestamp, hostname, etc.. query, displayed in a table as a csv file. This will work fine, as long as the resultset is "small enough". Once a certain size is exceeded, it takes veeery long until the "save as" popup window appears (if it appears at all)

We experience the same extremely slow behavior when large resultsets are used for charts - the notebooks become unusable (too slow, browser becomes irresponsive)

How are you guys dealing with this?

Thanks in advance
Alex

RE: Whats the best practice limit of Query results?

Posted by Belousov Maksim Eduardovich <m....@tinkoff.ru>.
Hi Alexander!

There was PR2323 [1] "[ZEPPELIN-2411] Improve Table" that added UI-grid [2]
The UI-grid excellent processes a huge amount of data and has a nice functionality.


[1] https://github.com/apache/zeppelin/pull/2323
[2] http://ui-grid.info/


Regards,

Maksim Belousov


From: Alexander.Meier@t-systems.com [mailto:Alexander.Meier@t-systems.com]
Sent: Tuesday, January 23, 2018 5:37 PM
To: users@zeppelin.apache.org
Subject: Whats the best practice limit of Query results?

Hi guys

We're using Zeppelin to do some analysis of log files (Cloudera Cluster, Zeppelin 0.7.1 currently) and we're experiencing that zeppelin tends to get really slow when notebooks / queries return large datasets.


*         Is there a best practice on what amounts of data / query results zeppelin can handle?

*         And is there a way to increase the performance?

o   (This may even be actually browser specific?)

As an example we'd like to be able to save a simple select timestamp, hostname, etc.. query, displayed in a table as a csv file. This will work fine, as long as the resultset is "small enough". Once a certain size is exceeded, it takes veeery long until the "save as" popup window appears (if it appears at all)

We experience the same extremely slow behavior when large resultsets are used for charts - the notebooks become unusable (too slow, browser becomes irresponsive)

How are you guys dealing with this?

Thanks in advance
Alex