You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Gaurav1809 <ga...@gmail.com> on 2017/04/17 04:55:20 UTC

Shall I use Apache Zeppelin for data analytics & visualization?

Hi All, I am looking for a data visualization (and analytics) tool. My
processing is done through Spark. There are many tools available around us.
I got some suggestions on Apache Zeppelin too? Can anybody throw some light
on its power and capabilities when it comes to data analytics and
visualization? If there are any better options than this, do suggest too.
One of the options came to me was Kibana (from ELK stack). Thanks.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Shall-I-use-Apache-Zeppelin-for-data-analytics-visualization-tp28604.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: Shall I use Apache Zeppelin for data analytics & visualization?

Posted by Jayant Shekhar <ja...@gmail.com>.
Hello Gaurav,

Pre-calculating the results would obviously be a great idea - and load the
results into a serving store from where you serve it out to your customers
- as suggested by Jorn.

And run it every hour/day, depending on your requirements.

Zeppelin (as mentioned by Ayan) would not be a good tool for this use case
as it more for interactive data exploration.

You can hand-code your spark jobs, or if SQL does the job you can use it,
or use a drag and drop tool for creating your workflows for your reports
and/or incorporate ML into it.

Jayant




On Mon, Apr 17, 2017 at 7:17 AM, ayan guha <gu...@gmail.com> wrote:

> Zeppelin is more useful for interactive data exploration. If tye reports
> are known beforehand then any good reporting tool should work, such as
> tablaue, qlic, power bi etc. zeppelin is not fit for this use case.
>
> On Mon, 17 Apr 2017 at 6:57 pm, Gaurav Pandya <ga...@gmail.com>
> wrote:
>
>> Thanks Jorn. Yes, I will precalculate the results. Do you think Zeppelin
>> can work here?
>>
>> On Mon, Apr 17, 2017 at 1:41 PM, Jörn Franke <jo...@gmail.com>
>> wrote:
>>
>>> Processing through Spark is fine, but I do not recommend that each of
>>> the users triggers a Spark query. So either you precalculate the reports in
>>> Spark so that the reports themselves do not trigger Spark queries or you
>>> have a database that serves the report. For the latter case there are tons
>>> of commercial tools. Depending on the type of report you can also use a
>>> custom report tool or write your own dashboard with ds3.js visualizations.
>>>
>>> On 17. Apr 2017, at 09:49, Gaurav Pandya <ga...@gmail.com>
>>> wrote:
>>>
>>> Thanks for the revert Jorn.
>>> In my case, I am going to put the analysis on e-commerce website so
>>> naturally users will be more and it will keep growing when e-commerce
>>> website captures market. Users will not be doing any analysis here. Reports
>>> will show their purchasing behaviour and pattern (kind of Machine learning
>>> stuff).
>>> Please note that all processing will be done in Spark here. Please share
>>> your thoughts. Thanks again.
>>>
>>> On Mon, Apr 17, 2017 at 12:58 PM, Jörn Franke <jo...@gmail.com>
>>> wrote:
>>>
>>>> I think it highly depends on your requirements. There are various tools
>>>> for analyzing and visualizing data. How many concurrent users do you have?
>>>> What analysis do they do? How much data is involved? Do they have to
>>>> process the data all the time or can they live with sampling which
>>>> increases performance and response time significantly.
>>>> In lambda architecture terms you may want to think about different
>>>> technologies in the serving layer.
>>>>
>>>> > On 17. Apr 2017, at 06:55, Gaurav1809 <ga...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Hi All, I am looking for a data visualization (and analytics) tool. My
>>>> > processing is done through Spark. There are many tools available
>>>> around us.
>>>> > I got some suggestions on Apache Zeppelin too? Can anybody throw some
>>>> light
>>>> > on its power and capabilities when it comes to data analytics and
>>>> > visualization? If there are any better options than this, do suggest
>>>> too.
>>>> > One of the options came to me was Kibana (from ELK stack). Thanks.
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > View this message in context: http://apache-spark-user-list.
>>>> 1001560.n3.nabble.com/Shall-I-use-Apache-Zeppelin-for-data-
>>>> analytics-visualization-tp28604.html
>>>> > Sent from the Apache Spark User List mailing list archive at
>>>> Nabble.com.
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> > To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>>> >
>>>>
>>>
>>>
>> --
> Best Regards,
> Ayan Guha
>

Re: Shall I use Apache Zeppelin for data analytics & visualization?

Posted by ayan guha <gu...@gmail.com>.
Zeppelin is more useful for interactive data exploration. If tye reports
are known beforehand then any good reporting tool should work, such as
tablaue, qlic, power bi etc. zeppelin is not fit for this use case.

On Mon, 17 Apr 2017 at 6:57 pm, Gaurav Pandya <ga...@gmail.com>
wrote:

> Thanks Jorn. Yes, I will precalculate the results. Do you think Zeppelin
> can work here?
>
> On Mon, Apr 17, 2017 at 1:41 PM, Jörn Franke <jo...@gmail.com> wrote:
>
>> Processing through Spark is fine, but I do not recommend that each of the
>> users triggers a Spark query. So either you precalculate the reports in
>> Spark so that the reports themselves do not trigger Spark queries or you
>> have a database that serves the report. For the latter case there are tons
>> of commercial tools. Depending on the type of report you can also use a
>> custom report tool or write your own dashboard with ds3.js visualizations.
>>
>> On 17. Apr 2017, at 09:49, Gaurav Pandya <ga...@gmail.com> wrote:
>>
>> Thanks for the revert Jorn.
>> In my case, I am going to put the analysis on e-commerce website so
>> naturally users will be more and it will keep growing when e-commerce
>> website captures market. Users will not be doing any analysis here. Reports
>> will show their purchasing behaviour and pattern (kind of Machine learning
>> stuff).
>> Please note that all processing will be done in Spark here. Please share
>> your thoughts. Thanks again.
>>
>> On Mon, Apr 17, 2017 at 12:58 PM, Jörn Franke <jo...@gmail.com>
>> wrote:
>>
>>> I think it highly depends on your requirements. There are various tools
>>> for analyzing and visualizing data. How many concurrent users do you have?
>>> What analysis do they do? How much data is involved? Do they have to
>>> process the data all the time or can they live with sampling which
>>> increases performance and response time significantly.
>>> In lambda architecture terms you may want to think about different
>>> technologies in the serving layer.
>>>
>>> > On 17. Apr 2017, at 06:55, Gaurav1809 <ga...@gmail.com> wrote:
>>> >
>>> > Hi All, I am looking for a data visualization (and analytics) tool. My
>>> > processing is done through Spark. There are many tools available
>>> around us.
>>> > I got some suggestions on Apache Zeppelin too? Can anybody throw some
>>> light
>>> > on its power and capabilities when it comes to data analytics and
>>> > visualization? If there are any better options than this, do suggest
>>> too.
>>> > One of the options came to me was Kibana (from ELK stack). Thanks.
>>> >
>>> >
>>> >
>>> > --
>>> > View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Shall-I-use-Apache-Zeppelin-for-data-analytics-visualization-tp28604.html
>>> > Sent from the Apache Spark User List mailing list archive at
>>> Nabble.com.
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>> >
>>>
>>
>>
> --
Best Regards,
Ayan Guha

Re: Shall I use Apache Zeppelin for data analytics & visualization?

Posted by Gaurav Pandya <ga...@gmail.com>.
Thanks Jorn. Yes, I will precalculate the results. Do you think Zeppelin
can work here?

On Mon, Apr 17, 2017 at 1:41 PM, Jörn Franke <jo...@gmail.com> wrote:

> Processing through Spark is fine, but I do not recommend that each of the
> users triggers a Spark query. So either you precalculate the reports in
> Spark so that the reports themselves do not trigger Spark queries or you
> have a database that serves the report. For the latter case there are tons
> of commercial tools. Depending on the type of report you can also use a
> custom report tool or write your own dashboard with ds3.js visualizations.
>
> On 17. Apr 2017, at 09:49, Gaurav Pandya <ga...@gmail.com> wrote:
>
> Thanks for the revert Jorn.
> In my case, I am going to put the analysis on e-commerce website so
> naturally users will be more and it will keep growing when e-commerce
> website captures market. Users will not be doing any analysis here. Reports
> will show their purchasing behaviour and pattern (kind of Machine learning
> stuff).
> Please note that all processing will be done in Spark here. Please share
> your thoughts. Thanks again.
>
> On Mon, Apr 17, 2017 at 12:58 PM, Jörn Franke <jo...@gmail.com>
> wrote:
>
>> I think it highly depends on your requirements. There are various tools
>> for analyzing and visualizing data. How many concurrent users do you have?
>> What analysis do they do? How much data is involved? Do they have to
>> process the data all the time or can they live with sampling which
>> increases performance and response time significantly.
>> In lambda architecture terms you may want to think about different
>> technologies in the serving layer.
>>
>> > On 17. Apr 2017, at 06:55, Gaurav1809 <ga...@gmail.com> wrote:
>> >
>> > Hi All, I am looking for a data visualization (and analytics) tool. My
>> > processing is done through Spark. There are many tools available around
>> us.
>> > I got some suggestions on Apache Zeppelin too? Can anybody throw some
>> light
>> > on its power and capabilities when it comes to data analytics and
>> > visualization? If there are any better options than this, do suggest
>> too.
>> > One of the options came to me was Kibana (from ELK stack). Thanks.
>> >
>> >
>> >
>> > --
>> > View this message in context: http://apache-spark-user-list.
>> 1001560.n3.nabble.com/Shall-I-use-Apache-Zeppelin-for-data-a
>> nalytics-visualization-tp28604.html
>> > Sent from the Apache Spark User List mailing list archive at Nabble.com
>> .
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>> >
>>
>
>

Re: Shall I use Apache Zeppelin for data analytics & visualization?

Posted by Jörn Franke <jo...@gmail.com>.
Processing through Spark is fine, but I do not recommend that each of the users triggers a Spark query. So either you precalculate the reports in Spark so that the reports themselves do not trigger Spark queries or you have a database that serves the report. For the latter case there are tons of commercial tools. Depending on the type of report you can also use a custom report tool or write your own dashboard with ds3.js visualizations.

> On 17. Apr 2017, at 09:49, Gaurav Pandya <ga...@gmail.com> wrote:
> 
> Thanks for the revert Jorn.
> In my case, I am going to put the analysis on e-commerce website so naturally users will be more and it will keep growing when e-commerce website captures market. Users will not be doing any analysis here. Reports will show their purchasing behaviour and pattern (kind of Machine learning stuff).
> Please note that all processing will be done in Spark here. Please share your thoughts. Thanks again.
> 
>> On Mon, Apr 17, 2017 at 12:58 PM, Jörn Franke <jo...@gmail.com> wrote:
>> I think it highly depends on your requirements. There are various tools for analyzing and visualizing data. How many concurrent users do you have? What analysis do they do? How much data is involved? Do they have to process the data all the time or can they live with sampling which increases performance and response time significantly.
>> In lambda architecture terms you may want to think about different technologies in the serving layer.
>> 
>> > On 17. Apr 2017, at 06:55, Gaurav1809 <ga...@gmail.com> wrote:
>> >
>> > Hi All, I am looking for a data visualization (and analytics) tool. My
>> > processing is done through Spark. There are many tools available around us.
>> > I got some suggestions on Apache Zeppelin too? Can anybody throw some light
>> > on its power and capabilities when it comes to data analytics and
>> > visualization? If there are any better options than this, do suggest too.
>> > One of the options came to me was Kibana (from ELK stack). Thanks.
>> >
>> >
>> >
>> > --
>> > View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Shall-I-use-Apache-Zeppelin-for-data-analytics-visualization-tp28604.html
>> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>> >
> 

Re: Shall I use Apache Zeppelin for data analytics & visualization?

Posted by Gaurav Pandya <ga...@gmail.com>.
Thanks for the revert Jorn.
In my case, I am going to put the analysis on e-commerce website so
naturally users will be more and it will keep growing when e-commerce
website captures market. Users will not be doing any analysis here. Reports
will show their purchasing behaviour and pattern (kind of Machine learning
stuff).
Please note that all processing will be done in Spark here. Please share
your thoughts. Thanks again.

On Mon, Apr 17, 2017 at 12:58 PM, Jörn Franke <jo...@gmail.com> wrote:

> I think it highly depends on your requirements. There are various tools
> for analyzing and visualizing data. How many concurrent users do you have?
> What analysis do they do? How much data is involved? Do they have to
> process the data all the time or can they live with sampling which
> increases performance and response time significantly.
> In lambda architecture terms you may want to think about different
> technologies in the serving layer.
>
> > On 17. Apr 2017, at 06:55, Gaurav1809 <ga...@gmail.com> wrote:
> >
> > Hi All, I am looking for a data visualization (and analytics) tool. My
> > processing is done through Spark. There are many tools available around
> us.
> > I got some suggestions on Apache Zeppelin too? Can anybody throw some
> light
> > on its power and capabilities when it comes to data analytics and
> > visualization? If there are any better options than this, do suggest too.
> > One of the options came to me was Kibana (from ELK stack). Thanks.
> >
> >
> >
> > --
> > View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Shall-I-use-Apache-Zeppelin-for-data-
> analytics-visualization-tp28604.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe e-mail: user-unsubscribe@spark.apache.org
> >
>

Re: Shall I use Apache Zeppelin for data analytics & visualization?

Posted by Jörn Franke <jo...@gmail.com>.
I think it highly depends on your requirements. There are various tools for analyzing and visualizing data. How many concurrent users do you have? What analysis do they do? How much data is involved? Do they have to process the data all the time or can they live with sampling which increases performance and response time significantly.
In lambda architecture terms you may want to think about different technologies in the serving layer.

> On 17. Apr 2017, at 06:55, Gaurav1809 <ga...@gmail.com> wrote:
> 
> Hi All, I am looking for a data visualization (and analytics) tool. My
> processing is done through Spark. There are many tools available around us.
> I got some suggestions on Apache Zeppelin too? Can anybody throw some light
> on its power and capabilities when it comes to data analytics and
> visualization? If there are any better options than this, do suggest too.
> One of the options came to me was Kibana (from ELK stack). Thanks.
> 
> 
> 
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Shall-I-use-Apache-Zeppelin-for-data-analytics-visualization-tp28604.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org