You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by emma davis <em...@aol.com.INVALID> on 2020/05/20 20:43:23 UTC

BOOK review of Spark: WARNING to spark users

  p { margin-bottom: 0.25cm; line-height: 115%; background: transparent } a:link { color: #000080; so-language: zxx; text-decoration: underline }
Book: MachineLearning with Apache Spark Quick Start Guide publisher :packt> 

Following this Getting Started with Python inVS Codehttps://code.visualstudio.com/docs/python/python-tutorial
I realised JillurQudus has written and published a book without any knowledgeof subject matter,amongst other things Python. 

Highlighted proof with further details further down the email. 

import findspark #these lines of code are unnecessary see link above for setupfindspark.init()
SettingSPARK_HOME or any other sparkvariables are unnecessarybecause Spark like any frameworksis self contained and has its own conf directory for startuppersistent configuration settings. Obviously the software would find its own current directory uponstarting i.e. sbin/start-master.sh
Sparkis a BIG DATA tool ( heavy distributed ,parallelism processing) soclearly you would expect its helloworld demo programs todemonstrate that.
what is the point of setting num_samples=100. something like 10**10 would make sense to test performance.

Thisis mywarning do not end up wasting yourvaluable time as I did .  I fee your time is valuable.
Irealise the scam as I got a better understanding of the product byjust doingthe correct hello world program from correct source. 
“Research byCISQ found that, in 2018, poor quality software cost organizations$2.8 trillion in the US alone. “
I attribute thisto the Indian IT industry claiming they can do job better than thenatives [US , Europeans.] Implying Indian Education or IT people issuperior. For example People like me born, live and educated  in the western Europe

https://www.it-cisq.org/the-cost-of-poor-quality-software-in-the-us-a-2018-report/The-Cost-of-Poor-Quality-Software-in-the-US-2018-Report.pdf
 
Contributors:About the Author“Jillur Qudusis a lead technical architect, polygot software engineer and datascientistwith over 10 yearsof hand-on experience in architecting and engineering distributed,scalable , highperformance .. to combat serious organised crime. Jillur hasextensive experience working with government, intelligence,lawenforcement and banking, and has worked across the world includingJapan,Singapore,Malysia,Hong Kong and New Zealand .. founder ofkeisan, a UK-based company specializing in open source distributedtechnologies and machine learning…“This obviously meansa lot to many but when I look at his work Judge for yourself based onevidence.
Page 54<quote> ”Additional PythonPackages> conda install -c conda-forge findspark> conda install -c conda-forge pykafka ...”<quote>
Theremainder of the program was copied from spark website sothat wasn’t wrong. Page 63
<quote> “> cdetc/profile.dvi spark.sh  $export SPARK_HOME=/opt/spark-2.3.2-bin-hadoop2.7>source spark.sh
..in order for the SPARK_HOME environment variable to be successfullyrecognized and registered by findspark ...….
Weare now ready to write out first spark application in Python ! …..
#(1) import required Python dependenciesimportfindsparkfindspark.init()
(3)….num_samples= 100 “ </quote>
 
emma davis
emma.davis76@aol.com

Re: BOOK review of Spark: WARNING to spark users

Posted by Jacek Laskowski <ja...@japila.pl>.
Hi Emma,

I'm curious about the purpose of the email. Mind elaborating?

Pozdrawiam,
Jacek Laskowski
----
https://about.me/JacekLaskowski
"The Internals Of" Online Books <https://books.japila.pl/>
Follow me on https://twitter.com/jaceklaskowski

<https://twitter.com/jaceklaskowski>


On Wed, May 20, 2020 at 10:43 PM emma davis <em...@aol.com.invalid>
wrote:

>
> *Book:* Machine Learning with Apache Spark Quick Start Guide
> *publisher* : packt>
>
>
> *F**ollow**ing* this Getting Started with Python in VS Code
> https://code.visualstudio.com/docs/python/python-tutorial
>
> I realised Jillur Qudus has written and published a book without any
> knowledge
> of subject matter, amongst other things Python.
>
>
>
> *Highlighted proof with further details further down the email. *
>
> import findspark # these lines of code are unnecessary see link above for
> setup
> findspark.init()
>
> Setting SPARK_HOME or any other spark variables are unnecessary because
> Spark like any
> frameworks is self contained and has its own conf directory for startup persistent
> configuration settings.
> Obviously the software would find its own current directory upon starting
> i.e. sbin/start-master.sh
>
> Spark is a BIG DATA tool ( heavy distributed ,parallelism processing) so
> clearly you would expect its hello world demo programs to demonstrate
> that.
>
> what is the point of setting num_samples=100. something like 10**10 would
> make sense to test performance.
>
>
>
> *This is my warning do not end up wasting your valuable time as I did .  I
> fee your time is valuable. *
> *I realise the scam as I got a better understanding of the product by just
> doing the correct hello world program from correct source. *
>
> “Research by CISQ found that, in 2018, poor quality software cost
> organizations $2.8 trillion in the US alone. “
>
> I attribute this to the Indian IT industry claiming they can do job better
> than the natives [US , Europeans.] Implying Indian Education or IT people
> is superior. For example People like me born, live and educated  in the
> western Europe
>
> *https://www.it-cisq.org/the-cost-of-poor-quality-software-in-the-us-a-2018-report/The-Cost-of-Poor-Quality-Software-in-the-US-2018-Report.pdf
> <https://www.it-cisq.org/the-cost-of-poor-quality-software-in-the-us-a-2018-report/The-Cost-of-Poor-Quality-Software-in-the-US-2018-Report.pdf>*
>
>
> *Contributors: About the Author*
> “*Jillur Qudus* is a lead technical architect, polygot software engineer
> and data scientist
> with over 10 years of hand-on experience in architecting and engineering
> distributed,
> scalable , high performance .. to combat serious organised crime. Jillur
> has extensive experience working with government, intelligence,law
> enforcement and banking, and has worked across the world including
> Japan,Singapore,Malysia,Hong Kong and New Zealand .. founder of keisan, a
> UK-based company specializing in open source distributed technologies and
> machine learning…“
> This obviously means a lot to many but when I look at his work Judge for
> yourself based on evidence.
>
> *Page 54*
> *<quote> ”*
> Additional Python Packages
> > conda install -c conda-forge findspark
> > conda install -c conda-forge pykafka
> ...”*<quote>*
>
> The remainder of the program was copied from spark website so that wasn’t
> wrong.
> *Page 63*
>
> *<quote> “*
> > cd *etc*/profile.d
> vi spark.sh
> $ export SPARK_HOME=/opt/spark-2.3.2-bin-hadoop2.7
> > source spark.sh
>
> .. in order for the SPARK_HOME environment variable to be successfully
> recognized and registered by findspark ...
> ….
>
> We are now ready to write out first spark application in Python ! …..
>
> # (1) import required Python dependencies
> import findspark
> findspark.init()
>
> (3)
> ….
> num_samples = 100 *“ **</quote>*
>
>
> emma davis
> emma.davis76@aol.com
>
>