You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/02/09 07:34:42 UTC

[GitHub] [spark] DJeyCodeX opened a new pull request #27506: Demo Pipeline of Migrating OnPremise Database to Spark & Hadoop

DJeyCodeX opened a new pull request #27506: Demo Pipeline of Migrating OnPremise Database to Spark & Hadoop
URL: https://github.com/apache/spark/pull/27506
 
 
   ### What changes were proposed in this pull request?
   This PR consist the following:
   
   1. Many of the organisations are facing issue to migrate their current database such as Mysql to Hadoop & Spark Ecosystem
   2. Created a Demo Pipeline where I have covered 3 use cases:
      **Case 1: Storing & then reading from HDFS Part File in Spark**
      **Case 2: Converting it into parquete format & then reading from parquete file format in SPARK**
      **Special Case: Directly analyisng in Spark from MySQL without storing in HDFS**
   3. Finally after all the aggregations in Spark, generating a reporting Dashboard using Tableau.
   
   Well, this Code may help many of the Spark Users who are willing to do this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on issue #27506: Demo Pipeline of Migrating OnPremise Database to Spark & Hadoop

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on issue #27506: Demo Pipeline of Migrating OnPremise Database to Spark & Hadoop
URL: https://github.com/apache/spark/pull/27506#issuecomment-584045216
 
 
   Please see https://spark.apache.org/contributing.html

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gatorsmile commented on a change in pull request #27506: Demo Pipeline of Migrating OnPremise Database to Spark & Hadoop

Posted by GitBox <gi...@apache.org>.
gatorsmile commented on a change in pull request #27506: Demo Pipeline of Migrating OnPremise Database to Spark & Hadoop
URL: https://github.com/apache/spark/pull/27506#discussion_r379161875
 
 

 ##########
 File path: Migrating-MySQL-Database-to-Spark/README.md
 ##########
 @@ -0,0 +1,135 @@
+# Migrating MySQL Database to Spark
 
 Review comment:
   I would suggest to write a blog post instead of adding this to our documentation? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27506: Demo Pipeline of Migrating OnPremise Database to Spark & Hadoop

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27506: Demo Pipeline of Migrating OnPremise Database to Spark & Hadoop
URL: https://github.com/apache/spark/pull/27506#issuecomment-583814867
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27506: Demo Pipeline of Migrating OnPremise Database to Spark & Hadoop

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27506: Demo Pipeline of Migrating OnPremise Database to Spark & Hadoop
URL: https://github.com/apache/spark/pull/27506#issuecomment-583814867
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27506: Demo Pipeline of Migrating OnPremise Database to Spark & Hadoop

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27506: Demo Pipeline of Migrating OnPremise Database to Spark & Hadoop
URL: https://github.com/apache/spark/pull/27506#issuecomment-583814967
 
 
   Can one of the admins verify this patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org