You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hivemall.apache.org by my...@apache.org on 2018/09/06 09:47:02 UTC
incubator-hivemall git commit: [HIVEMALL-217] Resolve missing links for user manual

Repository: incubator-hivemall
Updated Branches:
  refs/heads/master e4aef6116 -> 30593b14b


[HIVEMALL-217] Resolve missing links for user manual

## What changes were proposed in this pull request?

Fix missing links and unintended redirection on the document.
- Resolve unintended redirects
- Use https insted of http if possible
- Add instruction for KDD Cup 2012 evaluation code

There are still known issues required to be fixed:
- Change of Kaggle documents loses good refere for [Log Loss](https://www.kaggle.com/wiki/LogarithmicLoss) and [Metrics](https://www.kaggle.com/wiki/Metrics) for evaluation for regression page
- Due to version up of EMR version, [this link](https://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-ami.html) redirects unintended page
- Mix server related documents are outdated; tips/mixserver, tips/hadoop_tuning

## What type of PR is it?

Improvement

## What is the Jira issue?

https://issues.apache.org/jira/projects/HIVEMALL/issues/HIVEMALL-217

## How was this patch tested?

manual tests

Author: Aki Ariga <ar...@treasure-data.com>

Closes #162 from chezou/resolve-missinglink.


Project: http://git-wip-us.apache.org/repos/asf/incubator-hivemall/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-hivemall/commit/30593b14
Tree: http://git-wip-us.apache.org/repos/asf/incubator-hivemall/tree/30593b14
Diff: http://git-wip-us.apache.org/repos/asf/incubator-hivemall/diff/30593b14

Branch: refs/heads/master
Commit: 30593b14b4006feb0c01e27321830251b7aeb703
Parents: e4aef61
Author: Aki Ariga <ar...@treasure-data.com>
Authored: Thu Sep 6 18:46:56 2018 +0900
Committer: Makoto Yui <my...@apache.org>
Committed: Thu Sep 6 18:46:56 2018 +0900

----------------------------------------------------------------------
 docs/gitbook/README.md                                    |  2 +-
 docs/gitbook/anomaly/changefinder.md                      |  2 +-
 docs/gitbook/anomaly/lof.md                               |  4 ++--
 docs/gitbook/anomaly/sst.md                               |  4 ++--
 docs/gitbook/binaryclass/a9a_dataset.md                   |  2 +-
 docs/gitbook/binaryclass/general.md                       |  2 +-
 docs/gitbook/binaryclass/kdd2010a_dataset.md              |  4 ++--
 docs/gitbook/binaryclass/kdd2010b_dataset.md              |  4 ++--
 docs/gitbook/binaryclass/news20_dataset.md                |  2 +-
 docs/gitbook/binaryclass/news20_rf.md                     |  2 +-
 docs/gitbook/binaryclass/titanic_rf.md                    |  4 ++--
 docs/gitbook/binaryclass/webspam_dataset.md               |  2 +-
 docs/gitbook/book.json                                    |  4 ++--
 docs/gitbook/clustering/plsa.md                           |  6 +++---
 docs/gitbook/eval/binary_classification_measures.md       |  4 ++--
 docs/gitbook/eval/lr_datagen.md                           |  2 +-
 docs/gitbook/eval/regression.md                           |  2 +-
 docs/gitbook/ft_engineering/polynomial.md                 |  4 ++--
 docs/gitbook/ft_engineering/scaling.md                    |  4 ++--
 docs/gitbook/ft_engineering/selection.md                  |  2 +-
 docs/gitbook/ft_engineering/tfidf.md                      |  2 +-
 docs/gitbook/geospatial/latlon.md                         |  4 ++--
 docs/gitbook/getting_started/input-format.md              | 10 +++++-----
 docs/gitbook/misc/approx.md                               |  2 +-
 docs/gitbook/misc/funcs.md                                |  2 +-
 docs/gitbook/misc/generic_funcs.md                        |  2 +-
 docs/gitbook/misc/tokenizer.md                            |  4 ++--
 docs/gitbook/misc/topk.md                                 |  4 ++--
 docs/gitbook/multiclass/iris_dataset.md                   |  2 +-
 docs/gitbook/multiclass/iris_randomforest.md              |  4 ++--
 docs/gitbook/multiclass/news20_dataset.md                 |  2 +-
 docs/gitbook/multiclass/news20_one-vs-the-rest_dataset.md |  2 +-
 docs/gitbook/recommend/movielens_cv.md                    |  4 ++--
 docs/gitbook/recommend/movielens_dataset.md               |  2 +-
 docs/gitbook/recommend/movielens_slim.md                  |  2 +-
 docs/gitbook/regression/e2006_arow.md                     |  2 +-
 docs/gitbook/regression/e2006_dataset.md                  |  2 +-
 docs/gitbook/regression/general.md                        |  2 +-
 docs/gitbook/regression/kddcup12tr2_dataset.md            |  4 ++--
 docs/gitbook/regression/kddcup12tr2_lr.md                 |  5 +++--
 docs/gitbook/regression/kddcup12tr2_lr_amplify.md         |  4 ++--
 docs/gitbook/spark/binaryclass/a9a_df.md                  |  6 +++---
 docs/gitbook/spark/binaryclass/a9a_sql.md                 |  6 +++---
 docs/gitbook/spark/getting_started/installation.md        |  4 ++--
 docs/gitbook/spark/regression/e2006_df.md                 |  6 +++---
 docs/gitbook/spark/regression/e2006_sql.md                |  7 ++++---
 docs/gitbook/supervised_learning/prediction.md            |  4 ++--
 docs/gitbook/tips/emr.md                                  |  8 ++++----
 docs/gitbook/tips/hadoop_tuning.md                        |  6 +++---
 docs/gitbook/tips/mixserver.md                            |  2 +-
 docs/gitbook/tips/rand_amplify.md                         |  4 ++--
 docs/gitbook/tips/rt_prediction.md                        |  2 +-
 docs/gitbook/troubleshooting/mapjoin_classcastex.md       |  2 +-
 docs/gitbook/troubleshooting/oom.md                       |  2 +-
 54 files changed, 95 insertions(+), 93 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/README.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/README.md b/docs/gitbook/README.md
index 164ef74..107c4df 100644
--- a/docs/gitbook/README.md
+++ b/docs/gitbook/README.md
@@ -29,7 +29,7 @@ Apache Hivemall offers a variety of functionalities: <strong>regression, classif
 
 ## Architecture
 
-Apache Hivemall is mainly designed to run on [Apache Hive](https://hive.apache.org/) but it also supports [Apache Pig](https://pig.apache.org/) and [Apache Spark](http://spark.apache.org/) for the runtime.
+Apache Hivemall is mainly designed to run on [Apache Hive](https://hive.apache.org/) but it also supports [Apache Pig](https://pig.apache.org/) and [Apache Spark](https://spark.apache.org/) for the runtime.
 Thus, it can be considered as a cross platform library for machine learning; prediction models built by a batch query of Apache Hive can be used on Apache Spark/Pig, and conversely, prediction models build by Apache Spark can be used from Apache Hive/Pig.
 
 <div style="text-align:center"><img src="./resources/images/techstack.png" width="80%" height="80%"/></div>

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/anomaly/changefinder.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/anomaly/changefinder.md b/docs/gitbook/anomaly/changefinder.md
index 7157e5d..6f2da22 100644
--- a/docs/gitbook/anomaly/changefinder.md
+++ b/docs/gitbook/anomaly/changefinder.md
@@ -21,7 +21,7 @@ In a context of anomaly detection, there are two types of anomalies, ***outlier*
 
 In some cases, we might want to detect outlier and change-point simultaneously in order to figure out characteristics of a time series both in a local and global scale. **ChangeFinder** is an anomaly detection technique which enables us to detect both of outliers and change-points in a single framework. A key reference for the technique is:
 
-* K. Yamanishi and J. Takeuchi. [A Unifying Framework for Detecting Outliers and Change Points from Non-Stationary Time Series Data](http://dl.acm.org/citation.cfm?id=775148). KDD'02.
+* K. Yamanishi and J. Takeuchi. [A Unifying Framework for Detecting Outliers and Change Points from Non-Stationary Time Series Data](https://dl.acm.org/citation.cfm?id=775148). KDD'02.
 
 <!-- toc -->
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/anomaly/lof.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/anomaly/lof.md b/docs/gitbook/anomaly/lof.md
index c2d396b..2631cb1 100644
--- a/docs/gitbook/anomaly/lof.md
+++ b/docs/gitbook/anomaly/lof.md
@@ -17,7 +17,7 @@
   under the License.
 -->
         
-This article introduces how to find outliers using [Local Outlier Detection (LOF)](http://en.wikipedia.org/wiki/Local_outlier_factor) on Hivemall.
+This article introduces how to find outliers using [Local Outlier Detection (LOF)](https://en.wikipedia.org/wiki/Local_outlier_factor) on Hivemall.
 
 <!-- toc -->
 
@@ -38,7 +38,7 @@ ROW FORMAT DELIMITED
 STORED AS TEXTFILE LOCATION '/dataset/lof/hundred_balls';
 ```
 
-Download [hundred_balls.txt](https://gist.githubusercontent.com/myui/f8b44ab925bc198e6d11b18fdd21269d/raw/bed05f811e4c351ed959e0159405690f2f11e577/hundred_balls.txt) that is originally provides in [this article](http://next.rikunabi.com/tech/docs/ct_s03600.jsp?p=002259).
+Download [hundred_balls.txt](https://gist.githubusercontent.com/myui/f8b44ab925bc198e6d11b18fdd21269d/raw/bed05f811e4c351ed959e0159405690f2f11e577/hundred_balls.txt) that is originally provides in [this article](https://next.rikunabi.com/tech/docs/ct_s03600.jsp?p=002259).
 
 In this example, Rowid `87` is apparently an outlier.
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/anomaly/sst.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/anomaly/sst.md b/docs/gitbook/anomaly/sst.md
index 6fc49af..2494e12 100644
--- a/docs/gitbook/anomaly/sst.md
+++ b/docs/gitbook/anomaly/sst.md
@@ -19,8 +19,8 @@
         
 This page introduces how to find change-points using **Singular Spectrum Transformation** (SST) on Hivemall. The following papers describe the details of this technique:
 
-* T. Idé and K. Inoue. [Knowledge Discovery from Heterogeneous Dynamic Systems using Change-Point Correlations](http://epubs.siam.org/doi/abs/10.1137/1.9781611972757.63). SDM'05.
-* T. Idé and K. Tsuda. [Change-Point Detection using Krylov Subspace Learning](http://epubs.siam.org/doi/abs/10.1137/1.9781611972771.54). SDM'07.
+* T. Idé and K. Inoue. [Knowledge Discovery from Heterogeneous Dynamic Systems using Change-Point Correlations](https://epubs.siam.org/doi/abs/10.1137/1.9781611972757.63). SDM'05.
+* T. Idé and K. Tsuda. [Change-Point Detection using Krylov Subspace Learning](https://epubs.siam.org/doi/abs/10.1137/1.9781611972771.54). SDM'07.
 
 <!-- toc -->
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/binaryclass/a9a_dataset.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/binaryclass/a9a_dataset.md b/docs/gitbook/binaryclass/a9a_dataset.md
index 26b1700..6af9252 100644
--- a/docs/gitbook/binaryclass/a9a_dataset.md
+++ b/docs/gitbook/binaryclass/a9a_dataset.md
@@ -19,7 +19,7 @@
         
 a9a
 ===
-http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#a9a
+https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#a9a
 
 ---
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/binaryclass/general.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/binaryclass/general.md b/docs/gitbook/binaryclass/general.md
index 60483a4..a14130c 100644
--- a/docs/gitbook/binaryclass/general.md
+++ b/docs/gitbook/binaryclass/general.md
@@ -19,7 +19,7 @@
 
 Hivemall has a generic function for classification: `train_classifier`. Compared to the other functions we will see in the later chapters, `train_classifier` provides simpler and configureable generic interface which can be utilized to build binary classification models in a variety of settings.
 
-Here, we briefly introduce usage of the function. Before trying sample queries, you first need to prepare [a9a data](http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#a9a). See [our a9a tutorial page](a9a_dataset.md) for further instructions.
+Here, we briefly introduce usage of the function. Before trying sample queries, you first need to prepare [a9a data](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#a9a). See [our a9a tutorial page](a9a_dataset.md) for further instructions.
 
 <!-- toc -->
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/binaryclass/kdd2010a_dataset.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/binaryclass/kdd2010a_dataset.md b/docs/gitbook/binaryclass/kdd2010a_dataset.md
index d1a346c..b5dcadf 100644
--- a/docs/gitbook/binaryclass/kdd2010a_dataset.md
+++ b/docs/gitbook/binaryclass/kdd2010a_dataset.md
@@ -17,7 +17,7 @@
   under the License.
 -->
         
-[http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#kdd2010 (algebra)](http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#kdd2010 (algebra))
+[https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#kdd2010 (algebra)](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#kdd2010 (algebra))
 
 * the number of classes: 2
 * the number of data: 8,407,752 (training) / 510,302 (testing)
@@ -50,7 +50,7 @@ STORED AS TEXTFILE LOCATION '/dataset/kdd10a/test';
 ```
 
 # Putting data into HDFS
-[conv.awk](https://raw.githubusercontent.com/myui/hivemall/master/scripts/misc/conv.awk)
+[conv.awk](https://raw.githubusercontent.com/apache/incubator-hivemall/master/resources/misc/conv.awk)
 ```sh
 awk -f conv.awk kdda | hadoop fs -put - /dataset/kdd10a/train/kdda
 awk -f conv.awk kdda.t | hadoop fs -put - /dataset/kdd10a/test/kdda.t

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/binaryclass/kdd2010b_dataset.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/binaryclass/kdd2010b_dataset.md b/docs/gitbook/binaryclass/kdd2010b_dataset.md
index 1a1c3ce..6ad71af 100644
--- a/docs/gitbook/binaryclass/kdd2010b_dataset.md
+++ b/docs/gitbook/binaryclass/kdd2010b_dataset.md
@@ -17,7 +17,7 @@
   under the License.
 -->
         
-[http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#kdd2010 (bridge to algebra)](http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#kdd2010 (bridge to algebra))
+[https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#kdd2010 (bridge to algebra)](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#kdd2010 (bridge to algebra))
 
 * the number of classes: 2
 * the number of examples: 19,264,097 (training) / 748,401 (testing)
@@ -50,7 +50,7 @@ STORED AS TEXTFILE LOCATION '/dataset/kdd10b/test';
 ```
 
 # Putting data into HDFS
-[conv.awk](https://raw.githubusercontent.com/myui/hivemall/master/scripts/misc/conv.awk)
+[conv.awk](https://raw.githubusercontent.com/apache/incubator-hivemall/master/resources/misc/conv.awk)
 ```sh
 awk -f conv.awk kddb | hadoop fs -put - /dataset/kdd10b/train/kddb
 awk -f conv.awk kddb.t | hadoop fs -put - /dataset/kdd10b/test/kddb.t

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/binaryclass/news20_dataset.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/binaryclass/news20_dataset.md b/docs/gitbook/binaryclass/news20_dataset.md
index d50452d..2edd3f7 100644
--- a/docs/gitbook/binaryclass/news20_dataset.md
+++ b/docs/gitbook/binaryclass/news20_dataset.md
@@ -18,7 +18,7 @@
 -->
         
 Get the news20b dataset.
-http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#news20.binary
+https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#news20.binary
 
 ```sh
 cat <<EOF > conv.awk

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/binaryclass/news20_rf.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/binaryclass/news20_rf.md b/docs/gitbook/binaryclass/news20_rf.md
index 065c736..659536a 100644
--- a/docs/gitbook/binaryclass/news20_rf.md
+++ b/docs/gitbook/binaryclass/news20_rf.md
@@ -21,7 +21,7 @@ Hivemall Random Forest supports libsvm-like sparse inputs.
 
 > #### Note
 > This feature, i.e., Sparse input support in Random Forest, is supported since Hivemall v0.5.0 or later._
-> [`feature_hashing`](http://hivemall.incubator.apache.org/userguide/ft_engineering/hashing.html#featurehashing-function) function is useful to prepare feature vectors for Random Forest.
+> [`feature_hashing`](https://hivemall.incubator.apache.org/userguide/ft_engineering/hashing.html#featurehashing-function) function is useful to prepare feature vectors for Random Forest.
 
 <!-- toc -->
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/binaryclass/titanic_rf.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/binaryclass/titanic_rf.md b/docs/gitbook/binaryclass/titanic_rf.md
index 9e06094..89691b7 100644
--- a/docs/gitbook/binaryclass/titanic_rf.md
+++ b/docs/gitbook/binaryclass/titanic_rf.md
@@ -253,7 +253,7 @@ Accuracy would gives `0.76555` for a Kaggle submission.
 > `tree_export` feature is supported from Hivemall v0.5.0 or later.
 > Better to limit tree depth on training by `-depth` option to plot a Decision Tree.
 
-Hivemall provide `tree_export` to export a decision tree into [Graphviz](http://www.graphviz.org/) or human-readable Javascript format. You can find the usage by issuing the following query:
+Hivemall provide `tree_export` to export a decision tree into [Graphviz](https://www.graphviz.org/) or human-readable Javascript format. You can find the usage by issuing the following query:
 
 ```
 > select tree_export("","-help");
@@ -283,7 +283,7 @@ from
 ;
 ```
 
-[Here is an example](https://gist.github.com/myui/a83ba3795bad9b278cf8bcc59f946e2c#file-titanic-dot) plotting a decision tree using Graphviz or [Vis.js](http://viz-js.com/).
+[Here is an example](https://gist.github.com/myui/a83ba3795bad9b278cf8bcc59f946e2c#file-titanic-dot) plotting a decision tree using Graphviz or [Vis.js](https://viz-js.com/).
 
 ---
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/binaryclass/webspam_dataset.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/binaryclass/webspam_dataset.md b/docs/gitbook/binaryclass/webspam_dataset.md
index fe00111..a0777c7 100644
--- a/docs/gitbook/binaryclass/webspam_dataset.md
+++ b/docs/gitbook/binaryclass/webspam_dataset.md
@@ -18,7 +18,7 @@
 -->
         
 Get the dataset from 
-http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#webspam
+https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#webspam
 
 # Putting data on HDFS
 ```sql

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/book.json
----------------------------------------------------------------------
diff --git a/docs/gitbook/book.json b/docs/gitbook/book.json
index 9a385bf..a12cd93 100644
--- a/docs/gitbook/book.json
+++ b/docs/gitbook/book.json
@@ -36,7 +36,7 @@
             "url": "https://github.com/apache/incubator-hivemall/"
         },
         "sitemap": {
-            "hostname": "http://hivemall.incubator.apache.org/"
+            "hostname": "https://hivemall.incubator.apache.org/"
         },
         "etoc": {
           "mindepth": 1,
@@ -59,7 +59,7 @@
     },
     "links": {
       "sidebar": {
-        "<i class=\"fa fa-home\"></i> Home": "http://hivemall.incubator.apache.org/"
+        "<i class=\"fa fa-home\"></i> Home": "https://hivemall.incubator.apache.org/"
       }
     }
 }

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/clustering/plsa.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/clustering/plsa.md b/docs/gitbook/clustering/plsa.md
index cfdb0ec..6fc3bef 100644
--- a/docs/gitbook/clustering/plsa.md
+++ b/docs/gitbook/clustering/plsa.md
@@ -19,12 +19,12 @@
 
 As described in [our user guide for Latent Dirichlet Allocation (LDA)](lda.md), Hivemall enables you to apply clustering for your data based on a topic modeling technique. While LDA is one of the most popular techniques, there is another approach named **Probabilistic Latent Semantic Analysis** (pLSA). In fact, pLSA is the predecessor of LDA, but it has an advantage in terms of running time.
 
-- T. Hofmann. [Probabilistic Latent Semantic Indexing](http://dl.acm.org/citation.cfm?id=312649). SIGIR 1999, pp. 50-57.
-- T. Hofmann. [Probabilistic Latent Semantic Analysis](http://www.iro.umontreal.ca/~nie/IFT6255/Hofmann-UAI99.pdf). UAI 1999, pp. 289-296.
+- T. Hofmann. [Probabilistic Latent Semantic Indexing](https://dl.acm.org/citation.cfm?id=312649). SIGIR 1999, pp. 50-57.
+- T. Hofmann. [Probabilistic Latent Semantic Analysis](https://www.iro.umontreal.ca/~nie/IFT6255/Hofmann-UAI99.pdf). UAI 1999, pp. 289-296.
 
 In order to efficiently handle large-scale data, our pLSA implementation is based on the following incremental variant of the original pLSA algorithm:
 
-- H. Wu, et al. [Incremental Probabilistic Latent Semantic Analysis for Automatic Question Recommendation](http://dl.acm.org/citation.cfm?id=1454026). RecSys 2008, pp. 99-106.
+- H. Wu, et al. [Incremental Probabilistic Latent Semantic Analysis for Automatic Question Recommendation](https://dl.acm.org/citation.cfm?id=1454026). RecSys 2008, pp. 99-106.
 
 <!-- toc -->
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/eval/binary_classification_measures.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/eval/binary_classification_measures.md b/docs/gitbook/eval/binary_classification_measures.md
index ddb7bff..0150058 100644
--- a/docs/gitbook/eval/binary_classification_measures.md
+++ b/docs/gitbook/eval/binary_classification_measures.md
@@ -131,7 +131,7 @@ Hivemall's `fmeasure` function provides the option which can switch `micro`(defa
 
 You can learn more about this from the following external resource:
 
-- [scikit-learn's F1-score](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html)
+- [scikit-learn's F1-score](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html)
 
 
 ### Micro average
@@ -220,4 +220,4 @@ select fmeasure(truth, predicted, '-beta 2. -average binary') from data;
 
 You can learn more about this from the following external resource:
 
-- [scikit-learn's FMeasure](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.fbeta_score.html)
+- [scikit-learn's FMeasure](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.fbeta_score.html)

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/eval/lr_datagen.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/eval/lr_datagen.md b/docs/gitbook/eval/lr_datagen.md
index a48dad7..4d4c7b0 100644
--- a/docs/gitbook/eval/lr_datagen.md
+++ b/docs/gitbook/eval/lr_datagen.md
@@ -21,7 +21,7 @@
 
 # create a dual table
 
-Create a [dual table](http://en.wikipedia.org/wiki/DUAL_table) as follows:
+Create a [dual table](https://en.wikipedia.org/wiki/DUAL_table) as follows:
 ```sql
 CREATE TABLE dual (
   dummy int

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/eval/regression.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/eval/regression.md b/docs/gitbook/eval/regression.md
index 9a7345e..d8db64b 100644
--- a/docs/gitbook/eval/regression.md
+++ b/docs/gitbook/eval/regression.md
@@ -73,5 +73,5 @@ from t;
 
 # References
 
-* R2 http://en.wikipedia.org/wiki/Coefficient_of_determination
+* R2 https://en.wikipedia.org/wiki/Coefficient_of_determination
 * Evaluation Metrics https://www.kaggle.com/wiki/Metrics

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/ft_engineering/polynomial.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/ft_engineering/polynomial.md b/docs/gitbook/ft_engineering/polynomial.md
index 4a2cde4..e3102d2 100644
--- a/docs/gitbook/ft_engineering/polynomial.md
+++ b/docs/gitbook/ft_engineering/polynomial.md
@@ -19,7 +19,7 @@
 
 <!-- toc -->
 
-[Polynomial features](http://en.wikipedia.org/wiki/Polynomial_kernel) allows you to do [non-linear regression](https://class.coursera.org/ml-005/lecture/23)/classification with a linear model.
+[Polynomial features](https://en.wikipedia.org/wiki/Polynomial_kernel) allows you to do [non-linear regression](https://class.coursera.org/ml-005/lecture/23)/classification with a linear model.
 
 > #### Caution
 >
@@ -27,7 +27,7 @@
 
 # Polynomial Features
 
-As [a similar to one in Scikit-Learn](http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PolynomialFeatures.html), `polynomial_feature(array<String> features, int degree [, boolean interactionOnly=false, boolean truncate=true])` is a function to generate polynomial and interaction features.
+As [a similar to one in Scikit-Learn](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PolynomialFeatures.html), `polynomial_feature(array<String> features, int degree [, boolean interactionOnly=false, boolean truncate=true])` is a function to generate polynomial and interaction features.
 
 ```sql
 select polynomial_features(array("a:0.5","b:0.2"), 2);

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/ft_engineering/scaling.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/ft_engineering/scaling.md b/docs/gitbook/ft_engineering/scaling.md
index ff3ccef..00288e8 100644
--- a/docs/gitbook/ft_engineering/scaling.md
+++ b/docs/gitbook/ft_engineering/scaling.md
@@ -36,7 +36,7 @@ select l2_normalize(array('apple:1.0', 'banana:0.5'))
 > ["apple:0.8944272","banana:0.4472136"]
 
 # Min-Max Normalization
-http://en.wikipedia.org/wiki/Feature_scaling#Rescaling
+https://en.wikipedia.org/wiki/Feature_scaling#Rescaling
 ```sql
 select min(target), max(target)
 from (
@@ -63,7 +63,7 @@ from
 ```
 
 # Feature scaling by zscore
-http://en.wikipedia.org/wiki/Standard_score
+https://en.wikipedia.org/wiki/Standard_score
 
 ```sql
 select avg(target), stddev_pop(target)

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/ft_engineering/selection.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/ft_engineering/selection.md b/docs/gitbook/ft_engineering/selection.md
index b19ba56..e90d36c 100644
--- a/docs/gitbook/ft_engineering/selection.md
+++ b/docs/gitbook/ft_engineering/selection.md
@@ -28,7 +28,7 @@ It is a useful technique to 1) improve prediction results by omitting redundant
 # Supported Feature Selection algorithms
 
 * Chi-square (Chi2)
-    * In statistics, the $$\chi^2$$ test is applied to test the independence of two even events. Chi-square statistics between every feature variable and the target variable can be applied to Feature Selection. Refer [this article](http://nlp.stanford.edu/IR-book/html/htmledition/feature-selectionchi2-feature-selection-1.html) for Mathematical details.
+    * In statistics, the $$\chi^2$$ test is applied to test the independence of two even events. Chi-square statistics between every feature variable and the target variable can be applied to Feature Selection. Refer [this article](https://nlp.stanford.edu/IR-book/html/htmledition/feature-selectionchi2-feature-selection-1.html) for Mathematical details.
 * Signal Noise Ratio (SNR)
     * The Signal Noise Ratio (SNR) is a univariate feature ranking metric, which can be used as a feature selection criterion for binary classification problems. SNR is defined as $$|\mu_{1} - \mu_{2}| / (\sigma_{1} + \sigma_{2})$$, where $$\mu_{k}$$ is the mean value of the variable in classes $$k$$, and $$\sigma_{k}$$ is the standard deviations of the variable in classes $$k$$. Clearly, features with larger SNR are useful for classification.
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/ft_engineering/tfidf.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/ft_engineering/tfidf.md b/docs/gitbook/ft_engineering/tfidf.md
index 4bcaae7..0eb2e29 100644
--- a/docs/gitbook/ft_engineering/tfidf.md
+++ b/docs/gitbook/ft_engineering/tfidf.md
@@ -17,7 +17,7 @@
   under the License.
 -->
 
-This document explains how to compute [TF-IDF](http://en.wikipedia.org/wiki/Tf%E2%80%93idf) with Apache Hive/Hivemall.
+This document explains how to compute [TF-IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) with Apache Hive/Hivemall.
 
 What you need to compute TF-IDF is a table/view composing (docid, word) pair, 2 views, and 1 query.
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/geospatial/latlon.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/geospatial/latlon.md b/docs/gitbook/geospatial/latlon.md
index 4d6ec06..2e25f3e 100644
--- a/docs/gitbook/geospatial/latlon.md
+++ b/docs/gitbook/geospatial/latlon.md
@@ -48,7 +48,7 @@ y &=
 \end{aligned}
 {% endmath %}
 
-Refer [this page](http://wiki.openstreetmap.org/wiki/Slippy_map_tilenames) for detail. Zoom level is well described in [this page](http://wiki.openstreetmap.org/wiki/Zoom_levels).
+Refer [this page](https://wiki.openstreetmap.org/wiki/Slippy_map_tilenames) for detail. Zoom level is well described in [this page](https://wiki.openstreetmap.org/wiki/Zoom_levels).
 
 ### Usage
 
@@ -80,7 +80,7 @@ from
 
 # Distance function
 
-`haversine_distance(double lat1, double lon1, double lat2, double lon2, [const boolean mile=false])` returns [Haversine distance](http://www.movable-type.co.uk/scripts/latlong.html) between given two Geo locations.
+`haversine_distance(double lat1, double lon1, double lat2, double lon2, [const boolean mile=false])` returns [Haversine distance](https://www.movable-type.co.uk/scripts/latlong.html) between given two Geo locations.
 
 ```sql
 -- Tokyo (lat: 35.6833, lon: 139.7667)

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/getting_started/input-format.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/getting_started/input-format.md b/docs/gitbook/getting_started/input-format.md
index a01b5e3..4ff3a43 100644
--- a/docs/gitbook/getting_started/input-format.md
+++ b/docs/gitbook/getting_started/input-format.md
@@ -18,13 +18,13 @@
 -->
         
 This page explains the input format of training data in Hivemall. 
-Here, we use [EBNF](http://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_Form)-like notation for describing the format.
+Here, we use [EBNF](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_Form)-like notation for describing the format.
 
 <!-- toc -->
 
 # Input Format for Classification 
 
-The classifiers of Hivemall takes 2 (or 3) arguments: *features*, *label*, and *options* (a.k.a. [hyperparameters](http://en.wikipedia.org/wiki/Hyperparameter)). The first two arguments of training functions represents training examples. 
+The classifiers of Hivemall takes 2 (or 3) arguments: *features*, *label*, and *options* (a.k.a. [hyperparameters](https://en.wikipedia.org/wiki/Hyperparameter)). The first two arguments of training functions represents training examples. 
 
 In Statistics, *features* and *label* are called [Explanatory variable and Response Variable](http://www.oswego.edu/~srp/stats/variable_types.htm), respectively.
 
@@ -33,7 +33,7 @@ In Statistics, *features* and *label* are called [Explanatory variable and Respo
 The format of *features* is common between (binary and multi-class) classification and regression.
 Hivemall accepts `ARRAY&lt;INT|BIGINT|TEXT>` for the type of *features* column.
 
-Hivemall uses a *sparse* data format (cf. [Compressed Row Storage](http://netlib.org/linalg/html_templates/node91.html)) which is similar to [LIBSVM](http://stackoverflow.com/questions/12112558/read-write-data-in-libsvm-format) and [Vowpal Wabbit](https://github.com/JohnLangford/vowpal_wabbit/wiki/Input-format).
+Hivemall uses a *sparse* data format (cf. [Compressed Row Storage](https://netlib.org/linalg/html_templates/node91.html)) which is similar to [LIBSVM](https://stackoverflow.com/questions/12112558/read-write-data-in-libsvm-format) and [Vowpal Wabbit](https://github.com/JohnLangford/vowpal_wabbit/wiki/Input-format).
 
 The format of each feature in an array is as follows:
 ```
@@ -84,11 +84,11 @@ The [add_bias](../tips/addbias.html) function is Hivemall appends "0:1.0" as an
 
 ## Feature hashing
 
-Hivemall supports [feature hashing/hashing trick](http://en.wikipedia.org/wiki/Feature_hashing) through [mhash function](../ft_engineering/hashing.html#mhash-function).
+Hivemall supports [feature hashing/hashing trick](https://en.wikipedia.org/wiki/Feature_hashing) through [mhash function](../ft_engineering/hashing.html#mhash-function).
 
 The mhash function takes a feature (i.e., *index*) of TEXT format and generates a hash number of a range from 1 to 2^24 (=16777216) by the default setting.
 
-Feature hashing is useful where the dimension of feature vector (i.e., the number of elements in *features*) is so large. Consider applying [mhash function]((../ft_engineering/hashing.html#mhash-function)) when a prediction model does not fit in memory and OutOfMemory exception happens.
+Feature hashing is useful where the dimension of feature vector (i.e., the number of elements in *features*) is so large. Consider applying [mhash function](../ft_engineering/hashing.html#mhash-function)) when a prediction model does not fit in memory and OutOfMemory exception happens.
 
 In general, you don't need to use mhash when the dimension of feature vector is less than 16777216.
 If feature *index* is very long TEXT (e.g., "xxxxxxx-yyyyyy-weight:55.3") and uses huge memory spaces, consider using mhash as follows:

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/misc/approx.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/misc/approx.md b/docs/gitbook/misc/approx.md
index 1451151..8eeb268 100644
--- a/docs/gitbook/misc/approx.md
+++ b/docs/gitbook/misc/approx.md
@@ -63,7 +63,7 @@ from
 
 > #### Note
 >
-> `p` controls expected precision and memory consumption tradeoff and `default p=15` generally works well. Find More information on [this paper](https://research.google.com/pubs/pub40671.html).
+> `p` controls expected precision and memory consumption tradeoff and `default p=15` generally works well. Find More information on [this paper](https://ai.google/research/pubs/pub40671).
 
 ## Function Signature
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/misc/funcs.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/misc/funcs.md b/docs/gitbook/misc/funcs.md
index 3449419..c80128b 100644
--- a/docs/gitbook/misc/funcs.md
+++ b/docs/gitbook/misc/funcs.md
@@ -310,7 +310,7 @@ This page describes a list of Hivemall functions. See also a [list of generic Hi
 
 - `tile(double lat, double lon, int zoom)`::bigint - Returns a tile number 2^2n where n is zoom level. _FUNC_(lat,lon,zoom) = xtile(lon,zoom) + ytile(lat,zoom) * 2^zoom
   ```
-  refer http://wiki.openstreetmap.org/wiki/Slippy_map_tilenames for detail
+  refer https://wiki.openstreetmap.org/wiki/Slippy_map_tilenames for detail
   ```
 
 - `tilex2lon(int x, int zoom)`::double - Returns longitude of the given tile x and zoom level

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/misc/generic_funcs.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/misc/generic_funcs.md b/docs/gitbook/misc/generic_funcs.md
index dc8f41e..f2629d9 100644
--- a/docs/gitbook/misc/generic_funcs.md
+++ b/docs/gitbook/misc/generic_funcs.md
@@ -591,7 +591,7 @@ This page describes a list of useful Hivemall generic functions. See also a [lis
 
 - `each_top_k(int K, Object group, double cmpKey, *)` - Returns top-K values (or tail-K values when k is less than 0)
 
-- `generate_series(const int|bigint start, const int|bigint end)` - Generate a series of values, from start to end. A similar function to PostgreSQL's [generate_serics](http://www.postgresql.org/docs/current/static/functions-srf.html)
+- `generate_series(const int|bigint start, const int|bigint end)` - Generate a series of values, from start to end. A similar function to PostgreSQL's [generate_serics](https://www.postgresql.org/docs/current/static/functions-srf.html)
   ```sql
   SELECT generate_series(2,4);
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/misc/tokenizer.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/misc/tokenizer.md b/docs/gitbook/misc/tokenizer.md
index c7eb6b9..016830c 100644
--- a/docs/gitbook/misc/tokenizer.md
+++ b/docs/gitbook/misc/tokenizer.md
@@ -90,7 +90,7 @@ For detailed APIs, please refer Javadoc of [JapaneseAnalyzer](https://lucene.apa
 
 ## Chinese Tokenizer
 
-Chinese text tokenizer UDF uses [SmartChineseAnalyzer](http://lucene.apache.org/core/5_3_1/analyzers-smartcn/org/apache/lucene/analysis/cn/smart/SmartChineseAnalyzer.html). 
+Chinese text tokenizer UDF uses [SmartChineseAnalyzer](https://lucene.apache.org/core/5_3_1/analyzers-smartcn/org/apache/lucene/analysis/cn/smart/SmartChineseAnalyzer.html). 
 
 The signature of the UDF is as follows:
 ```sql
@@ -103,4 +103,4 @@ select tokenize_cn("Smartcn为Apache2.0协议的开源中文分词系统，Java
 ```
 > [smartcn, 为, apach, 2, 0, 协议, 的, 开源, 中文, 分词, 系统, java, 语言, 编写, 修改, 的, 中科院, 计算, 所, ictcla, 分词, 系统]
 
-For detailed APIs, please refer Javadoc of [SmartChineseAnalyzer](http://lucene.apache.org/core/5_3_1/analyzers-smartcn/org/apache/lucene/analysis/cn/smart/SmartChineseAnalyzer.html) as well.
+For detailed APIs, please refer Javadoc of [SmartChineseAnalyzer](https://lucene.apache.org/core/5_3_1/analyzers-smartcn/org/apache/lucene/analysis/cn/smart/SmartChineseAnalyzer.html) as well.

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/misc/topk.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/misc/topk.md b/docs/gitbook/misc/topk.md
index 27cf7ad..9c1efc9 100644
--- a/docs/gitbook/misc/topk.md
+++ b/docs/gitbook/misc/topk.md
@@ -33,7 +33,7 @@ This function is particularly useful for applying a similarity/distance function
 * The third argument `value` is used for the comparison.
 * `Any number types` or `timestamp` are accepted for the type of `value`.
 * If k is less than 0, reverse order is used and `tail-K` records are returned for each `group`.
-* Note that this function returns [a pseudo ranking](http://www.michaelpollmeier.com/selecting-top-k-items-from-a-list-efficiently-in-java-groovy/) for top-k. It always returns `at-most K` records for each group. The ranking scheme is similar to `dense_rank` but slightly different in certain cases.
+* Note that this function returns [a pseudo ranking](https://www.michaelpollmeier.com/selecting-top-k-items-from-a-list-efficiently-in-java-groovy/) for top-k. It always returns `at-most K` records for each group. The ranking scheme is similar to `dense_rank` but slightly different in certain cases.
 
 # Usage
 
@@ -110,7 +110,7 @@ The ranking semantics of `each_top_k` follows SQL's `dense_rank` and then limits
 
 ## top-k clicks 
 
-http://stackoverflow.com/questions/9390698/hive-getting-top-n-records-in-group-by-query/32559050#32559050
+https://stackoverflow.com/questions/9390698/hive-getting-top-n-records-in-group-by-query/32559050#32559050
 
 ```sql
 set hivevar:k=5;

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/multiclass/iris_dataset.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/multiclass/iris_dataset.md b/docs/gitbook/multiclass/iris_dataset.md
index 8dae7c9..db1800b 100644
--- a/docs/gitbook/multiclass/iris_dataset.md
+++ b/docs/gitbook/multiclass/iris_dataset.md
@@ -113,7 +113,7 @@ select * from iris_scaled limit 3;
 > 3       Iris-setosa     ["1:0.11111101","2:0.5","3:0.05084745","4:0.041666664","0:1.0"]
 ```
 
-_[LibSVM web page](http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html#iris) provides a normalized (using [ZScore](../ft_engineering/scaling.html#feature-scaling-by-zscore)) version of Iris dataset._
+_[LibSVM web page](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html#iris) provides a normalized (using [ZScore](../ft_engineering/scaling.html#feature-scaling-by-zscore)) version of Iris dataset._
 
 # Create training/test data
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/multiclass/iris_randomforest.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/multiclass/iris_randomforest.md b/docs/gitbook/multiclass/iris_randomforest.md
index c173db2..e980229 100644
--- a/docs/gitbook/multiclass/iris_randomforest.md
+++ b/docs/gitbook/multiclass/iris_randomforest.md
@@ -325,7 +325,7 @@ WHERE
 > `tree_export` feature is supported from Hivemall v0.5.0 or later.
 > Better to limit tree depth on training by `-depth` option to plot a Decision Tree.
 
-Hivemall provide `tree_export` to export a decision tree into [Graphviz](http://www.graphviz.org/) or human-readable Javascript format. You can find the usage by issuing the following query:
+Hivemall provide `tree_export` to export a decision tree into [Graphviz](https://www.graphviz.org/) or human-readable Javascript format. You can find the usage by issuing the following query:
 
 ```
 > select tree_export("","-help");
@@ -389,4 +389,4 @@ digraph Tree {
 
 <img src="../resources/images/iris.png" alt="Iris Graphviz output"/>
 
-You can draw a graph by `dot -Tpng iris.dot -o iris.png` or using [Viz.js](http://viz-js.com/).
+You can draw a graph by `dot -Tpng iris.dot -o iris.png` or using [Viz.js](https://viz-js.com/).

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/multiclass/news20_dataset.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/multiclass/news20_dataset.md b/docs/gitbook/multiclass/news20_dataset.md
index 4cc9b83..0ba0360 100644
--- a/docs/gitbook/multiclass/news20_dataset.md
+++ b/docs/gitbook/multiclass/news20_dataset.md
@@ -18,7 +18,7 @@
 -->
         
 Get the news20 dataset.
-http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html#news20
+https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html#news20
 
 ```sh
 $ cat <<EOF > conv.awk

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/multiclass/news20_one-vs-the-rest_dataset.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/multiclass/news20_one-vs-the-rest_dataset.md b/docs/gitbook/multiclass/news20_one-vs-the-rest_dataset.md
index 6f76d28..62bc397 100644
--- a/docs/gitbook/multiclass/news20_one-vs-the-rest_dataset.md
+++ b/docs/gitbook/multiclass/news20_one-vs-the-rest_dataset.md
@@ -39,7 +39,7 @@ select collect_set(label) from news20mc_train;
 SET hivevar:possible_labels="1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,17,16,19,18,20";
 ```
 
-[one-vs-rest.awk](https://github.com/myui/hivemall/blob/master/resources/misc/one-vs-rest.awk)
+[one-vs-rest.awk](https://github.com/apache/incubator-hivemall/blob/master/resources/misc/one-vs-rest.awk)
 
 ```
 create or replace view news20_onevsrest_train

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/recommend/movielens_cv.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/recommend/movielens_cv.md b/docs/gitbook/recommend/movielens_cv.md
index 80c0d19..d5e4ca7 100644
--- a/docs/gitbook/recommend/movielens_cv.md
+++ b/docs/gitbook/recommend/movielens_cv.md
@@ -17,7 +17,7 @@
   under the License.
 -->
         
-[Cross-validation](http://en.wikipedia.org/wiki/Cross-validation_%28statistics%29) is a model validation technique for assessing how a prediction model will generalize to an independent data set. This example shows a way to perform [k-fold cross validation](http://en.wikipedia.org/wiki/Cross-validation_%28statistics%29#k-fold_cross-validation) to evaluate prediction performance.
+[Cross-validation](https://en.wikipedia.org/wiki/Cross-validation_%28statistics%29) is a model validation technique for assessing how a prediction model will generalize to an independent data set. This example shows a way to perform [k-fold cross validation](https://en.wikipedia.org/wiki/Cross-validation_%28statistics%29#k-fold_cross-validation) to evaluate prediction performance.
 
 *Caution:* Matrix factorization is supported in Hivemall v0.3 or later.
 
@@ -79,4 +79,4 @@ Then, issue SQL queies in [generate_cv.sql](https://gist.github.com/myui/2e20182
 
 > 0.8502739040257945 (RMSE)
 
-_We recommend to use [Tez](http://tez.apache.org/) for running queries having many stages._
+_We recommend to use [Tez](https://tez.apache.org/) for running queries having many stages._

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/recommend/movielens_dataset.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/recommend/movielens_dataset.md b/docs/gitbook/recommend/movielens_dataset.md
index 27c04ba..ca33815 100644
--- a/docs/gitbook/recommend/movielens_dataset.md
+++ b/docs/gitbook/recommend/movielens_dataset.md
@@ -20,7 +20,7 @@
 # Data preparation
 
 First, downlod MovieLens dataset from the following site.
-> http://www.grouplens.org/system/files/ml-1m.zip
+> http://files.grouplens.org/datasets/movielens/ml-1m.zip
 
 Get detail about the dataset in the README.
 > http://files.grouplens.org/papers/ml-1m-README.txt

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/recommend/movielens_slim.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/recommend/movielens_slim.md b/docs/gitbook/recommend/movielens_slim.md
index 760f3cd..613e8c9 100644
--- a/docs/gitbook/recommend/movielens_slim.md
+++ b/docs/gitbook/recommend/movielens_slim.md
@@ -73,7 +73,7 @@ To evaluate a recommendation model, this tutorial uses two type cross validation
 - Leave-one-out cross validation
 - $$K$$-hold cross validation
 
-The former is used in the [SLIM's paper](http://glaros.dtc.umn.edu/gkhome/fetch/papers/SLIM2011icdm.pdf) and the latter is used in [Mendeley's slide](http://slideshare.net/MarkLevy/efficient-slides/).
+The former is used in the [SLIM's paper](http://glaros.dtc.umn.edu/gkhome/fetch/papers/SLIM2011icdm.pdf) and the latter is used in [Mendeley's slide](https://www.slideshare.net/MarkLevy/efficient-slides/).
 
 ### Leave-one-out cross validation
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/regression/e2006_arow.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/regression/e2006_arow.md b/docs/gitbook/regression/e2006_arow.md
index ddf6398..169a7dc 100644
--- a/docs/gitbook/regression/e2006_arow.md
+++ b/docs/gitbook/regression/e2006_arow.md
@@ -17,7 +17,7 @@
   under the License.
 -->
         
-http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html#E2006-tfidf
+https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html#E2006-tfidf
 
 ---
 #[PA1a]

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/regression/e2006_dataset.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/regression/e2006_dataset.md b/docs/gitbook/regression/e2006_dataset.md
index 804fa40..5754103 100644
--- a/docs/gitbook/regression/e2006_dataset.md
+++ b/docs/gitbook/regression/e2006_dataset.md
@@ -20,7 +20,7 @@
 Prerequisite
 ============
 
-* [E2006-tfidf Dataset](http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html#E2006-tfidf)
+* [E2006-tfidf Dataset](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html#E2006-tfidf)
 * [conv.awk](https://github.com/apache/incubator-hivemall/blob/master/resources/misc/conv.awk)
 
 Data preparation

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/regression/general.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/regression/general.md b/docs/gitbook/regression/general.md
index 4750ea4..a40eb58 100644
--- a/docs/gitbook/regression/general.md
+++ b/docs/gitbook/regression/general.md
@@ -26,7 +26,7 @@ In our regression tutorials, you can tackle realistic prediction problems by usi
 
 Our `train_regressor` function enables you to solve the regression problems with flexible configurable options. Let us try the function below.
 
-It should be noted that the sample queries require you to prepare [E2006-tfidf data](http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html#E2006-tfidf). See [our E2006-tfidf tutorial page](../regression/e2006_dataset.md) for further instructions.
+It should be noted that the sample queries require you to prepare [E2006-tfidf data](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html#E2006-tfidf). See [our E2006-tfidf tutorial page](../regression/e2006_dataset.md) for further instructions.
 
 <!-- toc -->
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/regression/kddcup12tr2_dataset.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/regression/kddcup12tr2_dataset.md b/docs/gitbook/regression/kddcup12tr2_dataset.md
index e4a541b..8d512cd 100644
--- a/docs/gitbook/regression/kddcup12tr2_dataset.md
+++ b/docs/gitbook/regression/kddcup12tr2_dataset.md
@@ -18,7 +18,7 @@
 -->
         
 The task is predicting the click through rate (CTR) of advertisement, meaning that we are to predict the probability of each ad being clicked. 
-http://www.kddcup2012.org/c/kddcup2012-track2
+https://www.kaggle.com/c/kddcup2012-track2
 
 ---
 
@@ -210,7 +210,7 @@ create table training_orcfile (
 ```
 _Caution: Joining between training table and user table takes a long time. Consider not to use gender and age and avoid joins if your Hadoop cluster is small._
 
-[kddconv.awk](https://github.com/myui/hivemall/blob/master/resources/examples/kddtrack2/kddconv.awk)
+[kddconv.awk](https://github.com/apache/incubator-hivemall/blob/master/resources/examples/kddtrack2/kddconv.awk)
 
 ```sql
 add file /tmp/kddconv.awk;

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/regression/kddcup12tr2_lr.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/regression/kddcup12tr2_lr.md b/docs/gitbook/regression/kddcup12tr2_lr.md
index b9f8bdf..42b8141 100644
--- a/docs/gitbook/regression/kddcup12tr2_lr.md
+++ b/docs/gitbook/regression/kddcup12tr2_lr.md
@@ -18,7 +18,7 @@
 -->
         
 The task is predicting the click through rate (CTR) of advertisement, meaning that we are to predict the probability of each ad being clicked.   
-http://www.kddcup2012.org/c/kddcup2012-track2
+https://www.kaggle.com/c/kddcup2012-track2
 
 _Caution: This example just shows a baseline result. Use token tables and amplifier to get better AUC score._
 
@@ -83,7 +83,8 @@ order by
 ```
 ## Evaluation
 
-[scoreKDD.py](https://github.com/myui/hivemall/blob/master/resources/examples/kddtrack2/scoreKDD.py)
+You can download scoreKDD.py from [KDD Cup 2012, Track 2 site](https://www.kaggle.com/c/kddcup2012-track2/data). After logging-in to Kaggle, download
+scoreKDD.py.
 
 ```sh
 hadoop fs -getmerge /user/hive/warehouse/kdd12track2.db/lr_predict lr_predict.tbl

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/regression/kddcup12tr2_lr_amplify.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/regression/kddcup12tr2_lr_amplify.md b/docs/gitbook/regression/kddcup12tr2_lr_amplify.md
index b363051..992463b 100644
--- a/docs/gitbook/regression/kddcup12tr2_lr_amplify.md
+++ b/docs/gitbook/regression/kddcup12tr2_lr_amplify.md
@@ -19,7 +19,7 @@
         
 This article explains *amplify* technique that is useful for improving prediction score.
 
-Iterations are mandatory in machine learning (e.g., in [stochastic gradient descent](http://en.wikipedia.org/wiki/Stochastic_gradient_descent)) to get good prediction models. However, MapReduce is known to be not suited for iterative algorithms because IN/OUT of each MapReduce job is through HDFS.
+Iterations are mandatory in machine learning (e.g., in [stochastic gradient descent](https://en.wikipedia.org/wiki/Stochastic_gradient_descent)) to get good prediction models. However, MapReduce is known to be not suited for iterative algorithms because IN/OUT of each MapReduce job is through HDFS.
 
 In this example, we show how Hivemall deals with this problem. We use [KDD Cup 2012, Track 2 Task](../regression/kddcup12tr2_dataset.html) as an example.
 
@@ -78,7 +78,7 @@ Using *trainning_x3*  instead of the plain training table results in higher and
 A problem in amplify() is that the shuffle (copy) and merge phase of the stage 1 could become a bottleneck.
 When the training table is so large that involves 100 Map tasks, the merge operator needs to merge at least 100 files by (external) merge sort! 
 
-Note that the actual bottleneck is not M/R iterations but shuffling training instance. Iteration without shuffling (as in [the Spark example](http://spark.incubator.apache.org/examples.html)) causes very slow convergence and results in requiring more iterations. Shuffling cannot be avoided even in iterative MapReduce variants.
+Note that the actual bottleneck is not M/R iterations but shuffling training instance. Iteration without shuffling (as in [the Spark example](https://spark.incubator.apache.org/examples.html)) causes very slow convergence and results in requiring more iterations. Shuffling cannot be avoided even in iterative MapReduce variants.
 
 <img src="../resources/images/amplify_elapsed.png" alt="amplify elapsed"/>
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/spark/binaryclass/a9a_df.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/spark/binaryclass/a9a_df.md b/docs/gitbook/spark/binaryclass/a9a_df.md
index 88229e3..b9cb68b 100644
--- a/docs/gitbook/spark/binaryclass/a9a_df.md
+++ b/docs/gitbook/spark/binaryclass/a9a_df.md
@@ -19,14 +19,14 @@
 
 a9a
 ===
-http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#a9a
+https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#a9a
 
 Data preparation
 ================
 
 ```sh
-$ wget http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a9a
-$ wget http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a9a.t
+$ wget https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a9a
+$ wget https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a9a.t
 ```
 
 ```scala

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/spark/binaryclass/a9a_sql.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/spark/binaryclass/a9a_sql.md b/docs/gitbook/spark/binaryclass/a9a_sql.md
index 06734d9..c9a7398 100644
--- a/docs/gitbook/spark/binaryclass/a9a_sql.md
+++ b/docs/gitbook/spark/binaryclass/a9a_sql.md
@@ -19,14 +19,14 @@
 
 a9a
 ===
-http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#a9a
+https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#a9a
 
 Data preparation
 ================
 
 ```sh
-$ wget http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a9a
-$ wget http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a9a.t
+$ wget https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a9a
+$ wget https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a9a.t
 ```
 
 ```scala

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/spark/getting_started/installation.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/spark/getting_started/installation.md b/docs/gitbook/spark/getting_started/installation.md
index b244f30..7b9595d 100644
--- a/docs/gitbook/spark/getting_started/installation.md
+++ b/docs/gitbook/spark/getting_started/installation.md
@@ -22,7 +22,7 @@ Prerequisites
 
 * Spark v2.1 or later
 * Java 7 or later
-* `hivemall-spark-xxx-with-dependencies.jar` that can be found in [the ASF distribution mirror](http://www.apache.org/dyn/closer.cgi/incubator/hivemall/).
+* `hivemall-spark-xxx-with-dependencies.jar` that can be found in [the ASF distribution mirror](https://www.apache.org/dyn/closer.cgi/incubator/hivemall/).
 * [define-all.spark](https://github.com/apache/incubator-hivemall/blob/master/resources/ddl/define-all.spark)
 * [import-packages.spark](https://github.com/apache/incubator-hivemall/blob/master/resources/ddl/import-packages.spark)
 
@@ -32,7 +32,7 @@ Prerequisites
 Installation
 ============
 
-First, you download a compiled Spark package from [the Spark official web page](http://spark.apache.org/downloads.html) and invoke spark-shell with a compiled Hivemall binary.
+First, you download a compiled Spark package from [the Spark official web page](https://spark.apache.org/downloads.html) and invoke spark-shell with a compiled Hivemall binary.
 
 ```
 $ ./bin/spark-shell --jars hivemall-spark-xxx-with-dependencies.jar

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/spark/regression/e2006_df.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/spark/regression/e2006_df.md b/docs/gitbook/spark/regression/e2006_df.md
index d6ac138..015ee00 100644
--- a/docs/gitbook/spark/regression/e2006_df.md
+++ b/docs/gitbook/spark/regression/e2006_df.md
@@ -19,14 +19,14 @@
 
 E2006
 ===
-http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html#E2006-tfidf
+https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html#E2006-tfidf
 
 Data preparation
 ================
 
 ```sh
-$ wget http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression/E2006.train.bz2
-$ wget http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression/E2006.test.bz2
+$ wget https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression/E2006.train.bz2
+$ wget https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression/E2006.test.bz2
 ```
 
 ```scala

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/spark/regression/e2006_sql.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/spark/regression/e2006_sql.md b/docs/gitbook/spark/regression/e2006_sql.md
index 48477d1..bb95cab 100644
--- a/docs/gitbook/spark/regression/e2006_sql.md
+++ b/docs/gitbook/spark/regression/e2006_sql.md
@@ -1,3 +1,4 @@
+
 <!--
   Licensed to the Apache Software Foundation (ASF) under one
   or more contributor license agreements.  See the NOTICE file
@@ -19,14 +20,14 @@
 
 E2006
 ===
-http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html#E2006-tfidf
+https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html#E2006-tfidf
 
 Data preparation
 ================
 
 ```sh
-$ wget http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression/E2006.train.bz2
-$ wget http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression/E2006.test.bz2
+$ wget https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression/E2006.train.bz2
+$ wget https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression/E2006.test.bz2
 ```
 
 ```scala

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/supervised_learning/prediction.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/supervised_learning/prediction.md b/docs/gitbook/supervised_learning/prediction.md
index 53d0cea..65aad27 100644
--- a/docs/gitbook/supervised_learning/prediction.md
+++ b/docs/gitbook/supervised_learning/prediction.md
@@ -38,8 +38,8 @@ Once a prediction model has been constructed based on the samples, the model can
 
 In order to train prediction models, an algorithm so-called ***stochastic gradient descent*** (SGD) is normally applied. You can learn more about this from the following external resources:
 
-- [scikit-learn documentation](http://scikit-learn.org/stable/modules/sgd.html)
-- [Spark MLlib documentation](http://spark.apache.org/docs/latest/mllib-optimization.html)
+- [scikit-learn documentation](https://scikit-learn.org/stable/modules/sgd.html)
+- [Spark MLlib documentation](https://spark.apache.org/docs/latest/mllib-optimization.html)
 
 Importantly, depending on types of output value, prediction problem can be categorized into **regression** and **classification** problem.
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/tips/emr.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/tips/emr.md b/docs/gitbook/tips/emr.md
index 44e0855..0d3dd44 100644
--- a/docs/gitbook/tips/emr.md
+++ b/docs/gitbook/tips/emr.md
@@ -21,15 +21,15 @@
         
 ## Prerequisite
 Learn how to use Hive with Elastic MapReduce (EMR).  
-http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive.html
+https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hive.html
 
 Before launching an EMR job, 
 * create ${s3bucket}/emr/outputs for outputs
 * optionally, create ${s3bucket}/emr/logs for logging
-* put [emr_hivemall_bootstrap.sh](https://raw.github.com/myui/hivemall/master/scripts/misc/emr_hivemall_bootstrap.sh) on ${s3bucket}/emr/conf
+* put [emr_hivemall_bootstrap.sh](https://raw.githubusercontent.com/apache/incubator-hivemall/master/resources/misc/emr_hivemall_bootstrap.sh) on ${s3bucket}/emr/conf
 
 Then, lunch an EMR job with hive in an interactive mode.
-I'm usually lunching EMR instances with cheap Spot instances through [CLI client](http://aws.amazon.com/developertools/2264) as follows:
+I'm usually lunching EMR instances with cheap Spot instances through [CLI client](https://aws.amazon.com/tools/) as follows:
 ```
 ./elastic-mapreduce --create --alive \
  --name "Hive cluster" \
@@ -43,7 +43,7 @@ I'm usually lunching EMR instances with cheap Spot instances through [CLI client
    --args "instance.isMaster=true,s3://${s3bucket}/emr/conf/emr_hivemall_bootstrap.sh" --bootstrap-name "hivemall setup"
  --bootstrap-action s3://elasticmapreduce/bootstrap-actions/install-ganglia --bootstrap-name "install ganglia"
 ```
-_To use YARN instead of old Hadoop, specify "[--ami-version 3.0.0](http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-ami.html#ami-versions-supported)". Hivemall works on both old Hadoop and YARN._
+_To use YARN instead of old Hadoop, specify "[--ami-version 3.0.0](https://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-ami.html#ami-versions-supported)". Hivemall works on both old Hadoop and YARN._
 
 Or, lunch an interactive EMR job using the EMR GUI wizard.
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/tips/hadoop_tuning.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/tips/hadoop_tuning.md b/docs/gitbook/tips/hadoop_tuning.md
index c516820..167a068 100644
--- a/docs/gitbook/tips/hadoop_tuning.md
+++ b/docs/gitbook/tips/hadoop_tuning.md
@@ -24,7 +24,7 @@
 Please refer the following guides for Hadoop tuning:
 
 * http://hadoopbook.com/
-* http://www.slideshare.net/cloudera/mr-perf
+* https://www.slideshare.net/cloudera/mr-perf
 
 ---
 # Mapper-side configuration
@@ -75,13 +75,13 @@ feature_dimensions (2^24 by the default) * 4 bytes (float) * 2 (iff covariance i
 ```
 > 2^24 * 4 bytes * 2 * 1.2 ≈ 161MB
 
-When [SpaceEfficientDenseModel](https://github.com/myui/hivemall/blob/master/src/main/java/hivemall/io/SpaceEfficientDenseModel.java) is used, the formula changes as follows:
+When [SpaceEfficientDenseModel](https://github.com/apache/incubator-hivemall/blob/master/core/src/main/java/hivemall/model/SpaceEfficientDenseModel.java) is used, the formula changes as follows:
 ```
 feature_dimensions (assume here 2^25) * 2 bytes (short) * 2 (iff covariance is calculated) * 1.2 (heuristics)
 ```
 > 2^25 * 2 bytes * 2 * 1.2 ≈ 161MB
 
-Note: Hivemall uses a [sparse representation](https://github.com/myui/hivemall/blob/master/src/main/java/hivemall/io/SparseModel.java) of prediction model (using a hash table) by the default. Use "[-densemodel](https://github.com/myui/hivemall/blob/master/src/main/java/hivemall/LearnerBaseUDTF.java#L87)" option to use a dense model.
+Note: Hivemall uses a [sparse representation](https://github.com/apache/incubator-hivemall/blob/master/core/src/main/java/hivemall/model/SparseModel.java) of prediction model (using a hash table) by the default. Use "[-densemodel](https://github.com/apache/incubator-hivemall/blob/master/core/src/main/java/hivemall/LearnerBaseUDTF.java#L87)" option to use a dense model.
 
 # Execution Engine of Hive
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/tips/mixserver.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/tips/mixserver.md b/docs/gitbook/tips/mixserver.md
index 91aff87..4eb1221 100644
--- a/docs/gitbook/tips/mixserver.md
+++ b/docs/gitbook/tips/mixserver.md
@@ -46,7 +46,7 @@ _Caution: hivemall-mixserv.jar is large in size and thus only used for Mix serve
 We assume in this example that Mix servers are running on host01, host03 and host03.
 The default port used by Mix server is 11212 and the port is configurable through "-port" option of run_mixserv.sh. 
 
-See [MixServer.java](https://github.com/myui/hivemall/blob/master/mixserv/src/main/java/hivemall/mix/server/MixServer.java#L90) to get detail of the Mix server options.
+See [MixServer.java](https://github.com/apache/incubator-hivemall/blob/master/mixserv/src/main/java/hivemall/mix/server/MixServer.java#L90-L104) to get detail of the Mix server options.
 
 We recommended to use multiple MIX servers to get better MIX throughput (3-5 or so would be enough for normal cluster size). The MIX protocol of Hivemall is *horizontally scalable* by adding MIX server nodes.
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/tips/rand_amplify.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/tips/rand_amplify.md b/docs/gitbook/tips/rand_amplify.md
index 73b1c3a..090926a 100644
--- a/docs/gitbook/tips/rand_amplify.md
+++ b/docs/gitbook/tips/rand_amplify.md
@@ -19,7 +19,7 @@
         
 This article explains *amplify* technique that is useful for improving prediction score.
 
-Iterations are mandatory in machine learning (e.g., in [stochastic gradient descent](http://en.wikipedia.org/wiki/Stochastic_gradient_descent)) to get good prediction models. However, MapReduce is known to be not suited for iterative algorithms because IN/OUT of each MapReduce job is through HDFS.
+Iterations are mandatory in machine learning (e.g., in [stochastic gradient descent](https://en.wikipedia.org/wiki/Stochastic_gradient_descent)) to get good prediction models. However, MapReduce is known to be not suited for iterative algorithms because IN/OUT of each MapReduce job is through HDFS.
 
 In this example, we show how Hivemall deals with this problem. We use [KDD Cup 2012, Track 2 Task](../regression/kddcup12tr2_dataset.html) as an example.
 
@@ -77,7 +77,7 @@ Using *trainning_x3*  instead of the plain training table results in higher and
 A problem in `amplify()` is that the shuffle (copy) and merge phase of the stage 1 could become a bottleneck.
 When the training table is so large that involves 100 Map tasks, the merge operator needs to merge at least 100 files by (external) merge sort! 
 
-Note that the actual bottleneck is not M/R iterations but shuffling training instance. Iteration without shuffling (as in [the Spark example](http://spark.incubator.apache.org/examples.html)) causes very slow convergence and results in requiring more iterations. Shuffling cannot be avoided even in iterative MapReduce variants.
+Note that the actual bottleneck is not M/R iterations but shuffling training instance. Iteration without shuffling (as in [the Spark example](https://spark.incubator.apache.org/examples.html)) causes very slow convergence and results in requiring more iterations. Shuffling cannot be avoided even in iterative MapReduce variants.
 
 <img src="../resources/images/amplify_elapsed.png" alt="amplify_elapsed"/>
 

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/tips/rt_prediction.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/tips/rt_prediction.md b/docs/gitbook/tips/rt_prediction.md
index e1a1fff..b326e34 100644
--- a/docs/gitbook/tips/rt_prediction.md
+++ b/docs/gitbook/tips/rt_prediction.md
@@ -31,7 +31,7 @@ We assume that you have already run the [a9a binary classification task](../bina
 
     Put mysql-connector-java.jar (JDBC driver) on $SQOOP_HOME/lib.
 
-- [Sqoop](http://sqoop.apache.org/)
+- [Sqoop](https://sqoop.apache.org/)
 
     Sqoop 1.4.5 does not support Hadoop v2.6.0. So, you need to build packages for Hadoop 2.6.
     To do that you need to edit build.xml and ivy.xml as shown in [this patch](https://gist.github.com/myui/e8db4a31b574103133c6).

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/troubleshooting/mapjoin_classcastex.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/troubleshooting/mapjoin_classcastex.md b/docs/gitbook/troubleshooting/mapjoin_classcastex.md
index ade4f52..4675d33 100644
--- a/docs/gitbook/troubleshooting/mapjoin_classcastex.md
+++ b/docs/gitbook/troubleshooting/mapjoin_classcastex.md
@@ -17,7 +17,7 @@
   under the License.
 -->
         
-Map-side join on Tez causes [ClassCastException](http://markmail.org/message/7cwbgupnhah6ggkv) when a serialized table contains array column(s).
+Map-side join on Tez causes [ClassCastException](https://markmail.org/message/7cwbgupnhah6ggkv) when a serialized table contains array column(s).
 
 [Workaround] Try setting _hive.mapjoin.optimized.hashtable_ off as follows:
 ```sql

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/30593b14/docs/gitbook/troubleshooting/oom.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/troubleshooting/oom.md b/docs/gitbook/troubleshooting/oom.md
index dc375bf..e413612 100644
--- a/docs/gitbook/troubleshooting/oom.md
+++ b/docs/gitbook/troubleshooting/oom.md
@@ -31,7 +31,7 @@ Then, the number of training examples used for each trainer is reduced (as the n
 
 # OOM in shuffle/merge
 
-If OOM caused during the merge step, try setting a larger **mapred.reduce.tasks** value before training and revise [shuffle/reduce parameters](http://hadoop.apache.org/docs/r1.0.4/mapred_tutorial.html#Shuffle%2FReduce+Parameters).
+If OOM caused during the merge step, try setting a larger **mapred.reduce.tasks** value before training and revise [shuffle/reduce parameters](https://hadoop.apache.org/docs/r1.0.4/mapred_tutorial.html#Shuffle%2FReduce+Parameters).
 ```
 SET mapred.reduce.tasks=64;
 ```