You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by geetikagupta16 <gi...@git.apache.org> on 2018/04/20 11:20:46 UTC
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
GitHub user geetikagupta16 opened a pull request:
https://github.com/apache/carbondata/pull/2199
[CARBONDATA-2370] Added document for presto multinode setup for carbondata
Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:
- [ ] Any interfaces changed?
- [ ] Any backward compatibility impacted?
- [ ] Document update required?
- [ ] Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/geetikagupta16/incubator-carbondata CARBONDATA-2370
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/2199.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2199
----
commit 21160515ccecdf34b00d36182d2594a2e3467c28
Author: Geetika Gupta <ge...@...>
Date: 2018-04-20T11:17:35Z
Added document for presto multinode setup for carbondata
----
---
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183027976
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
--- End diff --
Leave a space after #
---
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183304329
--- Diff: integration/presto/Presto_Cluster_Setup_For_Carbondata.md ---
@@ -0,0 +1,133 @@
+# Presto Multinode Cluster setup For Carbondata
+
+## Installing Presto
+
+ 1. Download the 0.187 version of Presto using:
+ `wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz`
+
+ 2. Extract Presto tar file: `tar zxvf presto-server-0.187.tar.gz`
+
+ 3. Download the Presto CLI for the coordinator and name it presto.
+
+ ```
+ wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+
+ mv presto-cli-0.187-executable.jar presto
+
+ chmod +x presto
+ ```
+
+ ## Create Configuration Files
+
+ 1. Create `etc` folder in presto-server-0.187 directory.
+ 2. Create `config.properties`, `jvm.config`, `log.properties`, and `node.properties` files.
+ 3. Install uuid to generate a node.id
+
+ ```
+ sudo apt-get install uuid
+
+ uuid
+ ```
+
+
+##### Contents of your node.properties file
+
+ ```
+ node.environment=production
+ node.id=<generated uuid>
+ node.data-dir=/home/ubuntu/data
+ ```
+
+##### Contents of your jvm.config file
+
+ ```
+ -server
+ -Xmx16G
+ -XX:+UseG1GC
+ -XX:G1HeapRegionSize=32M
+ -XX:+UseGCOverheadLimit
+ -XX:+ExplicitGCInvokesConcurrent
+ -XX:+HeapDumpOnOutOfMemoryError
+ -XX:OnOutOfMemoryError=kill -9 %p
+ ```
+
+##### Contents of your log.properties file
+ ```
+ com.facebook.presto=INFO
+ ```
+
+ The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`.
+
+## Coordinator Configurations
+
+ ##### Contents of your config.properties
+ ```
+ coordinator=true
+ node-scheduler.include-coordinator=false
+ http-server.http.port=8080
+ query.max-memory=50GB
+ query.max-memory-per-node=2GB
+ discovery-server.enabled=true
+ discovery.uri=<coordinator_ip>:8080
+ ```
+The options `node-scheduler.include-coordinator=false` and `coordinator=true` indicate that the node is the coordinator and tells the coordinator not to do any of the computation work itself and to use the workers.
+
+**Note**: We recommend setting `query.max-memory-per-node` to half of the JVM config max memory, though if your workload is highly concurrent, you may want to use a lower value for `query.max-memory-per-node`.
+
+Also relation between below two configuration-properties should be like:
+If, `query.max-memory-per-node=30GB`
+Then, `query.max-memory=<30GB * number of nodes>`
+
+## Worker Configurations
+
+##### Contents of your config.properties
+
+ ```
+ coordinator=false
+ http-server.http.port=8080
+ query.max-memory=50GB
+ query.max-memory-per-node=2GB
+ discovery.uri=<coordinator_ip>:8080
+ ```
+
+**Note**: `jvm.config` and `node.properties` files are same for all the nodes (worker + coordinator). All the nodes should have different `node.id`
+
+## Catalog Configurations
+
+1. Create a folder named `catalog` in etc directory of presto on all the nodes of the cluster including the coordinator.
+
+##### Configuring Carbondata in Presto
+1. Create a file named `carbondata.properties` in the `catalog` folder and set the required properties on all the nodes.
+
+## Add Plugins
+
+1. Create a directory named `carbondata` in plugin directory of presto
--- End diff --
Period at the end of sentence for both the point. Check for all the sentence
---
[GitHub] carbondata issue #2199: [CARBONDATA-2370] Added document for presto multinod...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2199
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4153/
---
[GitHub] carbondata issue #2199: [CARBONDATA-2370] Added document for presto multinod...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2199
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4706/
---
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183029300
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+ * Download the 0.187 version of presto using:
+
+ ``wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz
+ ``
+ * Extract presto tar file
+ ``tar zxvf presto-server-0.187.tar.gz``
+
+ * Download the presto CLI for the coordinator and name it presto.
+
+ ```
+ wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+
+ mv presto-cli-0.187-executable.jar presto
+
+ chmod +x presto
+ ```
+
+ ### Create configuration Files
+
+ * Create etc folder in presto-server-0.187 directory.
--- End diff --
This is a procedure so change it to number point
---
[GitHub] carbondata issue #2199: [CARBONDATA-2370] Added document for presto multinod...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2199
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5238/
---
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183030457
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+ * Download the 0.187 version of presto using:
+
+ ``wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz
+ ``
+ * Extract presto tar file
+ ``tar zxvf presto-server-0.187.tar.gz``
+
+ * Download the presto CLI for the coordinator and name it presto.
+
+ ```
+ wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+
+ mv presto-cli-0.187-executable.jar presto
+
+ chmod +x presto
+ ```
+
+ ### Create configuration Files
+
+ * Create etc folder in presto-server-0.187 directory.
+ * Create config.properties, jvm.config, log.properties, and node.properties files.
+ * Install uuid to generate a node.id
+
+ ```
+ sudo apt-get install uuid
+
+ uuid
+ ```
+
+
+##### Contents of your node.properties file
+
+ ```
+ node.environment=production
+ node.id=<generated uuid>
+ node.data-dir=/home/ubuntu/data
+ ```
+
+##### Contents of your jvm.config file
+
+ ```
+ -server
+ -Xmx16G
+ -XX:+UseG1GC
+ -XX:G1HeapRegionSize=32M
+ -XX:+UseGCOverheadLimit
+ -XX:+ExplicitGCInvokesConcurrent
+ -XX:+HeapDumpOnOutOfMemoryError
+ -XX:OnOutOfMemoryError=kill -9 %p
+ ```
+
+##### Contents of your log.properties file
+ ```
+ com.facebook.presto=INFO
+ ```
+
+ The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`.
+
+### Coordinator Configurations
+
+ ##### Contents of your config.properties
+```
+coordinator=true
+node-scheduler.include-coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=2GB
+discovery-server.enabled=true
+discovery.uri=<coordinator_ip>:8080
+```
+The options `node-scheduler.include-coordinator=false` and `coordinator=true` indicate that the node is the coordinator and tells the coordinator not to do any of the computation work itself and to use the workers.
+
+**Note**: We recommend setting `query.max-memory-per-node` to half of the JVM config max memory, though if your workload is highly concurrent, you may want to use a lower value for `query.max-memory-per-node`.
+
+Also relation between below two configuration-properties should be like:
+If, `query.max-memory-per-node=30GB`
+Then, `query.max-memory=<30GB * number of nodes>`
+
+### Worker Configurations
--- End diff --
Heading 2
---
[GitHub] carbondata issue #2199: [CARBONDATA-2370] Added document for presto multinod...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2199
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4140/
---
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183028111
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
--- End diff --
We can change it to Heading 2 (##) and change the heading to "Installing Presto"
---
[GitHub] carbondata issue #2199: [CARBONDATA-2370] Added document for presto multinod...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2199
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5333/
---
[GitHub] carbondata issue #2199: [CARBONDATA-2370] Added document for presto multinod...
Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:
https://github.com/apache/carbondata/pull/2199
LGTM
---
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183028486
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+ * Download the 0.187 version of presto using:
+
+ ``wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz
+ ``
+ * Extract presto tar file
+ ``tar zxvf presto-server-0.187.tar.gz``
+
+ * Download the presto CLI for the coordinator and name it presto.
--- End diff --
All 'presto' instances can be changed to title case 'Presto'
---
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183028214
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+ * Download the 0.187 version of presto using:
--- End diff --
If this are Steps then can change from Bulleted points to Numbered point
---
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183027087
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
--- End diff --
Give a space after #
---
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183304497
--- Diff: integration/presto/Presto_Cluster_Setup_For_Carbondata.md ---
@@ -0,0 +1,133 @@
+# Presto Multinode Cluster setup For Carbondata
+
+## Installing Presto
+
+ 1. Download the 0.187 version of Presto using:
+ `wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz`
+
+ 2. Extract Presto tar file: `tar zxvf presto-server-0.187.tar.gz`
+
+ 3. Download the Presto CLI for the coordinator and name it presto.
+
+ ```
+ wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+
+ mv presto-cli-0.187-executable.jar presto
+
+ chmod +x presto
+ ```
+
+ ## Create Configuration Files
+
+ 1. Create `etc` folder in presto-server-0.187 directory.
+ 2. Create `config.properties`, `jvm.config`, `log.properties`, and `node.properties` files.
+ 3. Install uuid to generate a node.id
+
+ ```
+ sudo apt-get install uuid
+
+ uuid
+ ```
+
+
+##### Contents of your node.properties file
+
+ ```
+ node.environment=production
+ node.id=<generated uuid>
+ node.data-dir=/home/ubuntu/data
+ ```
+
+##### Contents of your jvm.config file
+
+ ```
+ -server
+ -Xmx16G
+ -XX:+UseG1GC
+ -XX:G1HeapRegionSize=32M
+ -XX:+UseGCOverheadLimit
+ -XX:+ExplicitGCInvokesConcurrent
+ -XX:+HeapDumpOnOutOfMemoryError
+ -XX:OnOutOfMemoryError=kill -9 %p
+ ```
+
+##### Contents of your log.properties file
+ ```
+ com.facebook.presto=INFO
+ ```
+
+ The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`.
+
+## Coordinator Configurations
+
+ ##### Contents of your config.properties
+ ```
+ coordinator=true
+ node-scheduler.include-coordinator=false
+ http-server.http.port=8080
+ query.max-memory=50GB
+ query.max-memory-per-node=2GB
+ discovery-server.enabled=true
+ discovery.uri=<coordinator_ip>:8080
+ ```
+The options `node-scheduler.include-coordinator=false` and `coordinator=true` indicate that the node is the coordinator and tells the coordinator not to do any of the computation work itself and to use the workers.
+
+**Note**: We recommend setting `query.max-memory-per-node` to half of the JVM config max memory, though if your workload is highly concurrent, you may want to use a lower value for `query.max-memory-per-node`.
+
+Also relation between below two configuration-properties should be like:
+If, `query.max-memory-per-node=30GB`
+Then, `query.max-memory=<30GB * number of nodes>`
+
+## Worker Configurations
+
+##### Contents of your config.properties
+
+ ```
+ coordinator=false
+ http-server.http.port=8080
+ query.max-memory=50GB
+ query.max-memory-per-node=2GB
+ discovery.uri=<coordinator_ip>:8080
+ ```
+
+**Note**: `jvm.config` and `node.properties` files are same for all the nodes (worker + coordinator). All the nodes should have different `node.id`
+
+## Catalog Configurations
+
+1. Create a folder named `catalog` in etc directory of presto on all the nodes of the cluster including the coordinator.
+
+##### Configuring Carbondata in Presto
+1. Create a file named `carbondata.properties` in the `catalog` folder and set the required properties on all the nodes.
+
+## Add Plugins
+
+1. Create a directory named `carbondata` in plugin directory of presto
+2. Copy `carbondata` jars to `plugin/carbondata` directory on all nodes
+
+## Start Presto Server on all nodes
+
+```
+./presto-server-0.187/bin/launcher start
+```
+To run it as a background process.
+
+```
+./presto-server-0.187/bin/launcher run
+```
+To run it in foreground.
+
+## Start Presto CLI
+```
+./presto
+```
+To connect to carbondata catalog use the following command:
+
+```
+./presto --server <coordinator_ip>:8080 --catalog carbondata --schema <schema_name>
+```
+Execute the following command to ensure the workers are connected
--- End diff --
: end of sentence
---
[GitHub] carbondata issue #2199: [CARBONDATA-2370] Added document for presto multinod...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2199
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4059/
---
[GitHub] carbondata issue #2199: [CARBONDATA-2370] Added document for presto multinod...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2199
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5860/
---
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183030442
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+ * Download the 0.187 version of presto using:
+
+ ``wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz
+ ``
+ * Extract presto tar file
+ ``tar zxvf presto-server-0.187.tar.gz``
+
+ * Download the presto CLI for the coordinator and name it presto.
+
+ ```
+ wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+
+ mv presto-cli-0.187-executable.jar presto
+
+ chmod +x presto
+ ```
+
+ ### Create configuration Files
+
+ * Create etc folder in presto-server-0.187 directory.
+ * Create config.properties, jvm.config, log.properties, and node.properties files.
+ * Install uuid to generate a node.id
+
+ ```
+ sudo apt-get install uuid
+
+ uuid
+ ```
+
+
+##### Contents of your node.properties file
+
+ ```
+ node.environment=production
+ node.id=<generated uuid>
+ node.data-dir=/home/ubuntu/data
+ ```
+
+##### Contents of your jvm.config file
+
+ ```
+ -server
+ -Xmx16G
+ -XX:+UseG1GC
+ -XX:G1HeapRegionSize=32M
+ -XX:+UseGCOverheadLimit
+ -XX:+ExplicitGCInvokesConcurrent
+ -XX:+HeapDumpOnOutOfMemoryError
+ -XX:OnOutOfMemoryError=kill -9 %p
+ ```
+
+##### Contents of your log.properties file
+ ```
+ com.facebook.presto=INFO
+ ```
+
+ The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`.
+
+### Coordinator Configurations
--- End diff --
Heading 2
---
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183031033
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+ * Download the 0.187 version of presto using:
+
+ ``wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz
+ ``
+ * Extract presto tar file
+ ``tar zxvf presto-server-0.187.tar.gz``
+
+ * Download the presto CLI for the coordinator and name it presto.
+
+ ```
+ wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+
+ mv presto-cli-0.187-executable.jar presto
+
+ chmod +x presto
+ ```
+
+ ### Create configuration Files
+
+ * Create etc folder in presto-server-0.187 directory.
+ * Create config.properties, jvm.config, log.properties, and node.properties files.
+ * Install uuid to generate a node.id
+
+ ```
+ sudo apt-get install uuid
+
+ uuid
+ ```
+
+
+##### Contents of your node.properties file
+
+ ```
+ node.environment=production
+ node.id=<generated uuid>
+ node.data-dir=/home/ubuntu/data
+ ```
+
+##### Contents of your jvm.config file
+
+ ```
+ -server
+ -Xmx16G
+ -XX:+UseG1GC
+ -XX:G1HeapRegionSize=32M
+ -XX:+UseGCOverheadLimit
+ -XX:+ExplicitGCInvokesConcurrent
+ -XX:+HeapDumpOnOutOfMemoryError
+ -XX:OnOutOfMemoryError=kill -9 %p
+ ```
+
+##### Contents of your log.properties file
+ ```
+ com.facebook.presto=INFO
+ ```
+
+ The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`.
+
+### Coordinator Configurations
+
+ ##### Contents of your config.properties
+```
+coordinator=true
+node-scheduler.include-coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=2GB
+discovery-server.enabled=true
+discovery.uri=<coordinator_ip>:8080
+```
+The options `node-scheduler.include-coordinator=false` and `coordinator=true` indicate that the node is the coordinator and tells the coordinator not to do any of the computation work itself and to use the workers.
+
+**Note**: We recommend setting `query.max-memory-per-node` to half of the JVM config max memory, though if your workload is highly concurrent, you may want to use a lower value for `query.max-memory-per-node`.
+
+Also relation between below two configuration-properties should be like:
+If, `query.max-memory-per-node=30GB`
+Then, `query.max-memory=<30GB * number of nodes>`
+
+### Worker Configurations
+
+##### Contents of your config.properties
+
+```
+coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=2GB
+discovery.uri=<coordinator_ip>:8080
+```
+
+**Note**: `jvm.config`, `node.properties` file is same for all the nodes (worker + coordinator). All the nodes should have different `node.id`
+
+### Catalog Configurations
+
+Create a folder named `catalog` in etc directory of presto on all the nodes of the cluster including the coordinator.
+
+##### Configuring Carbondata in Presto
+* Create a file named `carbondata.properties` in the `catalog` folder and set the required properties on all the nodes.
+
+### Add Plugins
+
+* Create a directory named `carbondata` in plugin directory of presto
+* Copy `carbondata` jars to `plugin/carbondata` directory on all nodes
+
+### Start Presto Server on all nodes
+
+```
+./presto-server-0.187/bin/launcher start
+```
+To run it as a background process.
+
+```
+./presto-server-0.187/bin/launcher run
+```
+To run it in foreground.
+
+### Start presto CLI
+```
+./presto
+```
+To connect to carbondata catalog use the following command:
--- End diff --
To connect to carbondata catalog, use the following command:
---
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183027350
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
--- End diff --
We can make it heading 2 (##) and change the heading to "Installing Presto"
---
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/carbondata/pull/2199
---
[GitHub] carbondata issue #2199: [CARBONDATA-2370] Added document for presto multinod...
Posted by geetikagupta16 <gi...@git.apache.org>.
Github user geetikagupta16 commented on the issue:
https://github.com/apache/carbondata/pull/2199
@chenliang613 Please review this PR
---
[GitHub] carbondata issue #2199: [CARBONDATA-2370] Added document for presto multinod...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2199
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5322/
---
[GitHub] carbondata issue #2199: [CARBONDATA-2370] Added document for presto multinod...
Posted by geetikagupta16 <gi...@git.apache.org>.
Github user geetikagupta16 commented on the issue:
https://github.com/apache/carbondata/pull/2199
@chenliang613 I have made the required changes. Please check
---
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183030674
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+ * Download the 0.187 version of presto using:
+
+ ``wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz
+ ``
+ * Extract presto tar file
+ ``tar zxvf presto-server-0.187.tar.gz``
+
+ * Download the presto CLI for the coordinator and name it presto.
+
+ ```
+ wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+
+ mv presto-cli-0.187-executable.jar presto
+
+ chmod +x presto
+ ```
+
+ ### Create configuration Files
+
+ * Create etc folder in presto-server-0.187 directory.
+ * Create config.properties, jvm.config, log.properties, and node.properties files.
+ * Install uuid to generate a node.id
+
+ ```
+ sudo apt-get install uuid
+
+ uuid
+ ```
+
+
+##### Contents of your node.properties file
+
+ ```
+ node.environment=production
+ node.id=<generated uuid>
+ node.data-dir=/home/ubuntu/data
+ ```
+
+##### Contents of your jvm.config file
+
+ ```
+ -server
+ -Xmx16G
+ -XX:+UseG1GC
+ -XX:G1HeapRegionSize=32M
+ -XX:+UseGCOverheadLimit
+ -XX:+ExplicitGCInvokesConcurrent
+ -XX:+HeapDumpOnOutOfMemoryError
+ -XX:OnOutOfMemoryError=kill -9 %p
+ ```
+
+##### Contents of your log.properties file
+ ```
+ com.facebook.presto=INFO
+ ```
+
+ The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`.
+
+### Coordinator Configurations
+
+ ##### Contents of your config.properties
+```
+coordinator=true
+node-scheduler.include-coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=2GB
+discovery-server.enabled=true
+discovery.uri=<coordinator_ip>:8080
+```
+The options `node-scheduler.include-coordinator=false` and `coordinator=true` indicate that the node is the coordinator and tells the coordinator not to do any of the computation work itself and to use the workers.
+
+**Note**: We recommend setting `query.max-memory-per-node` to half of the JVM config max memory, though if your workload is highly concurrent, you may want to use a lower value for `query.max-memory-per-node`.
+
+Also relation between below two configuration-properties should be like:
+If, `query.max-memory-per-node=30GB`
+Then, `query.max-memory=<30GB * number of nodes>`
+
+### Worker Configurations
+
+##### Contents of your config.properties
+
+```
+coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=2GB
+discovery.uri=<coordinator_ip>:8080
+```
+
+**Note**: `jvm.config`, `node.properties` file is same for all the nodes (worker + coordinator). All the nodes should have different `node.id`
--- End diff --
`jvm.config` and `node.properties` files are same for all the nodes (worker + coordinator). All the nodes should have different `node.id`.
---
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r187832216
--- Diff: integration/presto/Presto_Cluster_Setup_For_Carbondata.md ---
@@ -0,0 +1,133 @@
+# Presto Multinode Cluster setup For Carbondata
+
+## Installing Presto
+
+ 1. Download the 0.187 version of Presto using:
+ `wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz`
+
+ 2. Extract Presto tar file: `tar zxvf presto-server-0.187.tar.gz`.
+
+ 3. Download the Presto CLI for the coordinator and name it presto.
+
+ ```
+ wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+
+ mv presto-cli-0.187-executable.jar presto
+
+ chmod +x presto
+ ```
+
+ ## Create Configuration Files
+
+ 1. Create `etc` folder in presto-server-0.187 directory.
+ 2. Create `config.properties`, `jvm.config`, `log.properties`, and `node.properties` files.
+ 3. Install uuid to generate a node.id.
+
+ ```
+ sudo apt-get install uuid
+
+ uuid
+ ```
+
+
+##### Contents of your node.properties file
+
+ ```
+ node.environment=production
+ node.id=<generated uuid>
+ node.data-dir=/home/ubuntu/data
+ ```
+
+##### Contents of your jvm.config file
+
+ ```
+ -server
+ -Xmx16G
+ -XX:+UseG1GC
+ -XX:G1HeapRegionSize=32M
+ -XX:+UseGCOverheadLimit
+ -XX:+ExplicitGCInvokesConcurrent
+ -XX:+HeapDumpOnOutOfMemoryError
+ -XX:OnOutOfMemoryError=kill -9 %p
+ ```
+
+##### Contents of your log.properties file
+ ```
+ com.facebook.presto=INFO
+ ```
+
+ The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`.
+
+## Coordinator Configurations
+
+ ##### Contents of your config.properties
+ ```
+ coordinator=true
+ node-scheduler.include-coordinator=false
+ http-server.http.port=8080
--- End diff --
Here, the example port, don't suggest using 8080, it is easy to generate conflict.
---
[GitHub] carbondata issue #2199: [CARBONDATA-2370] Added document for presto multinod...
Posted by geetikagupta16 <gi...@git.apache.org>.
Github user geetikagupta16 commented on the issue:
https://github.com/apache/carbondata/pull/2199
@sgururajshetty @chenliang613. Please review
---
[GitHub] carbondata issue #2199: [CARBONDATA-2370] Added document for presto multinod...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2199
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4903/
---
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183030962
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+ * Download the 0.187 version of presto using:
+
+ ``wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz
+ ``
+ * Extract presto tar file
+ ``tar zxvf presto-server-0.187.tar.gz``
+
+ * Download the presto CLI for the coordinator and name it presto.
+
+ ```
+ wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+
+ mv presto-cli-0.187-executable.jar presto
+
+ chmod +x presto
+ ```
+
+ ### Create configuration Files
+
+ * Create etc folder in presto-server-0.187 directory.
+ * Create config.properties, jvm.config, log.properties, and node.properties files.
+ * Install uuid to generate a node.id
+
+ ```
+ sudo apt-get install uuid
+
+ uuid
+ ```
+
+
+##### Contents of your node.properties file
+
+ ```
+ node.environment=production
+ node.id=<generated uuid>
+ node.data-dir=/home/ubuntu/data
+ ```
+
+##### Contents of your jvm.config file
+
+ ```
+ -server
+ -Xmx16G
+ -XX:+UseG1GC
+ -XX:G1HeapRegionSize=32M
+ -XX:+UseGCOverheadLimit
+ -XX:+ExplicitGCInvokesConcurrent
+ -XX:+HeapDumpOnOutOfMemoryError
+ -XX:OnOutOfMemoryError=kill -9 %p
+ ```
+
+##### Contents of your log.properties file
+ ```
+ com.facebook.presto=INFO
+ ```
+
+ The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`.
+
+### Coordinator Configurations
+
+ ##### Contents of your config.properties
+```
+coordinator=true
+node-scheduler.include-coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=2GB
+discovery-server.enabled=true
+discovery.uri=<coordinator_ip>:8080
+```
+The options `node-scheduler.include-coordinator=false` and `coordinator=true` indicate that the node is the coordinator and tells the coordinator not to do any of the computation work itself and to use the workers.
+
+**Note**: We recommend setting `query.max-memory-per-node` to half of the JVM config max memory, though if your workload is highly concurrent, you may want to use a lower value for `query.max-memory-per-node`.
+
+Also relation between below two configuration-properties should be like:
+If, `query.max-memory-per-node=30GB`
+Then, `query.max-memory=<30GB * number of nodes>`
+
+### Worker Configurations
+
+##### Contents of your config.properties
+
+```
+coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=2GB
+discovery.uri=<coordinator_ip>:8080
+```
+
+**Note**: `jvm.config`, `node.properties` file is same for all the nodes (worker + coordinator). All the nodes should have different `node.id`
+
+### Catalog Configurations
+
+Create a folder named `catalog` in etc directory of presto on all the nodes of the cluster including the coordinator.
+
+##### Configuring Carbondata in Presto
+* Create a file named `carbondata.properties` in the `catalog` folder and set the required properties on all the nodes.
+
+### Add Plugins
+
+* Create a directory named `carbondata` in plugin directory of presto
--- End diff --
Procedure so change to numbered step
---
[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2199#discussion_r183028525
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+ * Download the 0.187 version of presto using:
+
+ ``wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz
+ ``
+ * Extract presto tar file
+ ``tar zxvf presto-server-0.187.tar.gz``
+
+ * Download the presto CLI for the coordinator and name it presto.
+
+ ```
+ wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+
+ mv presto-cli-0.187-executable.jar presto
+
+ chmod +x presto
+ ```
+
+ ### Create configuration Files
--- End diff --
Headin 2 (##) and make it title case
---
[GitHub] carbondata issue #2199: [CARBONDATA-2370] Added document for presto multinod...
Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on the issue:
https://github.com/apache/carbondata/pull/2199
LGTM
---
[GitHub] carbondata issue #2199: [CARBONDATA-2370] Added document for presto multinod...
Posted by geetikagupta16 <gi...@git.apache.org>.
Github user geetikagupta16 commented on the issue:
https://github.com/apache/carbondata/pull/2199
@sgururajshetty I have made the required changes. Please review
---