You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@drill.apache.org by ts...@apache.org on 2015/05/12 07:56:58 UTC

[16/25] drill git commit: html > markdown table format

html > markdown table format


Project: http://git-wip-us.apache.org/repos/asf/drill/repo
Commit: http://git-wip-us.apache.org/repos/asf/drill/commit/598be281
Tree: http://git-wip-us.apache.org/repos/asf/drill/tree/598be281
Diff: http://git-wip-us.apache.org/repos/asf/drill/diff/598be281

Branch: refs/heads/gh-pages
Commit: 598be2810054cb608cd4854c132f9df6165c6b3c
Parents: 7a40745
Author: Kristine Hahn <kh...@maprtech.com>
Authored: Sat May 9 06:42:11 2015 -0700
Committer: Kristine Hahn <kh...@maprtech.com>
Committed: Sat May 9 06:42:11 2015 -0700

----------------------------------------------------------------------
 .../010-configuration-options-introduction.md   | 349 +++----------------
 1 file changed, 58 insertions(+), 291 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/drill/blob/598be281/_docs/configure-drill/configuration-options/010-configuration-options-introduction.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/configuration-options/010-configuration-options-introduction.md b/_docs/configure-drill/configuration-options/010-configuration-options-introduction.md
index 8cfadf8..8443036 100644
--- a/_docs/configure-drill/configuration-options/010-configuration-options-introduction.md
+++ b/_docs/configure-drill/configuration-options/010-configuration-options-introduction.md
@@ -14,297 +14,64 @@ The sys.options table in Drill contains information about boot (start-up) and sy
 ## System Options
 The sys.options table lists the following options that you can set at the session or system level as described in the section, ["Planning and Execution Options"]({{site.baseurl}}/docs/planning-and-execution-options) 
 
-<table>
-  <tr>
-    <th>Name</th>
-    <th>Default</th>
-    <th>Comments</th>
-  </tr>
-  <tr>
-    <td>drill.exec.functions.cast_empty_string_to_null</td>
-    <td>FALSE</td>
-    <td>Not supported in this release.</td>
-  </tr>
-  <tr>
-    <td>drill.exec.storage.file.partition.column.label</td>
-    <td>dir</td>
-    <td>Accepts a string input.</td>
-  </tr>
-  <tr>
-    <td>exec.errors.verbose</td>
-    <td>FALSE</td>
-    <td>Toggles verbose output of executable error messages</td>
-  </tr>
-  <tr>
-    <td>exec.java_compiler</td>
-    <td>DEFAULT</td>
-    <td>Switches between DEFAULT, JDK, and JANINO mode for the current session. Uses Janino by default for generated source code of less than exec.java_compiler_janino_maxsize; otherwise, switches to the JDK compiler.</td>
-  </tr>
-  <tr>
-    <td>exec.java_compiler_debug</td>
-    <td>TRUE</td>
-    <td>Toggles the output of debug-level compiler error messages in runtime generated code.</td>
-  </tr>
-  <tr>
-    <td>exec.java_compiler_janino_maxsize</td>
-    <td>262144</td>
-    <td>See the exec.java_compiler option comment. Accepts inputs of type LONG.</td>
-  </tr>
-  <tr>
-    <td>exec.max_hash_table_size</td>
-    <td>1073741824</td>
-    <td>Ending size for hash tables. Range: 0 - 1073741824</td>
-  </tr>
-  <tr>
-    <td>exec.min_hash_table_size</td>
-    <td>65536</td>
-    <td>Starting size for hash tables. Increase according to available memory to improve performance. Range: 0 - 1073741824</td>
-  </tr>
-  <tr>
-    <td>exec.queue.enable</td>
-    <td>FALSE</td>
-    <td>Changes the state of query queues to control the number of queries that run simultaneously.</td>
-  </tr>
-  <tr>
-    <td>exec.queue.large</td>
-    <td>10</td>
-    <td>Sets the number of large queries that can run concurrently in the cluster.
-Range: 0-1000</td>
-  </tr>
-  <tr>
-    <td>exec.queue.small</td>
-    <td>100</td>
-    <td>Sets the number of small queries that can run concurrently in the cluster. Range: 0-1001</td>
-  </tr>
-  <tr>
-    <td>exec.queue.threshold</td>
-    <td>30000000</td>
-    <td>Sets the cost threshold, which depends on the complexity of the queries in queue, for determining whether query is large or small. Complex queries have higher thresholds. Range: 0-9223372036854775807</td>
-  </tr>
-  <tr>
-    <td>exec.queue.timeout_millis</td>
-    <td>300000</td>
-    <td>Indicates how long a query can wait in queue before the query fails. Range: 0-9223372036854775807</td>
-  </tr>
-  <tr>
-    <td>planner.add_producer_consumer</td>
-    <td>FALSE</td>
-    <td>Increase prefetching of data from disk. Disable for in-memory reads.</td>
-  </tr>
-  <tr>
-    <td>planner.affinity_factor</td>
-    <td>1.2</td>
-    <td>Accepts inputs of type DOUBLE.</td>
-  </tr>
-  <tr>
-    <td>planner.broadcast_factor</td>
-    <td>1</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>planner.broadcast_threshold</td>
-    <td>10000000</td>
-    <td>The maximum number of records allowed to be broadcast as part of a query. After one million records, Drill reshuffles data rather than doing a broadcast to one side of the join. Range: 0-2147483647</td>
-  </tr>
-  <tr>
-    <td>planner.disable_exchanges</td>
-    <td>FALSE</td>
-    <td>Toggles the state of hashing to a random exchange.</td>
-  </tr>
-  
-  <tr>
-    <td>planner.enable_broadcast_join</td>
-    <td>TRUE</td>
-    <td>Changes the state of aggregation and join operators. Do not disable.</td>
-  </tr>
-  <tr>
-    <td>planner.enable_constant_folding</td>
-    <td>TRUE</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>planner.enable_demux_exchange</td>
-    <td>FALSE</td>
-    <td>Toggles the state of hashing to a demulitplexed exchange.</td>
-  </tr>
-  <tr>
-    <td>planner.enable_hash_single_key</td>
-    <td>TRUE</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>planner.enable_hashagg</td>
-    <td>TRUE</td>
-    <td>Enable hash aggregation; otherwise, Drill does a sort-based aggregation. Does not write to disk. Enable is recommended.</td>
-  </tr>
-  <tr>
-    <td>planner.enable_hashjoin</td>
-    <td>TRUE</td>
-    <td>Enable the memory hungry hash join. Drill assumes that a query with have adequate memory to complete and tries to use the fastest operations possible to complete the planned inner, left, right, or full outer joins using a hash table. Does not write to disk. Disabling hash join allows Drill to manage arbitrarily large data in a small memory footprint.</td>
-  </tr>
-  <tr>
-    <td>planner.enable_hashjoin_swap</td>
-    <td>TRUE</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>planner.enable_mergejoin</td>
-    <td>TRUE</td>
-    <td>Sort-based operation. A merge join is used for inner join, left and right outer joins.  Inputs to the merge join must be sorted. It reads the sorted input streams from both sides and finds matching rows. Writes to disk.</td>
-  </tr>
-  <tr>
-    <td>planner.enable_multiphase_agg</td>
-    <td>TRUE</td>
-    <td>Each minor fragment does a local aggregation in phase 1, distributes on a hash basis using GROUP-BY keys partially aggregated results to other fragments, and all the fragments perform a total aggregation using this data.  
- </td>
-  </tr>
-  <tr>
-    <td>planner.enable_mux_exchange</td>
-    <td>TRUE</td>
-    <td>Toggles the state of hashing to a multiplexed exchange.</td>
-  </tr>
-  <tr>
-    <td>planner.enable_nestedloopjoin</td>
-    <td>TRUE</td>
-    <td>Sort-based operation. Writes to disk.</td>
-  </tr>
-  <tr>
-    <td>planner.enable_nljoin_for_scalar_only</td>
-    <td>TRUE</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>planner.enable_streamagg</td>
-    <td>TRUE</td>
-    <td>Sort-based operation. Writes to disk.</td>
-  </tr>
-  <tr>
-    <td>planner.identifier_max_length</td>
-    <td>1024</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>planner.join.hash_join_swap_margin_factor</td>
-    <td>10</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>planner.join.row_count_estimate_factor</td>
-    <td>1</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>planner.memory.average_field_width</td>
-    <td>8</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>planner.memory.enable_memory_estimation</td>
-    <td>FALSE</td>
-    <td>Toggles the state of memory estimation and re-planning of the query.
-When enabled, Drill conservatively estimates memory requirements and typically excludes these operators from the plan and negatively impacts performance.</td>
-  </tr>
-  <tr>
-    <td>planner.memory.hash_agg_table_factor</td>
-    <td>1.1</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>planner.memory.hash_join_table_factor</td>
-    <td>1.1</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>planner.memory.max_query_memory_per_node</td>
-    <td>2147483648</td>
-    <td>Sets the maximum estimate of memory for a query per node. If the estimate is too low, Drill re-plans the query without memory-constrained operators.</td>
-  </tr>
-  <tr>
-    <td>planner.memory.non_blocking_operators_memory</td>
-    <td>64</td>
-    <td>Range: 0-2048</td>
-  </tr>
-  <tr>
-    <td>planner.nestedloopjoin_factor</td>
-    <td>100</td>
-    <td></td>
-  </tr>
-    <tr>
-    <td>planner.partitioner_sender_max_threads</td>
-    <td>8</td>
-    <td>Upper limit of threads for outbound queuing.</td>
-  </tr>
-  <tr>
-    <td>planner.partitioner_sender_set_threads</td>
-    <td>-1</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>planner.partitioner_sender_threads_factor</td>
-    <td>1</td>
-    <td></td>
-  </tr>
-  <tr>
-    <td>planner.producer_consumer_queue_size</td>
-    <td>10</td>
-    <td>How much data to prefetch from disk (in record batches) out of band of query execution</td>
-  </tr>
-  <tr>
-    <td>planner.slice_target</td>
-    <td>100000</td>
-    <td>The number of records manipulated within a fragment before Drill parallelizes operations.</td>
-  </tr>
-  <tr>
-    <td>planner.width.max_per_node</td>
-    <td>3</td>
-    <td>Maximum number of threads that can run in parallel for a query on a node. A slice is an individual thread. This number indicates the maximum number of slices per query for the query’s major fragment on a node.</td>
-  </tr>
-  <tr>
-    <td>planner.width.max_per_query</td>
-    <td>1000</td>
-    <td>Same as max per node but applies to the query as executed by the entire cluster.</td>
-  </tr>
-  <tr>
-    <td>store.format</td>
-    <td>parquet</td>
-    <td>Output format for data written to tables with the CREATE TABLE AS (CTAS) command. Allowed values are parquet, json, or text. Allowed values: 0, -1, 1000000</td>
-  </tr>
-  <tr>
-    <td>store.json.all_text_mode</a></td>
-    <td>FALSE</td>
-    <td>Drill reads all data from the JSON files as VARCHAR. Prevents schema change errors.</td>
-  </tr>
-  <tr>
-    <td>store.mongo.all_text_mode</td>
-    <td>FALSE</td>
-    <td>Similar to store.json.all_text_mode for MongoDB.</td>
-  </tr>
-  <tr>
-    <td>store.parquet.block-size</a></td>
-    <td>536870912</td>
-    <td>Sets the size of a Parquet row group to the number of bytes less than or equal to the block size of MFS, HDFS, or the file system.</td>
-  </tr>
-  <tr>
-    <td>store.parquet.compression</td>
-    <td>snappy</td>
-    <td>Compression type for storing Parquet output. Allowed values: snappy, gzip, none</td>
-  </tr>
-  <tr>
-    <td>store.parquet.enable_dictionary_encoding*</td>
-    <td>FALSE</td>
-    <td>Do not change.</td>
-  </tr>
-  <tr>
-    <td>store.parquet.use_new_reader</td>
-    <td>FALSE</td>
-    <td>Not supported</td>
-  </tr>
-  <tr>
-    <td>window.enable*</td>
-    <td>FALSE</td>
-    <td>Coming soon.</td>
-  </tr>
-</table>
+| Name                                           | Default    | Comments                                                                                                                                                                                                                                                                                                                                                         |
+|------------------------------------------------|------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| drill.exec.functions.cast_empty_string_to_null | FALSE      | Not supported in this release.                                                                                                                                                                                                                                                                                                                                   |
+| drill.exec.storage.file.partition.column.label | dir        | Accepts a string input.                                                                                                                                                                                                                                                                                                                                          |
+| exec.errors.verbose                            | FALSE      | Toggles verbose output of executable error messages                                                                                                                                                                                                                                                                                                              |
+| exec.java_compiler                             | DEFAULT    | Switches between DEFAULT, JDK, and JANINO mode for the current session. Uses Janino by default for generated source code of less than exec.java_compiler_janino_maxsize; otherwise, switches to the JDK compiler.                                                                                                                                                |
+| exec.java_compiler_debug                       | TRUE       | Toggles the output of debug-level compiler error messages in runtime generated code.                                                                                                                                                                                                                                                                             |
+| exec.java_compiler_janino_maxsize              | 262144     | See the exec.java_compiler option comment. Accepts inputs of type LONG.                                                                                                                                                                                                                                                                                          |
+| exec.max_hash_table_size                       | 1073741824 | Ending size for hash tables. Range: 0 - 1073741824                                                                                                                                                                                                                                                                                                               |
+| exec.min_hash_table_size                       | 65536      | Starting size for hash tables. Increase according to available memory to improve performance. Range: 0 - 1073741824                                                                                                                                                                                                                                              |
+| exec.queue.enable                              | FALSE      | Changes the state of query queues to control the number of queries that run simultaneously.                                                                                                                                                                                                                                                                      |
+| exec.queue.large                               | 10         | Sets the number of large queries that can run concurrently in the cluster. Range: 0-1000                                                                                                                                                                                                                                                                         |
+| exec.queue.small                               | 100        | Sets the number of small queries that can run concurrently in the cluster. Range: 0-1001                                                                                                                                                                                                                                                                         |
+| exec.queue.threshold                           | 30000000   | Sets the cost threshold, which depends on the complexity of the queries in queue, for determining whether query is large or small. Complex queries have higher thresholds. Range: 0-9223372036854775807                                                                                                                                                          |
+| exec.queue.timeout_millis                      | 300000     | Indicates how long a query can wait in queue before the query fails. Range: 0-9223372036854775807                                                                                                                                                                                                                                                                |
+| planner.add_producer_consumer                  | FALSE      | Increase prefetching of data from disk. Disable for in-memory reads.                                                                                                                                                                                                                                                                                             |
+| planner.affinity_factor                        | 1.2        | Accepts inputs of type DOUBLE.                                                                                                                                                                                                                                                                                                                                   |
+| planner.broadcast_factor                       | 1          |                                                                                                                                                                                                                                                                                                                                                                  |
+| planner.broadcast_threshold                    | 10000000   | The maximum number of records allowed to be broadcast as part of a query. After one million records, Drill reshuffles data rather than doing a broadcast to one side of the join. Range: 0-2147483647                                                                                                                                                            |
+| planner.disable_exchanges                      | FALSE      | Toggles the state of hashing to a random exchange.                                                                                                                                                                                                                                                                                                               |
+| planner.enable_broadcast_join                  | TRUE       | Changes the state of aggregation and join operators. Do not disable.                                                                                                                                                                                                                                                                                             |
+| planner.enable_constant_folding                | TRUE       |                                                                                                                                                                                                                                                                                                                                                                  |
+| planner.enable_demux_exchange                  | FALSE      | Toggles the state of hashing to a demulitplexed exchange.                                                                                                                                                                                                                                                                                                        |
+| planner.enable_hash_single_key                 | TRUE       |                                                                                                                                                                                                                                                                                                                                                                  |
+| planner.enable_hashagg                         | TRUE       | Enable hash aggregation; otherwise, Drill does a sort-based aggregation. Does not write to disk. Enable is recommended.                                                                                                                                                                                                                                          |
+| planner.enable_hashjoin                        | TRUE       | Enable the memory hungry hash join. Drill assumes that a query with have adequate memory to complete and tries to use the fastest operations possible to complete the planned inner, left, right, or full outer joins using a hash table. Does not write to disk. Disabling hash join allows Drill to manage arbitrarily large data in a small memory footprint. |
+| planner.enable_hashjoin_swap                   | TRUE       |                                                                                                                                                                                                                                                                                                                                                                  |
+| planner.enable_mergejoin                       | TRUE       | Sort-based operation. A merge join is used for inner join, left and right outer joins. Inputs to the merge join must be sorted. It reads the sorted input streams from both sides and finds matching rows. Writes to disk.                                                                                                                                       |
+| planner.enable_multiphase_agg                  | TRUE       | Each minor fragment does a local aggregation in phase 1, distributes on a hash basis using GROUP-BY keys partially aggregated results to other fragments, and all the fragments perform a total aggregation using this data.                                                                                                                                     |
+| planner.enable_mux_exchange                    | TRUE       | Toggles the state of hashing to a multiplexed exchange.                                                                                                                                                                                                                                                                                                          |
+| planner.enable_nestedloopjoin                  | TRUE       | Sort-based operation. Writes to disk.                                                                                                                                                                                                                                                                                                                            |
+| planner.enable_nljoin_for_scalar_only          | TRUE       |                                                                                                                                                                                                                                                                                                                                                                  |
+| planner.enable_streamagg                       | TRUE       | Sort-based operation. Writes to disk.                                                                                                                                                                                                                                                                                                                            |
+| planner.identifier_max_length                  | 1024       |                                                                                                                                                                                                                                                                                                                                                                  |
+| planner.join.hash_join_swap_margin_factor      | 10         |                                                                                                                                                                                                                                                                                                                                                                  |
+| planner.join.row_count_estimate_factor         | 1          |                                                                                                                                                                                                                                                                                                                                                                  |
+| planner.memory.average_field_width             | 8          |                                                                                                                                                                                                                                                                                                                                                                  |
+| planner.memory.enable_memory_estimation        | FALSE      | Toggles the state of memory estimation and re-planning of the query. When enabled, Drill conservatively estimates memory requirements and typically excludes these operators from the plan and negatively impacts performance.                                                                                                                                   |
+| planner.memory.hash_agg_table_factor           | 1.1        |                                                                                                                                                                                                                                                                                                                                                                  |
+| planner.memory.hash_join_table_factor          | 1.1        |                                                                                                                                                                                                                                                                                                                                                                  |
+| planner.memory.max_query_memory_per_node       | 2147483648 | Sets the maximum estimate of memory for a query per node. If the estimate is too low, Drill re-plans the query without memory-constrained operators.                                                                                                                                                                                                             |
+| planner.memory.non_blocking_operators_memory   | 64         | Range: 0-2048                                                                                                                                                                                                                                                                                                                                                    |
+| planner.nestedloopjoin_factor                  | 100        |                                                                                                                                                                                                                                                                                                                                                                  |
+| planner.partitioner_sender_max_threads         | 8          | Upper limit of threads for outbound queuing.                                                                                                                                                                                                                                                                                                                     |
+| planner.partitioner_sender_set_threads         | -1         |                                                                                                                                                                                                                                                                                                                                                                  |
+| planner.partitioner_sender_threads_factor      | 1          |                                                                                                                                                                                                                                                                                                                                                                  |
+| planner.producer_consumer_queue_size           | 10         | How much data to prefetch from disk (in record batches) out of band of query execution                                                                                                                                                                                                                                                                           |
+| planner.slice_target                           | 100000     | The number of records manipulated within a fragment before Drill parallelizes operations.                                                                                                                                                                                                                                                                        |
+| planner.width.max_per_node                     | 3          | Maximum number of threads that can run in parallel for a query on a node. A slice is an individual thread. This number indicates the maximum number of slices per query for the query’s major fragment on a node.                                                                                                                                                |
+| planner.width.max_per_query                    | 1000       | Same as max per node but applies to the query as executed by the entire cluster.                                                                                                                                                                                                                                                                                 |
+| store.format                                   | parquet    | Output format for data written to tables with the CREATE TABLE AS (CTAS) command. Allowed values are parquet, json, or text. Allowed values: 0, -1, 1000000                                                                                                                                                                                                      |
+| store.json.all_text_mode                       | FALSE      | Drill reads all data from the JSON files as VARCHAR. Prevents schema change errors.                                                                                                                                                                                                                                                                              |
+| store.mongo.all_text_mode                      | FALSE      | Similar to store.json.all_text_mode for MongoDB.                                                                                                                                                                                                                                                                                                                 |
+| store.parquet.block-size                       | 536870912  | Sets the size of a Parquet row group to the number of bytes less than or equal to the block size of MFS, HDFS, or the file system.                                                                                                                                                                                                                               |
+| store.parquet.compression                      | snappy     | Compression type for storing Parquet output. Allowed values: snappy, gzip, none                                                                                                                                                                                                                                                                                  |
+| store.parquet.enable_dictionary_encoding*      | FALSE      | Do not change.                                                                                                                                                                                                                                                                                                                                                   |
+| store.parquet.use_new_reader                   | FALSE      | Not supported                                                                                                                                                                                                                                                                                                                                                    |
+| window.enable*                                 | FALSE      | Coming soon.                                                                                                                                                                                                                                                                                                                                                     |
 
 \* Not supported in this release.