You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@iceberg.apache.org by bl...@apache.org on 2021/01/29 01:20:08 UTC

[iceberg] branch asf-site updated: Deployed b4f73d2ba with MkDocs version: 1.0.4

This is an automated email from the ASF dual-hosted git repository.

blue pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/iceberg.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new f8f9494  Deployed b4f73d2ba with MkDocs version: 1.0.4
f8f9494 is described below

commit f8f9494280112b4df4028989c699da7aeb67e12d
Author: Ryan Blue <bl...@apache.org>
AuthorDate: Thu Jan 28 17:19:57 2021 -0800

    Deployed b4f73d2ba with MkDocs version: 1.0.4
---
 aws/index.html      |   1 +
 flink/index.html    |  10 ++++++++++
 releases/index.html |  14 +++++++++-----
 sitemap.xml.gz      | Bin 230 -> 230 bytes
 spark/index.html    |   3 +--
 5 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/aws/index.html b/aws/index.html
index 29fc397..fd5b537 100644
--- a/aws/index.html
+++ b/aws/index.html
@@ -468,6 +468,7 @@ spark-sql --packages $DEPENDENCIES \
 </code></pre>
 
 <p>As you can see, In the shell command, we use <code>--packages</code> to specify the additional AWS bundle and HTTP client dependencies with their version as <code>2.15.40</code>.</p>
+<p>For integration with other engines such as Flink, please read their engine documentation pages that explain how to load a custom catalog. </p>
 <h2 id="glue-catalog">Glue Catalog<a class="headerlink" href="#glue-catalog" title="Permanent link">&para;</a></h2>
 <p>Iceberg enables the use of <a href="https://aws.amazon.com/glue">AWS Glue</a> as the <code>Catalog</code> implementation.
 When used, an Iceberg namespace is stored as a <a href="https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-catalog-databases.html">Glue Database</a>, 
diff --git a/flink/index.html b/flink/index.html
index cc59ed1..7b24572 100644
--- a/flink/index.html
+++ b/flink/index.html
@@ -404,6 +404,7 @@
                 <li class="third-level"><a href="#hive-catalog">Hive catalog</a></li>
                 <li class="third-level"><a href="#hadoop-catalog">Hadoop catalog</a></li>
                 <li class="third-level"><a href="#custom-catalog">Custom catalog</a></li>
+                <li class="third-level"><a href="#create-through-yaml-config">Create through YAML config</a></li>
             <li class="second-level"><a href="#ddl-commands">DDL commands</a></li>
                 
                 <li class="third-level"><a href="#create-database">CREATE DATABASE</a></li>
@@ -625,6 +626,15 @@ When <code>catalog-impl</code> is set, the value of <code>catalog-type</code> is
 );
 </code></pre>
 
+<h3 id="create-through-yaml-config">Create through YAML config<a class="headerlink" href="#create-through-yaml-config" title="Permanent link">&para;</a></h3>
+<p>Catalogs can be registered in <code>sql-client-defaults.yaml</code> before starting the SQL client. Here is an example:</p>
+<pre><code class="yaml">catalogs: 
+  - name: my_catalog
+    type: iceberg
+    catalog-type: hadoop
+    warehouse: hdfs://nn:8020/warehouse/path
+</code></pre>
+
 <h2 id="ddl-commands">DDL commands<a class="headerlink" href="#ddl-commands" title="Permanent link">&para;</a></h2>
 <h3 id="create-database"><code>CREATE DATABASE</code><a class="headerlink" href="#create-database" title="Permanent link">&para;</a></h3>
 <p>By default, iceberg will use the <code>default</code> database in flink. Using the following example to create a separate database if we don&rsquo;t want to create tables under the <code>default</code> database:</p>
diff --git a/releases/index.html b/releases/index.html
index 35904f8..267015b 100644
--- a/releases/index.html
+++ b/releases/index.html
@@ -484,11 +484,13 @@
 </ul>
 <p>Important bug fixes:</p>
 <ul>
-<li><a href="https://github.com/apache/iceberg/pull/2091">#2091</a> fixes Parquet vectorized reads when column types are promoted</li>
-<li><a href="https://github.com/apache/iceberg/pull/1991">#1991</a> fixes Avro schema conversions to preserve field docs</li>
-<li><a href="https://github.com/apache/iceberg/pull/1981">#1981</a> fixes bug that date and timestamp transforms were producing incorrect values for negative dates and times</li>
-<li><a href="https://github.com/apache/iceberg/pull/1798">#1798</a> fixes read failure when encountering duplicate entries of data files</li>
-<li><a href="https://github.com/apache/iceberg/pull/1785">#1785</a> fixes invalidation of metadata tables in CachingCatalog</li>
+<li><a href="https://github.com/apache/iceberg/pull/1981">#1981</a> fixes bug that date and timestamp transforms were producing incorrect values for dates and times before 1970. Before the fix, negative values were incorrectly transformed by date and timestamp transforms to 1 larger than the correct value. For example, <code>day(1969-12-31 10:00:00)</code> produced 0 instead of -1. The fix is backwards compatible, which means predicate projection can still work with the incorrectly trans [...]
+<li><a href="https://github.com/apache/iceberg/pull/2091">#2091</a> fixes <code>ClassCastException</code> for type promotion <code>int</code> to <code>long</code> and <code>float</code> to <code>double</code> during Parquet vectorized read. Now Arrow vector is created by looking at Parquet file schema instead of Iceberg schema for <code>int</code> and <code>float</code> fields.</li>
+<li><a href="https://github.com/apache/iceberg/pull/1998">#1998</a> fixes bug in <code>HiveTableOperation</code> that <code>unlock</code> is not called if new metadata cannot be deleted. Now it is guaranteed that <code>unlock</code> is always called for Hive catalog users.</li>
+<li><a href="https://github.com/apache/iceberg/pull/1979">#1979</a> fixes table listing failure in Hadoop catalog when user does not have permission to some tables. Now the tables with no permission are ignored in listing.</li>
+<li><a href="https://github.com/apache/iceberg/pull/1798">#1798</a> fixes scan task failure when encountering duplicate entries of data files. Spark and Flink readers can now ignore duplicated entries in data files for each scan task.</li>
+<li><a href="https://github.com/apache/iceberg/pull/1785">#1785</a> fixes invalidation of metadata tables in <code>CachingCatalog</code>. When a table is dropped, all the metadata tables associated with it are also invalidated in the cache.</li>
+<li><a href="https://github.com/apache/iceberg/pull/1960">#1960</a> fixes bug that ORC writer does not read metrics config and always use the default. Now customized metrics config is respected.</li>
 </ul>
 <p>Other notable changes:</p>
 <ul>
@@ -497,8 +499,10 @@
 <li>Spark and Flink now support dynamically loading customized <code>Catalog</code> and <code>FileIO</code> implementations</li>
 <li>Spark 2 now supports loading tables from other catalogs, like Spark 3</li>
 <li>Spark 3 now supports catalog names in DataFrameReader when using Iceberg as a format</li>
+<li>Flink now uses the number of Iceberg read splits as its job parallelism to improve performance and save resource.</li>
 <li>Hive (experimental) now supports INSERT INTO, case insensitive query, projection pushdown, create DDL with schema and auto type conversion</li>
 <li>ORC now supports reading tinyint, smallint, char, varchar types</li>
+<li>Avro to Iceberg schema conversion now preserves field docs</li>
 </ul>
 <h2 id="past-releases">Past releases<a class="headerlink" href="#past-releases" title="Permanent link">&para;</a></h2>
 <h3 id="0100">0.10.0<a class="headerlink" href="#0100" title="Permanent link">&para;</a></h3>
diff --git a/sitemap.xml.gz b/sitemap.xml.gz
index 5b765f1..c51d235 100644
Binary files a/sitemap.xml.gz and b/sitemap.xml.gz differ
diff --git a/spark/index.html b/spark/index.html
index 103da30..70837fc 100644
--- a/spark/index.html
+++ b/spark/index.html
@@ -490,7 +490,7 @@
 <td><a href="#alter-table"><code>ALTER TABLE</code></a></td>
 <td>✔️</td>
 <td></td>
-<td>⚠ requires extensions enabled to update partition field and sort order</td>
+<td>⚠ Requires <a href="../spark-configuration/#sql-extensions">SQL extensions</a> enabled to update partition field and sort order</td>
 </tr>
 <tr>
 <td><a href="#drop-table"><code>DROP TABLE</code></a></td>
@@ -560,7 +560,6 @@
 </tr>
 </tbody>
 </table>
-<p>To enable Iceberg SQL extensions, set Spark configuration <code>spark.sql.extensions</code> as <code>org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions</code>. </p>
 <h2 id="configuring-catalogs">Configuring catalogs<a class="headerlink" href="#configuring-catalogs" title="Permanent link">&para;</a></h2>
 <p>Spark 3.0 adds an API to plug in table catalogs that are used to load, create, and manage Iceberg tables. Spark catalogs are configured by setting <a href="../configuration/#catalogs">Spark properties</a> under <code>spark.sql.catalog</code>.</p>
 <p>This creates an Iceberg catalog named <code>hive_prod</code> that loads tables from a Hive metastore:</p>