You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@iceberg.apache.org by bl...@apache.org on 2019/07/15 04:40:53 UTC

[incubator-iceberg] 02/06: Deployed b3d3ab9f with MkDocs version: 1.0.4

This is an automated email from the ASF dual-hosted git repository.

blue pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-iceberg.git

commit 318eff477da3416619f8d6b4676d52c6a9098969
Author: Ryan Blue <bl...@apache.org>
AuthorDate: Wed Jun 26 15:37:14 2019 -0800

    Deployed b3d3ab9f with MkDocs version: 1.0.4
---
 community/index.html       |   8 +++---
 configuration/index.html   |  16 +++++------
 evolution/index.html       |   8 +++---
 getting-started/index.html |   2 +-
 index.html                 |  10 +++----
 partitioning/index.html    |  10 +++----
 performance/index.html     |   8 +++---
 presto/index.html          |   2 +-
 reliability/index.html     |  10 +++----
 schemas/index.html         |   2 +-
 sitemap.xml.gz             | Bin 219 -> 219 bytes
 snapshots/index.html       |   6 ++--
 spark/index.html           |  12 ++++----
 spec/index.html            |  70 ++++++++++++++++++++++-----------------------
 terms/index.html           |  14 ++++-----
 why-iceberg/index.html     |   8 +++---
 16 files changed, 93 insertions(+), 93 deletions(-)

diff --git a/community/index.html b/community/index.html
index f98ee6c..1bcde68 100644
--- a/community/index.html
+++ b/community/index.html
@@ -267,23 +267,23 @@
   - under the License.
   -->
 
-<h1 id="welcome">Welcome!<a class="headerlink" href="#welcome" title="Permanent link">&para;</a></h1>
+<h1 id="welcome">Welcome!</h1>
 <p>Apache Iceberg tracks issues in GitHub and prefers to receive contributions as pull requests.</p>
 <p>Community discussions happen primarily on the dev mailing list or on specific issues.</p>
-<h2 id="contributing">Contributing<a class="headerlink" href="#contributing" title="Permanent link">&para;</a></h2>
+<h2 id="contributing">Contributing</h2>
 <p>Iceberg uses Apache&rsquo;s GitHub integration. The code is available at <a href="https://github.com/apache/incubator-iceberg">https://github.com/apache/incubator-iceberg</a></p>
 <p>The Iceberg community prefers to receive contributions as <a href="https://help.github.com/articles/about-pull-requests/">Github pull requests</a>.</p>
 <ul>
 <li><a href="https://github.com/apache/incubator-iceberg/pulls">View open pull requests</a></li>
 <li><a href="https://help.github.com/articles/about-pull-requests/">Learn about pull requests</a></li>
 </ul>
-<h2 id="issues">Issues<a class="headerlink" href="#issues" title="Permanent link">&para;</a></h2>
+<h2 id="issues">Issues</h2>
 <p>Issues are tracked in GitHub:</p>
 <ul>
 <li><a href="https://github.com/apache/incubator-iceberg/issues">View open issues</a></li>
 <li><a href="https://github.com/apache/incubator-iceberg/issues/new">Open a new issue</a></li>
 </ul>
-<h2 id="mailing-lists">Mailing Lists<a class="headerlink" href="#mailing-lists" title="Permanent link">&para;</a></h2>
+<h2 id="mailing-lists">Mailing Lists</h2>
 <p>Iceberg has three mailing lists:</p>
 <ul>
 <li><strong>Developers</strong>: <a href="&#109;&#97;&#105;&#108;&#116;&#111;&#58;&#100;&#101;&#118;&#64;&#105;&#99;&#101;&#98;&#101;&#114;&#103;&#46;&#97;&#112;&#97;&#99;&#104;&#101;&#46;&#111;&#114;&#103;">&#100;&#101;&#118;&#64;&#105;&#99;&#101;&#98;&#101;&#114;&#103;&#46;&#97;&#112;&#97;&#99;&#104;&#101;&#46;&#111;&#114;&#103;</a> &ndash; used for community discussions<ul>
diff --git a/configuration/index.html b/configuration/index.html
index f7dc7af..efe9eab 100644
--- a/configuration/index.html
+++ b/configuration/index.html
@@ -251,10 +251,10 @@
 </div></div>
         <div class="col-md-9" role="main">
 
-<h1 id="configuration">Configuration<a class="headerlink" href="#configuration" title="Permanent link">&para;</a></h1>
-<h2 id="table-properties">Table properties<a class="headerlink" href="#table-properties" title="Permanent link">&para;</a></h2>
+<h1 id="configuration">Configuration</h1>
+<h2 id="table-properties">Table properties</h2>
 <p>Iceberg tables support table properties to configure table behavior, like the default split size for readers.</p>
-<h3 id="read-properties">Read properties<a class="headerlink" href="#read-properties" title="Permanent link">&para;</a></h3>
+<h3 id="read-properties">Read properties</h3>
 <table>
 <thead>
 <tr>
@@ -276,7 +276,7 @@
 </tr>
 </tbody>
 </table>
-<h3 id="write-properties">Write properties<a class="headerlink" href="#write-properties" title="Permanent link">&para;</a></h3>
+<h3 id="write-properties">Write properties</h3>
 <table>
 <thead>
 <tr>
@@ -318,7 +318,7 @@
 </tr>
 </tbody>
 </table>
-<h3 id="table-behavior-properties">Table behavior properties<a class="headerlink" href="#table-behavior-properties" title="Permanent link">&para;</a></h3>
+<h3 id="table-behavior-properties">Table behavior properties</h3>
 <table>
 <thead>
 <tr>
@@ -360,8 +360,8 @@
 </tr>
 </tbody>
 </table>
-<h2 id="spark-options">Spark options<a class="headerlink" href="#spark-options" title="Permanent link">&para;</a></h2>
-<h3 id="read-options">Read options<a class="headerlink" href="#read-options" title="Permanent link">&para;</a></h3>
+<h2 id="spark-options">Spark options</h2>
+<h3 id="read-options">Read options</h3>
 <p>Spark read options are passed when configuring the DataFrameReader, like this:</p>
 <pre><code class="scala">// time travel
 spark.read
@@ -391,7 +391,7 @@ spark.read
 </tr>
 </tbody>
 </table>
-<h3 id="write-options">Write options<a class="headerlink" href="#write-options" title="Permanent link">&para;</a></h3>
+<h3 id="write-options">Write options</h3>
 <p>Spark write options are passed when configuring the DataFrameWriter, like this:</p>
 <pre><code class="scala">// write with Avro instead of Parquet
 df.write
diff --git a/evolution/index.html b/evolution/index.html
index 5211f50..b5d93c6 100644
--- a/evolution/index.html
+++ b/evolution/index.html
@@ -247,10 +247,10 @@
 </div></div>
         <div class="col-md-9" role="main">
 
-<h1 id="table-evolution">Table Evolution<a class="headerlink" href="#table-evolution" title="Permanent link">&para;</a></h1>
+<h1 id="table-evolution">Table Evolution</h1>
 <p>Iceberg supports <strong>in-place table evolution</strong>. You can <a href="#schema-evolution">evolve a table schema</a> just like SQL &ndash; even in nested structures &ndash; or <a href="#partition-evolution">change partition layout</a> when data volume changes. Iceberg does not require costly distractions, like rewriting table data or migrating to a new table.</p>
 <p>For example, Hive table partitioning cannot change so moving from a daily partition layout to an hourly partition layout requires a new table. And because queries are dependent on partitions, queries must be rewritten for the new table. In some cases, even changes as simple as renaming a column are either not supported, or can cause <a href="#correctness">data correctness</a> problems.</p>
-<h2 id="schema-evolution">Schema evolution<a class="headerlink" href="#schema-evolution" title="Permanent link">&para;</a></h2>
+<h2 id="schema-evolution">Schema evolution</h2>
 <p>Iceberg supports the following schema evolution changes:</p>
 <ul>
 <li><strong>Add</strong> &ndash; add a new column to the table or to a nested struct</li>
@@ -261,7 +261,7 @@
 </ul>
 <p>Iceberg schema updates are metadata changes. Data files are not eagerly rewritten.</p>
 <p>Note that map keys do not support adding or dropping struct fields that would change equality.</p>
-<h3 id="correctness">Correctness<a class="headerlink" href="#correctness" title="Permanent link">&para;</a></h3>
+<h3 id="correctness">Correctness</h3>
 <p>Iceberg guarantees that <strong>schema evolution changes are independent and free of side-effects</strong>:</p>
 <ol>
 <li>Added columns never read existing values from another column.</li>
@@ -274,7 +274,7 @@
 <li>Formats that track columns by name can inadvertently un-delete a column if a name is reused, which violates #1.</li>
 <li>Formats that track columns by position cannot delete columns without changing the names that are used for each column, which violates #2.</li>
 </ul>
-<h2 id="partition-evolution">Partition evolution<a class="headerlink" href="#partition-evolution" title="Permanent link">&para;</a></h2>
+<h2 id="partition-evolution">Partition evolution</h2>
 <p>Iceberg table partitioning can be updated in an existing table because queries do not reference partition values directly.</p>
 <p>Iceberg uses <a href="../partitioning">hidden partitioning</a>, so you don&rsquo;t <em>need</em> to write queries for a specific partition layout to be fast. Instead, you can write queries that select the data you need, and Iceberg automatically prunes out files that don&rsquo;t contain matching data.</p>
 <p>Partition evolution is a metadata operation and does not eagerly rewrite files.</p></div>
diff --git a/getting-started/index.html b/getting-started/index.html
index c10b038..f88b303 100644
--- a/getting-started/index.html
+++ b/getting-started/index.html
@@ -232,7 +232,7 @@
 </div></div>
         <div class="col-md-9" role="main">
 
-<h2 id="getting-started">Getting Started<a class="headerlink" href="#getting-started" title="Permanent link">&para;</a></h2></div>
+<h2 id="getting-started">Getting Started</h2></div>
         
         
     </div>
diff --git a/index.html b/index.html
index a14543d..e7c0de3 100644
--- a/index.html
+++ b/index.html
@@ -263,9 +263,9 @@
   - under the License.
   -->
 
-<h1 id="_1"><img alt="Iceberg" src="img/Iceberg-logo.png" /><a class="headerlink" href="#_1" title="Permanent link">&para;</a></h1>
+<h1 id="_1"><img alt="Iceberg" src="img/Iceberg-logo.png" /></h1>
 <p><strong>Apache Iceberg is an open table format for huge analytic datasets.</strong> Iceberg adds tables to Presto and Spark that use a high-performance format that works just like a SQL table.</p>
-<h3 id="user-experience">User experience<a class="headerlink" href="#user-experience" title="Permanent link">&para;</a></h3>
+<h3 id="user-experience">User experience</h3>
 <p>Iceberg avoids unpleasant surprises. Schema evolution works and won&rsquo;t inadvertently un-delete data. Users don&rsquo;t need to know about partitioning to get fast queries.</p>
 <ul>
 <li><a href="evolution#schema-evolution">Schema evolution</a> supports add, drop, update, or rename, and has <a href="evolution#correctness">no side-effects</a></li>
@@ -274,7 +274,7 @@
 <li><a href="spark#time-travel">Time travel</a> enables reproducible queries that use exactly the same table snapshot, or lets users easily examine changes</li>
 <li>Version rollback allows users to quickly correct problems by resetting tables to a good state</li>
 </ul>
-<h3 id="reliability-and-performance">Reliability and performance<a class="headerlink" href="#reliability-and-performance" title="Permanent link">&para;</a></h3>
+<h3 id="reliability-and-performance">Reliability and performance</h3>
 <p>Iceberg was built for huge tables. Iceberg is used in production where a single table can contain tens of petabytes of data and even these huge tables can be read without a distributed SQL engine.</p>
 <ul>
 <li><a href="performance#scan-planning">Scan planning is fast</a> &ndash; a distributed SQL engine isn&rsquo;t needed to read a table or find files</li>
@@ -286,7 +286,7 @@
 <li><a href="reliability">Serializable isolation</a> &ndash; table changes are atomic and readers never see partial or uncommitted changes</li>
 <li><a href="reliability#concurrent-write-operations">Multiple concurrent writers</a> use optimistic concurrency and will retry to ensure that compatible updates succeed, even when writes conflict</li>
 </ul>
-<h3 id="open-standard">Open standard<a class="headerlink" href="#open-standard" title="Permanent link">&para;</a></h3>
+<h3 id="open-standard">Open standard</h3>
 <p>Iceberg has been designed and developed to be an open community standard with a <a href="spec">specification</a> to ensure compatibility across languages and implementations.</p>
 <p><a href="community">Apache Iceberg is open source</a>, and is an incubating project at the <a href="https://www.apache.org/">Apache Software Foundation</a>.</p></div>
         
@@ -356,5 +356,5 @@
 
 <!--
 MkDocs version : 1.0.4
-Build Date UTC : 2019-06-26 23:33:37
+Build Date UTC : 2019-06-26 23:37:14
 -->
diff --git a/partitioning/index.html b/partitioning/index.html
index 457c882..be546c0 100644
--- a/partitioning/index.html
+++ b/partitioning/index.html
@@ -249,7 +249,7 @@
 </div></div>
         <div class="col-md-9" role="main">
 
-<h2 id="what-is-partitioning">What is partitioning?<a class="headerlink" href="#what-is-partitioning" title="Permanent link">&para;</a></h2>
+<h2 id="what-is-partitioning">What is partitioning?</h2>
 <p>Partitioning is a way to make queries faster by grouping similar rows together when writing.</p>
 <p>For example, queries for log entries from a <code>logs</code> table would usually include a time range, like this query for logs between 10 and 12 AM:</p>
 <pre><code class="sql">SELECT level, message FROM logs
@@ -258,14 +258,14 @@ WHERE event_time BETWEEN '2018-12-01 10:00:00' AND '2018-12-01 12:00:00'
 
 <p>Configuring the <code>logs</code> table to partition by the date of <code>event_time</code> will group log events into files with the same event date. Iceberg keeps track of that date and will use it to skip files for other dates that don&rsquo;t have useful data.</p>
 <p>Iceberg can partition timestamps by year, month, day, and hour granularity. It can also use a categorical column, like <code>level</code> in this logs example, to store rows together and speed up queries.</p>
-<h2 id="what-does-iceberg-do-differently">What does Iceberg do differently?<a class="headerlink" href="#what-does-iceberg-do-differently" title="Permanent link">&para;</a></h2>
+<h2 id="what-does-iceberg-do-differently">What does Iceberg do differently?</h2>
 <p>Other tables formats like Hive support partitioning, but Iceberg supports <em>hidden partitioning</em>.</p>
 <ul>
 <li>Iceberg handles the tedious and error-prone task of producing partition values for rows in a table.</li>
 <li>Iceberg avoids reading unnecessary partitions automatically. Consumers don&rsquo;t need to know how the table is partitioned and add extra filters to their queries.</li>
 <li>Iceberg partition layouts can evolve as needed.</li>
 </ul>
-<h3 id="partitioning-in-hive">Partitioning in Hive<a class="headerlink" href="#partitioning-in-hive" title="Permanent link">&para;</a></h3>
+<h3 id="partitioning-in-hive">Partitioning in Hive</h3>
 <p>To demonstrate the difference, consider how Hive would handle a <code>logs</code> table.</p>
 <p>In Hive, partitions are explicit and appear as a column, so the <code>logs</code> table would have a column called <code>event_date</code>. When writing, an insert needs to supply the data for the <code>event_date</code> column:</p>
 <pre><code class="sql">INSERT INTO logs PARTITION (event_date)
@@ -280,7 +280,7 @@ WHERE event_time BETWEEN '2018-12-01 10:00:00' AND '2018-12-01 12:00:00'
 </code></pre>
 
 <p>If the <code>event_date</code> filter were missing, Hive would scan through every file in the table because it doesn&rsquo;t know that the <code>event_time</code> column is related to the <code>event_date</code> column.</p>
-<h3 id="problems-with-hive-partitioning">Problems with Hive partitioning<a class="headerlink" href="#problems-with-hive-partitioning" title="Permanent link">&para;</a></h3>
+<h3 id="problems-with-hive-partitioning">Problems with Hive partitioning</h3>
 <p>Hive must be given partition values. In the logs example, it doesn&rsquo;t know the relationship bewteen <code>event_time</code> and <code>event_date</code>.</p>
 <p>This leads to several problems:</p>
 <ul>
@@ -296,7 +296,7 @@ WHERE event_time BETWEEN '2018-12-01 10:00:00' AND '2018-12-01 12:00:00'
 </li>
 <li>Working queries are tied to the table&rsquo;s partitioning scheme, so partitioning configuration cannot be changed without breaking queries</li>
 </ul>
-<h3 id="icebergs-hidden-partitioning">Iceberg&rsquo;s hidden partitioning<a class="headerlink" href="#icebergs-hidden-partitioning" title="Permanent link">&para;</a></h3>
+<h3 id="icebergs-hidden-partitioning">Iceberg&rsquo;s hidden partitioning</h3>
 <p>Iceberg produces partition values by taking a column value and optionally transforming it. Iceberg is responsible for converting <code>event_time</code> into <code>event_date</code>, and keeps track of the relationship.</p>
 <p>Table partitioning is configured using these relationships. The <code>logs</code> table would be partitioned by <code>date(event_time)</code> and <code>level</code>.</p>
 <p>Because Iceberg doesn&rsquo;t require user-maintained partition columns, it can hide partitioning. Partition values are produced correctly every time and always used to speed up queries, when possible. Producers and consumers wouldn&rsquo;t even see <code>event_date</code>.</p>
diff --git a/performance/index.html b/performance/index.html
index eeadbd7..2d3542c 100644
--- a/performance/index.html
+++ b/performance/index.html
@@ -246,12 +246,12 @@
 </div></div>
         <div class="col-md-9" role="main">
 
-<h1 id="performance">Performance<a class="headerlink" href="#performance" title="Permanent link">&para;</a></h1>
+<h1 id="performance">Performance</h1>
 <ul>
 <li>Iceberg is designed for huge tables and is used in production where a <em>single table</em> can contain tens of petabytes of data.</li>
 <li>Even multi-petabyte tables can be read from a single node, without needing a distributed SQL engine to sift through table metadata.</li>
 </ul>
-<h2 id="scan-planning">Scan planning<a class="headerlink" href="#scan-planning" title="Permanent link">&para;</a></h2>
+<h2 id="scan-planning">Scan planning</h2>
 <p>Scan planning is the process of finding the files in a table that are needed for a query.</p>
 <p>Planning in an Iceberg table fits on a single node because Iceberg&rsquo;s metadata can be used to prune <em>metadata</em> files that aren&rsquo;t needed, in addition to filtering <em>data</em> files that don&rsquo;t contain matching data.</p>
 <p>Fast scan planning from a single node enables:</p>
@@ -259,7 +259,7 @@
 <li>Lower latency SQL queries &ndash; by eliminating a distributed scan to plan a distributed scan</li>
 <li>Access from any client &ndash; stand-alone processes can read data directly from Iceberg tables</li>
 </ul>
-<h3 id="metadata-filtering">Metadata filtering<a class="headerlink" href="#metadata-filtering" title="Permanent link">&para;</a></h3>
+<h3 id="metadata-filtering">Metadata filtering</h3>
 <p>Iceberg uses two levels of metadata to track the files in a snasphot.</p>
 <ul>
 <li><strong>Manifest files</strong> store a list of data files, along each data file&rsquo;s partition data and column-level stats</li>
@@ -267,7 +267,7 @@
 </ul>
 <p>For fast scan planning, Iceberg first filters manifests using the partition value ranges in the manifest list. Then, it reads each manifest to get data files. With this scheme, the manifest list acts as an index over the manifest files, making it possible to plan without reading all manifests.</p>
 <p>In addition to partition value ranges, a manifest list also stores the number of files added or deleted in a manifest to speed up operations like snapshot expiration.</p>
-<h3 id="data-filtering">Data filtering<a class="headerlink" href="#data-filtering" title="Permanent link">&para;</a></h3>
+<h3 id="data-filtering">Data filtering</h3>
 <p>Manifest files include a tuple of partition data and column-level stats for each data file.</p>
 <p>During planning, query predicates are automatically converted to predicates on the partition data and applied first to filter data files. Next, column-level value counts, null counts, lower bounds, and upper bounds are used to eliminate files that cannot match the query predicate.</p>
 <p>By using upper and lower bounds to filter data files at planning time, Iceberg uses clustered data to eliminate splits without running tasks. In some cases, this is a <a href="https://cdn.oreillystatic.com/en/assets/1/event/278/Introducing%20Iceberg_%20Tables%20designed%20for%20object%20stores%20Presentation.pdf">10x performance improvement</a>.</p></div>
diff --git a/presto/index.html b/presto/index.html
index 9393d6a..51d2637 100644
--- a/presto/index.html
+++ b/presto/index.html
@@ -242,7 +242,7 @@
 </div></div>
         <div class="col-md-9" role="main">
 
-<h1 id="presto">Presto<a class="headerlink" href="#presto" title="Permanent link">&para;</a></h1>
+<h1 id="presto">Presto</h1>
 <p>An Iceberg connector for Presto is available in <a href="https://github.com/prestosql/presto/pull/458">pull request #458 on prestosql/presto</a></p></div>
         
         
diff --git a/reliability/index.html b/reliability/index.html
index 5d21733..b869fd0 100644
--- a/reliability/index.html
+++ b/reliability/index.html
@@ -248,7 +248,7 @@
 </div></div>
         <div class="col-md-9" role="main">
 
-<h1 id="reliability">Reliability<a class="headerlink" href="#reliability" title="Permanent link">&para;</a></h1>
+<h1 id="reliability">Reliability</h1>
 <p>Iceberg was designed to solve correctness problems that affect Hive tables running in S3.</p>
 <p>Hive tables track data files using both a central metastore for partitions and a file system for individual files. This makes atomic changes to a table&rsquo;s contents impossible, and eventually consistent stores like S3 may return incorrect results due to the use of listing files to reconstruct the state of a table. It also requires job planning to make many slow listing calls: O(n) with the number of partitions.</p>
 <p>Iceberg tracks the complete list of data files in each <a href="../terms#snapshot">snapshot</a> using a persistent tree structure. Every write or delete produces a new snapshot that reuses as much of the previous snapshot&rsquo;s metadata tree as possible to avoid high write volumes.</p>
@@ -266,17 +266,17 @@
 <li><strong>Distributed planning</strong>: File pruning and predicate push-down is distributed to jobs, removing the metastore as a bottleneck</li>
 <li><strong>Finer granularity partitioning</strong>: Distributed planning and O(1) RPC calls remove the current barriers to finer-grained partitioning</li>
 </ul>
-<h2 id="concurrent-write-operations">Concurrent write operations<a class="headerlink" href="#concurrent-write-operations" title="Permanent link">&para;</a></h2>
+<h2 id="concurrent-write-operations">Concurrent write operations</h2>
 <p>Iceberg supports multiple concurrent writes using optimistic concurrency.</p>
 <p>Each writer assumes that no other writers are operating and writes out new table metadata for an operation. Then, the writer attempts to commit by atomically swapping the new table metadata file for the existing metadata file.</p>
 <p>If the atomic swap fails because another writer has committed, the failed writer retries by writing a new metadata tree based on the the new current table state.</p>
-<h3 id="cost-of-retries">Cost of retries<a class="headerlink" href="#cost-of-retries" title="Permanent link">&para;</a></h3>
+<h3 id="cost-of-retries">Cost of retries</h3>
 <p>Writers avoid expensive retry operaitions by structuring changes so that work can be reused across retries.</p>
 <p>For example, appends usually create a new manifest file for the appended data files, which can be added to the table without rewriting the manifest on every attempt.</p>
-<h3 id="retry-validation">Retry validation<a class="headerlink" href="#retry-validation" title="Permanent link">&para;</a></h3>
+<h3 id="retry-validation">Retry validation</h3>
 <p>Commits are structured as assumptions and actions. After a conflict, a writer checks that the assumptions are met by the current table state. If the assumptions are met, then it is safe to re-apply the actions and commit.</p>
 <p>For example, a compaction might rewrite <code>file_a.avro</code> and <code>file_b.avro</code> as <code>merged.parquet</code>. This is safe to commit as long as the table still contains both <code>file_a.avro</code> and <code>file_b.avro</code>. If either file was deleted by a conflicting commit, then the operation must fail. Otherwise, it is safe to remove the source files and add the merged file.</p>
-<h2 id="compatibility">Compatibility<a class="headerlink" href="#compatibility" title="Permanent link">&para;</a></h2>
+<h2 id="compatibility">Compatibility</h2>
 <p>By avoiding file listing and rename operations, Iceberg tables are compatible with any object store. No consistent listing is required.</p></div>
         
         
diff --git a/schemas/index.html b/schemas/index.html
index 5885a99..c3636cf 100644
--- a/schemas/index.html
+++ b/schemas/index.html
@@ -242,7 +242,7 @@
 </div></div>
         <div class="col-md-9" role="main">
 
-<h1 id="schemas">Schemas<a class="headerlink" href="#schemas" title="Permanent link">&para;</a></h1>
+<h1 id="schemas">Schemas</h1>
 <p>Iceberg tables support the following types:</p>
 <table>
 <thead>
diff --git a/sitemap.xml.gz b/sitemap.xml.gz
index b533f9f..726115f 100644
Binary files a/sitemap.xml.gz and b/sitemap.xml.gz differ
diff --git a/snapshots/index.html b/snapshots/index.html
index d7cc24e..b69c501 100644
--- a/snapshots/index.html
+++ b/snapshots/index.html
@@ -236,9 +236,9 @@
 </div></div>
         <div class="col-md-9" role="main">
 
-<h1 id="snapshots">Snapshots<a class="headerlink" href="#snapshots" title="Permanent link">&para;</a></h1>
-<h2 id="time-travel">Time travel<a class="headerlink" href="#time-travel" title="Permanent link">&para;</a></h2>
-<h2 id="expiration">Expiration<a class="headerlink" href="#expiration" title="Permanent link">&para;</a></h2></div>
+<h1 id="snapshots">Snapshots</h1>
+<h2 id="time-travel">Time travel</h2>
+<h2 id="expiration">Expiration</h2></div>
         
         
     </div>
diff --git a/spark/index.html b/spark/index.html
index cd26b72..9b509f2 100644
--- a/spark/index.html
+++ b/spark/index.html
@@ -248,7 +248,7 @@
 </div></div>
         <div class="col-md-9" role="main">
 
-<h1 id="spark">Spark<a class="headerlink" href="#spark" title="Permanent link">&para;</a></h1>
+<h1 id="spark">Spark</h1>
 <p>Iceberg uses Spark&rsquo;s DataSourceV2 API for data source and catalog implementations. Spark DSv2 is an evolving API with different levels of support in Spark versions.</p>
 <table>
 <thead>
@@ -291,10 +291,10 @@
 </tr>
 </tbody>
 </table>
-<h2 id="spark-24">Spark 2.4<a class="headerlink" href="#spark-24" title="Permanent link">&para;</a></h2>
+<h2 id="spark-24">Spark 2.4</h2>
 <p>To use Iceberg in Spark 2.4, add the <code>iceberg-runtime</code> Jar to Spark&rsquo;s <code>jars</code> folder.</p>
 <p>Spark 2.4 is limited to reading and writing existing Iceberg tables. Use the <a href="api">Iceberg API</a> to create Iceberg tables.</p>
-<h3 id="reading-an-iceberg-table">Reading an Iceberg table<a class="headerlink" href="#reading-an-iceberg-table" title="Permanent link">&para;</a></h3>
+<h3 id="reading-an-iceberg-table">Reading an Iceberg table</h3>
 <p>To read an Iceberg table, use the <code>iceberg</code> format in <code>DataFrameReader</code>:</p>
 <pre><code class="scala">spark.read.format(&quot;iceberg&quot;).load(&quot;db.table&quot;)
 </code></pre>
@@ -303,7 +303,7 @@
 <pre><code class="scala">spark.read.format(&quot;iceberg&quot;).load(&quot;hdfs://nn:8020/path/to/table&quot;)
 </code></pre>
 
-<h3 id="time-travel">Time travel<a class="headerlink" href="#time-travel" title="Permanent link">&para;</a></h3>
+<h3 id="time-travel">Time travel</h3>
 <p>To select a specific table snapshot or the snapshot at some time, Iceberg supports two Spark read options:</p>
 <ul>
 <li><code>snapshot-id</code> selects a specific table snapshot</li>
@@ -323,7 +323,7 @@ spark.read
     .load(&quot;db.table&quot;)
 </code></pre>
 
-<h3 id="querying-with-sql">Querying with SQL<a class="headerlink" href="#querying-with-sql" title="Permanent link">&para;</a></h3>
+<h3 id="querying-with-sql">Querying with SQL</h3>
 <p>To run SQL <code>SELECT</code> statements on Iceberg tables in 2.4, register the DataFrame as a temporary table:</p>
 <pre><code class="scala">val df = spark.read.format(&quot;iceberg&quot;).load(&quot;db.table&quot;)
 df.createOrReplaceTempView(&quot;table&quot;)
@@ -331,7 +331,7 @@ df.createOrReplaceTempView(&quot;table&quot;)
 spark.sql(&quot;&quot;&quot;select count(1) from table&quot;&quot;&quot;).show()
 </code></pre>
 
-<h3 id="appending-to-an-iceberg-table">Appending to an Iceberg table<a class="headerlink" href="#appending-to-an-iceberg-table" title="Permanent link">&para;</a></h3>
+<h3 id="appending-to-an-iceberg-table">Appending to an Iceberg table</h3>
 <p>To append a dataframe to an Iceberg table, use the <code>iceberg</code> format with <code>DataFrameReader</code>:</p>
 <pre><code class="scala">spark.write
     .format(&quot;iceberg&quot;)
diff --git a/spec/index.html b/spec/index.html
index dede2d2..23ef46c 100644
--- a/spec/index.html
+++ b/spec/index.html
@@ -270,9 +270,9 @@
 </div></div>
         <div class="col-md-9" role="main">
 
-<h1 id="iceberg-table-spec">Iceberg Table Spec<a class="headerlink" href="#iceberg-table-spec" title="Permanent link">&para;</a></h1>
+<h1 id="iceberg-table-spec">Iceberg Table Spec</h1>
 <p>This is a specification for the Iceberg table format that is designed to manage a large, slow-changing collection of files in a distributed file system or key-value store as a table.</p>
-<h2 id="goals">Goals<a class="headerlink" href="#goals" title="Permanent link">&para;</a></h2>
+<h2 id="goals">Goals</h2>
 <ul>
 <li><strong>Snapshot isolation</strong> &ndash; Reads will be isolated from concurrent writes and always use a committed snapshot of a table’s data. Writes will support removing and adding files in a single operation and are never partially visible. Readers will not acquire locks.</li>
 <li><strong>Speed</strong> &ndash; Operations will use O(1) remote calls to plan the files for a scan and not O(n) where n grows with the size of the table, like the number of partitions or files.</li>
@@ -282,17 +282,17 @@
 <li><strong>Storage separation</strong> &ndash; Partitioning will be table configuration. Reads will be planned using predicates on data values, not partition values. Tables will support evolving partition schemes.</li>
 <li><strong>Formats</strong> &ndash; Underlying data file formats will support identical schema evolution rules and types. Both read- and write-optimized formats will be available.</li>
 </ul>
-<h2 id="overview">Overview<a class="headerlink" href="#overview" title="Permanent link">&para;</a></h2>
+<h2 id="overview">Overview</h2>
 <p><img alt="Iceberg snapshot structure" class="floating" src="../img/iceberg-metadata.png" /></p>
 <p>This table format tracks individual data files in a table instead of directories. This allows writers to create data files in-place and only adds files to the table in an explicit commit.</p>
 <p>Table state is maintained in metadata files. All changes to table state create a new metadata file and replace the old metadata with an atomic swap. The table metadata file tracks the table schema, partitioning config, custom properties, and snapshots of the table contents. A snapshot represents the state of a table at some time and is used to access the complete set of data files in the table.</p>
 <p>Data files in snapshots are tracked by one or more manifest files that contain a row for each data file in the table, the file&rsquo;s partition data, and its metrics. The data in a snapshot is the union of all files in its manifests. Manifest files are reused across snapshots to avoid rewriting metadata that is slow-changing. Manifests can track data files with any subset of a table and are not associated with partitions.</p>
 <p>The manifests that make up a snapshot are stored in a manifest list file. Each manifest list stores metadata about manifests, including partition stats and data file counts. These stats are used to avoid reading manifests that are not required for an operation.</p>
-<h4 id="mvcc-and-optimistic-concurrency">MVCC and Optimistic Concurrency<a class="headerlink" href="#mvcc-and-optimistic-concurrency" title="Permanent link">&para;</a></h4>
+<h4 id="mvcc-and-optimistic-concurrency">MVCC and Optimistic Concurrency</h4>
 <p>An atomic swap of one table metadata file for another provides serializable isolation. Readers use the snapshot that was current when they load the table metadata and are not affected by changes until they refresh and pick up a new metadata location.</p>
 <p>Writers create table metadata files optimistically, assuming that the current version will not be changed before the writer&rsquo;s commit. Once a writer has created an update, it commits by swapping the table’s metadata file pointer from the base version to the new version.</p>
 <p>If the snapshot on which an update is based is no longer current, the writer must retry the update based on the new current version. Some operations support retry by re-applying metadata changes and committing, under well-defined conditions. For example, a change that rewrites files can be applied to a new table snapshot if all of the rewritten files are still in the table.</p>
-<h4 id="file-system-operations">File System Operations<a class="headerlink" href="#file-system-operations" title="Permanent link">&para;</a></h4>
+<h4 id="file-system-operations">File System Operations</h4>
 <p>Iceberg only requires that file systems support the following operations:</p>
 <ul>
 <li><strong>In-place write</strong>: files are not moved or altered once they are written</li>
@@ -302,8 +302,8 @@
 <p>These requirements are compatible with object stores, like S3.</p>
 <p>Tables do not require random-access writes. Once written, data and metadata files are immutable until they are deleted.</p>
 <p>Tables do not require rename, except fo rtables that use atomic rename to implement the commit operation for new metadata files.</p>
-<h2 id="specification">Specification<a class="headerlink" href="#specification" title="Permanent link">&para;</a></h2>
-<h4 id="terms">Terms<a class="headerlink" href="#terms" title="Permanent link">&para;</a></h4>
+<h2 id="specification">Specification</h2>
+<h4 id="terms">Terms</h4>
 <ul>
 <li><strong>Schema</strong> &ndash; names and types of fields in a table</li>
 <li><strong>Partition spec</strong> &ndash; a definition of how partition values are derived from data fields</li>
@@ -311,14 +311,14 @@
 <li><strong>Manifest</strong> &ndash; a file that lists data files; a subset of a snapshot</li>
 <li><strong>Manifest list</strong> &ndash; a file that lists manifest files; one per snapshot</li>
 </ul>
-<h3 id="schemas-and-data-types">Schemas and Data Types<a class="headerlink" href="#schemas-and-data-types" title="Permanent link">&para;</a></h3>
+<h3 id="schemas-and-data-types">Schemas and Data Types</h3>
 <p>A table&rsquo;s <strong>schema</strong> is a list of named columns. All data types are either primitives or nested types, which are maps, lists, or structs. A table schema is also a struct type.</p>
 <p>For the representations of these types in Avro, ORC, and Parquet file formats, see Appendix A.</p>
-<h4 id="nested-types">Nested Types<a class="headerlink" href="#nested-types" title="Permanent link">&para;</a></h4>
+<h4 id="nested-types">Nested Types</h4>
 <p>A <strong><code>struct</code></strong> is a tuple of typed values. Each field in the tuple is named and has an integer id that is unique in the table schema. Each field can be either optional or required, meaning that values can (or cannot) be null. Fields may be any type. Fields may have an optional comment or doc string.</p>
 <p>A <strong><code>list</code></strong> is a collection of values with some element type. The element field has an integer id that is unique in the table schema. Elements can be either optional or required. Element types may be any type.</p>
 <p>A <strong><code>map</code></strong> is a collection of key-value pairs with a key type and a value type. Both the key field and value field each have an integer id that is unique in the table schema. Map keys are required and map values can be either optional or required. Both map keys and map values may be any type, including nested types.</p>
-<h4 id="primitive-types">Primitive Types<a class="headerlink" href="#primitive-types" title="Permanent link">&para;</a></h4>
+<h4 id="primitive-types">Primitive Types</h4>
 <table>
 <thead>
 <tr>
@@ -409,7 +409,7 @@ Timestamps <em>without time zone</em> represent a date and time of day regardles
 <li>Character strings must be stored as UTF-8 encoded byte arrays.</li>
 </ol>
 <p>For details on how to serialize a schema to JSON, see Appendix C.</p>
-<h4 id="schema-evolution">Schema Evolution<a class="headerlink" href="#schema-evolution" title="Permanent link">&para;</a></h4>
+<h4 id="schema-evolution">Schema Evolution</h4>
 <p>Schema evolution is limited to type promotion and adding, deleting, and renaming fields in structs (both nested structs and the top-level schema’s struct).</p>
 <p>Valid type promotions are:</p>
 <ul>
@@ -419,7 +419,7 @@ Timestamps <em>without time zone</em> represent a date and time of day regardles
 </ul>
 <p>Any struct, including a top-level schema, can evolve through deleting fields, adding new fields, renaming existing fields, or promoting a primitive using the valid type promotions. Adding a new field assigns a new ID for that field and for any nested fields. Renaming an existing field must change the name, but not the field ID. Deleting a field removes it from the current schema. Field deletion cannot be rolled back unless the field was nullable or if the current snapshot has not chan [...]
 <p>Grouping a subset of a struct’s fields into a nested struct is <strong>not</strong> allowed, nor is moving fields from a nested struct into its immediate parent struct (<code>struct&lt;a, b, c&gt; ↔ struct&lt;a, struct&lt;b, c&gt;&gt;</code>). Evolving primitive types to structs is <strong>not</strong> allowed, nor is evolving a single-field struct to a primitive (<code>map&lt;string, int&gt; ↔ map&lt;string, struct&lt;int&gt;&gt;</code>).</p>
-<h3 id="partitioning">Partitioning<a class="headerlink" href="#partitioning" title="Permanent link">&para;</a></h3>
+<h3 id="partitioning">Partitioning</h3>
 <p>Data files are stored in manifests with a tuple of partition values that are used in scans to filter out files that cannot contain records that match the scan’s filter predicate. Partition values for a data file must be the same for all records stored in the data file. (Manifests store data files from any partition, as long as the partition spec is the same for the data files.)</p>
 <p>Tables are configured with a <strong>partition spec</strong> that defines how to produce a tuple of partition values from a record. A partition spec has a list of fields that consist of:</p>
 <ul>
@@ -429,7 +429,7 @@ Timestamps <em>without time zone</em> represent a date and time of day regardles
 </ul>
 <p>The source column, selected by id, must be a primitive type and cannot be contained in a map or list, but may be nested in a struct. For details on how to serialize a partition spec to JSON, see Appendix C.</p>
 <p>Partition specs capture the transform from table data to partition values. This is used to transform predicates to partition predicates, in addition to transforming data values. Deriving partition predicates from column predicates on the table data is used to separate the logical queries from physical storage: the partitioning can change and the correct partition filters are always derived from column predicates. This simplifies queries because users don’t have to supply both logical  [...]
-<h4 id="partition-transforms">Partition Transforms<a class="headerlink" href="#partition-transforms" title="Permanent link">&para;</a></h4>
+<h4 id="partition-transforms">Partition Transforms</h4>
 <table>
 <thead>
 <tr>
@@ -485,14 +485,14 @@ Timestamps <em>without time zone</em> represent a date and time of day regardles
 </tbody>
 </table>
 <p>All transforms must return <code>null</code> for a <code>null</code> input value.</p>
-<h4 id="bucket-transform-details">Bucket Transform Details<a class="headerlink" href="#bucket-transform-details" title="Permanent link">&para;</a></h4>
+<h4 id="bucket-transform-details">Bucket Transform Details</h4>
 <p>Bucket partition transforms use a 32-bit hash of the source value. The 32-bit hash implementation is the 32-bit Murmur3 hash, x86 variant, seeded with 0.</p>
 <p>Transforms are parameterized by a number of buckets[^3], <code>N</code>. The hash mod <code>N</code> must produce a positive value by first discarding the sign bit of the hash value. In pseudo-code, the function is:</p>
 <pre><code>  def bucket_N(x) = (murmur3_x86_32_hash(x) &amp; Integer.MAX_VALUE) % N
 </code></pre>
 
 <p>For hash function details by type, see Appendix B.</p>
-<h4 id="truncate-transform-details">Truncate Transform Details<a class="headerlink" href="#truncate-transform-details" title="Permanent link">&para;</a></h4>
+<h4 id="truncate-transform-details">Truncate Transform Details</h4>
 <table>
 <thead>
 <tr>
@@ -534,7 +534,7 @@ Timestamps <em>without time zone</em> represent a date and time of day regardles
 <li>The remainder, <code>v % W</code>, must be positive. For languages where <code>%</code> can produce negative values, the correct truncate function is: <code>v - (((v % W) + W) % W)</code></li>
 <li>The width, <code>W</code>, used to truncate decimal values is applied using the scale of the decimal column to avoid additional (and potentially conflicting) parameters.</li>
 </ol>
-<h3 id="manifests">Manifests<a class="headerlink" href="#manifests" title="Permanent link">&para;</a></h3>
+<h3 id="manifests">Manifests</h3>
 <p>A manifest is an immutable Avro file that lists a set of data files, along with each file’s partition data tuple, metrics, and tracking information. One or more manifest files are used to store a snapshot, which tracks all of the files in a table at some point in time.</p>
 <p>A manifest is a valid Iceberg data file. Files must use Iceberg schemas and column projection.</p>
 <p>A manifest stores files for a single partition spec. When a table’s partition spec changes, old files remain in the older manifest and newer files are written to a new manifest. This is required because a manifest file’s schema is based on its partition spec (see below). This restriction also simplifies selecting files from a manifest because the same boolean expression can be used to select or filter all rows.</p>
@@ -664,11 +664,11 @@ Timestamps <em>without time zone</em> represent a date and time of day regardles
 </ol>
 <p>The <code>partition</code> struct stores the tuple of partition values for each file. Its type is derived from the partition fields of the partition spec for the manifest file.</p>
 <p>Each manifest file must store its partition spec and the current table schema in the Avro file’s key-value metadata. The partition spec is used to transform predicates on the table’s data rows into predicates on the manifest’s partition values during job planning.</p>
-<h4 id="manifest-entry-fields">Manifest Entry Fields<a class="headerlink" href="#manifest-entry-fields" title="Permanent link">&para;</a></h4>
+<h4 id="manifest-entry-fields">Manifest Entry Fields</h4>
 <p>The manifest entry fields are used to keep track of the snapshot in which files were added or logically deleted. The <code>data_file</code> struct is nested inside of the manifest entry so that it can be easily passed to job planning without the manifest entry fields.</p>
 <p>When a data file is added to the dataset, it’s manifest entry should store the snapshot ID in which the file was added and set status to 1 (added).</p>
 <p>When a data file is replaced or deleted from the dataset, it’s manifest entry fields store the snapshot ID in which the file was deleted and status 2 (deleted). The file may be deleted from the file system when the snapshot in which it was deleted is garbage collected, assuming that older snapshots have also been garbage collected[^4].</p>
-<h3 id="snapshots">Snapshots<a class="headerlink" href="#snapshots" title="Permanent link">&para;</a></h3>
+<h3 id="snapshots">Snapshots</h3>
 <p>A snapshot consists of the following fields:</p>
 <ul>
 <li><strong><code>snapshot-id</code></strong>: a unique long ID.</li>
@@ -691,12 +691,12 @@ Timestamps <em>without time zone</em> represent a date and time of day regardles
 <li>Large tables can be split across multiple manifests so that implementations can parallelize job planning or reduce the cost of rewriting a manifest.</li>
 </ul>
 <p>Valid snapshots are stored as a list in table metadata. For serialization, see Appendix C.</p>
-<h4 id="scan-planning">Scan Planning<a class="headerlink" href="#scan-planning" title="Permanent link">&para;</a></h4>
+<h4 id="scan-planning">Scan Planning</h4>
 <p>Scans are planned by reading the manifest files for the current snapshot listed in the table metadata. Deleted entries in a manifest are not included in the scan.</p>
 <p>For each manifest, scan predicates, that filter data rows, are converted to partition predicates, that filter data files, and used to select the data files in the manifest. This conversion uses the partition spec used to write the manifest file.</p>
 <p>Scan predicates are converted to partition predicates using an inclusive projection: if a scan predicate matches a row, then the partition predicate must match that row’s partition. This is an <em>inclusive projection</em>[^5] because rows that do not match the scan predicate may be included in the scan by the partition predicate.</p>
 <p>For example, an <code>events</code> table with a timestamp column named <code>ts</code> that is partitioned by <code>ts_day=day(ts)</code> is queried by users with ranges over the timestamp column: <code>ts &gt; X</code>. The inclusive projection is <code>ts_day &gt;= day(X)</code>, which is used to select files that may have matching rows. Note that, in most cases, timestamps just before <code>X</code> will be included in the scan because the file contains rows that match the predica [...]
-<h4 id="manifest-lists">Manifest Lists<a class="headerlink" href="#manifest-lists" title="Permanent link">&para;</a></h4>
+<h4 id="manifest-lists">Manifest Lists</h4>
 <p>Snapshots are embedded in table metadata, but the list of manifests for a snapshot can be stored in a separate manifest list file.</p>
 <p>A manifest list encodes extra fields that can be used to avoid scanning all of the manifests in a snapshot when planning a table scan. </p>
 <p>Manifest list files store <code>manifest_file</code>, a struct with the following fields:</p>
@@ -782,10 +782,10 @@ Timestamps <em>without time zone</em> represent a date and time of day regardles
 <ol>
 <li>Lower and upper bounds are serialized to bytes using the single-object serialization in Appendix D. The type of used to encode the value is the type of the partition field data.</li>
 </ol>
-<h3 id="table-metadata">Table Metadata<a class="headerlink" href="#table-metadata" title="Permanent link">&para;</a></h3>
+<h3 id="table-metadata">Table Metadata</h3>
 <p>Table metadata is stored as JSON. Each table metadata change creates a new table metadata file that is committed by an atomic operation. This operation is used to ensure that a new version of table metadata replaces the version on which it was based. This produces a linear history of table versions and ensures that concurrent writes are not lost.</p>
 <p>The atomic operation used to commit metadata depends on how tables are tracked and is not standardized by this spec. See the sections below for examples.</p>
-<h4 id="commit-conflict-resolution-and-retry">Commit Conflict Resolution and Retry<a class="headerlink" href="#commit-conflict-resolution-and-retry" title="Permanent link">&para;</a></h4>
+<h4 id="commit-conflict-resolution-and-retry">Commit Conflict Resolution and Retry</h4>
 <p>When two commits happen at the same time and are based on the same version, only one commit will succeed. In most cases, the failed commit can be applied to the new current version of table metadata and retried. Updates verify the conditions under which they can be applied to a new version and retry if those conditions are met.</p>
 <ul>
 <li>Append operations have no requirements and can always be applied.</li>
@@ -793,7 +793,7 @@ Timestamps <em>without time zone</em> represent a date and time of day regardles
 <li>Delete operations must verify that specific files to delete are still in the table. Delete operations based on expressions can always be applied (e.g., where timestamp &lt; X).</li>
 <li>Table schema updates and partition spec changes must validate that the schema has not changed between the base version and the current version.</li>
 </ul>
-<h4 id="table-metadata-fields">Table Metadata Fields<a class="headerlink" href="#table-metadata-fields" title="Permanent link">&para;</a></h4>
+<h4 id="table-metadata-fields">Table Metadata Fields</h4>
 <p>Table metadata consists of the following fields:</p>
 <table>
 <thead>
@@ -854,7 +854,7 @@ Timestamps <em>without time zone</em> represent a date and time of day regardles
 </tbody>
 </table>
 <p>For serialization details, see Appendix C.</p>
-<h4 id="file-system-tables">File System Tables<a class="headerlink" href="#file-system-tables" title="Permanent link">&para;</a></h4>
+<h4 id="file-system-tables">File System Tables</h4>
 <p>An atomic swap can be implemented using atomic rename in file systems that support it, like HDFS or most local file systems[^6].</p>
 <p>Each version of table metadata is stored in a metadata folder under the table’s base location using a file naming scheme that includes a version number, <code>V</code>: <code>v&lt;V&gt;.metadata.json</code>. To commit a new metadata version, <code>V+1</code>, the writer performs the following steps:</p>
 <ol>
@@ -867,7 +867,7 @@ Timestamps <em>without time zone</em> represent a date and time of day regardles
 </ol>
 </li>
 </ol>
-<h4 id="metastore-tables">Metastore Tables<a class="headerlink" href="#metastore-tables" title="Permanent link">&para;</a></h4>
+<h4 id="metastore-tables">Metastore Tables</h4>
 <p>The atomic swap needed to commit new versions of table metadata can be implemented by storing a pointer in a metastore or database that is updated with a check-and-put operation[^7]. The check-and-put validates that the version of the table that a write is based on is still current and then makes the new metadata from the write the current version.</p>
 <p>Each version of table metadata is stored in a metadata folder under the table’s base location using a naming scheme that includes a version and UUID: <code>&lt;V&gt;-&lt;uuid&gt;.metadata.json</code>. To commit a new metadata version, <code>V+1</code>, the writer performs the following steps:</p>
 <ol start="2">
@@ -879,8 +879,8 @@ Timestamps <em>without time zone</em> represent a date and time of day regardles
 </ol>
 </li>
 </ol>
-<h2 id="appendix-a-format-specific-requirements">Appendix A: Format-specific Requirements<a class="headerlink" href="#appendix-a-format-specific-requirements" title="Permanent link">&para;</a></h2>
-<h3 id="avro">Avro<a class="headerlink" href="#avro" title="Permanent link">&para;</a></h3>
+<h2 id="appendix-a-format-specific-requirements">Appendix A: Format-specific Requirements</h2>
+<h3 id="avro">Avro</h3>
 <p><strong>Data Type Mappings</strong></p>
 <p>Values should be stored in Avro using the Avro types and logical type annotations in the table below.</p>
 <p>Optional fields, array elements, and map values must be wrapped in an Avro <code>union</code> with <code>null</code>. This is the only union type allowed in Iceberg data files.</p>
@@ -1028,7 +1028,7 @@ Timestamps <em>without time zone</em> represent a date and time of day regardles
 </tbody>
 </table>
 <p>Note that the string map case is for maps where the key type is a string. Using Avro’s map type in this case is optional. Maps with string keys may be stored as arrays.</p>
-<h3 id="parquet">Parquet<a class="headerlink" href="#parquet" title="Permanent link">&para;</a></h3>
+<h3 id="parquet">Parquet</h3>
 <p><strong>Data Type Mappings</strong></p>
 <p>Values should be stored in Parquet using the types and logical type annotations in the table below. Column IDs are required.</p>
 <p>Lists must use the <a href="https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#lists">3-level representation</a>.</p>
@@ -1146,7 +1146,7 @@ Timestamps <em>without time zone</em> represent a date and time of day regardles
 </tr>
 </tbody>
 </table>
-<h3 id="orc">ORC<a class="headerlink" href="#orc" title="Permanent link">&para;</a></h3>
+<h3 id="orc">ORC</h3>
 <p><strong>Data Type Mappings</strong></p>
 <table>
 <thead>
@@ -1271,7 +1271,7 @@ Timestamps <em>without time zone</em> represent a date and time of day regardles
 </tr>
 </tbody>
 </table>
-<h2 id="appendix-b-32-bit-hash-requirements">Appendix B: 32-bit Hash Requirements<a class="headerlink" href="#appendix-b-32-bit-hash-requirements" title="Permanent link">&para;</a></h2>
+<h2 id="appendix-b-32-bit-hash-requirements">Appendix B: 32-bit Hash Requirements</h2>
 <p>The 32-bit hash implementation is 32-bit Murmur3 hash, x86 variant, seeded with 0.</p>
 <table>
 <thead>
@@ -1363,8 +1363,8 @@ Hash results are not dependent on decimal scale, which is part of the type, not
 <li>UUIDs are encoded using big endian. The test UUID for the example above is: <code>f79c3e09-677c-4bbd-a479-3f349cb785e7</code>. This UUID encoded as a byte array is:
 <code>F7 9C 3E 09 67 7C 4B BD A4 79 3F 34 9C B7 85 E7</code></li>
 </ol>
-<h2 id="appendix-c-json-serialization">Appendix C: JSON serialization<a class="headerlink" href="#appendix-c-json-serialization" title="Permanent link">&para;</a></h2>
-<h3 id="schemas">Schemas<a class="headerlink" href="#schemas" title="Permanent link">&para;</a></h3>
+<h2 id="appendix-c-json-serialization">Appendix C: JSON serialization</h2>
+<h3 id="schemas">Schemas</h3>
 <p>Schemas are serialized to JSON as a struct. Types are serialized according to this table:</p>
 <table>
 <thead>
@@ -1462,7 +1462,7 @@ Hash results are not dependent on decimal scale, which is part of the type, not
 </tr>
 </tbody>
 </table>
-<h3 id="partition-specs">Partition Specs<a class="headerlink" href="#partition-specs" title="Permanent link">&para;</a></h3>
+<h3 id="partition-specs">Partition Specs</h3>
 <p>Partition specs are serialized as a JSON object with the following fields:</p>
 <table>
 <thead>
@@ -1538,7 +1538,7 @@ Hash results are not dependent on decimal scale, which is part of the type, not
 </tbody>
 </table>
 <p>In some cases partition specs are stored using only the field list instead of the object format that includes the spec ID, like the deprecated <code>partition-spec</code> field in table metadata. The object format should be used unless otherwise noted in this spec.</p>
-<h3 id="table-metadata-and-snapshots">Table Metadata and Snapshots<a class="headerlink" href="#table-metadata-and-snapshots" title="Permanent link">&para;</a></h3>
+<h3 id="table-metadata-and-snapshots">Table Metadata and Snapshots</h3>
 <p>Table metadata is serialized as a JSON object according to the following table. Snapshots are not serialized separately. Instead, they are stored in the table metadata JSON.</p>
 <table>
 <thead>
@@ -1611,7 +1611,7 @@ Hash results are not dependent on decimal scale, which is part of the type, not
 </tr>
 </tbody>
 </table>
-<h2 id="appendix-d-single-value-serialization">Appendix D: Single-value serialization<a class="headerlink" href="#appendix-d-single-value-serialization" title="Permanent link">&para;</a></h2>
+<h2 id="appendix-d-single-value-serialization">Appendix D: Single-value serialization</h2>
 <p>This serialization scheme is for storing single values as individual binary values in the lower and upper bounds maps of manifest files.</p>
 <table>
 <thead>
diff --git a/terms/index.html b/terms/index.html
index 94783a8..5917aed 100644
--- a/terms/index.html
+++ b/terms/index.html
@@ -254,24 +254,24 @@
 </div></div>
         <div class="col-md-9" role="main">
 
-<h1 id="terms">Terms<a class="headerlink" href="#terms" title="Permanent link">&para;</a></h1>
-<h3 id="snapshot">Snapshot<a class="headerlink" href="#snapshot" title="Permanent link">&para;</a></h3>
+<h1 id="terms">Terms</h1>
+<h3 id="snapshot">Snapshot</h3>
 <p>A <strong>snapshot</strong> is the state of a table at some time.</p>
 <p>Each snapshot lists all of the data files that make up the table&rsquo;s contents at the time of the snapshot. Data files are stored across multiple <a href="#manifest-file">manifest</a> files, and the manifests for a snapshot are listed in a single <a href="#manifest-list">manifest list</a> file.</p>
-<h3 id="manifest-list">Manifest list<a class="headerlink" href="#manifest-list" title="Permanent link">&para;</a></h3>
+<h3 id="manifest-list">Manifest list</h3>
 <p>A <strong>manifest list</strong> is a metadata file that lists the <a href="#manifest-file">manifests</a> that make up a table snapshot.</p>
 <p>Each manifest file in the manifest list is stored with information about its contents, like partition value ranges, used to speed up metadata operations.</p>
-<h3 id="manifest-file">Manifest file<a class="headerlink" href="#manifest-file" title="Permanent link">&para;</a></h3>
+<h3 id="manifest-file">Manifest file</h3>
 <p>A <strong>manifest file</strong> is a metadata file that lists a subset of data files that make up a snapshot.</p>
 <p>Each data file in a manifest is stored with a <a href="#partition-tuple">partition tuple</a>, column-level stats, and summary information used to prune splits during <a href="../performance#scan-planning">scan planning</a>.</p>
-<h3 id="partition-spec">Partition spec<a class="headerlink" href="#partition-spec" title="Permanent link">&para;</a></h3>
+<h3 id="partition-spec">Partition spec</h3>
 <p>A <strong>partition spec</strong> is a description of how to <a href="../partitioning">partition</a> data in a table.</p>
 <p>A spec consists of a list of source columns and transforms. A transform produces a partition value from a source value. For example, <code>date(ts)</code> produces the date associated with a timestamp column named <code>ts</code>.</p>
-<h3 id="partition-tuple">Partition tuple<a class="headerlink" href="#partition-tuple" title="Permanent link">&para;</a></h3>
+<h3 id="partition-tuple">Partition tuple</h3>
 <p>A <strong>partition tuple</strong> is a tuple or struct of partition data stored with each data file.</p>
 <p>All values in a partition tuple are the same for all rows stored in a data file. Partition tuples are produced by transforming values from row data using a partition spec.</p>
 <p>Iceberg stores partition values unmodified, unlike Hive tables that convert values to and from strings in file system paths and keys.</p>
-<h3 id="snapshot-log-history-table">Snapshot log (history table)<a class="headerlink" href="#snapshot-log-history-table" title="Permanent link">&para;</a></h3>
+<h3 id="snapshot-log-history-table">Snapshot log (history table)</h3>
 <p>The <strong>snapshot log</strong> is a metadata log of how the table&rsquo;s current snapshot has changed over time.</p>
 <p>The log is a list of timestamp and ID pairs: when the current snapshot changed and the snapshot ID the current snapshot was changed to.</p>
 <p>The snapshot log is stored in <a href="../spec#table-metadata-fields">table metadata as <code>snapshot-log</code></a>.</p></div>
diff --git a/why-iceberg/index.html b/why-iceberg/index.html
index e787e5b..86388ee 100644
--- a/why-iceberg/index.html
+++ b/why-iceberg/index.html
@@ -257,14 +257,14 @@
   - under the License.
   -->
 
-<h2 id="why-iceberg">Why Iceberg?<a class="headerlink" href="#why-iceberg" title="Permanent link">&para;</a></h2>
+<h2 id="why-iceberg">Why Iceberg?</h2>
 <p>Iceberg was created because no other table format for provides a complete solution for SQL tables. Some improve atomicity guarantees, but no other format delivers both reliable operations and behavior that data engineers need.</p>
 <p>Iceberg tracks individual data files in a table instead of directories. This allows writers to create data files in-place and only adds files to the table in an explicit commit.</p>
 <p>Table state is maintained in metadata files. All changes to table state create a new metadata file and replace the old metadata with an atomic operation. The table metadata file tracks the table schema, partitioning config, other properties, and snapshots of the table contents.</p>
 <p>The atomic transitions from one table metadata file to the next provide snapshot isolation. Readers use the latest table state (snapshot) that was current when they load the table metadata and are not affected by changes until they refresh and pick up a new metadata location.</p>
 <p>A <em>snapshot</em> is a complete set of data files in the table at some point in time. Snapshots are listed in the metadata file, but the files in a snapshot are stored across a separate set of <em>manifest</em> files.</p>
 <p>Data files in snapshots are stored in one or more manifest files that contain a row for each data file in the table, its partition data, and its metrics. A snapshot is the union of all files in its manifests. Manifest files can be shared between snapshots to avoid rewriting metadata that is slow-changing.</p>
-<h3 id="goals">Goals<a class="headerlink" href="#goals" title="Permanent link">&para;</a></h3>
+<h3 id="goals">Goals</h3>
 <p>This also provides improved guarantees and performance:</p>
 <ul>
 <li><strong>Snapshot isolation</strong>: Readers always use a consistent snapshot of the table, without needing to hold a lock. All table updates are atomic.</li>
@@ -274,7 +274,7 @@
 <li><strong>Finer granularity partitioning</strong>: Distributed planning and O(1) RPC calls remove the current barriers to finer-grained partitioning.</li>
 <li><strong>Safe file-level operations</strong>. By supporting atomic changes, Iceberg enables new use cases, like safely compacting small files and safely appending late data to tables.</li>
 </ul>
-<h3 id="why-a-new-table-format">Why a new table format?<a class="headerlink" href="#why-a-new-table-format" title="Permanent link">&para;</a></h3>
+<h3 id="why-a-new-table-format">Why a new table format?</h3>
 <p>The central metastore can be a scale bottleneck and the file system doesn&rsquo;t&mdash;and shouldn&rsquo;t&mdash;provide transactions to isolate concurrent reads and writes.</p>
 <p>There are several problems with the current format:</p>
 <ul>
@@ -283,7 +283,7 @@
 <li><strong>Operations depend on file rename</strong>: Most output committers depend on rename operations to implement guarantees and reduce the amount of time tables only have partial data from a write. But rename is not a metadata-only operation in S3 and will copy data. The <a href="https://issues.apache.org/jira/browse/HADOOP-13786">new S3 committers</a> that use multipart upload make this better, but can’t entirely solve the problem and put a lot of load on the file system during jo [...]
 </ul>
 <p>The current format&rsquo;s dependence on listing and rename cannot be changed, so a new format is needed.</p>
-<h3 id="other-design-goals">Other design goals<a class="headerlink" href="#other-design-goals" title="Permanent link">&para;</a></h3>
+<h3 id="other-design-goals">Other design goals</h3>
 <p>In addition to changes in how table contents are tracked, Iceberg&rsquo;s design improves a few other areas:</p>
 <ul>
 <li><strong>Reliable types</strong>: Iceberg provides a core set of types, tested to work consistently across all of the supported data formats. Types include date, timestamp, and decimal, as well as nested combinations of map, list, and struct.</li>