You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by gr...@apache.org on 2019/11/25 03:09:12 UTC

[kudu] branch master updated (c050809 -> 1d38243)

This is an automated email from the ASF dual-hosted git repository.

granthenke pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git.


    from c050809  [examples] use org.apache.kudu packages v1.11.1
     new a5e5840  www: Add tablet On-Disk Size info to /table
     new 1d38243  [spark] Add a test for sink based writing

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../apache/kudu/spark/kudu/DefaultSourceTest.scala | 37 ++++++++++++++++++++++
 src/kudu/master/master_path_handlers.cc            |  3 ++
 www/table.mustache                                 |  4 ++-
 3 files changed, 43 insertions(+), 1 deletion(-)


[kudu] 02/02: [spark] Add a test for sink based writing

Posted by gr...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

granthenke pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit 1d382435d1d2e46d279503ab64225342cf0068ea
Author: Grant Henke <gr...@apache.org>
AuthorDate: Thu Oct 31 09:05:38 2019 -0500

    [spark] Add a test for sink based writing
    
    Adds a test to verify writing via the KuduSink, as opposed to the
    KuduContext, works as expected.
    
    Change-Id: Ic1f28be80ad21b0783d8a0889ad7b1847601442b
    Reviewed-on: http://gerrit.cloudera.org:8080/14603
    Tested-by: Kudu Jenkins
    Reviewed-by: Andrew Wong <aw...@cloudera.com>
---
 .../apache/kudu/spark/kudu/DefaultSourceTest.scala | 37 ++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala b/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
index 6dca719..b2226e5 100644
--- a/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
+++ b/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
@@ -20,6 +20,7 @@ import scala.collection.JavaConverters._
 import scala.collection.immutable.IndexedSeq
 import org.apache.spark.sql.Row
 import org.apache.spark.sql.SQLContext
+import org.apache.spark.sql.SaveMode
 import org.apache.spark.sql.functions._
 import org.apache.spark.sql.types.DataTypes
 import org.apache.spark.sql.types.StructField
@@ -338,6 +339,42 @@ class DefaultSourceTest extends KuduTestSuite with Matchers {
   }
 
   @Test
+  def testWriteWithSink() {
+    val df = sqlContext.read.options(kuduOptions).format("kudu").load
+    val baseDF = df.limit(1) // Filter down to just the first row.
+
+    // Change the c2 string to abc and upsert.
+    val upsertDF = baseDF.withColumn("c2_s", lit("abc"))
+    upsertDF.write
+      .format("kudu")
+      .option("kudu.master", harness.getMasterAddressesAsString)
+      .option("kudu.table", tableName)
+      // Default kudu.operation is upsert.
+      .mode(SaveMode.Append)
+      .save()
+
+    // Change the key and insert.
+    val insertDF = df
+      .limit(1)
+      .withColumn("key", df("key").plus(100))
+      .withColumn("c2_s", lit("def"))
+    insertDF.write
+      .format("kudu")
+      .option("kudu.master", harness.getMasterAddressesAsString)
+      .option("kudu.table", tableName)
+      .option("kudu.operation", "insert")
+      .mode(SaveMode.Append)
+      .save()
+
+    // Read the data back.
+    val newDF = sqlContext.read.options(kuduOptions).format("kudu").load
+    val collectedUpdate = newDF.filter("key = 0").collect()
+    assertEquals("abc", collectedUpdate(0).getAs[String]("c2_s"))
+    val collectedInsert = newDF.filter("key = 100").collect()
+    assertEquals("def", collectedInsert(0).getAs[String]("c2_s"))
+  }
+
+  @Test
   def testUpsertRowsIgnoreNulls() {
     val nonNullDF =
       sqlContext.createDataFrame(Seq((0, "foo"))).toDF("key", "val")


[kudu] 01/02: www: Add tablet On-Disk Size info to /table

Posted by gr...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

granthenke pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit a5e58406ae9195e66f4fbc91c56d49bbaca5dd98
Author: zhangyifan27 <ch...@163.com>
AuthorDate: Thu Nov 21 16:22:46 2019 +0800

    www: Add tablet On-Disk Size info to /table
    
    This patch added 'On-Disk Size' info for each tablet of a table
    on master webui /table page, and supported sorting on tablets by
    on_disk_size, so that we could easily find the data skew of the table.
    
    Screenshot: http://ww1.sinaimg.cn/large/9b7ebaddly1g96lqe35qgj21eg0q87au.jpg
    
    Change-Id: I8cd84420968383d11658df45719a5b2070505291
    Reviewed-on: http://gerrit.cloudera.org:8080/14771
    Reviewed-by: Adar Dembo <ad...@cloudera.com>
    Tested-by: Kudu Jenkins
---
 src/kudu/master/master_path_handlers.cc | 3 +++
 www/table.mustache                      | 4 +++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/kudu/master/master_path_handlers.cc b/src/kudu/master/master_path_handlers.cc
index a80a76b..790fe34 100644
--- a/src/kudu/master/master_path_handlers.cc
+++ b/src/kudu/master/master_path_handlers.cc
@@ -61,6 +61,7 @@
 #include "kudu/master/ts_manager.h"
 #include "kudu/server/monitored_task.h"
 #include "kudu/server/webui_util.h"
+#include "kudu/tablet/metadata.pb.h"
 #include "kudu/util/cow_object.h"
 #include "kudu/util/easy_json.h"
 #include "kudu/util/jsonwriter.h"
@@ -481,6 +482,8 @@ void MasterPathHandlers::HandleTablePage(const Webserver::WebRequest& req,
     Capitalize(&state);
     tablet_detail_json["state"] = state;
     tablet_detail_json["state_msg"] = l.data().pb.state_msg();
+    tablet_detail_json["on_disk_size"] =
+        HumanReadableNumBytes::ToString(tablet->GetStats().on_disk_size());
     EasyJson peers_json = tablet_detail_json.Set("peers", EasyJson::kArray);
     for (const auto& e : sorted_replicas) {
       EasyJson peer_json = peers_json.PushBack(EasyJson::kObject);
diff --git a/www/table.mustache b/www/table.mustache
index b96f6e3..af1a6de 100644
--- a/www/table.mustache
+++ b/www/table.mustache
@@ -96,12 +96,13 @@ under the License.
   <h4>Detail</h4>
   <a href='#detail' data-toggle='collapse'>(toggle)</a>
   <div id='detail' class='collapse'>
-    <table class='table table-striped table-hover'>
+    <table data-toggle="table" class='table table-striped table-hover'>
       <thead><tr>
         <th>Tablet ID</th>
         {{{detail_partition_schema_header}}}
         <th>State</th>
         <th>Message</th>
+        <th data-sorter="bytesSorter" data-sortable="true">On-Disk Size (leaders only)</th>
         <th>Peers</th>
       </tr></thead>
       <tbody>
@@ -111,6 +112,7 @@ under the License.
           {{{partition_cols}}}
           <td>{{state}}</td>
           <td>{{state_msg}}</td>
+          <td>{{on_disk_size}}</td>
           <td>
             <ul>
               {{#peers}}