You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by gr...@apache.org on 2019/11/25 03:09:12 UTC
[kudu] branch master updated (c050809 -> 1d38243)
This is an automated email from the ASF dual-hosted git repository.
granthenke pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git.
from c050809 [examples] use org.apache.kudu packages v1.11.1
new a5e5840 www: Add tablet On-Disk Size info to /table
new 1d38243 [spark] Add a test for sink based writing
The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "add" were already present in the repository and have only
been added to this reference.
Summary of changes:
.../apache/kudu/spark/kudu/DefaultSourceTest.scala | 37 ++++++++++++++++++++++
src/kudu/master/master_path_handlers.cc | 3 ++
www/table.mustache | 4 ++-
3 files changed, 43 insertions(+), 1 deletion(-)
[kudu] 02/02: [spark] Add a test for sink based writing
Posted by gr...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
granthenke pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git
commit 1d382435d1d2e46d279503ab64225342cf0068ea
Author: Grant Henke <gr...@apache.org>
AuthorDate: Thu Oct 31 09:05:38 2019 -0500
[spark] Add a test for sink based writing
Adds a test to verify writing via the KuduSink, as opposed to the
KuduContext, works as expected.
Change-Id: Ic1f28be80ad21b0783d8a0889ad7b1847601442b
Reviewed-on: http://gerrit.cloudera.org:8080/14603
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong <aw...@cloudera.com>
---
.../apache/kudu/spark/kudu/DefaultSourceTest.scala | 37 ++++++++++++++++++++++
1 file changed, 37 insertions(+)
diff --git a/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala b/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
index 6dca719..b2226e5 100644
--- a/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
+++ b/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
@@ -20,6 +20,7 @@ import scala.collection.JavaConverters._
import scala.collection.immutable.IndexedSeq
import org.apache.spark.sql.Row
import org.apache.spark.sql.SQLContext
+import org.apache.spark.sql.SaveMode
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types.DataTypes
import org.apache.spark.sql.types.StructField
@@ -338,6 +339,42 @@ class DefaultSourceTest extends KuduTestSuite with Matchers {
}
@Test
+ def testWriteWithSink() {
+ val df = sqlContext.read.options(kuduOptions).format("kudu").load
+ val baseDF = df.limit(1) // Filter down to just the first row.
+
+ // Change the c2 string to abc and upsert.
+ val upsertDF = baseDF.withColumn("c2_s", lit("abc"))
+ upsertDF.write
+ .format("kudu")
+ .option("kudu.master", harness.getMasterAddressesAsString)
+ .option("kudu.table", tableName)
+ // Default kudu.operation is upsert.
+ .mode(SaveMode.Append)
+ .save()
+
+ // Change the key and insert.
+ val insertDF = df
+ .limit(1)
+ .withColumn("key", df("key").plus(100))
+ .withColumn("c2_s", lit("def"))
+ insertDF.write
+ .format("kudu")
+ .option("kudu.master", harness.getMasterAddressesAsString)
+ .option("kudu.table", tableName)
+ .option("kudu.operation", "insert")
+ .mode(SaveMode.Append)
+ .save()
+
+ // Read the data back.
+ val newDF = sqlContext.read.options(kuduOptions).format("kudu").load
+ val collectedUpdate = newDF.filter("key = 0").collect()
+ assertEquals("abc", collectedUpdate(0).getAs[String]("c2_s"))
+ val collectedInsert = newDF.filter("key = 100").collect()
+ assertEquals("def", collectedInsert(0).getAs[String]("c2_s"))
+ }
+
+ @Test
def testUpsertRowsIgnoreNulls() {
val nonNullDF =
sqlContext.createDataFrame(Seq((0, "foo"))).toDF("key", "val")
[kudu] 01/02: www: Add tablet On-Disk Size info to /table
Posted by gr...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
granthenke pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git
commit a5e58406ae9195e66f4fbc91c56d49bbaca5dd98
Author: zhangyifan27 <ch...@163.com>
AuthorDate: Thu Nov 21 16:22:46 2019 +0800
www: Add tablet On-Disk Size info to /table
This patch added 'On-Disk Size' info for each tablet of a table
on master webui /table page, and supported sorting on tablets by
on_disk_size, so that we could easily find the data skew of the table.
Screenshot: http://ww1.sinaimg.cn/large/9b7ebaddly1g96lqe35qgj21eg0q87au.jpg
Change-Id: I8cd84420968383d11658df45719a5b2070505291
Reviewed-on: http://gerrit.cloudera.org:8080/14771
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Kudu Jenkins
---
src/kudu/master/master_path_handlers.cc | 3 +++
www/table.mustache | 4 +++-
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/kudu/master/master_path_handlers.cc b/src/kudu/master/master_path_handlers.cc
index a80a76b..790fe34 100644
--- a/src/kudu/master/master_path_handlers.cc
+++ b/src/kudu/master/master_path_handlers.cc
@@ -61,6 +61,7 @@
#include "kudu/master/ts_manager.h"
#include "kudu/server/monitored_task.h"
#include "kudu/server/webui_util.h"
+#include "kudu/tablet/metadata.pb.h"
#include "kudu/util/cow_object.h"
#include "kudu/util/easy_json.h"
#include "kudu/util/jsonwriter.h"
@@ -481,6 +482,8 @@ void MasterPathHandlers::HandleTablePage(const Webserver::WebRequest& req,
Capitalize(&state);
tablet_detail_json["state"] = state;
tablet_detail_json["state_msg"] = l.data().pb.state_msg();
+ tablet_detail_json["on_disk_size"] =
+ HumanReadableNumBytes::ToString(tablet->GetStats().on_disk_size());
EasyJson peers_json = tablet_detail_json.Set("peers", EasyJson::kArray);
for (const auto& e : sorted_replicas) {
EasyJson peer_json = peers_json.PushBack(EasyJson::kObject);
diff --git a/www/table.mustache b/www/table.mustache
index b96f6e3..af1a6de 100644
--- a/www/table.mustache
+++ b/www/table.mustache
@@ -96,12 +96,13 @@ under the License.
<h4>Detail</h4>
<a href='#detail' data-toggle='collapse'>(toggle)</a>
<div id='detail' class='collapse'>
- <table class='table table-striped table-hover'>
+ <table data-toggle="table" class='table table-striped table-hover'>
<thead><tr>
<th>Tablet ID</th>
{{{detail_partition_schema_header}}}
<th>State</th>
<th>Message</th>
+ <th data-sorter="bytesSorter" data-sortable="true">On-Disk Size (leaders only)</th>
<th>Peers</th>
</tr></thead>
<tbody>
@@ -111,6 +112,7 @@ under the License.
{{{partition_cols}}}
<td>{{state}}</td>
<td>{{state_msg}}</td>
+ <td>{{on_disk_size}}</td>
<td>
<ul>
{{#peers}}