You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by rx...@apache.org on 2016/01/20 20:20:30 UTC

spark git commit: [SPARK-12925][SQL] Improve HiveInspectors.unwrap for StringObjectIns…

Repository: spark
Updated Branches:
  refs/heads/master 9753835cf -> e75e340a4


[SPARK-12925][SQL] Improve HiveInspectors.unwrap for StringObjectIns…

Text is in UTF-8 and converting it via "UTF8String.fromString" incurs decoding and encoding, which turns out to be expensive and redundant.  Profiler snapshot details is attached in the JIRA (ref:https://issues.apache.org/jira/secure/attachment/12783331/SPARK-12925_profiler_cpu_samples.png)

Author: Rajesh Balamohan <rb...@apache.org>

Closes #10848 from rajeshbalamohan/SPARK-12925.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e75e340a
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e75e340a
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e75e340a

Branch: refs/heads/master
Commit: e75e340a406b765608258b49f7e2f1107d4605fb
Parents: 9753835
Author: Rajesh Balamohan <rb...@apache.org>
Authored: Wed Jan 20 11:20:26 2016 -0800
Committer: Reynold Xin <rx...@databricks.com>
Committed: Wed Jan 20 11:20:26 2016 -0800

----------------------------------------------------------------------
 .../main/scala/org/apache/spark/sql/hive/HiveInspectors.scala    | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/e75e340a/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala
----------------------------------------------------------------------
diff --git a/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala b/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala
index 7a260e7..5d84feb 100644
--- a/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala
+++ b/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala
@@ -320,7 +320,9 @@ private[hive] trait HiveInspectors {
       case hvoi: HiveCharObjectInspector =>
         UTF8String.fromString(hvoi.getPrimitiveJavaObject(data).getValue)
       case x: StringObjectInspector if x.preferWritable() =>
-        UTF8String.fromString(x.getPrimitiveWritableObject(data).toString)
+        // Text is in UTF-8 already. No need to convert again via fromString
+        val wObj = x.getPrimitiveWritableObject(data)
+        UTF8String.fromBytes(wObj.getBytes, 0, wObj.getLength)
       case x: StringObjectInspector =>
         UTF8String.fromString(x.getPrimitiveJavaObject(data))
       case x: IntObjectInspector if x.preferWritable() => x.get(data)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org