You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Greg Rahn (Code Review)" <ge...@cloudera.org> on 2018/11/28 06:20:45 UTC

[Impala-ASF-CR] IMPALA-7759: Add Levenshtein edit distance built-in function

Greg Rahn has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/11793 )

Change subject: IMPALA-7759: Add Levenshtein edit distance built-in function
......................................................................

IMPALA-7759: Add Levenshtein edit distance built-in function

This patch adds new built-in functions to calculate Levenshtein edit
distance. Implemented as levenshtein() to match PostgreSQL in
both functionality and name and also added le_dst() alias for Netezza,
compatibility, but note that levenshtein() differs in functionality in
that if either value is NULL or both values are NULL, levenshtein()
returns NULL, where Netezza's le_dst() returns the length of the not
NULL value or 0 if both values are NULL.

Testing:
- Added unit tests to expr-test.cc
- Manual test on 966289 string pairs and results match PostgreSQL
- Added changes to qgen tests for PostgreSQL comparison

Change-Id: I549d33ab7cebfa10db2934461c8ec91e2cc1cdcb
---
M be/src/exprs/expr-test.cc
M be/src/exprs/string-functions-ir.cc
M be/src/exprs/string-functions.h
M common/function-registry/impala_functions.py
A tests/comparison/compat.py
M tests/comparison/discrepancy_searcher.py
M tests/comparison/funcs.py
7 files changed, 128 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/11793/3
-- 
To view, visit http://gerrit.cloudera.org:8080/11793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I549d33ab7cebfa10db2934461c8ec91e2cc1cdcb
Gerrit-Change-Number: 11793
Gerrit-PatchSet: 3
Gerrit-Owner: Greg Rahn <gr...@cloudera.com>
Gerrit-Reviewer: Greg Rahn <gr...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>