You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Edmon Begoli (JIRA)" <ji...@apache.org> on 2015/09/08 16:55:46 UTC

[jira] [Created] (DRILL-3747) UDF for "fuzzy" string and similarity matching

Edmon Begoli created DRILL-3747:
-----------------------------------

             Summary: UDF for "fuzzy" string and similarity matching
                 Key: DRILL-3747
                 URL: https://issues.apache.org/jira/browse/DRILL-3747
             Project: Apache Drill
          Issue Type: New Feature
          Components: Functions - Drill
    Affects Versions: Future
            Reporter: Edmon Begoli
            Assignee: Mehant Baid
            Priority: Minor


I propose implementation of string/distance or distance matching functions similar to what one finds in most of other databases - soundex, metaphone, levenshtein (and more advanced variants such as levenshtein-damerau, jaro-winkler, etc.).

See fuzzystrmatch http://www.postgresql.org/docs/9.5/static/fuzzystrmatch.html, 
and pg_similarity http://pgsimilarity.projects.pgfoundry.org/
for inspiration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)