You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/10/10 09:00:00 UTC

[jira] [Work logged] (HIVE-24762) StringValueBoundaryScanner ignores boundary which leads to incorrect results

     [ https://issues.apache.org/jira/browse/HIVE-24762?focusedWorklogId=815137&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-815137 ]

ASF GitHub Bot logged work on HIVE-24762:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 10/Oct/22 08:59
            Start Date: 10/Oct/22 08:59
    Worklog Time Spent: 10m 
      Work Description: sonarcloud[bot] commented on PR #1965:
URL: https://github.com/apache/hive/pull/1965#issuecomment-1272997758

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=1965)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=1965&resolved=false&types=BUG) [![C](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/C-16px.png 'C')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=1965&resolved=false&types=BUG) [6 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=1965&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=1965&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=1965&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=1965&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=1965&resolved=false&types=SECURITY_HOTSPOT) [![E](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/E-16px.png 'E')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=1965&resolved=false&types=SECURITY_HOTSPOT) [1 Security Hotspot](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=1965&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=1965&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=1965&resolved=false&types=CODE_SMELL) [68 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=1965&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=1965&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=1965&metric=duplicated_lines_density&view=list) No Duplication information
   
   




Issue Time Tracking
-------------------

    Worklog Id:     (was: 815137)
    Time Spent: 2h 20m  (was: 2h 10m)

>  StringValueBoundaryScanner ignores boundary which leads to incorrect results
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-24762
>                 URL: https://issues.apache.org/jira/browse/HIVE-24762
>             Project: Hive
>          Issue Type: Bug
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/ValueBoundaryScanner.java#L901
> {code}
>   public boolean isDistanceGreater(Object v1, Object v2, int amt) {
>     ...
>     return s1 != null && s2 != null && s1.compareTo(s2) > 0;
> {code}
> Like other boundary scanners, StringValueBoundaryScanner should take amt into account, otherwise it'll result in the same range regardless of the given window size. This typically affects queries where the range is defined on a string column:
> {code}
> select p_mfgr, p_name, p_retailprice,
> count(*) over(partition by p_mfgr order by p_name range between 1 preceding and current row) as cs1,
> count(*) over(partition by p_mfgr order by p_name range between 3 preceding and current row) as cs2
> from vector_ptf_part_simple_orc;
> {code} 
> with "> 0" cs1 and cs2 will be calculated on the same window, so cs1 == cs2, but actually it should be different, this is the correct result (see "almond antique olive coral navajo"):
> {code}
> +-----------------+---------------------------------------------+------+------+
> |     p_mfgr      |                   p_name                    | cs1  | cs2  |
> +-----------------+---------------------------------------------+------+------+
> | Manufacturer#1  | almond antique burnished rose metallic      | 2    | 2    |
> | Manufacturer#1  | almond antique burnished rose metallic      | 2    | 2    |
> | Manufacturer#1  | almond antique chartreuse lavender yellow   | 6    | 6    |
> | Manufacturer#1  | almond antique chartreuse lavender yellow   | 6    | 6    |
> | Manufacturer#1  | almond antique chartreuse lavender yellow   | 6    | 6    |
> | Manufacturer#1  | almond antique chartreuse lavender yellow   | 6    | 6    |
> | Manufacturer#1  | almond antique salmon chartreuse burlywood  | 1    | 1    |
> | Manufacturer#1  | almond aquamarine burnished black steel     | 1    | 8    |
> | Manufacturer#1  | almond aquamarine pink moccasin thistle     | 4    | 4    |
> | Manufacturer#1  | almond aquamarine pink moccasin thistle     | 4    | 4    |
> | Manufacturer#1  | almond aquamarine pink moccasin thistle     | 4    | 4    |
> | Manufacturer#1  | almond aquamarine pink moccasin thistle     | 4    | 4    |
> | Manufacturer#2  | almond antique violet chocolate turquoise   | 1    | 1    |
> | Manufacturer#2  | almond antique violet turquoise frosted     | 3    | 3    |
> | Manufacturer#2  | almond antique violet turquoise frosted     | 3    | 3    |
> | Manufacturer#2  | almond antique violet turquoise frosted     | 3    | 3    |
> | Manufacturer#2  | almond aquamarine midnight light salmon     | 1    | 5    |
> | Manufacturer#2  | almond aquamarine rose maroon antique       | 2    | 2    |
> | Manufacturer#2  | almond aquamarine rose maroon antique       | 2    | 2    |
> | Manufacturer#2  | almond aquamarine sandy cyan gainsboro      | 3    | 3    |
> | Manufacturer#3  | almond antique chartreuse khaki white       | 1    | 1    |
> | Manufacturer#3  | almond antique forest lavender goldenrod    | 4    | 5    |
> | Manufacturer#3  | almond antique forest lavender goldenrod    | 4    | 5    |
> | Manufacturer#3  | almond antique forest lavender goldenrod    | 4    | 5    |
> | Manufacturer#3  | almond antique forest lavender goldenrod    | 4    | 5    |
> | Manufacturer#3  | almond antique metallic orange dim          | 1    | 1    |
> | Manufacturer#3  | almond antique misty red olive              | 1    | 1    |
> | Manufacturer#3  | almond antique olive coral navajo           | 1    | 3    |
> | Manufacturer#4  | almond antique gainsboro frosted violet     | 1    | 1    |
> | Manufacturer#4  | almond antique violet mint lemon            | 1    | 1    |
> | Manufacturer#4  | almond aquamarine floral ivory bisque       | 2    | 4    |
> | Manufacturer#4  | almond aquamarine floral ivory bisque       | 2    | 4    |
> | Manufacturer#4  | almond aquamarine yellow dodger mint        | 1    | 1    |
> | Manufacturer#4  | almond azure aquamarine papaya violet       | 1    | 1    |
> | Manufacturer#5  | almond antique blue firebrick mint          | 1    | 1    |
> | Manufacturer#5  | almond antique medium spring khaki          | 2    | 2    |
> | Manufacturer#5  | almond antique medium spring khaki          | 2    | 2    |
> | Manufacturer#5  | almond antique sky peru orange              | 1    | 1    |
> | Manufacturer#5  | almond aquamarine dodger light gainsboro    | 1    | 5    |
> | Manufacturer#5  | almond azure blanched chiffon midnight      | 1    | 1    |
> +-----------------+---------------------------------------------+------+------+
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)