You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by wzhfy <gi...@git.apache.org> on 2017/03/03 06:57:52 UTC

[GitHub] spark pull request #17065: [SPARK-17075][SQL][followup] fix some minor issue...

Github user wzhfy commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17065#discussion_r104098256
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala ---
    @@ -95,15 +84,16 @@ case class FilterEstimation(plan: Filter, catalystConf: CatalystConf) extends Lo
        * @param condition the compound logical expression
        * @param update a boolean flag to specify if we need to update ColumnStat of a column
        *               for subsequent conditions
    -   * @return a double value to show the percentage of rows meeting a given condition.
    +   * @return an optional double value to show the percentage of rows meeting a given condition.
        *         It returns None if the condition is not supported.
        */
       def calculateFilterSelectivity(condition: Expression, update: Boolean = true): Option[Double] = {
    -
         condition match {
           case And(cond1, cond2) =>
    -        (calculateFilterSelectivity(cond1, update), calculateFilterSelectivity(cond2, update))
    -        match {
    +        // For ease of debugging, we compute percent1 and percent2 in 2 statements.
    +        val percent1 = calculateFilterSelectivity(cond1, update)
    +        val percent2 = calculateFilterSelectivity(cond2, update)
    +        (percent1, percent2) match {
               case (Some(p1), Some(p2)) => Some(p1 * p2)
               case (Some(p1), None) => Some(p1)
    --- End diff --
    
    @cloud-fan @ron8hu I'm a little confused about this, for Not expression, it always becomes under-estimation if we do over-estimation, no matter it's nested or not. So should we remove support for `nested Not` or `Not`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org