You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by "TANG Wen-hui (Jira)" <ji...@apache.org> on 2020/03/20 08:42:00 UTC

[jira] [Created] (CALCITE-3866) ReturnTypes.AGG_SUM may cause "numeric field overflow" on PostgreSQL when generate the sql after using the rule AggregateJoinTransposeRule.EXTENDED.

TANG Wen-hui created CALCITE-3866:
-------------------------------------

             Summary:  ReturnTypes.AGG_SUM may cause "numeric field overflow" on PostgreSQL when generate the sql after using the rule AggregateJoinTransposeRule.EXTENDED.
                 Key: CALCITE-3866
                 URL: https://issues.apache.org/jira/browse/CALCITE-3866
             Project: Calcite
          Issue Type: Bug
            Reporter: TANG Wen-hui
            Assignee: TANG Wen-hui


When we try to generate a sql after applying AggregateJoinTransposeRule.EXTENDED, the result sql can not run on PostgreSQL, and throws the following exception:
{code:java}
PSQLException: ERROR: numeric field overflow Detail: A field with precision 7, scale 2 must round to an absolute value less than 10^5.
{code}
I found that the main reason is that :

the return types of sum may have wrong precision when the type of 
 its operand is the decimal with precision, for example:
{code:java}
@Test public void testSum() {
  String query =
      "select sum(e1.\"store_sales\"), sum(e2.\"store_sales\") from \"sales_fact_dec_1998\" as "
          + "e1 , \"sales_fact_dec_1998\" as e2 where e1.\"product_id\" = e2.\"product_id\"";
  String expect = "";
  HepProgramBuilder builder = new HepProgramBuilder();
  builder.addRuleClass(FilterJoinRule.class);
  builder.addRuleClass(AggregateProjectMergeRule.class);
  builder.addRuleClass(AggregateJoinTransposeRule.class);
  HepPlanner hepPlanner = new HepPlanner(builder.build());
  RuleSet rules = RuleSets.ofList(FilterJoinRule.FILTER_ON_JOIN, FilterJoinRule.JOIN,
      AggregateProjectMergeRule.INSTANCE,
      AggregateJoinTransposeRule.EXTENDED);
  sql(query).withPostgresql().optimize(rules, hepPlanner).ok(expect);
}
{code}
the result generated sql of the query is :
{code:java}
SELECT SUM(CAST(\"t\".\"EXPR$0\" * \"t0\".\"$f1\" AS DECIMAL(10, 4))), SUM(CAST(\"t\".\"$f2\" * \"t0\".\"EXPR$1\" AS DECIMAL(10, 4)))
FROM (SELECT \"product_id\", SUM(\"store_sales\") AS \"EXPR$0\", COUNT(*) AS \"$f2\"
FROM \"foodmart\".\"sales_fact_dec_1998\"
GROUP BY \"product_id\") AS \"t\"
INNER JOIN (SELECT \"product_id\", COUNT(*) AS \"$f1\", SUM(\"store_sales\") AS \"EXPR$1\"
FROM \"foodmart\".\"sales_fact_dec_1998\"
GROUP BY \"product_id\") AS \"t0\" ON \"t\".\"product_id\" = \"t0\".\"product_id\"
{code}
AggregateJoinTransposeRule.EXTENDED generates a Aggregate  to sum up the sub-totals:
{code:java}
// Aggregate above to sum up the sub-totals
final List<AggregateCall> newAggCalls = new ArrayList<>();
final int groupCount = aggregate.getGroupCount();
final int newLeftWidth = sides.get(0).newInput.getRowType().getFieldCount();
final List<RexNode> projects =
    new ArrayList<>(
        rexBuilder.identityProjects(relBuilder.peek().getRowType()));
for (Ord<AggregateCall> aggCall : Ord.zip(aggregate.getAggCallList())) {
  final SqlAggFunction aggregation = aggCall.e.getAggregation();
  final SqlSplittableAggFunction splitter =
      Objects.requireNonNull(
          aggregation.unwrap(SqlSplittableAggFunction.class));
  final Integer leftSubTotal = sides.get(0).split.get(aggCall.i);
  final Integer rightSubTotal = sides.get(1).split.get(aggCall.i);
  newAggCalls.add(
      splitter.topSplit(rexBuilder, registry(projects),
          groupCount, relBuilder.peek().getRowType(), aggCall.e,
          leftSubTotal == null ? -1 : leftSubTotal,
          rightSubTotal == null ? -1 : rightSubTotal + newLeftWidth));
}

public AggregateCall topSplit(RexBuilder rexBuilder,
    Registry<RexNode> extra, int offset, RelDataType inputRowType,
    AggregateCall aggregateCall, int leftSubTotal, int rightSubTotal) {
  final List<RexNode> merges = new ArrayList<>();
  final List<RelDataTypeField> fieldList = inputRowType.getFieldList();
  if (leftSubTotal >= 0) {
    final RelDataType type = fieldList.get(leftSubTotal).getType();
    merges.add(rexBuilder.makeInputRef(type, leftSubTotal));
  }
  if (rightSubTotal >= 0) {
    final RelDataType type = fieldList.get(rightSubTotal).getType();
    merges.add(rexBuilder.makeInputRef(type, rightSubTotal));
  }
  RexNode node;
  switch (merges.size()) {
  case 1:
    node = merges.get(0);
    break;
  case 2:
    node = rexBuilder.makeCall(SqlStdOperatorTable.MULTIPLY, merges);
    node = rexBuilder.makeAbstractCast(aggregateCall.type, node);
    break;
  default:
    throw new AssertionError("unexpected count " + merges);
  }
  int ordinal = extra.register(node);
  return AggregateCall.create(getMergeAggFunctionOfTopSplit(), false, false,
      false, ImmutableList.of(ordinal), -1, aggregateCall.collation,
      aggregateCall.type, aggregateCall.name);
}
{code}
but it use the type of origin sum to generate a cast like 
{code:java}
SUM(CAST(\"t\".\"$f2\" * \"t0\".\"EXPR$1\" AS DECIMAL(10, 4))){code}
ReturnTypes.AGG_SUM simplely use the operand type as its return type:
{code:java}
public static final SqlReturnTypeInference AGG_SUM = opBinding -> {
  final RelDataTypeFactory typeFactory = opBinding.getTypeFactory();
  final RelDataType type = typeFactory.getTypeSystem()
      .deriveSumType(typeFactory, opBinding.getOperandType(0));
  if (opBinding.getGroupCount() == 0 || opBinding.hasFilter()) {
    return typeFactory.createTypeWithNullability(type, true);
  } else {
    return type;
  }
};
@Override public RelDataType deriveSumType(RelDataTypeFactory typeFactory,
    RelDataType argumentType) {
  return argumentType;
}
{code}
So the result of \"t\".\"$f2\" * \"t0\".\"EXPR$1\" may too large to cause overflow 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)