You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/09/26 20:50:28 UTC

[GitHub] [spark] aokolnychyi commented on pull request #36304: [SPARK-38959][SQL] DS V2: Support runtime group filtering in row-level commands

aokolnychyi commented on PR #36304:
URL: https://github.com/apache/spark/pull/36304#issuecomment-1258618345

I want to resume working on this PR but I need feedback on one point.

In the original implementation, @cloud-fan and I discussed supporting a separate scan builder for runtime group filtering in row-level operations. That way, we can prune columns and push down filters while looking for groups that have matches. We can't do that in the main row-level scan for group-based data sources as non-matching records in matching groups have to be copied over. See PR #35395 for context.

The only challenge is ensuring the same version of the table is scanned in the main row-level scan and in the scan that searches for matching groups to rewrite. There are multiple solutions to consider.

**Option 1**

The first option is shown in this PR. We can add a new method to `RowLevelOperation` that would provide us a scan builder for runtime group filtering.

```
interface RowLevelOperation {
// existing method
ScanBuilder newScanBuilder(CaseInsensitiveStringMap options);

// new method
default ScanBuilder newAffectedGroupsScanBuilder(CaseInsensitiveStringMap options) {
return newScanBuilder(options);
}

...
}
```

Under this implementation, it is up to data sources to ensure the same version is scanned in both scans. It is a fairly simple approach but it complicates the row-level API. On top, the new method is useless for data sources that can handle a delta of rows.

**Option 2**

The main row-level `Scan` can report scanned `tableVersion` and we can use that information to load a correct table version in the rule that assigns a runtime filter. This can be done via `TableCatalog$load(ident, version)`. The only API change is to extend `Scan` with `tableVersion` to know which table version is being read in the main scan.

**Option 3**

The rule that assigns a runtime group filter has access to the original `Table` object. We could just call `newScanBuilder` on it. However, I don't see anything in the API implying that reusing the `Table` instance guarantees the same version of the table will be scanned. If we call `newScanBuilder` on the same `Table` instance, do we expect the same version to be scanned? Seems like it is NOT the assumption right now.

If we can somehow benefit from reusing `Table` object, it will be the cleanest option from the API perspective.

Any ideas how to make Option 3 work?

cc @cloud-fan @rdblue @huaxingao @dongjoon-hyun @sunchao @viirya

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org