You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@datafu.apache.org by "Eyal Allweil (Jira)" <ji...@apache.org> on 2022/11/10 15:24:00 UTC
[jira] [Updated] (DATAFU-164) Improve test cases
[ https://issues.apache.org/jira/browse/DATAFU-164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eyal Allweil updated DATAFU-164:
--------------------------------
Description:
We can get better code coverage and cover edge cases that are currently missing in our main tests file, [TestSparkDFUtils|https://github.com/apache/datafu/blob/master/datafu-spark/src/test/scala/datafu/spark/TestSparkDFUtils.scala].
For example, another test for [joinWithRange|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L419] that includes the case that a record falls into {{{}decreased_range_single{}}}, but {{range_start}} and {{range_end}} do not contain {{single.}}
Another case for the [flatten|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L256] API could also be good.
Or for [dedupRandomN|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L567].
Or adding to our [randomJoinSkewedTests|https://github.com/apache/datafu/blob/main/datafu-spark/src/test/scala/datafu/spark/TestSparkDFUtils.scala#L256] a test that verifies that _joinSkewed_ gives the same results as a regular join ({_}broadcastJoinSkewed{_} is already checked).
Or for anything else, for that matter.
It's perfectly alright to only do one of them - either as a patch or GitHub PR.
was:
We can get better code coverage and cover edge cases that are currently missing in our main tests file, [TestSparkDFUtils|https://github.com/apache/datafu/blob/master/datafu-spark/src/test/scala/datafu/spark/TestSparkDFUtils.scala].
For example, another test for [joinWithRange|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L419] that includes the case that a record falls into {{{}decreased_range_single{}}}, but {{range_start}} and {{range_end}} do not contain {{single.}}
Another case for the [flatten|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L256] API could also be good.
Or for [dedupRandomN|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L567].
Or adding tests that verify that our skewed join methods ({_}broadcastJoinSkewed{_} and {_}joinSkewed{_}) give the same results as a regular join.
Or for anything else, for that matter.
It's perfectly alright to only do one of them - either as a patch or GitHub PR.
> Improve test cases
> ------------------
>
> Key: DATAFU-164
> URL: https://issues.apache.org/jira/browse/DATAFU-164
> Project: DataFu
> Issue Type: Test
> Reporter: Eyal Allweil
> Priority: Major
> Labels: good-first-issue, newbie, up-for-grabs
>
> We can get better code coverage and cover edge cases that are currently missing in our main tests file, [TestSparkDFUtils|https://github.com/apache/datafu/blob/master/datafu-spark/src/test/scala/datafu/spark/TestSparkDFUtils.scala].
>
> For example, another test for [joinWithRange|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L419] that includes the case that a record falls into {{{}decreased_range_single{}}}, but {{range_start}} and {{range_end}} do not contain {{single.}}
> Another case for the [flatten|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L256] API could also be good.
> Or for [dedupRandomN|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L567].
> Or adding to our [randomJoinSkewedTests|https://github.com/apache/datafu/blob/main/datafu-spark/src/test/scala/datafu/spark/TestSparkDFUtils.scala#L256] a test that verifies that _joinSkewed_ gives the same results as a regular join ({_}broadcastJoinSkewed{_} is already checked).
> Or for anything else, for that matter.
>
> It's perfectly alright to only do one of them - either as a patch or GitHub PR.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)