You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@datafu.apache.org by "Eyal Allweil (Jira)" <ji...@apache.org> on 2022/11/10 15:24:00 UTC

[jira] [Updated] (DATAFU-164) Improve test cases

     [ https://issues.apache.org/jira/browse/DATAFU-164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eyal Allweil updated DATAFU-164:
--------------------------------
    Description: 
We can get better code coverage and cover edge cases that are currently missing in our main tests file, [TestSparkDFUtils|https://github.com/apache/datafu/blob/master/datafu-spark/src/test/scala/datafu/spark/TestSparkDFUtils.scala].

 

For example, another test for [joinWithRange|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L419] that includes the case that a record falls into {{{}decreased_range_single{}}}, but {{range_start}} and {{range_end}} do not contain {{single.}}

Another case for the [flatten|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L256] API could also be good.

Or for [dedupRandomN|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L567].

Or adding to our [randomJoinSkewedTests|https://github.com/apache/datafu/blob/main/datafu-spark/src/test/scala/datafu/spark/TestSparkDFUtils.scala#L256] a test that verifies that _joinSkewed_ gives the same results as a regular join ({_}broadcastJoinSkewed{_} is already checked).

Or for anything else, for that matter.

 

It's perfectly alright to only do one of them - either as a patch or GitHub PR.

  was:
We can get better code coverage and cover edge cases that are currently missing in our main tests file, [TestSparkDFUtils|https://github.com/apache/datafu/blob/master/datafu-spark/src/test/scala/datafu/spark/TestSparkDFUtils.scala].

 

For example, another test for [joinWithRange|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L419] that includes the case that a record falls into {{{}decreased_range_single{}}}, but {{range_start}} and {{range_end}} do not contain {{single.}}

Another case for the [flatten|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L256] API could also be good.

Or for [dedupRandomN|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L567].

Or adding tests that verify that our skewed join methods ({_}broadcastJoinSkewed{_} and {_}joinSkewed{_}) give the same results as a regular join.

Or for anything else, for that matter.

 

It's perfectly alright to only do one of them - either as a patch or GitHub PR.


> Improve test cases
> ------------------
>
>                 Key: DATAFU-164
>                 URL: https://issues.apache.org/jira/browse/DATAFU-164
>             Project: DataFu
>          Issue Type: Test
>            Reporter: Eyal Allweil
>            Priority: Major
>              Labels: good-first-issue, newbie, up-for-grabs
>
> We can get better code coverage and cover edge cases that are currently missing in our main tests file, [TestSparkDFUtils|https://github.com/apache/datafu/blob/master/datafu-spark/src/test/scala/datafu/spark/TestSparkDFUtils.scala].
>  
> For example, another test for [joinWithRange|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L419] that includes the case that a record falls into {{{}decreased_range_single{}}}, but {{range_start}} and {{range_end}} do not contain {{single.}}
> Another case for the [flatten|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L256] API could also be good.
> Or for [dedupRandomN|https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala#L567].
> Or adding to our [randomJoinSkewedTests|https://github.com/apache/datafu/blob/main/datafu-spark/src/test/scala/datafu/spark/TestSparkDFUtils.scala#L256] a test that verifies that _joinSkewed_ gives the same results as a regular join ({_}broadcastJoinSkewed{_} is already checked).
> Or for anything else, for that matter.
>  
> It's perfectly alright to only do one of them - either as a patch or GitHub PR.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)