You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@sdap.apache.org by sk...@apache.org on 2023/03/23 16:28:05 UTC

[incubator-sdap-nexus] branch master updated: SDAP 412 - Solution to Duplicate Primary Issue in /match_spark Endpoint (#216)

This is an automated email from the ASF dual-hosted git repository.

skperez pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-sdap-nexus.git


The following commit(s) were added to refs/heads/master by this push:
     new 675f7c2  SDAP 412 - Solution to Duplicate Primary Issue in /match_spark Endpoint (#216)
675f7c2 is described below

commit 675f7c2afe2beb9898243be1a3024a72bdab7ca0
Author: Riley Kuttruff <72...@users.noreply.github.com>
AuthorDate: Thu Mar 23 09:27:59 2023 -0700

    SDAP 412 - Solution to Duplicate Primary Issue in /match_spark Endpoint (#216)
    
    * Explicitly defined equality for DomsPoint.
    
    This prevents the duplicate primary points from appearing in the final results by merging them in the combineByKey step.
    
    * Lazy hashing for domspoint
    
    * Moved changelog entry
    
    * Simplified __eq__ and __hash__
    
    * Updated changelog entry to better reflect reasoning for fix
    
    * Switched equality field from data_id to object id @ construction time
    
    ---------
    
    Co-authored-by: rileykk <ri...@jpl.nasa.gov>
---
 CHANGELOG.md                                    | 1 +
 analysis/webservice/algorithms_spark/Matchup.py | 8 ++++++++
 2 files changed, 9 insertions(+)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index a99ee89..dc3baeb 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -28,6 +28,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - SDAP-449: Fixed 404 error when populating datasets; script was still using `/domslist`
 - SDAP-415: Fixed bug where mask was incorrectly combined across all variables for multi-variable satellite to satellite matchup
 - SDAP-434: Fix for webapp Docker image build failure
+- SDAP-412: Explicit definition of `__eq__` and `__hash__` in matchup `DomsPoint` class. This ensures all primary-secondary pairs with the same primary point are merged in the `combineByKey` step.
 ### Security
 
 ## [1.0.0] - 2022-12-05
diff --git a/analysis/webservice/algorithms_spark/Matchup.py b/analysis/webservice/algorithms_spark/Matchup.py
index 2bc91c3..1274b64 100644
--- a/analysis/webservice/algorithms_spark/Matchup.py
+++ b/analysis/webservice/algorithms_spark/Matchup.py
@@ -368,9 +368,17 @@ class DomsPoint(object):
         self.device = None
         self.file_url = None
 
+        self.__id = id(self)
+
     def __repr__(self):
         return str(self.__dict__)
 
+    def __eq__(self, other):
+        return isinstance(other, DomsPoint) and other.__id == self.__id
+
+    def __hash__(self):
+        return hash(self.data_id) if self.data_id else id(self)
+
     @staticmethod
     def _variables_to_device(variables):
         """