You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@sdap.apache.org by sk...@apache.org on 2023/03/23 16:28:05 UTC
[incubator-sdap-nexus] branch master updated: SDAP 412 - Solution to Duplicate Primary Issue in /match_spark Endpoint (#216)
This is an automated email from the ASF dual-hosted git repository.
skperez pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-sdap-nexus.git
The following commit(s) were added to refs/heads/master by this push:
new 675f7c2 SDAP 412 - Solution to Duplicate Primary Issue in /match_spark Endpoint (#216)
675f7c2 is described below
commit 675f7c2afe2beb9898243be1a3024a72bdab7ca0
Author: Riley Kuttruff <72...@users.noreply.github.com>
AuthorDate: Thu Mar 23 09:27:59 2023 -0700
SDAP 412 - Solution to Duplicate Primary Issue in /match_spark Endpoint (#216)
* Explicitly defined equality for DomsPoint.
This prevents the duplicate primary points from appearing in the final results by merging them in the combineByKey step.
* Lazy hashing for domspoint
* Moved changelog entry
* Simplified __eq__ and __hash__
* Updated changelog entry to better reflect reasoning for fix
* Switched equality field from data_id to object id @ construction time
---------
Co-authored-by: rileykk <ri...@jpl.nasa.gov>
---
CHANGELOG.md | 1 +
analysis/webservice/algorithms_spark/Matchup.py | 8 ++++++++
2 files changed, 9 insertions(+)
diff --git a/CHANGELOG.md b/CHANGELOG.md
index a99ee89..dc3baeb 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -28,6 +28,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- SDAP-449: Fixed 404 error when populating datasets; script was still using `/domslist`
- SDAP-415: Fixed bug where mask was incorrectly combined across all variables for multi-variable satellite to satellite matchup
- SDAP-434: Fix for webapp Docker image build failure
+- SDAP-412: Explicit definition of `__eq__` and `__hash__` in matchup `DomsPoint` class. This ensures all primary-secondary pairs with the same primary point are merged in the `combineByKey` step.
### Security
## [1.0.0] - 2022-12-05
diff --git a/analysis/webservice/algorithms_spark/Matchup.py b/analysis/webservice/algorithms_spark/Matchup.py
index 2bc91c3..1274b64 100644
--- a/analysis/webservice/algorithms_spark/Matchup.py
+++ b/analysis/webservice/algorithms_spark/Matchup.py
@@ -368,9 +368,17 @@ class DomsPoint(object):
self.device = None
self.file_url = None
+ self.__id = id(self)
+
def __repr__(self):
return str(self.__dict__)
+ def __eq__(self, other):
+ return isinstance(other, DomsPoint) and other.__id == self.__id
+
+ def __hash__(self):
+ return hash(self.data_id) if self.data_id else id(self)
+
@staticmethod
def _variables_to_device(variables):
"""