You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sdap.apache.org by "Frank Greguska (JIRA)" <ji...@apache.org> on 2018/07/24 18:11:00 UTC
[jira] [Created] (SDAP-127) ID should be unique in SOLR schema
Frank Greguska created SDAP-127:
-----------------------------------
Summary: ID should be unique in SOLR schema
Key: SDAP-127
URL: https://issues.apache.org/jira/browse/SDAP-127
Project: Apache Science Data Analytics Platform
Issue Type: Improvement
Reporter: Frank Greguska
The "solr_id_s" field is currently the "uniqueKey" for the schema:
https://github.com/apache/incubator-sdap-nexus/blob/107438af45b479348ffb75a667b276ee3c81f9da/data-access/config/schemas/solr/nexustiles/conf/managed-schema#L200
This is fine but a lot of the algorithms depend on the simple "id" field for working with tiles (the id field is the same as solr_id_s but without the prefix used for document routing):
https://github.com/apache/incubator-sdap-nexus/blob/107438af45b479348ffb75a667b276ee3c81f9da/data-access/config/schemas/solr/nexustiles/conf/managed-schema#L120
If possible, the "id" field should also be marked as unique so that it is impossible to generate tiles with identical "id"s.
This problem was found because of SLCP ice shelf data where 2 variables from the same granule were being ingested. The ID is generated from the granule name and section spec and an optional 'salt' value. In this case no salt was used (incorrectly) so the tiles were generated with identical "id"s but no error occurred because they had different dataset names which caused the "solr_id_s" field to be unique.
Not sure if it is possible to have more than one unique field in a SOLR schema.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)