You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@oodt.apache.org by "Luca Cinquini (Created) (JIRA)" <ji...@apache.org> on 2012/01/21 12:52:40 UTC

[jira] [Created] (OODT-371) Opendapps patch #3 for CMDS

Opendapps patch #3 for CMDS
---------------------------

                 Key: OODT-371
                 URL: https://issues.apache.org/jira/browse/OODT-371
             Project: OODT
          Issue Type: Improvement
          Components: opendapps
    Affects Versions: 0.4
            Reporter: Luca Cinquini
         Attachments: opendapps-asf-20120121.patch

The main purpose of this patch is to improve the richness and consistency of metadata extracted from the THREDDS catalogs and OpenDAP streams, and to check that the required information is indeed present in the resulting OODT metadata profiles. 

Details on all classes affected follow:

Profiler
- invokes a profile-checking utility and prints out a summary of the most important metadata fields for quick review by the publisher

DasMetadataExtractor
- extracts variable names, long names and CF standard names from the opendap DAS stream

ThreddsMetadataExtractor
- stores additional metadata, such as the hostname
- parses all types of <documentation> tags, including xlinks, and uses the "type" attribute to create different metadata elements
- adds additional geospatial and temporal coverage elements
- stores multiple THREDDS access URLs as OODT <resLocation> attributes: the OpenDAP URL, the THREDDS catalog URL, and the TDS HTML landing page. All <resLocation>s are ecoded as tuple, to store the multiple fields
- does NOT parse tHE variable information in the THREDDS catalog, since this metadata is more reliably and consistently extracted from the opendap stream

ProfileChecker
- new utility class that checks an OODT Profile versus a list of required/optional elements.

ProfileUtils
- fixes bug that caused any metadata value containing a ',' to be split across multiple XML elements
- checks for the string "null" before adding a value to the metadata
- allow for multiple values of the same profile element to be provided in the configuration file, and includes them in the resulting OODT profile

opendap.config.xml
- updated example of configuration with new metadata fields

General changes
- changed level of log output in several classes so that relevant information stands out more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (OODT-371) Opendapps patch #3 for CMDS

Posted by "Luca Cinquini (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OODT-371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luca Cinquini updated OODT-371:
-------------------------------

    Attachment: opendapps-asf-20120121.patch

Patch attached
                
> Opendapps patch #3 for CMDS
> ---------------------------
>
>                 Key: OODT-371
>                 URL: https://issues.apache.org/jira/browse/OODT-371
>             Project: OODT
>          Issue Type: Improvement
>          Components: opendapps
>    Affects Versions: 0.4
>            Reporter: Luca Cinquini
>         Attachments: opendapps-asf-20120121.patch
>
>
> The main purpose of this patch is to improve the richness and consistency of metadata extracted from the THREDDS catalogs and OpenDAP streams, and to check that the required information is indeed present in the resulting OODT metadata profiles. 
> Details on all classes affected follow:
> Profiler
> - invokes a profile-checking utility and prints out a summary of the most important metadata fields for quick review by the publisher
> DasMetadataExtractor
> - extracts variable names, long names and CF standard names from the opendap DAS stream
> ThreddsMetadataExtractor
> - stores additional metadata, such as the hostname
> - parses all types of <documentation> tags, including xlinks, and uses the "type" attribute to create different metadata elements
> - adds additional geospatial and temporal coverage elements
> - stores multiple THREDDS access URLs as OODT <resLocation> attributes: the OpenDAP URL, the THREDDS catalog URL, and the TDS HTML landing page. All <resLocation>s are ecoded as tuple, to store the multiple fields
> - does NOT parse tHE variable information in the THREDDS catalog, since this metadata is more reliably and consistently extracted from the opendap stream
> ProfileChecker
> - new utility class that checks an OODT Profile versus a list of required/optional elements.
> ProfileUtils
> - fixes bug that caused any metadata value containing a ',' to be split across multiple XML elements
> - checks for the string "null" before adding a value to the metadata
> - allow for multiple values of the same profile element to be provided in the configuration file, and includes them in the resulting OODT profile
> opendap.config.xml
> - updated example of configuration with new metadata fields
> General changes
> - changed level of log output in several classes so that relevant information stands out more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-371) Improve the richness and consistency of metadata extracted from the THREDDS catalogs

Posted by "Chris A. Mattmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190548#comment-13190548 ] 

Chris A. Mattmann commented on OODT-371:
----------------------------------------

Hey Luca, +1 you are right. Have you found in a lot of the datasets that you are looking at, that "," is used prevalently?
                
> Improve the richness and consistency of metadata extracted from the THREDDS catalogs
> ------------------------------------------------------------------------------------
>
>                 Key: OODT-371
>                 URL: https://issues.apache.org/jira/browse/OODT-371
>             Project: OODT
>          Issue Type: Improvement
>          Components: opendapps
>            Reporter: Luca Cinquini
>            Assignee: Chris A. Mattmann
>             Fix For: 0.4
>
>         Attachments: opendapps-asf-20120121.patch
>
>
> The main purpose of this patch is to improve the richness and consistency of metadata extracted from the THREDDS catalogs and OpenDAP streams, and to check that the required information is indeed present in the resulting OODT metadata profiles. 
> Details on all classes affected follow:
> Profiler
> - invokes a profile-checking utility and prints out a summary of the most important metadata fields for quick review by the publisher
> DasMetadataExtractor
> - extracts variable names, long names and CF standard names from the opendap DAS stream
> ThreddsMetadataExtractor
> - stores additional metadata, such as the hostname
> - parses all types of <documentation> tags, including xlinks, and uses the "type" attribute to create different metadata elements
> - adds additional geospatial and temporal coverage elements
> - stores multiple THREDDS access URLs as OODT <resLocation> attributes: the OpenDAP URL, the THREDDS catalog URL, and the TDS HTML landing page. All <resLocation>s are ecoded as tuple, to store the multiple fields
> - does NOT parse tHE variable information in the THREDDS catalog, since this metadata is more reliably and consistently extracted from the opendap stream
> ProfileChecker
> - new utility class that checks an OODT Profile versus a list of required/optional elements.
> ProfileUtils
> - fixes bug that caused any metadata value containing a ',' to be split across multiple XML elements
> - checks for the string "null" before adding a value to the metadata
> - allow for multiple values of the same profile element to be provided in the configuration file, and includes them in the resulting OODT profile
> opendap.config.xml
> - updated example of configuration with new metadata fields
> General changes
> - changed level of log output in several classes so that relevant information stands out more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-371) Improve the richness and consistency of metadata extracted from the THREDDS catalogs

Posted by "Luca Cinquini (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190558#comment-13190558 ] 

Luca Cinquini commented on OODT-371:
------------------------------------

Yes, mostly when parsing tags like the THREDDS documentation, which contains full sentences, including commas... That's how I realized we should use a different delimiter. For single value fields the comma would work just fine instead.
                
> Improve the richness and consistency of metadata extracted from the THREDDS catalogs
> ------------------------------------------------------------------------------------
>
>                 Key: OODT-371
>                 URL: https://issues.apache.org/jira/browse/OODT-371
>             Project: OODT
>          Issue Type: Improvement
>          Components: opendapps
>            Reporter: Luca Cinquini
>            Assignee: Chris A. Mattmann
>             Fix For: 0.4
>
>         Attachments: opendapps-asf-20120121.patch
>
>
> The main purpose of this patch is to improve the richness and consistency of metadata extracted from the THREDDS catalogs and OpenDAP streams, and to check that the required information is indeed present in the resulting OODT metadata profiles. 
> Details on all classes affected follow:
> Profiler
> - invokes a profile-checking utility and prints out a summary of the most important metadata fields for quick review by the publisher
> DasMetadataExtractor
> - extracts variable names, long names and CF standard names from the opendap DAS stream
> ThreddsMetadataExtractor
> - stores additional metadata, such as the hostname
> - parses all types of <documentation> tags, including xlinks, and uses the "type" attribute to create different metadata elements
> - adds additional geospatial and temporal coverage elements
> - stores multiple THREDDS access URLs as OODT <resLocation> attributes: the OpenDAP URL, the THREDDS catalog URL, and the TDS HTML landing page. All <resLocation>s are ecoded as tuple, to store the multiple fields
> - does NOT parse tHE variable information in the THREDDS catalog, since this metadata is more reliably and consistently extracted from the opendap stream
> ProfileChecker
> - new utility class that checks an OODT Profile versus a list of required/optional elements.
> ProfileUtils
> - fixes bug that caused any metadata value containing a ',' to be split across multiple XML elements
> - checks for the string "null" before adding a value to the metadata
> - allow for multiple values of the same profile element to be provided in the configuration file, and includes them in the resulting OODT profile
> opendap.config.xml
> - updated example of configuration with new metadata fields
> General changes
> - changed level of log output in several classes so that relevant information stands out more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (OODT-371) Improve the richness and consistency of metadata extracted from the THREDDS catalogs

Posted by "Chris A. Mattmann (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OODT-371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris A. Mattmann updated OODT-371:
-----------------------------------

    Affects Version/s:     (was: 0.4)
        Fix Version/s: 0.4
              Summary: Improve the richness and consistency of metadata extracted from the THREDDS catalogs  (was: Opendapps patch #3 for CMDS)

- clearer title
- set fix version
                
> Improve the richness and consistency of metadata extracted from the THREDDS catalogs
> ------------------------------------------------------------------------------------
>
>                 Key: OODT-371
>                 URL: https://issues.apache.org/jira/browse/OODT-371
>             Project: OODT
>          Issue Type: Improvement
>          Components: opendapps
>            Reporter: Luca Cinquini
>            Assignee: Chris A. Mattmann
>             Fix For: 0.4
>
>         Attachments: opendapps-asf-20120121.patch
>
>
> The main purpose of this patch is to improve the richness and consistency of metadata extracted from the THREDDS catalogs and OpenDAP streams, and to check that the required information is indeed present in the resulting OODT metadata profiles. 
> Details on all classes affected follow:
> Profiler
> - invokes a profile-checking utility and prints out a summary of the most important metadata fields for quick review by the publisher
> DasMetadataExtractor
> - extracts variable names, long names and CF standard names from the opendap DAS stream
> ThreddsMetadataExtractor
> - stores additional metadata, such as the hostname
> - parses all types of <documentation> tags, including xlinks, and uses the "type" attribute to create different metadata elements
> - adds additional geospatial and temporal coverage elements
> - stores multiple THREDDS access URLs as OODT <resLocation> attributes: the OpenDAP URL, the THREDDS catalog URL, and the TDS HTML landing page. All <resLocation>s are ecoded as tuple, to store the multiple fields
> - does NOT parse tHE variable information in the THREDDS catalog, since this metadata is more reliably and consistently extracted from the opendap stream
> ProfileChecker
> - new utility class that checks an OODT Profile versus a list of required/optional elements.
> ProfileUtils
> - fixes bug that caused any metadata value containing a ',' to be split across multiple XML elements
> - checks for the string "null" before adding a value to the metadata
> - allow for multiple values of the same profile element to be provided in the configuration file, and includes them in the resulting OODT profile
> opendap.config.xml
> - updated example of configuration with new metadata fields
> General changes
> - changed level of log output in several classes so that relevant information stands out more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-371) Improve the richness and consistency of metadata extracted from the THREDDS catalogs

Posted by "Chris A. Mattmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190589#comment-13190589 ] 

Chris A. Mattmann commented on OODT-371:
----------------------------------------

+1, sounds good Luca.
                
> Improve the richness and consistency of metadata extracted from the THREDDS catalogs
> ------------------------------------------------------------------------------------
>
>                 Key: OODT-371
>                 URL: https://issues.apache.org/jira/browse/OODT-371
>             Project: OODT
>          Issue Type: Improvement
>          Components: opendapps
>            Reporter: Luca Cinquini
>            Assignee: Chris A. Mattmann
>             Fix For: 0.4
>
>         Attachments: opendapps-asf-20120121.patch
>
>
> The main purpose of this patch is to improve the richness and consistency of metadata extracted from the THREDDS catalogs and OpenDAP streams, and to check that the required information is indeed present in the resulting OODT metadata profiles. 
> Details on all classes affected follow:
> Profiler
> - invokes a profile-checking utility and prints out a summary of the most important metadata fields for quick review by the publisher
> DasMetadataExtractor
> - extracts variable names, long names and CF standard names from the opendap DAS stream
> ThreddsMetadataExtractor
> - stores additional metadata, such as the hostname
> - parses all types of <documentation> tags, including xlinks, and uses the "type" attribute to create different metadata elements
> - adds additional geospatial and temporal coverage elements
> - stores multiple THREDDS access URLs as OODT <resLocation> attributes: the OpenDAP URL, the THREDDS catalog URL, and the TDS HTML landing page. All <resLocation>s are ecoded as tuple, to store the multiple fields
> - does NOT parse tHE variable information in the THREDDS catalog, since this metadata is more reliably and consistently extracted from the opendap stream
> ProfileChecker
> - new utility class that checks an OODT Profile versus a list of required/optional elements.
> ProfileUtils
> - fixes bug that caused any metadata value containing a ',' to be split across multiple XML elements
> - checks for the string "null" before adding a value to the metadata
> - allow for multiple values of the same profile element to be provided in the configuration file, and includes them in the resulting OODT profile
> opendap.config.xml
> - updated example of configuration with new metadata fields
> General changes
> - changed level of log output in several classes so that relevant information stands out more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-371) Opendapps patch #3 for CMDS

Posted by "Chris A. Mattmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190468#comment-13190468 ] 

Chris A. Mattmann commented on OODT-371:
----------------------------------------

Thanks for this patch, Luca!

FTR:

bq. fixes bug that caused any metadata value containing a ',' to be split across multiple XML elements

This wasn't a bug, but in fact, intended functionality. This was to allow the use of THREDDS dataset 
metadata and environment variable replacement to occur. When PathUtils replaces environment variables
and/or cas-metadata in some path value, if the key being referenced has multiple values, they are represented
as comma-delimited strings. When converting those values to OODT Profile Elements, we wanted each value
to be considered when adding them to the Profile.

                
> Opendapps patch #3 for CMDS
> ---------------------------
>
>                 Key: OODT-371
>                 URL: https://issues.apache.org/jira/browse/OODT-371
>             Project: OODT
>          Issue Type: Improvement
>          Components: opendapps
>            Reporter: Luca Cinquini
>            Assignee: Chris A. Mattmann
>             Fix For: 0.4
>
>         Attachments: opendapps-asf-20120121.patch
>
>
> The main purpose of this patch is to improve the richness and consistency of metadata extracted from the THREDDS catalogs and OpenDAP streams, and to check that the required information is indeed present in the resulting OODT metadata profiles. 
> Details on all classes affected follow:
> Profiler
> - invokes a profile-checking utility and prints out a summary of the most important metadata fields for quick review by the publisher
> DasMetadataExtractor
> - extracts variable names, long names and CF standard names from the opendap DAS stream
> ThreddsMetadataExtractor
> - stores additional metadata, such as the hostname
> - parses all types of <documentation> tags, including xlinks, and uses the "type" attribute to create different metadata elements
> - adds additional geospatial and temporal coverage elements
> - stores multiple THREDDS access URLs as OODT <resLocation> attributes: the OpenDAP URL, the THREDDS catalog URL, and the TDS HTML landing page. All <resLocation>s are ecoded as tuple, to store the multiple fields
> - does NOT parse tHE variable information in the THREDDS catalog, since this metadata is more reliably and consistently extracted from the opendap stream
> ProfileChecker
> - new utility class that checks an OODT Profile versus a list of required/optional elements.
> ProfileUtils
> - fixes bug that caused any metadata value containing a ',' to be split across multiple XML elements
> - checks for the string "null" before adding a value to the metadata
> - allow for multiple values of the same profile element to be provided in the configuration file, and includes them in the resulting OODT profile
> opendap.config.xml
> - updated example of configuration with new metadata fields
> General changes
> - changed level of log output in several classes so that relevant information stands out more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-371) Improve the richness and consistency of metadata extracted from the THREDDS catalogs

Posted by "Luca Cinquini (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190527#comment-13190527 ] 

Luca Cinquini commented on OODT-371:
------------------------------------

Hi Chris,
poor choice of words on my part... indeed, it is not a bug, because it serves a very valuable purpose. Fact is though that the use of ',' causes metadata fields that contain a comma to be treated as multiple values, when instead they should be kept intact. That's why the opendapps module, which often deals with metadata fields including a comma, overrides the default delimiter of PathUtils to be a '&', which is a far less frequently used character in metadata fields. Other characters would work too, but it has to be something that is not mistaken for a regular expression special character, otherwise the split won't work.
Would you agree, or have I got something totally wrong ?
thanks, Luca
                
> Improve the richness and consistency of metadata extracted from the THREDDS catalogs
> ------------------------------------------------------------------------------------
>
>                 Key: OODT-371
>                 URL: https://issues.apache.org/jira/browse/OODT-371
>             Project: OODT
>          Issue Type: Improvement
>          Components: opendapps
>            Reporter: Luca Cinquini
>            Assignee: Chris A. Mattmann
>             Fix For: 0.4
>
>         Attachments: opendapps-asf-20120121.patch
>
>
> The main purpose of this patch is to improve the richness and consistency of metadata extracted from the THREDDS catalogs and OpenDAP streams, and to check that the required information is indeed present in the resulting OODT metadata profiles. 
> Details on all classes affected follow:
> Profiler
> - invokes a profile-checking utility and prints out a summary of the most important metadata fields for quick review by the publisher
> DasMetadataExtractor
> - extracts variable names, long names and CF standard names from the opendap DAS stream
> ThreddsMetadataExtractor
> - stores additional metadata, such as the hostname
> - parses all types of <documentation> tags, including xlinks, and uses the "type" attribute to create different metadata elements
> - adds additional geospatial and temporal coverage elements
> - stores multiple THREDDS access URLs as OODT <resLocation> attributes: the OpenDAP URL, the THREDDS catalog URL, and the TDS HTML landing page. All <resLocation>s are ecoded as tuple, to store the multiple fields
> - does NOT parse tHE variable information in the THREDDS catalog, since this metadata is more reliably and consistently extracted from the opendap stream
> ProfileChecker
> - new utility class that checks an OODT Profile versus a list of required/optional elements.
> ProfileUtils
> - fixes bug that caused any metadata value containing a ',' to be split across multiple XML elements
> - checks for the string "null" before adding a value to the metadata
> - allow for multiple values of the same profile element to be provided in the configuration file, and includes them in the resulting OODT profile
> opendap.config.xml
> - updated example of configuration with new metadata fields
> General changes
> - changed level of log output in several classes so that relevant information stands out more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (OODT-371) Opendapps patch #3 for CMDS

Posted by "Chris A. Mattmann (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OODT-371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris A. Mattmann reassigned OODT-371:
--------------------------------------

    Assignee: Chris A. Mattmann
    
> Opendapps patch #3 for CMDS
> ---------------------------
>
>                 Key: OODT-371
>                 URL: https://issues.apache.org/jira/browse/OODT-371
>             Project: OODT
>          Issue Type: Improvement
>          Components: opendapps
>    Affects Versions: 0.4
>            Reporter: Luca Cinquini
>            Assignee: Chris A. Mattmann
>         Attachments: opendapps-asf-20120121.patch
>
>
> The main purpose of this patch is to improve the richness and consistency of metadata extracted from the THREDDS catalogs and OpenDAP streams, and to check that the required information is indeed present in the resulting OODT metadata profiles. 
> Details on all classes affected follow:
> Profiler
> - invokes a profile-checking utility and prints out a summary of the most important metadata fields for quick review by the publisher
> DasMetadataExtractor
> - extracts variable names, long names and CF standard names from the opendap DAS stream
> ThreddsMetadataExtractor
> - stores additional metadata, such as the hostname
> - parses all types of <documentation> tags, including xlinks, and uses the "type" attribute to create different metadata elements
> - adds additional geospatial and temporal coverage elements
> - stores multiple THREDDS access URLs as OODT <resLocation> attributes: the OpenDAP URL, the THREDDS catalog URL, and the TDS HTML landing page. All <resLocation>s are ecoded as tuple, to store the multiple fields
> - does NOT parse tHE variable information in the THREDDS catalog, since this metadata is more reliably and consistently extracted from the opendap stream
> ProfileChecker
> - new utility class that checks an OODT Profile versus a list of required/optional elements.
> ProfileUtils
> - fixes bug that caused any metadata value containing a ',' to be split across multiple XML elements
> - checks for the string "null" before adding a value to the metadata
> - allow for multiple values of the same profile element to be provided in the configuration file, and includes them in the resulting OODT profile
> opendap.config.xml
> - updated example of configuration with new metadata fields
> General changes
> - changed level of log output in several classes so that relevant information stands out more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (OODT-371) Improve the richness and consistency of metadata extracted from the THREDDS catalogs

Posted by "Luca Cinquini (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OODT-371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luca Cinquini resolved OODT-371.
--------------------------------

    Resolution: Fixed

All changes implemented as described in the task.
                
> Improve the richness and consistency of metadata extracted from the THREDDS catalogs
> ------------------------------------------------------------------------------------
>
>                 Key: OODT-371
>                 URL: https://issues.apache.org/jira/browse/OODT-371
>             Project: OODT
>          Issue Type: Improvement
>          Components: opendapps
>            Reporter: Luca Cinquini
>            Assignee: Chris A. Mattmann
>             Fix For: 0.4
>
>         Attachments: opendapps-asf-20120121.patch
>
>
> The main purpose of this patch is to improve the richness and consistency of metadata extracted from the THREDDS catalogs and OpenDAP streams, and to check that the required information is indeed present in the resulting OODT metadata profiles. 
> Details on all classes affected follow:
> Profiler
> - invokes a profile-checking utility and prints out a summary of the most important metadata fields for quick review by the publisher
> DasMetadataExtractor
> - extracts variable names, long names and CF standard names from the opendap DAS stream
> ThreddsMetadataExtractor
> - stores additional metadata, such as the hostname
> - parses all types of <documentation> tags, including xlinks, and uses the "type" attribute to create different metadata elements
> - adds additional geospatial and temporal coverage elements
> - stores multiple THREDDS access URLs as OODT <resLocation> attributes: the OpenDAP URL, the THREDDS catalog URL, and the TDS HTML landing page. All <resLocation>s are ecoded as tuple, to store the multiple fields
> - does NOT parse tHE variable information in the THREDDS catalog, since this metadata is more reliably and consistently extracted from the opendap stream
> ProfileChecker
> - new utility class that checks an OODT Profile versus a list of required/optional elements.
> ProfileUtils
> - fixes bug that caused any metadata value containing a ',' to be split across multiple XML elements
> - checks for the string "null" before adding a value to the metadata
> - allow for multiple values of the same profile element to be provided in the configuration file, and includes them in the resulting OODT profile
> opendap.config.xml
> - updated example of configuration with new metadata fields
> General changes
> - changed level of log output in several classes so that relevant information stands out more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira