You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@oodt.apache.org by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org> on 2012/04/03 23:56:26 UTC

[jira] [Commented] (OODT-426) Introduce a CAS-Metadata based renaming interface

    [ https://issues.apache.org/jira/browse/OODT-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245796#comment-13245796 ] 

jiraposter@reviews.apache.org commented on OODT-426:
----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4628/
-----------------------------------------------------------

Review request for oodt, Chris Mattmann, Ricky Nguyen, Paul Ramirez, and Thomas Bennett.


Summary
-------

CAS-PGE Changes to this issue...
- Renaming and Metadata extraction removed from CAS-PGE and instead CAS-PGE now uses AutoDetectProductCrawler instead of StdProductCrawler


This addresses bug OODT-426.
    https://issues.apache.org/jira/browse/OODT-426


Diffs
-----

  trunk/pge/pom.xml 1302648 
  trunk/pge/src/main/java/org/apache/oodt/cas/pge/PGETaskInstance.java 1302648 
  trunk/pge/src/main/java/org/apache/oodt/cas/pge/config/OutputDir.java 1302648 
  trunk/pge/src/main/java/org/apache/oodt/cas/pge/config/PgeConfig.java 1302648 
  trunk/pge/src/main/java/org/apache/oodt/cas/pge/config/PgeConfigBuilder.java 1302648 
  trunk/pge/src/main/java/org/apache/oodt/cas/pge/config/PgeConfigMetKeys.java 1302648 
  trunk/pge/src/main/java/org/apache/oodt/cas/pge/config/RegExprOutputFiles.java 1302648 
  trunk/pge/src/main/java/org/apache/oodt/cas/pge/config/RenamingConv.java 1302648 
  trunk/pge/src/main/java/org/apache/oodt/cas/pge/config/XmlFilePgeConfigBuilder.java 1302648 
  trunk/pge/src/main/java/org/apache/oodt/cas/pge/metadata/PgeTaskMetKeys.java 1302648 
  trunk/pge/src/main/java/org/apache/oodt/cas/pge/writers/ExternExtractorMetWriter.java 1302648 
  trunk/pge/src/main/java/org/apache/oodt/cas/pge/writers/FilenameExtractorWriter.java 1302648 
  trunk/pge/src/main/java/org/apache/oodt/cas/pge/writers/PcsMetFileWriter.java 1302648 
  trunk/pge/src/main/java/org/apache/oodt/cas/pge/writers/SciPgeConfigFileWriter.java 1302648 
  trunk/pge/src/main/java/org/apache/oodt/cas/pge/writers/metlist/MetadataListPcsMetFileWriter.java 1302648 
  trunk/pge/src/main/java/org/apache/oodt/cas/pge/writers/xslt/XslTransformWriter.java 1302648 
  trunk/pge/src/main/resources/examples/Crawler/action-beans.xml PRE-CREATION 
  trunk/pge/src/main/resources/examples/Crawler/crawler-config.xml PRE-CREATION 
  trunk/pge/src/main/resources/examples/Crawler/mime-extractor-map.xml PRE-CREATION 
  trunk/pge/src/main/resources/examples/Crawler/mime-types.xml PRE-CREATION 
  trunk/pge/src/main/resources/examples/Crawler/naming-beans.xml PRE-CREATION 
  trunk/pge/src/main/resources/examples/Crawler/precondition-beans.xml PRE-CREATION 
  trunk/pge/src/main/resources/examples/MetadataOutputFiles/metadata-output.xml 1302648 
  trunk/pge/src/main/resources/examples/PgeConfigFiles/pge-config.xml 1302648 
  trunk/pge/src/test/org/apache/oodt/cas/pge/TestPGETaskInstance.java 1302781 

Diff: https://reviews.apache.org/r/4628/diff


Testing
-------

Several Unit-tests


Thanks,

brian


                
> Introduce a CAS-Metadata based renaming interface
> -------------------------------------------------
>
>                 Key: OODT-426
>                 URL: https://issues.apache.org/jira/browse/OODT-426
>             Project: OODT
>          Issue Type: Sub-task
>          Components: crawler, metadata container, pge wrapper framework
>    Affects Versions: 0.3
>         Environment: none
>            Reporter: Brian Foster
>            Assignee: Brian Foster
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: OODT-426.2012-03-20.cas-crawler.patch.txt, OODT-426.2012-03-20.cas-metadata.patch.txt, OODT-426.2012-03-24.cas-crawler.patch.txt, OODT-426.2012-04-03.cas-pge.txt
>
>
> The idea here is that CAS-Metadata will introduce a new NamingConvention interface, which will allow for renaming of files.  CAS-Crawler will then be modified to support specified NamingConventions which will be run after all preconditions have passed for a given file.  This will then allow CAS-PGE to then use AutoDetectProductCrawler instead of StdProductCrawler, which will standardize across the board for file extraction (currently CAS-PGE has it's own file extraction interface which uses regular expression to determine files which should be extracted and ingested). The only missing feature in CAS-Crawler which CAS-PGE supports is file renaming, which this new NamingConvention interface will introduce.  Here is what the NamingConvention interface will look like:
> {code}
> public interface NamingConvention {
>    public File rename(File file, Metadata metadata)
>          throws NamingConventionException;
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira