You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@oodt.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/09/25 03:11:33 UTC

[jira] [Commented] (OODT-754) contribute ProdTypePatternMetExtractor

    [ https://issues.apache.org/jira/browse/OODT-754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147207#comment-14147207 ] 

Lewis John McGibbney commented on OODT-754:
-------------------------------------------

[~rickdn] this is an excellent idea. [~skhudiky] and myself were discussing this today and it is certainly a shortcoming of other extractor implementations where they do not account for the following case
Say you have a file which is as follows AAAA-BB-CCCCCC-DD.png which you wish to consider as a product.
 * AAAA represents the instrument/device which produced the picture
 * BB is an identifier for the project the picture was produced for
 * CCCCCC is the datee.g. YYMMDD
 * DD is the number of products produced on that date for that project by that instrument.
What happens is DD > 99?
Well what happens is that the FileNameExtractor (or whatever it is called) policy is broken and we begin ingesting incorrect information.
The extractor you describe on the wiki makes life so much easier to deal with cases like the above.
Thanks 

> contribute ProdTypePatternMetExtractor
> --------------------------------------
>
>                 Key: OODT-754
>                 URL: https://issues.apache.org/jira/browse/OODT-754
>             Project: OODT
>          Issue Type: New Feature
>          Components: metadata container
>            Reporter: Ricky Nguyen
>            Assignee: Ricky Nguyen
>             Fix For: 0.8
>
>
> There has been renewed interest in implementing the ProdTypePatternMetExtractor proposed [here|https://cwiki.apache.org/confluence/display/OODT/MetExtractors+for+Crawler].
> I was going to add it to the "metadata" module under the "org.apache.oodt.cas.metadata.extractors" package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)