You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Myles, Stuart" <SM...@ap.org> on 2016/10/06 13:20:44 UTC

UIMA Ruta and EXTRA - Developers Needed For IPTC's EXTRA Rules-based Classification Engine

Hello,

Greetings from The Associated Press. We're working within the IPTC - along with a number of other news organizations - on a project that is relevant to the UIMA Ruta project and wanted to reach out to see if you might be interested.

The IPTC project is "EXTRA" (shorthand for EXTraction Rules Apparatus), an open-source source rules based classification engine for news content. The IPTC was awarded a grant of 50,000 Euros from Google's Digital News Initiative Innovation Fund to build and freely distribute the initial version of EXTRA. As part of the IPTC, we plan to work with news providers to supply sets of news documents, and with linguists to write rules to classify the documents. We've been working on defining the technical requirements and now we’re looking for software developers to design, develop, document and test EXTRA.

Given the similarity of UIMA Ruta to our requirements, we thought we'd reach out to you to see if you might be interested in this project or if you know anyone who might be. Below is the formal announcement of our search for developers to work with us on EXTRA.

Regards,

Stuart Myles
--
Director of Information Management at the Associated Press
Chairman of the Board of the IPTC


https://iptc.org/news/developers-needed-for-extra/
Developers Needed For IPTC's EXTRA Rules-based Classification Engine

IPTC https://iptc.org/<https://iptc.org/%20/t%20_blank> is looking for software developers to design, develop, document and test EXTRA https://iptc.github.io/extra/<https://iptc.github.io/extra/%20/t%20_blank>, an open source rules-based classification engine for news. First preference will be given to applications received by 21st October 2016, and review will continue until the positions are filled. Apply here.<https://docs.google.com/forms/d/e/1FAIpQLSdROT-cefP57cmCRbW90cnvuYKNTJ4XKQ2cQgA8ZffLYoLwPQ/viewform%20/t%20_blank>

"Classification" means assigning one or more categories to the text of a news document. Rules based classifiers use a set of Boolean rules, rather than machine-learning or statistical techniques, to determine which categories to apply.

EXTRA is the EXTraction Rules Apparatus, a multilingual open-source platform for rules-based classification of news content. IPTC was awarded a grant of €50,000 from the first round of Google’s Digital News Initiative Innovation Fund https://www.digitalnewsinitiative.com/<https://www.digitalnewsinitiative.com/%20/t%20_blank> to build and freely distribute the initial version of EXTRA. DNI granted IPTC €50,000 for the entire project.


We are working with news providers to supply sets of news documents and with linguists to write rules to classify the documents. IPTC is looking for qualified developers to create the rules engine to accurately and efficiently categorize the documents using the rules. mandatory and preferred requirements.

Please consult<https://docs.google.com/forms/d/e/1FAIpQLSdROT-cefP57cmCRbW90cnvuYKNTJ4XKQ2cQgA8ZffLYoLwPQ/viewform%20/t%20_blank> this page for more information and to let us know if you’re interested in being considered.