You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@stanbol.apache.org by Fabian Christ <fc...@apache.org> on 2013/02/15 11:17:26 UTC

[VOTE] Release enhancement-engines-0.10.0

Hi,

this is a vote for releasing a set of enhancement engines bundled as
the enhancement-engines-0.10.0 release.

The release was cut via a release branch created from trunk revision 1443935.

Included are releases of the following engines:

* Tika
* HTML Extractor
* XMP Extractor
* Language Detection
* Language Identifier
* OpenNLP Sentence Detection
* OpenNLP Tokenizer
* OpenNLP POS Tagging
* OpenNLP NER
* OpenNLP Chunker
* Smart Chinese Tokenizer
* Paoding Chinese Tokenizer
* NLP to RDF converter
* RESTful NLP processing
* RESTful Language Identification
* Entity Linking Engine
* Entity Linking LabelTokenizer : Lucene
* Entity Linking LabelTokenizer : OpenNLP
* Entity Linking LabelTokenizer : Smart Chinese
* Entity Linking LabelTokenizer : Paoding
* Entityhub Linking
* Entity Tagging
* Keyword Extraction
* Topic Classification
* Topic Classification : Web API
* Sentiment Word Classifier
* Sentiment Summarization
* UIMA Remote Client
* UIMA To Triples
* UIMA Local Client
* CELI Engine
* DBPedia Spotlight
* Geonames Linking
* OpenCalais
* Zemanta

Please, vote on the following release packages:
apache-stanbol-enhancement-engines-0.10.0-source-release
* tar.gz MD5 08ad280426962f77c7e1c888050188aa
* zip MD5 65feb9bd94434d417a1cdca309270930

All release artifacts are staged at
https://repository.apache.org/content/repositories/orgapachestanbol-234/

The source release packages are also available in our dist/dev repo
https://dist.apache.org/repos/dist/dev/stanbol/234

PGP release signing keys are available at:
https://dist.apache.org/repos/dist/release/stanbol/KEYS

Release Notes - Stanbol - Version enhancement-engines-0.10.0

** Sub-task
    * [STANBOL-735] - OpenNLP POS Tagger Engine
    * [STANBOL-736] - OpenNLP Chunker Engine
    * [STANBOL-737] - Sentiment Tagger Engine
    * [STANBOL-739] - Migrate the Celi Lemmatizer Engine to use the
AnalyzedText contentPart
    * [STANBOL-740] - Adopt the KeywordLinkingEngine to use the
AnalyzedText content part
    * [STANBOL-792] - Extend the NamedEntityExtraction engine to
support custom NameFinder Models
    * [STANBOL-795] - OpenNLP Tokenizer Engine
    * [STANBOL-796] - OpenNLP Sentence Detection Engine
    * [STANBOL-797] - Adapt the OpenNLP NER engine to support the
AnalyzedText ContentPart
    * [STANBOL-812] - Rename the AnalyzedText based
KeywordLinkingEngine to EntityhubLinkingEngine
    * [STANBOL-851] - Extract the OpenNLP labeltokenizer from the
EntityLikingEngine
    * [STANBOL-856] - Add Lucene LabelTokenizer configuration for
Chinese (zh) based on the Lucene smartcn analyzer
    * [STANBOL-860] - Add AnalyzedText Tokenizer based on the Smart
Chinese Analyzer
** Bug
    * [STANBOL-617] - Define how TopicEnhancements are written to the
Enhancement Structure
    * [STANBOL-622] - The KeywordLinkingEngine should check if all
Tokens of a Label match against the text
    * [STANBOL-623] - The KeywordLinkingEngine does not select the
best fitting label for suggested Entities
    * [STANBOL-624] - The NamedEntityTagging engine should use
confidence values between [0..1]
    * [STANBOL-625] - EnhancementEngines that suggest Entities from
the Stanbol Entityhub should add the name of the ReferencedSite
    * [STANBOL-636] - KeywordLinkingEngine should report a
EngineException instead of a IllegalStateException if the configured
ReferencedSite is not available
    * [STANBOL-725] - NamedEntityTaggingEngine does not correctly
convert labels to lowercase when calculating Levenshtein distance
    * [STANBOL-726] - The KeywordlinkingEngine sets the value
configured for "Min Token Length" to "Max Suggestions"
    * [STANBOL-767] - LocationEnhancementEngine needs to add
dc:relation properties for dc:requires
    * [STANBOL-770] - Wrong changes of  the structure of HTML5 docs by
Tidy based HtmlParser
    * [STANBOL-809] - Parse ConentItem URI to the Tika content type detector
    * [STANBOL-813] - codification problem with the
removeNonUtf8CompliantCharacters method used by the opennlp-ner engine
    * [STANBOL-818] - EntitylinkingEngine encounters
StringIndexOutOfBounds exceptions
    * [STANBOL-821] - EntitylinkingEngine encounters
java.lang.IllegalArgumentException: parsed span MUST be > 0!
    * [STANBOL-865] - Tika engine is unable to create Temporary files
if SecurityManager is active
    * [STANBOL-882] - Loading Paoding Analyzer needs to be done with
AccessController.doPrivileged
    * [STANBOL-899] - EntityLinking engine MUST NOT fail on empty Spans
** Improvement
    * [STANBOL-611] - Make the list of properties included for
dereferenced Entities of the KeywordLinkingEngine configureable
    * [STANBOL-627] - Update to Tika 1.1
    * [STANBOL-685] - Improve POS tag handling of the KeywordLinkingEngine
    * [STANBOL-686] - Make the "Minimum Token Match Factor"
configurable for the KeywordLinkingEngine
    * [STANBOL-718] - Add support for suggesting mutiple languages and
confidence to the LanguageDetectionEnhancementEngine
    * [STANBOL-862] - Add support for country specific matching to the
EntityLinkingEngine
    * [STANBOL-866] - Add support for CharFilter to the Lucene LabelTokenizer
    * [STANBOL-867] - Add support for configuration parameters to the
Lucene LabelTokenizer
    * [STANBOL-871] - Support updating of the LabelTokenizer used by
the EntityLinkingEngine
    * [STANBOL-896] - EntityLinkingEngine should reuse existing TextAnnotations
** New Feature
    * [STANBOL-689] - Refactor RDFa/Microformat extractor to be
independent of external repositories
    * [STANBOL-706] - DBpedia Spotlight EnhancementEngines integration
    * [STANBOL-707] - Language detection for CJK languages
    * [STANBOL-771] - HtmlExtractor: Add an extractor for Microdata
    * [STANBOL-849] - Implement Lucene Tokenizer based LabelTokenizer
    * [STANBOL-850] - Modularize EntityLinking
    * [STANBOL-875] - Add support for Paoding (Chinese)
    * [STANBOL-876] - Add Smartcn Sentence detection engine
    * [STANBOL-892] - RESTful Service Specification for Stanbol NLP analysis
    * [STANBOL-893] - RESTful NLP analyses Enhancement Engine
    * [STANBOL-894] - RESTful Language Identification service
    * [STANBOL-895] - RESTful Language Identification Engine
** Task
    * [STANBOL-885] - Let UIMA local template use latest UIMA SDK and
Lucene analyzers
    * [STANBOL-913] - Release enhancement-engines-0.10.0
    * [STANBOL-916] - Move all OpenNLP related engines into
/enhancement-engines/opennlp
    * [STANBOL-917] - Move uima engine artifacts into /enhancement-engines/uima
** Test
    * [STANBOL-612] - Add helper for validating the Stanbol
EnhancementStructure to the Enhancer test module

The vote is open for at least 48 hours.

Best,
 - Fabian

Re: [VOTE] Release enhancement-engines-0.10.0

Posted by Fabian Christ <ch...@googlemail.com>.

Hi,

the vote is over and here are the results.

[+1 binding / non binding] 4 / 1 votes
[0] 0 votes
[-1] 0 votes

The vote has passed. I will publish the release artifacts.

Thanks for checking,
 - Fabian

2013/2/18 Suat Gönül <su...@gmail.com>:
> Hi,
>
> Digests and signatures were ok. Build was successful also. So, +1
>
> Thanks Fabian,
> Best,
> Suat
>
> On Mon, Feb 18, 2013 at 1:34 PM, Szaby Grünwald <sz...@apache.org> wrote:
>
>> Hi,
>> Build successful, signitures ok.
>> +1
>> best,
>> Szaby
>>
>> On 17 February 2013 23:04, Rupert Westenthaler <
>> rupert.westenthaler@gmail.com> wrote:
>>
>> > Hi Fabian, all
>> >
>> > Sorry for the late response, but I was offline the last 4 days.
>> >
>> > I checked signatures and digests; the release matches the tag and the
>> > build was successful
>> >
>> > +1 from my side
>> >
>> >
>> > Thx Fabian for forging the release
>> > best
>> > Rupert
>> >
>> > On Sun, Feb 17, 2013 at 8:24 PM, Sergio Fernández
>> > <se...@salzburgresearch.at> wrote:
>> > > +1 (non binding)
>> > >
>> > >
>> > > On 15/02/13 12:38, Fabian Christ wrote:
>> > >>
>> > >> Hi,
>> > >>
>> > >> and here is my +1.
>> > >>
>> > >> * Checked signatures and digest
>> > >> * Checked build
>> > >>
>> > >> Best,
>> > >>   - Fabian
>> > >>
>> > >> 2013/2/15 Fabian Christ<fc...@apache.org>:
>> > >>>
>> > >>> Hi,
>> > >>>
>> > >>> forgot to mention one important step to build the release:
>> > >>>
>> > >>> The release can be built on a clean system if tests are excluded via
>> > >>> 'mvn install -DskipTests'.
>> > >>>
>> > >>> If tests are activated, the build requires the
>> > >>> apache-stanbol-data-1.1.0 bundles to be in place in the local Maven
>> > >>> repository. The data bundles can be built from the source package
>> > >>> available at [1].
>> > >>>
>> > >>> [1] http://stanbol.apache.org/downloads/releases.html
>> > >>>
>> > >>> Best,
>> > >>>   - Fabian
>> > >>>
>> > >>> 2013/2/15 Fabian Christ<fc...@apache.org>:
>> > >>>>
>> > >>>> Hi,
>> > >>>>
>> > >>>> this is a vote for releasing a set of enhancement engines bundled as
>> > >>>> the enhancement-engines-0.10.0 release.
>> > >>>>
>> > >>>> The release was cut via a release branch created from trunk revision
>> > >>>> 1443935.
>> > >>>>
>> > >>>> Included are releases of the following engines:
>> > >>>>
>> > >>>> * Tika
>> > >>>> * HTML Extractor
>> > >>>> * XMP Extractor
>> > >>>> * Language Detection
>> > >>>> * Language Identifier
>> > >>>> * OpenNLP Sentence Detection
>> > >>>> * OpenNLP Tokenizer
>> > >>>> * OpenNLP POS Tagging
>> > >>>> * OpenNLP NER
>> > >>>> * OpenNLP Chunker
>> > >>>> * Smart Chinese Tokenizer
>> > >>>> * Paoding Chinese Tokenizer
>> > >>>> * NLP to RDF converter
>> > >>>> * RESTful NLP processing
>> > >>>> * RESTful Language Identification
>> > >>>> * Entity Linking Engine
>> > >>>> * Entity Linking LabelTokenizer : Lucene
>> > >>>> * Entity Linking LabelTokenizer : OpenNLP
>> > >>>> * Entity Linking LabelTokenizer : Smart Chinese
>> > >>>> * Entity Linking LabelTokenizer : Paoding
>> > >>>> * Entityhub Linking
>> > >>>> * Entity Tagging
>> > >>>> * Keyword Extraction
>> > >>>> * Topic Classification
>> > >>>> * Topic Classification : Web API
>> > >>>> * Sentiment Word Classifier
>> > >>>> * Sentiment Summarization
>> > >>>> * UIMA Remote Client
>> > >>>> * UIMA To Triples
>> > >>>> * UIMA Local Client
>> > >>>> * CELI Engine
>> > >>>> * DBPedia Spotlight
>> > >>>> * Geonames Linking
>> > >>>> * OpenCalais
>> > >>>> * Zemanta
>> > >>>>
>> > >>>> Please, vote on the following release packages:
>> > >>>> apache-stanbol-enhancement-engines-0.10.0-source-release
>> > >>>> * tar.gz MD5 08ad280426962f77c7e1c888050188aa
>> > >>>> * zip MD5 65feb9bd94434d417a1cdca309270930
>> > >>>>
>> > >>>> All release artifacts are staged at
>> > >>>>
>> > https://repository.apache.org/content/repositories/orgapachestanbol-234/
>> > >>>>
>> > >>>> The source release packages are also available in our dist/dev repo
>> > >>>> https://dist.apache.org/repos/dist/dev/stanbol/234
>> > >>>>
>> > >>>> PGP release signing keys are available at:
>> > >>>> https://dist.apache.org/repos/dist/release/stanbol/KEYS
>> > >>>>
>> > >>>> Release Notes - Stanbol - Version enhancement-engines-0.10.0
>> > >>>>
>> > >>>> ** Sub-task
>> > >>>>      * [STANBOL-735] - OpenNLP POS Tagger Engine
>> > >>>>      * [STANBOL-736] - OpenNLP Chunker Engine
>> > >>>>      * [STANBOL-737] - Sentiment Tagger Engine
>> > >>>>      * [STANBOL-739] - Migrate the Celi Lemmatizer Engine to use the
>> > >>>> AnalyzedText contentPart
>> > >>>>      * [STANBOL-740] - Adopt the KeywordLinkingEngine to use the
>> > >>>> AnalyzedText content part
>> > >>>>      * [STANBOL-792] - Extend the NamedEntityExtraction engine to
>> > >>>> support custom NameFinder Models
>> > >>>>      * [STANBOL-795] - OpenNLP Tokenizer Engine
>> > >>>>      * [STANBOL-796] - OpenNLP Sentence Detection Engine
>> > >>>>      * [STANBOL-797] - Adapt the OpenNLP NER engine to support the
>> > >>>> AnalyzedText ContentPart
>> > >>>>      * [STANBOL-812] - Rename the AnalyzedText based
>> > >>>> KeywordLinkingEngine to EntityhubLinkingEngine
>> > >>>>      * [STANBOL-851] - Extract the OpenNLP labeltokenizer from the
>> > >>>> EntityLikingEngine
>> > >>>>      * [STANBOL-856] - Add Lucene LabelTokenizer configuration for
>> > >>>> Chinese (zh) based on the Lucene smartcn analyzer
>> > >>>>      * [STANBOL-860] - Add AnalyzedText Tokenizer based on the Smart
>> > >>>> Chinese Analyzer
>> > >>>> ** Bug
>> > >>>>      * [STANBOL-617] - Define how TopicEnhancements are written to
>> the
>> > >>>> Enhancement Structure
>> > >>>>      * [STANBOL-622] - The KeywordLinkingEngine should check if all
>> > >>>> Tokens of a Label match against the text
>> > >>>>      * [STANBOL-623] - The KeywordLinkingEngine does not select the
>> > >>>> best fitting label for suggested Entities
>> > >>>>      * [STANBOL-624] - The NamedEntityTagging engine should use
>> > >>>> confidence values between [0..1]
>> > >>>>      * [STANBOL-625] - EnhancementEngines that suggest Entities from
>> > >>>> the Stanbol Entityhub should add the name of the ReferencedSite
>> > >>>>      * [STANBOL-636] - KeywordLinkingEngine should report a
>> > >>>> EngineException instead of a IllegalStateException if the configured
>> > >>>> ReferencedSite is not available
>> > >>>>      * [STANBOL-725] - NamedEntityTaggingEngine does not correctly
>> > >>>> convert labels to lowercase when calculating Levenshtein distance
>> > >>>>      * [STANBOL-726] - The KeywordlinkingEngine sets the value
>> > >>>> configured for "Min Token Length" to "Max Suggestions"
>> > >>>>      * [STANBOL-767] - LocationEnhancementEngine needs to add
>> > >>>> dc:relation properties for dc:requires
>> > >>>>      * [STANBOL-770] - Wrong changes of  the structure of HTML5 docs
>> > by
>> > >>>> Tidy based HtmlParser
>> > >>>>      * [STANBOL-809] - Parse ConentItem URI to the Tika content type
>> > >>>> detector
>> > >>>>      * [STANBOL-813] - codification problem with the
>> > >>>> removeNonUtf8CompliantCharacters method used by the opennlp-ner
>> engine
>> > >>>>      * [STANBOL-818] - EntitylinkingEngine encounters
>> > >>>> StringIndexOutOfBounds exceptions
>> > >>>>      * [STANBOL-821] - EntitylinkingEngine encounters
>> > >>>> java.lang.IllegalArgumentException: parsed span MUST be>  0!
>> > >>>>      * [STANBOL-865] - Tika engine is unable to create Temporary
>> files
>> > >>>> if SecurityManager is active
>> > >>>>      * [STANBOL-882] - Loading Paoding Analyzer needs to be done
>> with
>> > >>>> AccessController.doPrivileged
>> > >>>>      * [STANBOL-899] - EntityLinking engine MUST NOT fail on empty
>> > Spans
>> > >>>> ** Improvement
>> > >>>>      * [STANBOL-611] - Make the list of properties included for
>> > >>>> dereferenced Entities of the KeywordLinkingEngine configureable
>> > >>>>      * [STANBOL-627] - Update to Tika 1.1
>> > >>>>      * [STANBOL-685] - Improve POS tag handling of the
>> > >>>> KeywordLinkingEngine
>> > >>>>      * [STANBOL-686] - Make the "Minimum Token Match Factor"
>> > >>>> configurable for the KeywordLinkingEngine
>> > >>>>      * [STANBOL-718] - Add support for suggesting mutiple languages
>> > and
>> > >>>> confidence to the LanguageDetectionEnhancementEngine
>> > >>>>      * [STANBOL-862] - Add support for country specific matching to
>> > the
>> > >>>> EntityLinkingEngine
>> > >>>>      * [STANBOL-866] - Add support for CharFilter to the Lucene
>> > >>>> LabelTokenizer
>> > >>>>      * [STANBOL-867] - Add support for configuration parameters to
>> the
>> > >>>> Lucene LabelTokenizer
>> > >>>>      * [STANBOL-871] - Support updating of the LabelTokenizer used
>> by
>> > >>>> the EntityLinkingEngine
>> > >>>>      * [STANBOL-896] - EntityLinkingEngine should reuse existing
>> > >>>> TextAnnotations
>> > >>>> ** New Feature
>> > >>>>      * [STANBOL-689] - Refactor RDFa/Microformat extractor to be
>> > >>>> independent of external repositories
>> > >>>>      * [STANBOL-706] - DBpedia Spotlight EnhancementEngines
>> > integration
>> > >>>>      * [STANBOL-707] - Language detection for CJK languages
>> > >>>>      * [STANBOL-771] - HtmlExtractor: Add an extractor for Microdata
>> > >>>>      * [STANBOL-849] - Implement Lucene Tokenizer based
>> LabelTokenizer
>> > >>>>      * [STANBOL-850] - Modularize EntityLinking
>> > >>>>      * [STANBOL-875] - Add support for Paoding (Chinese)
>> > >>>>      * [STANBOL-876] - Add Smartcn Sentence detection engine
>> > >>>>      * [STANBOL-892] - RESTful Service Specification for Stanbol NLP
>> > >>>> analysis
>> > >>>>      * [STANBOL-893] - RESTful NLP analyses Enhancement Engine
>> > >>>>      * [STANBOL-894] - RESTful Language Identification service
>> > >>>>      * [STANBOL-895] - RESTful Language Identification Engine
>> > >>>> ** Task
>> > >>>>      * [STANBOL-885] - Let UIMA local template use latest UIMA SDK
>> and
>> > >>>> Lucene analyzers
>> > >>>>      * [STANBOL-913] - Release enhancement-engines-0.10.0
>> > >>>>      * [STANBOL-916] - Move all OpenNLP related engines into
>> > >>>> /enhancement-engines/opennlp
>> > >>>>      * [STANBOL-917] - Move uima engine artifacts into
>> > >>>> /enhancement-engines/uima
>> > >>>> ** Test
>> > >>>>      * [STANBOL-612] - Add helper for validating the Stanbol
>> > >>>> EnhancementStructure to the Enhancer test module
>> > >>>>
>> > >>>> The vote is open for at least 48 hours.
>> > >>>>
>> > >>>> Best,
>> > >>>>   - Fabian
>> > >
>> > >
>> > > --
>> > > Sergio Fernández
>> > > Salzburg Research
>> > > +43 662 2288 318
>> > > Jakob-Haringer Strasse 5/II
>> > > A-5020 Salzburg (Austria)
>> > > http://www.salzburgresearch.at
>> >
>> >
>> >
>> > --
>> > | Rupert Westenthaler             rupert.westenthaler@gmail.com
>> > | Bodenlehenstraße 11                             ++43-699-11108907
>> > | A-5500 Bischofshofen
>> >
>>



--
Fabian
http://twitter.com/fctwitt

Re: [VOTE] Release enhancement-engines-0.10.0

Posted by Suat Gönül <su...@gmail.com>.

Hi,

Digests and signatures were ok. Build was successful also. So, +1

Thanks Fabian,
Best,
Suat

On Mon, Feb 18, 2013 at 1:34 PM, Szaby Grünwald <sz...@apache.org> wrote:

> Hi,
> Build successful, signitures ok.
> +1
> best,
> Szaby
>
> On 17 February 2013 23:04, Rupert Westenthaler <
> rupert.westenthaler@gmail.com> wrote:
>
> > Hi Fabian, all
> >
> > Sorry for the late response, but I was offline the last 4 days.
> >
> > I checked signatures and digests; the release matches the tag and the
> > build was successful
> >
> > +1 from my side
> >
> >
> > Thx Fabian for forging the release
> > best
> > Rupert
> >
> > On Sun, Feb 17, 2013 at 8:24 PM, Sergio Fernández
> > <se...@salzburgresearch.at> wrote:
> > > +1 (non binding)
> > >
> > >
> > > On 15/02/13 12:38, Fabian Christ wrote:
> > >>
> > >> Hi,
> > >>
> > >> and here is my +1.
> > >>
> > >> * Checked signatures and digest
> > >> * Checked build
> > >>
> > >> Best,
> > >>   - Fabian
> > >>
> > >> 2013/2/15 Fabian Christ<fc...@apache.org>:
> > >>>
> > >>> Hi,
> > >>>
> > >>> forgot to mention one important step to build the release:
> > >>>
> > >>> The release can be built on a clean system if tests are excluded via
> > >>> 'mvn install -DskipTests'.
> > >>>
> > >>> If tests are activated, the build requires the
> > >>> apache-stanbol-data-1.1.0 bundles to be in place in the local Maven
> > >>> repository. The data bundles can be built from the source package
> > >>> available at [1].
> > >>>
> > >>> [1] http://stanbol.apache.org/downloads/releases.html
> > >>>
> > >>> Best,
> > >>>   - Fabian
> > >>>
> > >>> 2013/2/15 Fabian Christ<fc...@apache.org>:
> > >>>>
> > >>>> Hi,
> > >>>>
> > >>>> this is a vote for releasing a set of enhancement engines bundled as
> > >>>> the enhancement-engines-0.10.0 release.
> > >>>>
> > >>>> The release was cut via a release branch created from trunk revision
> > >>>> 1443935.
> > >>>>
> > >>>> Included are releases of the following engines:
> > >>>>
> > >>>> * Tika
> > >>>> * HTML Extractor
> > >>>> * XMP Extractor
> > >>>> * Language Detection
> > >>>> * Language Identifier
> > >>>> * OpenNLP Sentence Detection
> > >>>> * OpenNLP Tokenizer
> > >>>> * OpenNLP POS Tagging
> > >>>> * OpenNLP NER
> > >>>> * OpenNLP Chunker
> > >>>> * Smart Chinese Tokenizer
> > >>>> * Paoding Chinese Tokenizer
> > >>>> * NLP to RDF converter
> > >>>> * RESTful NLP processing
> > >>>> * RESTful Language Identification
> > >>>> * Entity Linking Engine
> > >>>> * Entity Linking LabelTokenizer : Lucene
> > >>>> * Entity Linking LabelTokenizer : OpenNLP
> > >>>> * Entity Linking LabelTokenizer : Smart Chinese
> > >>>> * Entity Linking LabelTokenizer : Paoding
> > >>>> * Entityhub Linking
> > >>>> * Entity Tagging
> > >>>> * Keyword Extraction
> > >>>> * Topic Classification
> > >>>> * Topic Classification : Web API
> > >>>> * Sentiment Word Classifier
> > >>>> * Sentiment Summarization
> > >>>> * UIMA Remote Client
> > >>>> * UIMA To Triples
> > >>>> * UIMA Local Client
> > >>>> * CELI Engine
> > >>>> * DBPedia Spotlight
> > >>>> * Geonames Linking
> > >>>> * OpenCalais
> > >>>> * Zemanta
> > >>>>
> > >>>> Please, vote on the following release packages:
> > >>>> apache-stanbol-enhancement-engines-0.10.0-source-release
> > >>>> * tar.gz MD5 08ad280426962f77c7e1c888050188aa
> > >>>> * zip MD5 65feb9bd94434d417a1cdca309270930
> > >>>>
> > >>>> All release artifacts are staged at
> > >>>>
> > https://repository.apache.org/content/repositories/orgapachestanbol-234/
> > >>>>
> > >>>> The source release packages are also available in our dist/dev repo
> > >>>> https://dist.apache.org/repos/dist/dev/stanbol/234
> > >>>>
> > >>>> PGP release signing keys are available at:
> > >>>> https://dist.apache.org/repos/dist/release/stanbol/KEYS
> > >>>>
> > >>>> Release Notes - Stanbol - Version enhancement-engines-0.10.0
> > >>>>
> > >>>> ** Sub-task
> > >>>>      * [STANBOL-735] - OpenNLP POS Tagger Engine
> > >>>>      * [STANBOL-736] - OpenNLP Chunker Engine
> > >>>>      * [STANBOL-737] - Sentiment Tagger Engine
> > >>>>      * [STANBOL-739] - Migrate the Celi Lemmatizer Engine to use the
> > >>>> AnalyzedText contentPart
> > >>>>      * [STANBOL-740] - Adopt the KeywordLinkingEngine to use the
> > >>>> AnalyzedText content part
> > >>>>      * [STANBOL-792] - Extend the NamedEntityExtraction engine to
> > >>>> support custom NameFinder Models
> > >>>>      * [STANBOL-795] - OpenNLP Tokenizer Engine
> > >>>>      * [STANBOL-796] - OpenNLP Sentence Detection Engine
> > >>>>      * [STANBOL-797] - Adapt the OpenNLP NER engine to support the
> > >>>> AnalyzedText ContentPart
> > >>>>      * [STANBOL-812] - Rename the AnalyzedText based
> > >>>> KeywordLinkingEngine to EntityhubLinkingEngine
> > >>>>      * [STANBOL-851] - Extract the OpenNLP labeltokenizer from the
> > >>>> EntityLikingEngine
> > >>>>      * [STANBOL-856] - Add Lucene LabelTokenizer configuration for
> > >>>> Chinese (zh) based on the Lucene smartcn analyzer
> > >>>>      * [STANBOL-860] - Add AnalyzedText Tokenizer based on the Smart
> > >>>> Chinese Analyzer
> > >>>> ** Bug
> > >>>>      * [STANBOL-617] - Define how TopicEnhancements are written to
> the
> > >>>> Enhancement Structure
> > >>>>      * [STANBOL-622] - The KeywordLinkingEngine should check if all
> > >>>> Tokens of a Label match against the text
> > >>>>      * [STANBOL-623] - The KeywordLinkingEngine does not select the
> > >>>> best fitting label for suggested Entities
> > >>>>      * [STANBOL-624] - The NamedEntityTagging engine should use
> > >>>> confidence values between [0..1]
> > >>>>      * [STANBOL-625] - EnhancementEngines that suggest Entities from
> > >>>> the Stanbol Entityhub should add the name of the ReferencedSite
> > >>>>      * [STANBOL-636] - KeywordLinkingEngine should report a
> > >>>> EngineException instead of a IllegalStateException if the configured
> > >>>> ReferencedSite is not available
> > >>>>      * [STANBOL-725] - NamedEntityTaggingEngine does not correctly
> > >>>> convert labels to lowercase when calculating Levenshtein distance
> > >>>>      * [STANBOL-726] - The KeywordlinkingEngine sets the value
> > >>>> configured for "Min Token Length" to "Max Suggestions"
> > >>>>      * [STANBOL-767] - LocationEnhancementEngine needs to add
> > >>>> dc:relation properties for dc:requires
> > >>>>      * [STANBOL-770] - Wrong changes of  the structure of HTML5 docs
> > by
> > >>>> Tidy based HtmlParser
> > >>>>      * [STANBOL-809] - Parse ConentItem URI to the Tika content type
> > >>>> detector
> > >>>>      * [STANBOL-813] - codification problem with the
> > >>>> removeNonUtf8CompliantCharacters method used by the opennlp-ner
> engine
> > >>>>      * [STANBOL-818] - EntitylinkingEngine encounters
> > >>>> StringIndexOutOfBounds exceptions
> > >>>>      * [STANBOL-821] - EntitylinkingEngine encounters
> > >>>> java.lang.IllegalArgumentException: parsed span MUST be>  0!
> > >>>>      * [STANBOL-865] - Tika engine is unable to create Temporary
> files
> > >>>> if SecurityManager is active
> > >>>>      * [STANBOL-882] - Loading Paoding Analyzer needs to be done
> with
> > >>>> AccessController.doPrivileged
> > >>>>      * [STANBOL-899] - EntityLinking engine MUST NOT fail on empty
> > Spans
> > >>>> ** Improvement
> > >>>>      * [STANBOL-611] - Make the list of properties included for
> > >>>> dereferenced Entities of the KeywordLinkingEngine configureable
> > >>>>      * [STANBOL-627] - Update to Tika 1.1
> > >>>>      * [STANBOL-685] - Improve POS tag handling of the
> > >>>> KeywordLinkingEngine
> > >>>>      * [STANBOL-686] - Make the "Minimum Token Match Factor"
> > >>>> configurable for the KeywordLinkingEngine
> > >>>>      * [STANBOL-718] - Add support for suggesting mutiple languages
> > and
> > >>>> confidence to the LanguageDetectionEnhancementEngine
> > >>>>      * [STANBOL-862] - Add support for country specific matching to
> > the
> > >>>> EntityLinkingEngine
> > >>>>      * [STANBOL-866] - Add support for CharFilter to the Lucene
> > >>>> LabelTokenizer
> > >>>>      * [STANBOL-867] - Add support for configuration parameters to
> the
> > >>>> Lucene LabelTokenizer
> > >>>>      * [STANBOL-871] - Support updating of the LabelTokenizer used
> by
> > >>>> the EntityLinkingEngine
> > >>>>      * [STANBOL-896] - EntityLinkingEngine should reuse existing
> > >>>> TextAnnotations
> > >>>> ** New Feature
> > >>>>      * [STANBOL-689] - Refactor RDFa/Microformat extractor to be
> > >>>> independent of external repositories
> > >>>>      * [STANBOL-706] - DBpedia Spotlight EnhancementEngines
> > integration
> > >>>>      * [STANBOL-707] - Language detection for CJK languages
> > >>>>      * [STANBOL-771] - HtmlExtractor: Add an extractor for Microdata
> > >>>>      * [STANBOL-849] - Implement Lucene Tokenizer based
> LabelTokenizer
> > >>>>      * [STANBOL-850] - Modularize EntityLinking
> > >>>>      * [STANBOL-875] - Add support for Paoding (Chinese)
> > >>>>      * [STANBOL-876] - Add Smartcn Sentence detection engine
> > >>>>      * [STANBOL-892] - RESTful Service Specification for Stanbol NLP
> > >>>> analysis
> > >>>>      * [STANBOL-893] - RESTful NLP analyses Enhancement Engine
> > >>>>      * [STANBOL-894] - RESTful Language Identification service
> > >>>>      * [STANBOL-895] - RESTful Language Identification Engine
> > >>>> ** Task
> > >>>>      * [STANBOL-885] - Let UIMA local template use latest UIMA SDK
> and
> > >>>> Lucene analyzers
> > >>>>      * [STANBOL-913] - Release enhancement-engines-0.10.0
> > >>>>      * [STANBOL-916] - Move all OpenNLP related engines into
> > >>>> /enhancement-engines/opennlp
> > >>>>      * [STANBOL-917] - Move uima engine artifacts into
> > >>>> /enhancement-engines/uima
> > >>>> ** Test
> > >>>>      * [STANBOL-612] - Add helper for validating the Stanbol
> > >>>> EnhancementStructure to the Enhancer test module
> > >>>>
> > >>>> The vote is open for at least 48 hours.
> > >>>>
> > >>>> Best,
> > >>>>   - Fabian
> > >
> > >
> > > --
> > > Sergio Fernández
> > > Salzburg Research
> > > +43 662 2288 318
> > > Jakob-Haringer Strasse 5/II
> > > A-5020 Salzburg (Austria)
> > > http://www.salzburgresearch.at
> >
> >
> >
> > --
> > | Rupert Westenthaler             rupert.westenthaler@gmail.com
> > | Bodenlehenstraße 11                             ++43-699-11108907
> > | A-5500 Bischofshofen
> >
>

Re: [VOTE] Release enhancement-engines-0.10.0

Posted by Szaby Grünwald <sz...@apache.org>.

Hi,
Build successful, signitures ok.
+1
best,
Szaby

On 17 February 2013 23:04, Rupert Westenthaler <
rupert.westenthaler@gmail.com> wrote:

> Hi Fabian, all
>
> Sorry for the late response, but I was offline the last 4 days.
>
> I checked signatures and digests; the release matches the tag and the
> build was successful
>
> +1 from my side
>
>
> Thx Fabian for forging the release
> best
> Rupert
>
> On Sun, Feb 17, 2013 at 8:24 PM, Sergio Fernández
> <se...@salzburgresearch.at> wrote:
> > +1 (non binding)
> >
> >
> > On 15/02/13 12:38, Fabian Christ wrote:
> >>
> >> Hi,
> >>
> >> and here is my +1.
> >>
> >> * Checked signatures and digest
> >> * Checked build
> >>
> >> Best,
> >>   - Fabian
> >>
> >> 2013/2/15 Fabian Christ<fc...@apache.org>:
> >>>
> >>> Hi,
> >>>
> >>> forgot to mention one important step to build the release:
> >>>
> >>> The release can be built on a clean system if tests are excluded via
> >>> 'mvn install -DskipTests'.
> >>>
> >>> If tests are activated, the build requires the
> >>> apache-stanbol-data-1.1.0 bundles to be in place in the local Maven
> >>> repository. The data bundles can be built from the source package
> >>> available at [1].
> >>>
> >>> [1] http://stanbol.apache.org/downloads/releases.html
> >>>
> >>> Best,
> >>>   - Fabian
> >>>
> >>> 2013/2/15 Fabian Christ<fc...@apache.org>:
> >>>>
> >>>> Hi,
> >>>>
> >>>> this is a vote for releasing a set of enhancement engines bundled as
> >>>> the enhancement-engines-0.10.0 release.
> >>>>
> >>>> The release was cut via a release branch created from trunk revision
> >>>> 1443935.
> >>>>
> >>>> Included are releases of the following engines:
> >>>>
> >>>> * Tika
> >>>> * HTML Extractor
> >>>> * XMP Extractor
> >>>> * Language Detection
> >>>> * Language Identifier
> >>>> * OpenNLP Sentence Detection
> >>>> * OpenNLP Tokenizer
> >>>> * OpenNLP POS Tagging
> >>>> * OpenNLP NER
> >>>> * OpenNLP Chunker
> >>>> * Smart Chinese Tokenizer
> >>>> * Paoding Chinese Tokenizer
> >>>> * NLP to RDF converter
> >>>> * RESTful NLP processing
> >>>> * RESTful Language Identification
> >>>> * Entity Linking Engine
> >>>> * Entity Linking LabelTokenizer : Lucene
> >>>> * Entity Linking LabelTokenizer : OpenNLP
> >>>> * Entity Linking LabelTokenizer : Smart Chinese
> >>>> * Entity Linking LabelTokenizer : Paoding
> >>>> * Entityhub Linking
> >>>> * Entity Tagging
> >>>> * Keyword Extraction
> >>>> * Topic Classification
> >>>> * Topic Classification : Web API
> >>>> * Sentiment Word Classifier
> >>>> * Sentiment Summarization
> >>>> * UIMA Remote Client
> >>>> * UIMA To Triples
> >>>> * UIMA Local Client
> >>>> * CELI Engine
> >>>> * DBPedia Spotlight
> >>>> * Geonames Linking
> >>>> * OpenCalais
> >>>> * Zemanta
> >>>>
> >>>> Please, vote on the following release packages:
> >>>> apache-stanbol-enhancement-engines-0.10.0-source-release
> >>>> * tar.gz MD5 08ad280426962f77c7e1c888050188aa
> >>>> * zip MD5 65feb9bd94434d417a1cdca309270930
> >>>>
> >>>> All release artifacts are staged at
> >>>>
> https://repository.apache.org/content/repositories/orgapachestanbol-234/
> >>>>
> >>>> The source release packages are also available in our dist/dev repo
> >>>> https://dist.apache.org/repos/dist/dev/stanbol/234
> >>>>
> >>>> PGP release signing keys are available at:
> >>>> https://dist.apache.org/repos/dist/release/stanbol/KEYS
> >>>>
> >>>> Release Notes - Stanbol - Version enhancement-engines-0.10.0
> >>>>
> >>>> ** Sub-task
> >>>>      * [STANBOL-735] - OpenNLP POS Tagger Engine
> >>>>      * [STANBOL-736] - OpenNLP Chunker Engine
> >>>>      * [STANBOL-737] - Sentiment Tagger Engine
> >>>>      * [STANBOL-739] - Migrate the Celi Lemmatizer Engine to use the
> >>>> AnalyzedText contentPart
> >>>>      * [STANBOL-740] - Adopt the KeywordLinkingEngine to use the
> >>>> AnalyzedText content part
> >>>>      * [STANBOL-792] - Extend the NamedEntityExtraction engine to
> >>>> support custom NameFinder Models
> >>>>      * [STANBOL-795] - OpenNLP Tokenizer Engine
> >>>>      * [STANBOL-796] - OpenNLP Sentence Detection Engine
> >>>>      * [STANBOL-797] - Adapt the OpenNLP NER engine to support the
> >>>> AnalyzedText ContentPart
> >>>>      * [STANBOL-812] - Rename the AnalyzedText based
> >>>> KeywordLinkingEngine to EntityhubLinkingEngine
> >>>>      * [STANBOL-851] - Extract the OpenNLP labeltokenizer from the
> >>>> EntityLikingEngine
> >>>>      * [STANBOL-856] - Add Lucene LabelTokenizer configuration for
> >>>> Chinese (zh) based on the Lucene smartcn analyzer
> >>>>      * [STANBOL-860] - Add AnalyzedText Tokenizer based on the Smart
> >>>> Chinese Analyzer
> >>>> ** Bug
> >>>>      * [STANBOL-617] - Define how TopicEnhancements are written to the
> >>>> Enhancement Structure
> >>>>      * [STANBOL-622] - The KeywordLinkingEngine should check if all
> >>>> Tokens of a Label match against the text
> >>>>      * [STANBOL-623] - The KeywordLinkingEngine does not select the
> >>>> best fitting label for suggested Entities
> >>>>      * [STANBOL-624] - The NamedEntityTagging engine should use
> >>>> confidence values between [0..1]
> >>>>      * [STANBOL-625] - EnhancementEngines that suggest Entities from
> >>>> the Stanbol Entityhub should add the name of the ReferencedSite
> >>>>      * [STANBOL-636] - KeywordLinkingEngine should report a
> >>>> EngineException instead of a IllegalStateException if the configured
> >>>> ReferencedSite is not available
> >>>>      * [STANBOL-725] - NamedEntityTaggingEngine does not correctly
> >>>> convert labels to lowercase when calculating Levenshtein distance
> >>>>      * [STANBOL-726] - The KeywordlinkingEngine sets the value
> >>>> configured for "Min Token Length" to "Max Suggestions"
> >>>>      * [STANBOL-767] - LocationEnhancementEngine needs to add
> >>>> dc:relation properties for dc:requires
> >>>>      * [STANBOL-770] - Wrong changes of  the structure of HTML5 docs
> by
> >>>> Tidy based HtmlParser
> >>>>      * [STANBOL-809] - Parse ConentItem URI to the Tika content type
> >>>> detector
> >>>>      * [STANBOL-813] - codification problem with the
> >>>> removeNonUtf8CompliantCharacters method used by the opennlp-ner engine
> >>>>      * [STANBOL-818] - EntitylinkingEngine encounters
> >>>> StringIndexOutOfBounds exceptions
> >>>>      * [STANBOL-821] - EntitylinkingEngine encounters
> >>>> java.lang.IllegalArgumentException: parsed span MUST be>  0!
> >>>>      * [STANBOL-865] - Tika engine is unable to create Temporary files
> >>>> if SecurityManager is active
> >>>>      * [STANBOL-882] - Loading Paoding Analyzer needs to be done with
> >>>> AccessController.doPrivileged
> >>>>      * [STANBOL-899] - EntityLinking engine MUST NOT fail on empty
> Spans
> >>>> ** Improvement
> >>>>      * [STANBOL-611] - Make the list of properties included for
> >>>> dereferenced Entities of the KeywordLinkingEngine configureable
> >>>>      * [STANBOL-627] - Update to Tika 1.1
> >>>>      * [STANBOL-685] - Improve POS tag handling of the
> >>>> KeywordLinkingEngine
> >>>>      * [STANBOL-686] - Make the "Minimum Token Match Factor"
> >>>> configurable for the KeywordLinkingEngine
> >>>>      * [STANBOL-718] - Add support for suggesting mutiple languages
> and
> >>>> confidence to the LanguageDetectionEnhancementEngine
> >>>>      * [STANBOL-862] - Add support for country specific matching to
> the
> >>>> EntityLinkingEngine
> >>>>      * [STANBOL-866] - Add support for CharFilter to the Lucene
> >>>> LabelTokenizer
> >>>>      * [STANBOL-867] - Add support for configuration parameters to the
> >>>> Lucene LabelTokenizer
> >>>>      * [STANBOL-871] - Support updating of the LabelTokenizer used by
> >>>> the EntityLinkingEngine
> >>>>      * [STANBOL-896] - EntityLinkingEngine should reuse existing
> >>>> TextAnnotations
> >>>> ** New Feature
> >>>>      * [STANBOL-689] - Refactor RDFa/Microformat extractor to be
> >>>> independent of external repositories
> >>>>      * [STANBOL-706] - DBpedia Spotlight EnhancementEngines
> integration
> >>>>      * [STANBOL-707] - Language detection for CJK languages
> >>>>      * [STANBOL-771] - HtmlExtractor: Add an extractor for Microdata
> >>>>      * [STANBOL-849] - Implement Lucene Tokenizer based LabelTokenizer
> >>>>      * [STANBOL-850] - Modularize EntityLinking
> >>>>      * [STANBOL-875] - Add support for Paoding (Chinese)
> >>>>      * [STANBOL-876] - Add Smartcn Sentence detection engine
> >>>>      * [STANBOL-892] - RESTful Service Specification for Stanbol NLP
> >>>> analysis
> >>>>      * [STANBOL-893] - RESTful NLP analyses Enhancement Engine
> >>>>      * [STANBOL-894] - RESTful Language Identification service
> >>>>      * [STANBOL-895] - RESTful Language Identification Engine
> >>>> ** Task
> >>>>      * [STANBOL-885] - Let UIMA local template use latest UIMA SDK and
> >>>> Lucene analyzers
> >>>>      * [STANBOL-913] - Release enhancement-engines-0.10.0
> >>>>      * [STANBOL-916] - Move all OpenNLP related engines into
> >>>> /enhancement-engines/opennlp
> >>>>      * [STANBOL-917] - Move uima engine artifacts into
> >>>> /enhancement-engines/uima
> >>>> ** Test
> >>>>      * [STANBOL-612] - Add helper for validating the Stanbol
> >>>> EnhancementStructure to the Enhancer test module
> >>>>
> >>>> The vote is open for at least 48 hours.
> >>>>
> >>>> Best,
> >>>>   - Fabian
> >
> >
> > --
> > Sergio Fernández
> > Salzburg Research
> > +43 662 2288 318
> > Jakob-Haringer Strasse 5/II
> > A-5020 Salzburg (Austria)
> > http://www.salzburgresearch.at
>
>
>
> --
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>

Re: [VOTE] Release enhancement-engines-0.10.0

Posted by Rupert Westenthaler <ru...@gmail.com>.

Hi Fabian, all

Sorry for the late response, but I was offline the last 4 days.

I checked signatures and digests; the release matches the tag and the
build was successful

+1 from my side


Thx Fabian for forging the release
best
Rupert

On Sun, Feb 17, 2013 at 8:24 PM, Sergio Fernández
<se...@salzburgresearch.at> wrote:
> +1 (non binding)
>
>
> On 15/02/13 12:38, Fabian Christ wrote:
>>
>> Hi,
>>
>> and here is my +1.
>>
>> * Checked signatures and digest
>> * Checked build
>>
>> Best,
>>   - Fabian
>>
>> 2013/2/15 Fabian Christ<fc...@apache.org>:
>>>
>>> Hi,
>>>
>>> forgot to mention one important step to build the release:
>>>
>>> The release can be built on a clean system if tests are excluded via
>>> 'mvn install -DskipTests'.
>>>
>>> If tests are activated, the build requires the
>>> apache-stanbol-data-1.1.0 bundles to be in place in the local Maven
>>> repository. The data bundles can be built from the source package
>>> available at [1].
>>>
>>> [1] http://stanbol.apache.org/downloads/releases.html
>>>
>>> Best,
>>>   - Fabian
>>>
>>> 2013/2/15 Fabian Christ<fc...@apache.org>:
>>>>
>>>> Hi,
>>>>
>>>> this is a vote for releasing a set of enhancement engines bundled as
>>>> the enhancement-engines-0.10.0 release.
>>>>
>>>> The release was cut via a release branch created from trunk revision
>>>> 1443935.
>>>>
>>>> Included are releases of the following engines:
>>>>
>>>> * Tika
>>>> * HTML Extractor
>>>> * XMP Extractor
>>>> * Language Detection
>>>> * Language Identifier
>>>> * OpenNLP Sentence Detection
>>>> * OpenNLP Tokenizer
>>>> * OpenNLP POS Tagging
>>>> * OpenNLP NER
>>>> * OpenNLP Chunker
>>>> * Smart Chinese Tokenizer
>>>> * Paoding Chinese Tokenizer
>>>> * NLP to RDF converter
>>>> * RESTful NLP processing
>>>> * RESTful Language Identification
>>>> * Entity Linking Engine
>>>> * Entity Linking LabelTokenizer : Lucene
>>>> * Entity Linking LabelTokenizer : OpenNLP
>>>> * Entity Linking LabelTokenizer : Smart Chinese
>>>> * Entity Linking LabelTokenizer : Paoding
>>>> * Entityhub Linking
>>>> * Entity Tagging
>>>> * Keyword Extraction
>>>> * Topic Classification
>>>> * Topic Classification : Web API
>>>> * Sentiment Word Classifier
>>>> * Sentiment Summarization
>>>> * UIMA Remote Client
>>>> * UIMA To Triples
>>>> * UIMA Local Client
>>>> * CELI Engine
>>>> * DBPedia Spotlight
>>>> * Geonames Linking
>>>> * OpenCalais
>>>> * Zemanta
>>>>
>>>> Please, vote on the following release packages:
>>>> apache-stanbol-enhancement-engines-0.10.0-source-release
>>>> * tar.gz MD5 08ad280426962f77c7e1c888050188aa
>>>> * zip MD5 65feb9bd94434d417a1cdca309270930
>>>>
>>>> All release artifacts are staged at
>>>> https://repository.apache.org/content/repositories/orgapachestanbol-234/
>>>>
>>>> The source release packages are also available in our dist/dev repo
>>>> https://dist.apache.org/repos/dist/dev/stanbol/234
>>>>
>>>> PGP release signing keys are available at:
>>>> https://dist.apache.org/repos/dist/release/stanbol/KEYS
>>>>
>>>> Release Notes - Stanbol - Version enhancement-engines-0.10.0
>>>>
>>>> ** Sub-task
>>>>      * [STANBOL-735] - OpenNLP POS Tagger Engine
>>>>      * [STANBOL-736] - OpenNLP Chunker Engine
>>>>      * [STANBOL-737] - Sentiment Tagger Engine
>>>>      * [STANBOL-739] - Migrate the Celi Lemmatizer Engine to use the
>>>> AnalyzedText contentPart
>>>>      * [STANBOL-740] - Adopt the KeywordLinkingEngine to use the
>>>> AnalyzedText content part
>>>>      * [STANBOL-792] - Extend the NamedEntityExtraction engine to
>>>> support custom NameFinder Models
>>>>      * [STANBOL-795] - OpenNLP Tokenizer Engine
>>>>      * [STANBOL-796] - OpenNLP Sentence Detection Engine
>>>>      * [STANBOL-797] - Adapt the OpenNLP NER engine to support the
>>>> AnalyzedText ContentPart
>>>>      * [STANBOL-812] - Rename the AnalyzedText based
>>>> KeywordLinkingEngine to EntityhubLinkingEngine
>>>>      * [STANBOL-851] - Extract the OpenNLP labeltokenizer from the
>>>> EntityLikingEngine
>>>>      * [STANBOL-856] - Add Lucene LabelTokenizer configuration for
>>>> Chinese (zh) based on the Lucene smartcn analyzer
>>>>      * [STANBOL-860] - Add AnalyzedText Tokenizer based on the Smart
>>>> Chinese Analyzer
>>>> ** Bug
>>>>      * [STANBOL-617] - Define how TopicEnhancements are written to the
>>>> Enhancement Structure
>>>>      * [STANBOL-622] - The KeywordLinkingEngine should check if all
>>>> Tokens of a Label match against the text
>>>>      * [STANBOL-623] - The KeywordLinkingEngine does not select the
>>>> best fitting label for suggested Entities
>>>>      * [STANBOL-624] - The NamedEntityTagging engine should use
>>>> confidence values between [0..1]
>>>>      * [STANBOL-625] - EnhancementEngines that suggest Entities from
>>>> the Stanbol Entityhub should add the name of the ReferencedSite
>>>>      * [STANBOL-636] - KeywordLinkingEngine should report a
>>>> EngineException instead of a IllegalStateException if the configured
>>>> ReferencedSite is not available
>>>>      * [STANBOL-725] - NamedEntityTaggingEngine does not correctly
>>>> convert labels to lowercase when calculating Levenshtein distance
>>>>      * [STANBOL-726] - The KeywordlinkingEngine sets the value
>>>> configured for "Min Token Length" to "Max Suggestions"
>>>>      * [STANBOL-767] - LocationEnhancementEngine needs to add
>>>> dc:relation properties for dc:requires
>>>>      * [STANBOL-770] - Wrong changes of  the structure of HTML5 docs by
>>>> Tidy based HtmlParser
>>>>      * [STANBOL-809] - Parse ConentItem URI to the Tika content type
>>>> detector
>>>>      * [STANBOL-813] - codification problem with the
>>>> removeNonUtf8CompliantCharacters method used by the opennlp-ner engine
>>>>      * [STANBOL-818] - EntitylinkingEngine encounters
>>>> StringIndexOutOfBounds exceptions
>>>>      * [STANBOL-821] - EntitylinkingEngine encounters
>>>> java.lang.IllegalArgumentException: parsed span MUST be>  0!
>>>>      * [STANBOL-865] - Tika engine is unable to create Temporary files
>>>> if SecurityManager is active
>>>>      * [STANBOL-882] - Loading Paoding Analyzer needs to be done with
>>>> AccessController.doPrivileged
>>>>      * [STANBOL-899] - EntityLinking engine MUST NOT fail on empty Spans
>>>> ** Improvement
>>>>      * [STANBOL-611] - Make the list of properties included for
>>>> dereferenced Entities of the KeywordLinkingEngine configureable
>>>>      * [STANBOL-627] - Update to Tika 1.1
>>>>      * [STANBOL-685] - Improve POS tag handling of the
>>>> KeywordLinkingEngine
>>>>      * [STANBOL-686] - Make the "Minimum Token Match Factor"
>>>> configurable for the KeywordLinkingEngine
>>>>      * [STANBOL-718] - Add support for suggesting mutiple languages and
>>>> confidence to the LanguageDetectionEnhancementEngine
>>>>      * [STANBOL-862] - Add support for country specific matching to the
>>>> EntityLinkingEngine
>>>>      * [STANBOL-866] - Add support for CharFilter to the Lucene
>>>> LabelTokenizer
>>>>      * [STANBOL-867] - Add support for configuration parameters to the
>>>> Lucene LabelTokenizer
>>>>      * [STANBOL-871] - Support updating of the LabelTokenizer used by
>>>> the EntityLinkingEngine
>>>>      * [STANBOL-896] - EntityLinkingEngine should reuse existing
>>>> TextAnnotations
>>>> ** New Feature
>>>>      * [STANBOL-689] - Refactor RDFa/Microformat extractor to be
>>>> independent of external repositories
>>>>      * [STANBOL-706] - DBpedia Spotlight EnhancementEngines integration
>>>>      * [STANBOL-707] - Language detection for CJK languages
>>>>      * [STANBOL-771] - HtmlExtractor: Add an extractor for Microdata
>>>>      * [STANBOL-849] - Implement Lucene Tokenizer based LabelTokenizer
>>>>      * [STANBOL-850] - Modularize EntityLinking
>>>>      * [STANBOL-875] - Add support for Paoding (Chinese)
>>>>      * [STANBOL-876] - Add Smartcn Sentence detection engine
>>>>      * [STANBOL-892] - RESTful Service Specification for Stanbol NLP
>>>> analysis
>>>>      * [STANBOL-893] - RESTful NLP analyses Enhancement Engine
>>>>      * [STANBOL-894] - RESTful Language Identification service
>>>>      * [STANBOL-895] - RESTful Language Identification Engine
>>>> ** Task
>>>>      * [STANBOL-885] - Let UIMA local template use latest UIMA SDK and
>>>> Lucene analyzers
>>>>      * [STANBOL-913] - Release enhancement-engines-0.10.0
>>>>      * [STANBOL-916] - Move all OpenNLP related engines into
>>>> /enhancement-engines/opennlp
>>>>      * [STANBOL-917] - Move uima engine artifacts into
>>>> /enhancement-engines/uima
>>>> ** Test
>>>>      * [STANBOL-612] - Add helper for validating the Stanbol
>>>> EnhancementStructure to the Enhancer test module
>>>>
>>>> The vote is open for at least 48 hours.
>>>>
>>>> Best,
>>>>   - Fabian
>
>
> --
> Sergio Fernández
> Salzburg Research
> +43 662 2288 318
> Jakob-Haringer Strasse 5/II
> A-5020 Salzburg (Austria)
> http://www.salzburgresearch.at



--
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Re: [VOTE] Release enhancement-engines-0.10.0

Posted by Sergio Fernández <se...@salzburgresearch.at>.

+1 (non binding)

On 15/02/13 12:38, Fabian Christ wrote:
> Hi,
>
> and here is my +1.
>
> * Checked signatures and digest
> * Checked build
>
> Best,
>   - Fabian
>
> 2013/2/15 Fabian Christ<fc...@apache.org>:
>> Hi,
>>
>> forgot to mention one important step to build the release:
>>
>> The release can be built on a clean system if tests are excluded via
>> 'mvn install -DskipTests'.
>>
>> If tests are activated, the build requires the
>> apache-stanbol-data-1.1.0 bundles to be in place in the local Maven
>> repository. The data bundles can be built from the source package
>> available at [1].
>>
>> [1] http://stanbol.apache.org/downloads/releases.html
>>
>> Best,
>>   - Fabian
>>
>> 2013/2/15 Fabian Christ<fc...@apache.org>:
>>> Hi,
>>>
>>> this is a vote for releasing a set of enhancement engines bundled as
>>> the enhancement-engines-0.10.0 release.
>>>
>>> The release was cut via a release branch created from trunk revision 1443935.
>>>
>>> Included are releases of the following engines:
>>>
>>> * Tika
>>> * HTML Extractor
>>> * XMP Extractor
>>> * Language Detection
>>> * Language Identifier
>>> * OpenNLP Sentence Detection
>>> * OpenNLP Tokenizer
>>> * OpenNLP POS Tagging
>>> * OpenNLP NER
>>> * OpenNLP Chunker
>>> * Smart Chinese Tokenizer
>>> * Paoding Chinese Tokenizer
>>> * NLP to RDF converter
>>> * RESTful NLP processing
>>> * RESTful Language Identification
>>> * Entity Linking Engine
>>> * Entity Linking LabelTokenizer : Lucene
>>> * Entity Linking LabelTokenizer : OpenNLP
>>> * Entity Linking LabelTokenizer : Smart Chinese
>>> * Entity Linking LabelTokenizer : Paoding
>>> * Entityhub Linking
>>> * Entity Tagging
>>> * Keyword Extraction
>>> * Topic Classification
>>> * Topic Classification : Web API
>>> * Sentiment Word Classifier
>>> * Sentiment Summarization
>>> * UIMA Remote Client
>>> * UIMA To Triples
>>> * UIMA Local Client
>>> * CELI Engine
>>> * DBPedia Spotlight
>>> * Geonames Linking
>>> * OpenCalais
>>> * Zemanta
>>>
>>> Please, vote on the following release packages:
>>> apache-stanbol-enhancement-engines-0.10.0-source-release
>>> * tar.gz MD5 08ad280426962f77c7e1c888050188aa
>>> * zip MD5 65feb9bd94434d417a1cdca309270930
>>>
>>> All release artifacts are staged at
>>> https://repository.apache.org/content/repositories/orgapachestanbol-234/
>>>
>>> The source release packages are also available in our dist/dev repo
>>> https://dist.apache.org/repos/dist/dev/stanbol/234
>>>
>>> PGP release signing keys are available at:
>>> https://dist.apache.org/repos/dist/release/stanbol/KEYS
>>>
>>> Release Notes - Stanbol - Version enhancement-engines-0.10.0
>>>
>>> ** Sub-task
>>>      * [STANBOL-735] - OpenNLP POS Tagger Engine
>>>      * [STANBOL-736] - OpenNLP Chunker Engine
>>>      * [STANBOL-737] - Sentiment Tagger Engine
>>>      * [STANBOL-739] - Migrate the Celi Lemmatizer Engine to use the
>>> AnalyzedText contentPart
>>>      * [STANBOL-740] - Adopt the KeywordLinkingEngine to use the
>>> AnalyzedText content part
>>>      * [STANBOL-792] - Extend the NamedEntityExtraction engine to
>>> support custom NameFinder Models
>>>      * [STANBOL-795] - OpenNLP Tokenizer Engine
>>>      * [STANBOL-796] - OpenNLP Sentence Detection Engine
>>>      * [STANBOL-797] - Adapt the OpenNLP NER engine to support the
>>> AnalyzedText ContentPart
>>>      * [STANBOL-812] - Rename the AnalyzedText based
>>> KeywordLinkingEngine to EntityhubLinkingEngine
>>>      * [STANBOL-851] - Extract the OpenNLP labeltokenizer from the
>>> EntityLikingEngine
>>>      * [STANBOL-856] - Add Lucene LabelTokenizer configuration for
>>> Chinese (zh) based on the Lucene smartcn analyzer
>>>      * [STANBOL-860] - Add AnalyzedText Tokenizer based on the Smart
>>> Chinese Analyzer
>>> ** Bug
>>>      * [STANBOL-617] - Define how TopicEnhancements are written to the
>>> Enhancement Structure
>>>      * [STANBOL-622] - The KeywordLinkingEngine should check if all
>>> Tokens of a Label match against the text
>>>      * [STANBOL-623] - The KeywordLinkingEngine does not select the
>>> best fitting label for suggested Entities
>>>      * [STANBOL-624] - The NamedEntityTagging engine should use
>>> confidence values between [0..1]
>>>      * [STANBOL-625] - EnhancementEngines that suggest Entities from
>>> the Stanbol Entityhub should add the name of the ReferencedSite
>>>      * [STANBOL-636] - KeywordLinkingEngine should report a
>>> EngineException instead of a IllegalStateException if the configured
>>> ReferencedSite is not available
>>>      * [STANBOL-725] - NamedEntityTaggingEngine does not correctly
>>> convert labels to lowercase when calculating Levenshtein distance
>>>      * [STANBOL-726] - The KeywordlinkingEngine sets the value
>>> configured for "Min Token Length" to "Max Suggestions"
>>>      * [STANBOL-767] - LocationEnhancementEngine needs to add
>>> dc:relation properties for dc:requires
>>>      * [STANBOL-770] - Wrong changes of  the structure of HTML5 docs by
>>> Tidy based HtmlParser
>>>      * [STANBOL-809] - Parse ConentItem URI to the Tika content type detector
>>>      * [STANBOL-813] - codification problem with the
>>> removeNonUtf8CompliantCharacters method used by the opennlp-ner engine
>>>      * [STANBOL-818] - EntitylinkingEngine encounters
>>> StringIndexOutOfBounds exceptions
>>>      * [STANBOL-821] - EntitylinkingEngine encounters
>>> java.lang.IllegalArgumentException: parsed span MUST be>  0!
>>>      * [STANBOL-865] - Tika engine is unable to create Temporary files
>>> if SecurityManager is active
>>>      * [STANBOL-882] - Loading Paoding Analyzer needs to be done with
>>> AccessController.doPrivileged
>>>      * [STANBOL-899] - EntityLinking engine MUST NOT fail on empty Spans
>>> ** Improvement
>>>      * [STANBOL-611] - Make the list of properties included for
>>> dereferenced Entities of the KeywordLinkingEngine configureable
>>>      * [STANBOL-627] - Update to Tika 1.1
>>>      * [STANBOL-685] - Improve POS tag handling of the KeywordLinkingEngine
>>>      * [STANBOL-686] - Make the "Minimum Token Match Factor"
>>> configurable for the KeywordLinkingEngine
>>>      * [STANBOL-718] - Add support for suggesting mutiple languages and
>>> confidence to the LanguageDetectionEnhancementEngine
>>>      * [STANBOL-862] - Add support for country specific matching to the
>>> EntityLinkingEngine
>>>      * [STANBOL-866] - Add support for CharFilter to the Lucene LabelTokenizer
>>>      * [STANBOL-867] - Add support for configuration parameters to the
>>> Lucene LabelTokenizer
>>>      * [STANBOL-871] - Support updating of the LabelTokenizer used by
>>> the EntityLinkingEngine
>>>      * [STANBOL-896] - EntityLinkingEngine should reuse existing TextAnnotations
>>> ** New Feature
>>>      * [STANBOL-689] - Refactor RDFa/Microformat extractor to be
>>> independent of external repositories
>>>      * [STANBOL-706] - DBpedia Spotlight EnhancementEngines integration
>>>      * [STANBOL-707] - Language detection for CJK languages
>>>      * [STANBOL-771] - HtmlExtractor: Add an extractor for Microdata
>>>      * [STANBOL-849] - Implement Lucene Tokenizer based LabelTokenizer
>>>      * [STANBOL-850] - Modularize EntityLinking
>>>      * [STANBOL-875] - Add support for Paoding (Chinese)
>>>      * [STANBOL-876] - Add Smartcn Sentence detection engine
>>>      * [STANBOL-892] - RESTful Service Specification for Stanbol NLP analysis
>>>      * [STANBOL-893] - RESTful NLP analyses Enhancement Engine
>>>      * [STANBOL-894] - RESTful Language Identification service
>>>      * [STANBOL-895] - RESTful Language Identification Engine
>>> ** Task
>>>      * [STANBOL-885] - Let UIMA local template use latest UIMA SDK and
>>> Lucene analyzers
>>>      * [STANBOL-913] - Release enhancement-engines-0.10.0
>>>      * [STANBOL-916] - Move all OpenNLP related engines into
>>> /enhancement-engines/opennlp
>>>      * [STANBOL-917] - Move uima engine artifacts into /enhancement-engines/uima
>>> ** Test
>>>      * [STANBOL-612] - Add helper for validating the Stanbol
>>> EnhancementStructure to the Enhancer test module
>>>
>>> The vote is open for at least 48 hours.
>>>
>>> Best,
>>>   - Fabian

-- 
Sergio Fernández
Salzburg Research
+43 662 2288 318
Jakob-Haringer Strasse 5/II
A-5020 Salzburg (Austria)
http://www.salzburgresearch.at

Re: [VOTE] Release enhancement-engines-0.10.0

Posted by Fabian Christ <fc...@apache.org>.

Hi,

and here is my +1.

* Checked signatures and digest
* Checked build

Best,
 - Fabian

2013/2/15 Fabian Christ <fc...@apache.org>:
> Hi,
>
> forgot to mention one important step to build the release:
>
> The release can be built on a clean system if tests are excluded via
> 'mvn install -DskipTests'.
>
> If tests are activated, the build requires the
> apache-stanbol-data-1.1.0 bundles to be in place in the local Maven
> repository. The data bundles can be built from the source package
> available at [1].
>
> [1] http://stanbol.apache.org/downloads/releases.html
>
> Best,
>  - Fabian
>
> 2013/2/15 Fabian Christ <fc...@apache.org>:
>> Hi,
>>
>> this is a vote for releasing a set of enhancement engines bundled as
>> the enhancement-engines-0.10.0 release.
>>
>> The release was cut via a release branch created from trunk revision 1443935.
>>
>> Included are releases of the following engines:
>>
>> * Tika
>> * HTML Extractor
>> * XMP Extractor
>> * Language Detection
>> * Language Identifier
>> * OpenNLP Sentence Detection
>> * OpenNLP Tokenizer
>> * OpenNLP POS Tagging
>> * OpenNLP NER
>> * OpenNLP Chunker
>> * Smart Chinese Tokenizer
>> * Paoding Chinese Tokenizer
>> * NLP to RDF converter
>> * RESTful NLP processing
>> * RESTful Language Identification
>> * Entity Linking Engine
>> * Entity Linking LabelTokenizer : Lucene
>> * Entity Linking LabelTokenizer : OpenNLP
>> * Entity Linking LabelTokenizer : Smart Chinese
>> * Entity Linking LabelTokenizer : Paoding
>> * Entityhub Linking
>> * Entity Tagging
>> * Keyword Extraction
>> * Topic Classification
>> * Topic Classification : Web API
>> * Sentiment Word Classifier
>> * Sentiment Summarization
>> * UIMA Remote Client
>> * UIMA To Triples
>> * UIMA Local Client
>> * CELI Engine
>> * DBPedia Spotlight
>> * Geonames Linking
>> * OpenCalais
>> * Zemanta
>>
>> Please, vote on the following release packages:
>> apache-stanbol-enhancement-engines-0.10.0-source-release
>> * tar.gz MD5 08ad280426962f77c7e1c888050188aa
>> * zip MD5 65feb9bd94434d417a1cdca309270930
>>
>> All release artifacts are staged at
>> https://repository.apache.org/content/repositories/orgapachestanbol-234/
>>
>> The source release packages are also available in our dist/dev repo
>> https://dist.apache.org/repos/dist/dev/stanbol/234
>>
>> PGP release signing keys are available at:
>> https://dist.apache.org/repos/dist/release/stanbol/KEYS
>>
>> Release Notes - Stanbol - Version enhancement-engines-0.10.0
>>
>> ** Sub-task
>>     * [STANBOL-735] - OpenNLP POS Tagger Engine
>>     * [STANBOL-736] - OpenNLP Chunker Engine
>>     * [STANBOL-737] - Sentiment Tagger Engine
>>     * [STANBOL-739] - Migrate the Celi Lemmatizer Engine to use the
>> AnalyzedText contentPart
>>     * [STANBOL-740] - Adopt the KeywordLinkingEngine to use the
>> AnalyzedText content part
>>     * [STANBOL-792] - Extend the NamedEntityExtraction engine to
>> support custom NameFinder Models
>>     * [STANBOL-795] - OpenNLP Tokenizer Engine
>>     * [STANBOL-796] - OpenNLP Sentence Detection Engine
>>     * [STANBOL-797] - Adapt the OpenNLP NER engine to support the
>> AnalyzedText ContentPart
>>     * [STANBOL-812] - Rename the AnalyzedText based
>> KeywordLinkingEngine to EntityhubLinkingEngine
>>     * [STANBOL-851] - Extract the OpenNLP labeltokenizer from the
>> EntityLikingEngine
>>     * [STANBOL-856] - Add Lucene LabelTokenizer configuration for
>> Chinese (zh) based on the Lucene smartcn analyzer
>>     * [STANBOL-860] - Add AnalyzedText Tokenizer based on the Smart
>> Chinese Analyzer
>> ** Bug
>>     * [STANBOL-617] - Define how TopicEnhancements are written to the
>> Enhancement Structure
>>     * [STANBOL-622] - The KeywordLinkingEngine should check if all
>> Tokens of a Label match against the text
>>     * [STANBOL-623] - The KeywordLinkingEngine does not select the
>> best fitting label for suggested Entities
>>     * [STANBOL-624] - The NamedEntityTagging engine should use
>> confidence values between [0..1]
>>     * [STANBOL-625] - EnhancementEngines that suggest Entities from
>> the Stanbol Entityhub should add the name of the ReferencedSite
>>     * [STANBOL-636] - KeywordLinkingEngine should report a
>> EngineException instead of a IllegalStateException if the configured
>> ReferencedSite is not available
>>     * [STANBOL-725] - NamedEntityTaggingEngine does not correctly
>> convert labels to lowercase when calculating Levenshtein distance
>>     * [STANBOL-726] - The KeywordlinkingEngine sets the value
>> configured for "Min Token Length" to "Max Suggestions"
>>     * [STANBOL-767] - LocationEnhancementEngine needs to add
>> dc:relation properties for dc:requires
>>     * [STANBOL-770] - Wrong changes of  the structure of HTML5 docs by
>> Tidy based HtmlParser
>>     * [STANBOL-809] - Parse ConentItem URI to the Tika content type detector
>>     * [STANBOL-813] - codification problem with the
>> removeNonUtf8CompliantCharacters method used by the opennlp-ner engine
>>     * [STANBOL-818] - EntitylinkingEngine encounters
>> StringIndexOutOfBounds exceptions
>>     * [STANBOL-821] - EntitylinkingEngine encounters
>> java.lang.IllegalArgumentException: parsed span MUST be > 0!
>>     * [STANBOL-865] - Tika engine is unable to create Temporary files
>> if SecurityManager is active
>>     * [STANBOL-882] - Loading Paoding Analyzer needs to be done with
>> AccessController.doPrivileged
>>     * [STANBOL-899] - EntityLinking engine MUST NOT fail on empty Spans
>> ** Improvement
>>     * [STANBOL-611] - Make the list of properties included for
>> dereferenced Entities of the KeywordLinkingEngine configureable
>>     * [STANBOL-627] - Update to Tika 1.1
>>     * [STANBOL-685] - Improve POS tag handling of the KeywordLinkingEngine
>>     * [STANBOL-686] - Make the "Minimum Token Match Factor"
>> configurable for the KeywordLinkingEngine
>>     * [STANBOL-718] - Add support for suggesting mutiple languages and
>> confidence to the LanguageDetectionEnhancementEngine
>>     * [STANBOL-862] - Add support for country specific matching to the
>> EntityLinkingEngine
>>     * [STANBOL-866] - Add support for CharFilter to the Lucene LabelTokenizer
>>     * [STANBOL-867] - Add support for configuration parameters to the
>> Lucene LabelTokenizer
>>     * [STANBOL-871] - Support updating of the LabelTokenizer used by
>> the EntityLinkingEngine
>>     * [STANBOL-896] - EntityLinkingEngine should reuse existing TextAnnotations
>> ** New Feature
>>     * [STANBOL-689] - Refactor RDFa/Microformat extractor to be
>> independent of external repositories
>>     * [STANBOL-706] - DBpedia Spotlight EnhancementEngines integration
>>     * [STANBOL-707] - Language detection for CJK languages
>>     * [STANBOL-771] - HtmlExtractor: Add an extractor for Microdata
>>     * [STANBOL-849] - Implement Lucene Tokenizer based LabelTokenizer
>>     * [STANBOL-850] - Modularize EntityLinking
>>     * [STANBOL-875] - Add support for Paoding (Chinese)
>>     * [STANBOL-876] - Add Smartcn Sentence detection engine
>>     * [STANBOL-892] - RESTful Service Specification for Stanbol NLP analysis
>>     * [STANBOL-893] - RESTful NLP analyses Enhancement Engine
>>     * [STANBOL-894] - RESTful Language Identification service
>>     * [STANBOL-895] - RESTful Language Identification Engine
>> ** Task
>>     * [STANBOL-885] - Let UIMA local template use latest UIMA SDK and
>> Lucene analyzers
>>     * [STANBOL-913] - Release enhancement-engines-0.10.0
>>     * [STANBOL-916] - Move all OpenNLP related engines into
>> /enhancement-engines/opennlp
>>     * [STANBOL-917] - Move uima engine artifacts into /enhancement-engines/uima
>> ** Test
>>     * [STANBOL-612] - Add helper for validating the Stanbol
>> EnhancementStructure to the Enhancer test module
>>
>> The vote is open for at least 48 hours.
>>
>> Best,
>>  - Fabian

Re: [VOTE] Release enhancement-engines-0.10.0

Posted by Fabian Christ <fc...@apache.org>.

Hi,

forgot to mention one important step to build the release:

The release can be built on a clean system if tests are excluded via
'mvn install -DskipTests'.

If tests are activated, the build requires the
apache-stanbol-data-1.1.0 bundles to be in place in the local Maven
repository. The data bundles can be built from the source package
available at [1].

[1] http://stanbol.apache.org/downloads/releases.html

Best,
 - Fabian

2013/2/15 Fabian Christ <fc...@apache.org>:
> Hi,
>
> this is a vote for releasing a set of enhancement engines bundled as
> the enhancement-engines-0.10.0 release.
>
> The release was cut via a release branch created from trunk revision 1443935.
>
> Included are releases of the following engines:
>
> * Tika
> * HTML Extractor
> * XMP Extractor
> * Language Detection
> * Language Identifier
> * OpenNLP Sentence Detection
> * OpenNLP Tokenizer
> * OpenNLP POS Tagging
> * OpenNLP NER
> * OpenNLP Chunker
> * Smart Chinese Tokenizer
> * Paoding Chinese Tokenizer
> * NLP to RDF converter
> * RESTful NLP processing
> * RESTful Language Identification
> * Entity Linking Engine
> * Entity Linking LabelTokenizer : Lucene
> * Entity Linking LabelTokenizer : OpenNLP
> * Entity Linking LabelTokenizer : Smart Chinese
> * Entity Linking LabelTokenizer : Paoding
> * Entityhub Linking
> * Entity Tagging
> * Keyword Extraction
> * Topic Classification
> * Topic Classification : Web API
> * Sentiment Word Classifier
> * Sentiment Summarization
> * UIMA Remote Client
> * UIMA To Triples
> * UIMA Local Client
> * CELI Engine
> * DBPedia Spotlight
> * Geonames Linking
> * OpenCalais
> * Zemanta
>
> Please, vote on the following release packages:
> apache-stanbol-enhancement-engines-0.10.0-source-release
> * tar.gz MD5 08ad280426962f77c7e1c888050188aa
> * zip MD5 65feb9bd94434d417a1cdca309270930
>
> All release artifacts are staged at
> https://repository.apache.org/content/repositories/orgapachestanbol-234/
>
> The source release packages are also available in our dist/dev repo
> https://dist.apache.org/repos/dist/dev/stanbol/234
>
> PGP release signing keys are available at:
> https://dist.apache.org/repos/dist/release/stanbol/KEYS
>
> Release Notes - Stanbol - Version enhancement-engines-0.10.0
>
> ** Sub-task
>     * [STANBOL-735] - OpenNLP POS Tagger Engine
>     * [STANBOL-736] - OpenNLP Chunker Engine
>     * [STANBOL-737] - Sentiment Tagger Engine
>     * [STANBOL-739] - Migrate the Celi Lemmatizer Engine to use the
> AnalyzedText contentPart
>     * [STANBOL-740] - Adopt the KeywordLinkingEngine to use the
> AnalyzedText content part
>     * [STANBOL-792] - Extend the NamedEntityExtraction engine to
> support custom NameFinder Models
>     * [STANBOL-795] - OpenNLP Tokenizer Engine
>     * [STANBOL-796] - OpenNLP Sentence Detection Engine
>     * [STANBOL-797] - Adapt the OpenNLP NER engine to support the
> AnalyzedText ContentPart
>     * [STANBOL-812] - Rename the AnalyzedText based
> KeywordLinkingEngine to EntityhubLinkingEngine
>     * [STANBOL-851] - Extract the OpenNLP labeltokenizer from the
> EntityLikingEngine
>     * [STANBOL-856] - Add Lucene LabelTokenizer configuration for
> Chinese (zh) based on the Lucene smartcn analyzer
>     * [STANBOL-860] - Add AnalyzedText Tokenizer based on the Smart
> Chinese Analyzer
> ** Bug
>     * [STANBOL-617] - Define how TopicEnhancements are written to the
> Enhancement Structure
>     * [STANBOL-622] - The KeywordLinkingEngine should check if all
> Tokens of a Label match against the text
>     * [STANBOL-623] - The KeywordLinkingEngine does not select the
> best fitting label for suggested Entities
>     * [STANBOL-624] - The NamedEntityTagging engine should use
> confidence values between [0..1]
>     * [STANBOL-625] - EnhancementEngines that suggest Entities from
> the Stanbol Entityhub should add the name of the ReferencedSite
>     * [STANBOL-636] - KeywordLinkingEngine should report a
> EngineException instead of a IllegalStateException if the configured
> ReferencedSite is not available
>     * [STANBOL-725] - NamedEntityTaggingEngine does not correctly
> convert labels to lowercase when calculating Levenshtein distance
>     * [STANBOL-726] - The KeywordlinkingEngine sets the value
> configured for "Min Token Length" to "Max Suggestions"
>     * [STANBOL-767] - LocationEnhancementEngine needs to add
> dc:relation properties for dc:requires
>     * [STANBOL-770] - Wrong changes of  the structure of HTML5 docs by
> Tidy based HtmlParser
>     * [STANBOL-809] - Parse ConentItem URI to the Tika content type detector
>     * [STANBOL-813] - codification problem with the
> removeNonUtf8CompliantCharacters method used by the opennlp-ner engine
>     * [STANBOL-818] - EntitylinkingEngine encounters
> StringIndexOutOfBounds exceptions
>     * [STANBOL-821] - EntitylinkingEngine encounters
> java.lang.IllegalArgumentException: parsed span MUST be > 0!
>     * [STANBOL-865] - Tika engine is unable to create Temporary files
> if SecurityManager is active
>     * [STANBOL-882] - Loading Paoding Analyzer needs to be done with
> AccessController.doPrivileged
>     * [STANBOL-899] - EntityLinking engine MUST NOT fail on empty Spans
> ** Improvement
>     * [STANBOL-611] - Make the list of properties included for
> dereferenced Entities of the KeywordLinkingEngine configureable
>     * [STANBOL-627] - Update to Tika 1.1
>     * [STANBOL-685] - Improve POS tag handling of the KeywordLinkingEngine
>     * [STANBOL-686] - Make the "Minimum Token Match Factor"
> configurable for the KeywordLinkingEngine
>     * [STANBOL-718] - Add support for suggesting mutiple languages and
> confidence to the LanguageDetectionEnhancementEngine
>     * [STANBOL-862] - Add support for country specific matching to the
> EntityLinkingEngine
>     * [STANBOL-866] - Add support for CharFilter to the Lucene LabelTokenizer
>     * [STANBOL-867] - Add support for configuration parameters to the
> Lucene LabelTokenizer
>     * [STANBOL-871] - Support updating of the LabelTokenizer used by
> the EntityLinkingEngine
>     * [STANBOL-896] - EntityLinkingEngine should reuse existing TextAnnotations
> ** New Feature
>     * [STANBOL-689] - Refactor RDFa/Microformat extractor to be
> independent of external repositories
>     * [STANBOL-706] - DBpedia Spotlight EnhancementEngines integration
>     * [STANBOL-707] - Language detection for CJK languages
>     * [STANBOL-771] - HtmlExtractor: Add an extractor for Microdata
>     * [STANBOL-849] - Implement Lucene Tokenizer based LabelTokenizer
>     * [STANBOL-850] - Modularize EntityLinking
>     * [STANBOL-875] - Add support for Paoding (Chinese)
>     * [STANBOL-876] - Add Smartcn Sentence detection engine
>     * [STANBOL-892] - RESTful Service Specification for Stanbol NLP analysis
>     * [STANBOL-893] - RESTful NLP analyses Enhancement Engine
>     * [STANBOL-894] - RESTful Language Identification service
>     * [STANBOL-895] - RESTful Language Identification Engine
> ** Task
>     * [STANBOL-885] - Let UIMA local template use latest UIMA SDK and
> Lucene analyzers
>     * [STANBOL-913] - Release enhancement-engines-0.10.0
>     * [STANBOL-916] - Move all OpenNLP related engines into
> /enhancement-engines/opennlp
>     * [STANBOL-917] - Move uima engine artifacts into /enhancement-engines/uima
> ** Test
>     * [STANBOL-612] - Add helper for validating the Stanbol
> EnhancementStructure to the Enhancer test module
>
> The vote is open for at least 48 hours.
>
> Best,
>  - Fabian