You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Jens Grivolla <j+...@grivolla.net> on 2014/07/01 18:52:51 UTC

Re: CFP: Workshop on Open Infrastructures and Analysis Frameworks for HLT

The list of accepted papers is now available:
http://glicom.upf.edu/OIAF4HLT/Papers.html

For anybody interested in attending the workshop and COLING, please
remember that the early registration deadline is tomorrow, July 2nd.

Looking forward to seeing many of you there...

-- Jens


On Wed, Mar 26, 2014 at 2:34 PM, Jens Grivolla <j+...@grivolla.net> wrote:

> Workshop on Open Infrastructures and Analysis Frameworks for HLT
> ================================================================
>
> http://glicom.upf.edu/OIAF4HLT/
>
> At the 25th International Conference on Computational Linguistics (COLING
> 2014)
> Helix Conference Centre at Dublin City University (DCU)
> 23-29 August 2014
>
> Description
> -----------
>
> Recent advances in digital storage and networking, coupled with the
> extension of human language technologies (HLT) into ever broader areas and
> the persistence of difficulties in software portability, have led to an
> increased focus on development and deployment of web-based infrastructures
> that allow users to access tools and other resources and combine them to
> create novel solutions that can be efficiently composed, tuned, evaluated,
> disseminated and consumed. This in turn engenders collaborative development
> and deployment among individuals and teams across the globe. It also
> increases the need for robust, widely available evaluation methods and
> tools, means to achieve interoperability of software and data from diverse
> sources, means to handle licensing for limited access resources distributed
> over the web, and, perhaps crucially, the need to develop strategies for
> multi-site collaborative work.
>
> For many decades, NLP has suffered from low software engineering standards
> causing a limited degree of re-usability of code and interoperability of
> different modules within larger NLP systems. While this did not really
> hamper success in limited task areas (such as implementing a parser), it
> caused serious problems for building complex integrated software systems,
> e.g., for information extraction or machine translation. This lack of
> integration has led to duplicated software development, work-arounds for
> programs written in different (versions of) programming languages, and
> ad-hoc tweaking of interfaces between modules developed at different sites.
>
> In recent years, two main frameworks, UIMA and GATE, have emerged that aim
> to allow the easy integration of varied tools through common type systems
> and standardized communication methods for components analysing
> unstructured textual information, such as natural language. Both frameworks
> offer a solid processing infrastructure that allows developers to
> concentrate on the implementation of the actual analytics components. An
> increasing number of members of the NLP community have adopted one of these
> frameworks as a platform for facilitating the creation of reusable NLP
> components that can be assembled to address different NLP tasks depending
> on their order, combination and configuration. Analysis frameworks also
> reduce the problem of reproducibility of NLP results by formalising
> solution composition and making language processing tools shareable.
>
> Very recently, several efforts have been devoted to the development of web
> service platforms for NLP. These platforms exploit the growing number of
> web-based tools and services available for tasks related to HLT, including
> corpus annotation, configuration and execution of NLP pipelines, and
> evaluation of results and automatic parameter tuning. These platforms can
> also integrate modules and pipelines from existing frameworks such as UIMA
> and GATE, in order to achieve interoperability with a wide variety of
> modules from different sources.
>
> Many of the issues and challenges surrounding these developments have been
> addressed individually in particular projects and workshops, but there are
> ramifications that cut across all of them. We therefore feel that this is
> the moment to bring together participants representing the range of
> interests that comprise the comprehensive picture for community-driven,
> distributed, collaborative, web-based development and use for language
> processing software and resources. This includes those engaged in
> development of infrastructures for HLT as well as those who will use these
> services and infrastructures, especially for multi-site collaborative work.
>
>
> ### Workshop Objectives
>
> The overall goal of this workshop is to provide a forum for discussion of
> the requirements for an envisaged open “global laboratory” for HLT research
> and development and establish the basis of a community effort to develop
> and support it. To this end, the workshop will include both presentations
> addressing the issues and challenges of developing, deploying, and using
> the global laboratory for distributed and collaborative efforts and
> discussion that will identify next steps for moving forward, fostering
> community-wide awareness, and establishing and encouraging communication
> among the various players.
>
> It aims at bringing together members of the NLP community specifically
> users, developers or providers of components and tools for these frameworks
> in order to explore and discuss the opportunities and challenges in using
> such platforms for modern, well-engineered NLP applications.
>
> The challenge of creating reusable and interoperable components raises
> particular interest and are affected by legal issues, such as potentially
> incompatible licenses of components and tools as well as the technical
> aspects of packaging and distribution of components. Also, tools are
> important, for example to assemble complex processing pipelines, to manage
> the bodies of data that are to be analysed and to visualize, explore, and
> further deploy the analysis results. Further challenges are involved in
> embedding framework based analysis within applications or using it in
> distributed computing scenarios, such as deployment of and access to
> required resources. Finally, the preservation of analysis results, their
> provenance and reproducibility are of particular interest to the scientific
> user community.
>
> ### Topics
>
> Workshop topics include, but are not limited to:
>
> - processing of very large data collections: scale-out, parallelization,
> and performance optimization
> - advanced applications driven by an NLP framework
> - sophisticated tools to build and manage complex processing pipelines
> - analysis of results: exploration, evaluation, visualization, and
> statistical analysis
> - experience reports combining components from different sources, as well
> as solutions to interoperability issues
> - experience reports combining different frameworks (e.g.
> GATE/UIMA/WebLicht/etc.)
> - UIMA components with a special focus on genericity and type-system
> independence
> - repositories of ready-to-use components for UIMA and/or GATE
> - distribution of components: documentation, licensing and packaging
> - developing for UIMA or GATE: simplified APIs, debugging, unit testing,
> and limitations of the frameworks
> - combining annotation type systems in processing frameworks (GATE, UIMA,
> etc.) with standardization efforts, such as done in the ISO TC37/SC4 or TEI
> contexts.
> - use of NLP frameworks in real-world "industry" settings
> - reports on current projects and frameworks, their challenges and
> proposed or implemented solutions, including efforts to address
> interoperability
> - issues and challenges of multi-site collaborative projects, including
> reports of implemented or proposed strategies
> - pipeline management, including authentication, strategies for passing
> resources through disparate tools and across hosting nodes, and licensing
> - development and use of evaluation environments that facilitate
> assessment of HLT component performance, iterative application development,
> and replication of results
> - community awareness and implementation of open infrastructures,
> including how to engage the community, establish confidence in the process,
> and promote use
>
> Dates
> -----
> Paper Submission Deadline: 2nd May 2014
> Author Notification Deadline: 6th June 2014
> Camera-Ready Paper Deadline: 27th June 2014
> Workshop: 23rd August 2014
>
> Organisers
> ----------
> Nancy Ide
> Department of Computer Science, Vassar College
>
> James Pustejovsky
> Department of Computer Science, Brandeis University
>
> Eric Nyberg
> Language Technologies Institute, School of Computer Science, Carnegie
> Mellon University
>
> Christopher Cieri
> Linguistic Data Consortium, University of Pennsylvania
>
> Jonathan Wright
> Linguistic Data Consortium, University of Pennsylvania
>
> Jens Grivolla
> GLiCom, Universitat Pompeu Fabra
>
> Kalina Bontcheva
> Department of Computer Science, University of Sheffield
>
>

Re: CFP: Workshop on Open Infrastructures and Analysis Frameworks for HLT

Posted by Jens Grivolla <j+...@grivolla.net>.
The workshop program, along with links to the full papers, is now
available: http://glicom.upf.edu/OIAF4HLT/Program.html

I'm looking forward to seeing many of you there.  I'll be staying at DCU
(College Park).

-- Jens


On Tue, Jul 1, 2014 at 6:52 PM, Jens Grivolla <j+...@grivolla.net> wrote:

> The list of accepted papers is now available:
> http://glicom.upf.edu/OIAF4HLT/Papers.html
>
> For anybody interested in attending the workshop and COLING, please
> remember that the early registration deadline is tomorrow, July 2nd.
>
> Looking forward to seeing many of you there...
>
> -- Jens
>
>
> On Wed, Mar 26, 2014 at 2:34 PM, Jens Grivolla <j+...@grivolla.net> wrote:
>
>> Workshop on Open Infrastructures and Analysis Frameworks for HLT
>> ================================================================
>>
>> http://glicom.upf.edu/OIAF4HLT/
>>
>> At the 25th International Conference on Computational Linguistics (COLING
>> 2014)
>> Helix Conference Centre at Dublin City University (DCU)
>> 23-29 August 2014
>>
>> Description
>> -----------
>>
>> Recent advances in digital storage and networking, coupled with the
>> extension of human language technologies (HLT) into ever broader areas and
>> the persistence of difficulties in software portability, have led to an
>> increased focus on development and deployment of web-based infrastructures
>> that allow users to access tools and other resources and combine them to
>> create novel solutions that can be efficiently composed, tuned, evaluated,
>> disseminated and consumed. This in turn engenders collaborative development
>> and deployment among individuals and teams across the globe. It also
>> increases the need for robust, widely available evaluation methods and
>> tools, means to achieve interoperability of software and data from diverse
>> sources, means to handle licensing for limited access resources distributed
>> over the web, and, perhaps crucially, the need to develop strategies for
>> multi-site collaborative work.
>>
>> For many decades, NLP has suffered from low software engineering
>> standards causing a limited degree of re-usability of code and
>> interoperability of different modules within larger NLP systems. While this
>> did not really hamper success in limited task areas (such as implementing a
>> parser), it caused serious problems for building complex integrated
>> software systems, e.g., for information extraction or machine translation.
>> This lack of integration has led to duplicated software development,
>> work-arounds for programs written in different (versions of) programming
>> languages, and ad-hoc tweaking of interfaces between modules developed at
>> different sites.
>>
>> In recent years, two main frameworks, UIMA and GATE, have emerged that
>> aim to allow the easy integration of varied tools through common type
>> systems and standardized communication methods for components analysing
>> unstructured textual information, such as natural language. Both frameworks
>> offer a solid processing infrastructure that allows developers to
>> concentrate on the implementation of the actual analytics components. An
>> increasing number of members of the NLP community have adopted one of these
>> frameworks as a platform for facilitating the creation of reusable NLP
>> components that can be assembled to address different NLP tasks depending
>> on their order, combination and configuration. Analysis frameworks also
>> reduce the problem of reproducibility of NLP results by formalising
>> solution composition and making language processing tools shareable.
>>
>> Very recently, several efforts have been devoted to the development of
>> web service platforms for NLP. These platforms exploit the growing number
>> of web-based tools and services available for tasks related to HLT,
>> including corpus annotation, configuration and execution of NLP pipelines,
>> and evaluation of results and automatic parameter tuning. These platforms
>> can also integrate modules and pipelines from existing frameworks such as
>> UIMA and GATE, in order to achieve interoperability with a wide variety of
>> modules from different sources.
>>
>> Many of the issues and challenges surrounding these developments have
>> been addressed individually in particular projects and workshops, but there
>> are ramifications that cut across all of them. We therefore feel that this
>> is the moment to bring together participants representing the range of
>> interests that comprise the comprehensive picture for community-driven,
>> distributed, collaborative, web-based development and use for language
>> processing software and resources. This includes those engaged in
>> development of infrastructures for HLT as well as those who will use these
>> services and infrastructures, especially for multi-site collaborative work.
>>
>>
>> ### Workshop Objectives
>>
>> The overall goal of this workshop is to provide a forum for discussion of
>> the requirements for an envisaged open “global laboratory” for HLT research
>> and development and establish the basis of a community effort to develop
>> and support it. To this end, the workshop will include both presentations
>> addressing the issues and challenges of developing, deploying, and using
>> the global laboratory for distributed and collaborative efforts and
>> discussion that will identify next steps for moving forward, fostering
>> community-wide awareness, and establishing and encouraging communication
>> among the various players.
>>
>> It aims at bringing together members of the NLP community specifically
>> users, developers or providers of components and tools for these frameworks
>> in order to explore and discuss the opportunities and challenges in using
>> such platforms for modern, well-engineered NLP applications.
>>
>> The challenge of creating reusable and interoperable components raises
>> particular interest and are affected by legal issues, such as potentially
>> incompatible licenses of components and tools as well as the technical
>> aspects of packaging and distribution of components. Also, tools are
>> important, for example to assemble complex processing pipelines, to manage
>> the bodies of data that are to be analysed and to visualize, explore, and
>> further deploy the analysis results. Further challenges are involved in
>> embedding framework based analysis within applications or using it in
>> distributed computing scenarios, such as deployment of and access to
>> required resources. Finally, the preservation of analysis results, their
>> provenance and reproducibility are of particular interest to the scientific
>> user community.
>>
>> ### Topics
>>
>> Workshop topics include, but are not limited to:
>>
>> - processing of very large data collections: scale-out, parallelization,
>> and performance optimization
>> - advanced applications driven by an NLP framework
>> - sophisticated tools to build and manage complex processing pipelines
>> - analysis of results: exploration, evaluation, visualization, and
>> statistical analysis
>> - experience reports combining components from different sources, as well
>> as solutions to interoperability issues
>> - experience reports combining different frameworks (e.g.
>> GATE/UIMA/WebLicht/etc.)
>> - UIMA components with a special focus on genericity and type-system
>> independence
>> - repositories of ready-to-use components for UIMA and/or GATE
>> - distribution of components: documentation, licensing and packaging
>> - developing for UIMA or GATE: simplified APIs, debugging, unit testing,
>> and limitations of the frameworks
>> - combining annotation type systems in processing frameworks (GATE, UIMA,
>> etc.) with standardization efforts, such as done in the ISO TC37/SC4 or TEI
>> contexts.
>> - use of NLP frameworks in real-world "industry" settings
>> - reports on current projects and frameworks, their challenges and
>> proposed or implemented solutions, including efforts to address
>> interoperability
>> - issues and challenges of multi-site collaborative projects, including
>> reports of implemented or proposed strategies
>> - pipeline management, including authentication, strategies for passing
>> resources through disparate tools and across hosting nodes, and licensing
>> - development and use of evaluation environments that facilitate
>> assessment of HLT component performance, iterative application development,
>> and replication of results
>> - community awareness and implementation of open infrastructures,
>> including how to engage the community, establish confidence in the process,
>> and promote use
>>
>> Dates
>> -----
>> Paper Submission Deadline: 2nd May 2014
>> Author Notification Deadline: 6th June 2014
>> Camera-Ready Paper Deadline: 27th June 2014
>> Workshop: 23rd August 2014
>>
>> Organisers
>> ----------
>> Nancy Ide
>> Department of Computer Science, Vassar College
>>
>> James Pustejovsky
>> Department of Computer Science, Brandeis University
>>
>> Eric Nyberg
>> Language Technologies Institute, School of Computer Science, Carnegie
>> Mellon University
>>
>> Christopher Cieri
>> Linguistic Data Consortium, University of Pennsylvania
>>
>> Jonathan Wright
>> Linguistic Data Consortium, University of Pennsylvania
>>
>> Jens Grivolla
>> GLiCom, Universitat Pompeu Fabra
>>
>> Kalina Bontcheva
>> Department of Computer Science, University of Sheffield
>>
>>
>

Re: CFP: Workshop on Open Infrastructures and Analysis Frameworks for HLT

Posted by Jens Grivolla <j+...@grivolla.net>.
The workshop program, along with links to the full papers, is now
available: http://glicom.upf.edu/OIAF4HLT/Program.html

I'm looking forward to seeing many of you there.  I'll be staying at DCU
(College Park).

-- Jens


On Tue, Jul 1, 2014 at 6:52 PM, Jens Grivolla <j+...@grivolla.net> wrote:

> The list of accepted papers is now available:
> http://glicom.upf.edu/OIAF4HLT/Papers.html
>
> For anybody interested in attending the workshop and COLING, please
> remember that the early registration deadline is tomorrow, July 2nd.
>
> Looking forward to seeing many of you there...
>
> -- Jens
>
>
> On Wed, Mar 26, 2014 at 2:34 PM, Jens Grivolla <j+...@grivolla.net> wrote:
>
>> Workshop on Open Infrastructures and Analysis Frameworks for HLT
>> ================================================================
>>
>> http://glicom.upf.edu/OIAF4HLT/
>>
>> At the 25th International Conference on Computational Linguistics (COLING
>> 2014)
>> Helix Conference Centre at Dublin City University (DCU)
>> 23-29 August 2014
>>
>> Description
>> -----------
>>
>> Recent advances in digital storage and networking, coupled with the
>> extension of human language technologies (HLT) into ever broader areas and
>> the persistence of difficulties in software portability, have led to an
>> increased focus on development and deployment of web-based infrastructures
>> that allow users to access tools and other resources and combine them to
>> create novel solutions that can be efficiently composed, tuned, evaluated,
>> disseminated and consumed. This in turn engenders collaborative development
>> and deployment among individuals and teams across the globe. It also
>> increases the need for robust, widely available evaluation methods and
>> tools, means to achieve interoperability of software and data from diverse
>> sources, means to handle licensing for limited access resources distributed
>> over the web, and, perhaps crucially, the need to develop strategies for
>> multi-site collaborative work.
>>
>> For many decades, NLP has suffered from low software engineering
>> standards causing a limited degree of re-usability of code and
>> interoperability of different modules within larger NLP systems. While this
>> did not really hamper success in limited task areas (such as implementing a
>> parser), it caused serious problems for building complex integrated
>> software systems, e.g., for information extraction or machine translation.
>> This lack of integration has led to duplicated software development,
>> work-arounds for programs written in different (versions of) programming
>> languages, and ad-hoc tweaking of interfaces between modules developed at
>> different sites.
>>
>> In recent years, two main frameworks, UIMA and GATE, have emerged that
>> aim to allow the easy integration of varied tools through common type
>> systems and standardized communication methods for components analysing
>> unstructured textual information, such as natural language. Both frameworks
>> offer a solid processing infrastructure that allows developers to
>> concentrate on the implementation of the actual analytics components. An
>> increasing number of members of the NLP community have adopted one of these
>> frameworks as a platform for facilitating the creation of reusable NLP
>> components that can be assembled to address different NLP tasks depending
>> on their order, combination and configuration. Analysis frameworks also
>> reduce the problem of reproducibility of NLP results by formalising
>> solution composition and making language processing tools shareable.
>>
>> Very recently, several efforts have been devoted to the development of
>> web service platforms for NLP. These platforms exploit the growing number
>> of web-based tools and services available for tasks related to HLT,
>> including corpus annotation, configuration and execution of NLP pipelines,
>> and evaluation of results and automatic parameter tuning. These platforms
>> can also integrate modules and pipelines from existing frameworks such as
>> UIMA and GATE, in order to achieve interoperability with a wide variety of
>> modules from different sources.
>>
>> Many of the issues and challenges surrounding these developments have
>> been addressed individually in particular projects and workshops, but there
>> are ramifications that cut across all of them. We therefore feel that this
>> is the moment to bring together participants representing the range of
>> interests that comprise the comprehensive picture for community-driven,
>> distributed, collaborative, web-based development and use for language
>> processing software and resources. This includes those engaged in
>> development of infrastructures for HLT as well as those who will use these
>> services and infrastructures, especially for multi-site collaborative work.
>>
>>
>> ### Workshop Objectives
>>
>> The overall goal of this workshop is to provide a forum for discussion of
>> the requirements for an envisaged open “global laboratory” for HLT research
>> and development and establish the basis of a community effort to develop
>> and support it. To this end, the workshop will include both presentations
>> addressing the issues and challenges of developing, deploying, and using
>> the global laboratory for distributed and collaborative efforts and
>> discussion that will identify next steps for moving forward, fostering
>> community-wide awareness, and establishing and encouraging communication
>> among the various players.
>>
>> It aims at bringing together members of the NLP community specifically
>> users, developers or providers of components and tools for these frameworks
>> in order to explore and discuss the opportunities and challenges in using
>> such platforms for modern, well-engineered NLP applications.
>>
>> The challenge of creating reusable and interoperable components raises
>> particular interest and are affected by legal issues, such as potentially
>> incompatible licenses of components and tools as well as the technical
>> aspects of packaging and distribution of components. Also, tools are
>> important, for example to assemble complex processing pipelines, to manage
>> the bodies of data that are to be analysed and to visualize, explore, and
>> further deploy the analysis results. Further challenges are involved in
>> embedding framework based analysis within applications or using it in
>> distributed computing scenarios, such as deployment of and access to
>> required resources. Finally, the preservation of analysis results, their
>> provenance and reproducibility are of particular interest to the scientific
>> user community.
>>
>> ### Topics
>>
>> Workshop topics include, but are not limited to:
>>
>> - processing of very large data collections: scale-out, parallelization,
>> and performance optimization
>> - advanced applications driven by an NLP framework
>> - sophisticated tools to build and manage complex processing pipelines
>> - analysis of results: exploration, evaluation, visualization, and
>> statistical analysis
>> - experience reports combining components from different sources, as well
>> as solutions to interoperability issues
>> - experience reports combining different frameworks (e.g.
>> GATE/UIMA/WebLicht/etc.)
>> - UIMA components with a special focus on genericity and type-system
>> independence
>> - repositories of ready-to-use components for UIMA and/or GATE
>> - distribution of components: documentation, licensing and packaging
>> - developing for UIMA or GATE: simplified APIs, debugging, unit testing,
>> and limitations of the frameworks
>> - combining annotation type systems in processing frameworks (GATE, UIMA,
>> etc.) with standardization efforts, such as done in the ISO TC37/SC4 or TEI
>> contexts.
>> - use of NLP frameworks in real-world "industry" settings
>> - reports on current projects and frameworks, their challenges and
>> proposed or implemented solutions, including efforts to address
>> interoperability
>> - issues and challenges of multi-site collaborative projects, including
>> reports of implemented or proposed strategies
>> - pipeline management, including authentication, strategies for passing
>> resources through disparate tools and across hosting nodes, and licensing
>> - development and use of evaluation environments that facilitate
>> assessment of HLT component performance, iterative application development,
>> and replication of results
>> - community awareness and implementation of open infrastructures,
>> including how to engage the community, establish confidence in the process,
>> and promote use
>>
>> Dates
>> -----
>> Paper Submission Deadline: 2nd May 2014
>> Author Notification Deadline: 6th June 2014
>> Camera-Ready Paper Deadline: 27th June 2014
>> Workshop: 23rd August 2014
>>
>> Organisers
>> ----------
>> Nancy Ide
>> Department of Computer Science, Vassar College
>>
>> James Pustejovsky
>> Department of Computer Science, Brandeis University
>>
>> Eric Nyberg
>> Language Technologies Institute, School of Computer Science, Carnegie
>> Mellon University
>>
>> Christopher Cieri
>> Linguistic Data Consortium, University of Pennsylvania
>>
>> Jonathan Wright
>> Linguistic Data Consortium, University of Pennsylvania
>>
>> Jens Grivolla
>> GLiCom, Universitat Pompeu Fabra
>>
>> Kalina Bontcheva
>> Department of Computer Science, University of Sheffield
>>
>>
>