You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@stanbol.apache.org by Suman Saurabh <ss...@gmail.com> on 2014/06/23 19:13:30 UTC

Mid- Term Evaluation GSOC 2014: Building Speech To Text Engine

Hi All,

Here are my updates regarding Speech To Text Engine.

[1] Project built to download  CMU-Sphinx models as dependency unpacked and
repacked to stanbol repository.

[2] DataFileProvider Service built which provides Acoustic, Language,
Dictionary models to SpeechToText Engine.

Next part is to feed grabbed audio to Sphinix engine for obtaining the
transcripts. More details regarding the project is in this link [3].

I am thankful for the support provided by Rupert and my mentor Andreas.

Regards,
Suman Saurabh

[1] https://github.com/sumansaurabh/Sphinx-Model
[2] https://github.com/sumansaurabh/SphinxModelProvider
[3] https://sites.google.com/site/gsoc2014stanbol/

Re: Mid- Term Evaluation GSOC 2014: Building Speech To Text Engine

Posted by Suman Saurabh <ss...@gmail.com>.

Hi Rupert,

I am working on your report. Currently I am out of network area, hence
cannot access large data. I will report you at the end of this week.

Regards,
Suman Saurabh




On Fri, Jun 27, 2014 at 6:26 PM, Rupert Westenthaler <
rupert.westenthaler@gmail.com> wrote:

> Hi Suman,
>
> Here is some feedback to the ModelProvider module:
>
> First, as far as I understand the ModelProvider can currently only
> provide a single acoustic Model. This is not exactly what I would
> expect form such a service. It would be good to support different
> languages as well as the ability to have several named models for the
> same language.
>
> What I would expect is something like a ModelProvider Interface with
> method like
>
>     /** getter for the default model for the parsed language */
>     + LanguageModel getDefaultModel(String language)
>     /** getter for a specific model for the parsed language */
>     + LanguageModel getModel(String name, String language)
>
> LanguageModel would be a class with an API based access to the
> Map<String,String> you are currently returning by the "initModel()"
> method.  "initModel()" would also be an internal method used to load
> models for specific languages and names. Supporting the use of
> multiple models will also require to change some parts of this code.
>
> Please also separate a ModelProvider interface with the
> ModelProviderImpl. You can keep the interface in the current folder.
> Move the Implementation to an "impl" subpackage and adapt the OSGI.
>
> Second the "clearTempResource()" method MUST NOT be part of the public
> interface but instead be called by the @Deactivate method (a OSGI
> component lifecycle method)
>
>     @Deactivate
>     protected void deactivate(ComponentContext ctx){
>         log.debug("deactivating {}",getClass().getSimpleName())
>         clearTempResource(); //clean up temp resources
>     }
>
>     private void clearTempResource() { /* same code as now */
>        [..]
>     }
>
> Note that there is also an
>
>     @Activate
>     protected void activate(ComponentContext ctx){
>
>     }
>
> if you need to do some initialization during startup.
>
> Thrid the pom file of the ModelProvider needs also some work
>
> 1. the junit dependency is missing. I needed to add it so that your
> code was compiling
> 2. the packaging needs to be set to bundle. Otherwise no OSGI bundle
> will be created and you will not be able to deploy the jar file in an
> OSGI environment
> 3. please add a basing configuration for the maven bundle plugin (just
> start from the configuration from any Stanbol module)
>
>
> Finally I would like to have more information on the status of the
> implementation of the actual engine. Do you plan to do the coding in
> https://github.com/sumansaurabh/stanbol-1007? Would be great if you
> could push your code as early as possible so that I can follow the
> progress.
>
> best
> Rupert
>
> On Mon, Jun 23, 2014 at 7:13 PM, Suman Saurabh
> <ss...@gmail.com> wrote:
> > Hi All,
> >
> > Here are my updates regarding Speech To Text Engine.
> >
> > [1] Project built to download  CMU-Sphinx models as dependency unpacked
> and
> > repacked to stanbol repository.
> >
> > [2] DataFileProvider Service built which provides Acoustic, Language,
> > Dictionary models to SpeechToText Engine.
> >
> > Next part is to feed grabbed audio to Sphinix engine for obtaining the
> > transcripts. More details regarding the project is in this link [3].
> >
> > I am thankful for the support provided by Rupert and my mentor Andreas.
> >
> > Regards,
> > Suman Saurabh
> >
> > [1] https://github.com/sumansaurabh/Sphinx-Model
> > [2] https://github.com/sumansaurabh/SphinxModelProvider
> > [3] https://sites.google.com/site/gsoc2014stanbol/
>
>
>
> --
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                              ++43-699-11108907
> | A-5500 Bischofshofen
> | REDLINK.CO
> ..........................................................................
> | http://redlink.co/
>

Re: Mid- Term Evaluation GSOC 2014: Building Speech To Text Engine

Posted by Suman Saurabh <ss...@gmail.com>.

Hi Rupert,

I have updated the code [1] as per the feedback and have completed
ModelProvider module, please review it. Specifier for method void
clearTempResource() is still public for testing, I will update it. Made the
junit test classes for testing this module.

I have also pushed the code for SpeechToText Engine [2], changed the name
to SpeechToTextEngine from Stanbol-1007. Currently code generates just the
transcripts of the parsed sound file. For reference I am using Tika Engine
source code for parsing media file, its meta-data and feeding the contents
to blob [3].

Regards,
Suman Saurabh

[1] https://github.com/sumansaurabh/SphinxModelProvider
[2] https://github.com/sumansaurabh/SpeechToTextEngine
[3] https://issues.apache.org/jira/browse/STANBOL-579

Regards,
Suman Saurabh



On Fri, Jun 27, 2014 at 6:26 PM, Rupert Westenthaler <
rupert.westenthaler@gmail.com> wrote:

> Hi Suman,
>
> Here is some feedback to the ModelProvider module:
>
> First, as far as I understand the ModelProvider can currently only
> provide a single acoustic Model. This is not exactly what I would
> expect form such a service. It would be good to support different
> languages as well as the ability to have several named models for the
> same language.
>
> What I would expect is something like a ModelProvider Interface with
> method like
>
>     /** getter for the default model for the parsed language */
>     + LanguageModel getDefaultModel(String language)
>     /** getter for a specific model for the parsed language */
>     + LanguageModel getModel(String name, String language)
>
> LanguageModel would be a class with an API based access to the
> Map<String,String> you are currently returning by the "initModel()"
> method.  "initModel()" would also be an internal method used to load
> models for specific languages and names. Supporting the use of
> multiple models will also require to change some parts of this code.
>
> Please also separate a ModelProvider interface with the
> ModelProviderImpl. You can keep the interface in the current folder.
> Move the Implementation to an "impl" subpackage and adapt the OSGI.
>
> Second the "clearTempResource()" method MUST NOT be part of the public
> interface but instead be called by the @Deactivate method (a OSGI
> component lifecycle method)
>
>     @Deactivate
>     protected void deactivate(ComponentContext ctx){
>         log.debug("deactivating {}",getClass().getSimpleName())
>         clearTempResource(); //clean up temp resources
>     }
>
>     private void clearTempResource() { /* same code as now */
>        [..]
>     }
>
> Note that there is also an
>
>     @Activate
>     protected void activate(ComponentContext ctx){
>
>     }
>
> if you need to do some initialization during startup.
>
> Thrid the pom file of the ModelProvider needs also some work
>
> 1. the junit dependency is missing. I needed to add it so that your
> code was compiling
> 2. the packaging needs to be set to bundle. Otherwise no OSGI bundle
> will be created and you will not be able to deploy the jar file in an
> OSGI environment
> 3. please add a basing configuration for the maven bundle plugin (just
> start from the configuration from any Stanbol module)
>
>
> Finally I would like to have more information on the status of the
> implementation of the actual engine. Do you plan to do the coding in
> https://github.com/sumansaurabh/stanbol-1007? Would be great if you
> could push your code as early as possible so that I can follow the
> progress.
>
> best
> Rupert
>
> On Mon, Jun 23, 2014 at 7:13 PM, Suman Saurabh
> <ss...@gmail.com> wrote:
> > Hi All,
> >
> > Here are my updates regarding Speech To Text Engine.
> >
> > [1] Project built to download  CMU-Sphinx models as dependency unpacked
> and
> > repacked to stanbol repository.
> >
> > [2] DataFileProvider Service built which provides Acoustic, Language,
> > Dictionary models to SpeechToText Engine.
> >
> > Next part is to feed grabbed audio to Sphinix engine for obtaining the
> > transcripts. More details regarding the project is in this link [3].
> >
> > I am thankful for the support provided by Rupert and my mentor Andreas.
> >
> > Regards,
> > Suman Saurabh
> >
> > [1] https://github.com/sumansaurabh/Sphinx-Model
> > [2] https://github.com/sumansaurabh/SphinxModelProvider
> > [3] https://sites.google.com/site/gsoc2014stanbol/
>
>
>
> --
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                              ++43-699-11108907
> | A-5500 Bischofshofen
> | REDLINK.CO
> ..........................................................................
> | http://redlink.co/
>

Re: Mid- Term Evaluation GSOC 2014: Building Speech To Text Engine

Posted by Rupert Westenthaler <ru...@gmail.com>.

Hi Suman,

Here is some feedback to the ModelProvider module:

First, as far as I understand the ModelProvider can currently only
provide a single acoustic Model. This is not exactly what I would
expect form such a service. It would be good to support different
languages as well as the ability to have several named models for the
same language.

What I would expect is something like a ModelProvider Interface with method like

    /** getter for the default model for the parsed language */
    + LanguageModel getDefaultModel(String language)
    /** getter for a specific model for the parsed language */
    + LanguageModel getModel(String name, String language)

LanguageModel would be a class with an API based access to the
Map<String,String> you are currently returning by the "initModel()"
method.  "initModel()" would also be an internal method used to load
models for specific languages and names. Supporting the use of
multiple models will also require to change some parts of this code.

Please also separate a ModelProvider interface with the
ModelProviderImpl. You can keep the interface in the current folder.
Move the Implementation to an "impl" subpackage and adapt the OSGI.

Second the "clearTempResource()" method MUST NOT be part of the public
interface but instead be called by the @Deactivate method (a OSGI
component lifecycle method)

    @Deactivate
    protected void deactivate(ComponentContext ctx){
        log.debug("deactivating {}",getClass().getSimpleName())
        clearTempResource(); //clean up temp resources
    }

    private void clearTempResource() { /* same code as now */
       [..]
    }

Note that there is also an

    @Activate
    protected void activate(ComponentContext ctx){

    }

if you need to do some initialization during startup.

Thrid the pom file of the ModelProvider needs also some work

1. the junit dependency is missing. I needed to add it so that your
code was compiling
2. the packaging needs to be set to bundle. Otherwise no OSGI bundle
will be created and you will not be able to deploy the jar file in an
OSGI environment
3. please add a basing configuration for the maven bundle plugin (just
start from the configuration from any Stanbol module)

Finally I would like to have more information on the status of the
implementation of the actual engine. Do you plan to do the coding in
https://github.com/sumansaurabh/stanbol-1007? Would be great if you
could push your code as early as possible so that I can follow the
progress.

best
Rupert

On Mon, Jun 23, 2014 at 7:13 PM, Suman Saurabh
<ss...@gmail.com> wrote:
> Hi All,
>
> Here are my updates regarding Speech To Text Engine.
>
> [1] Project built to download  CMU-Sphinx models as dependency unpacked and
> repacked to stanbol repository.
>
> [2] DataFileProvider Service built which provides Acoustic, Language,
> Dictionary models to SpeechToText Engine.
>
> Next part is to feed grabbed audio to Sphinix engine for obtaining the
> transcripts. More details regarding the project is in this link [3].
>
> I am thankful for the support provided by Rupert and my mentor Andreas.
>
> Regards,
> Suman Saurabh
>
> [1] https://github.com/sumansaurabh/Sphinx-Model
> [2] https://github.com/sumansaurabh/SphinxModelProvider
> [3] https://sites.google.com/site/gsoc2014stanbol/

-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/