You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Rajan Shah <ra...@gmail.com> on 2015/05/27 05:31:51 UTC

General Stanbol questions

Hi,

As I am trying to get my hands around stanbol, I have couple general design
questions.

*1. Enhancement Chain firing and results*

How to find out which enhancement chain detected which entities? One way, I
could see that by adding/removing particular chain. Is it possible to just
enable it via logging within current code?

For ex.
I have a chain categorized-linking and would like to find out whether this
chain fired and labeled entities properly

*2. Categorize entities differently*

Is it possible to categorize your detected entities as something else?
 i.e. other than People, Organizations or Places

What steps one need to take in current framework to achieve the same?

*3. Domain specific modeling*

Suppose, I have a small domain and various types of entities. I am
interested in

a. analyzing various entities
b. linking them with other entities and find relations from dbpedia/freebase
c. infer interesting aspects using reasoning

Is Stanbol the way to go or Marmotta? or Is it preferred to develop a
custom engine using Stanbol which uses internal components to perform all
of the above tasks?

*4. Enhance detected entities by annotation*

Suppose, opennlp-ner detected an entity xyz. If I want to annotate this
entity with additional attributes/fields using different custom
vocabularies, what are the dev. steps I need to take?

*5. Previous demo project(s)*

 At the same time, any luck with restoring demo project(s) within 0.12
branch ? I believe, it demonstrates various aspects and it would be great
to have it restored.

Thanks in advance,
Rajan

Re: General Stanbol questions

Posted by Rupert Westenthaler <ru...@gmail.com>.
On Tue, Jun 2, 2015 at 2:53 PM, Rajan Shah <ra...@gmail.com> wrote:
> I could setup the demo and look at the results. It's extremely powerful.
> One quick question, what's the purpose of these bin files under resources
> directory.
>
> bionlp2004-DNA-en.bin
> bionlp2004-RNA-en.bin
> bionlp2004-cell_line-en.bin
> bionlp2004-cell_type-en.bin
> bionlp2004-protein-en.bin

Those are the Open NLP models for Named Entity Extraction.

best
Rupert

>
> With best regards,
> Rajan
>
> On Thu, May 28, 2015 at 3:52 AM, Rupert Westenthaler <
> rupert.westenthaler@gmail.com> wrote:
>
>> Hi Rajan,
>>
>> The demo never included any Java code.
>>
>> The module just provides configurations [1] and datafiles [2]. Those
>> will be installed with the bundle using the Sling Installer and
>> Stanbol DataFileProvider infrastructure when the bundle is installed.
>> Note the <Install-Path> and <Data-Files> instructions configured for
>> the maven-bundle-plugin in the pom.xml file.
>>
>> The demo also provides a shell script [3] the indexes eHealth related
>> datasets and of corse the README explaining the demo
>>
>> best
>> Rupert
>>
>> [1]
>> http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/demos/ehealth/src/main/resources/config/
>> [2]
>> http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/demos/ehealth/src/main/resources/datafiles/
>> [3]
>> http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/demos/ehealth/index.sh
>>
>> On Wed, May 27, 2015 at 1:27 PM,  <ra...@gmail.com> wrote:
>> > Hi Rupert,
>> >
>> > Thanks a lot for the detailed answers. Let me play a little bit further
>> before I ask additional follow-up questions.
>> >
>> > As far as demo is concerned, I am interested in eHealth demo as it
>> covers lots of items from my questions. At present, the Java code for it is
>> missing. Is it possible to restore Java code for eHealth demo in 0.12
>> branch?
>> >
>> > With best regards,
>> > Rajan
>> >
>> > Sent from my iPhone
>> >
>> >> On May 27, 2015, at 6:42 AM, Rupert Westenthaler <
>> rupert.westenthaler@gmail.com> wrote:
>> >>
>> >> Hi
>> >>
>> >>> On Wed, May 27, 2015 at 5:31 AM, Rajan Shah <ra...@gmail.com> wrote:
>> >>> Hi,
>> >>>
>> >>> As I am trying to get my hands around stanbol, I have couple general
>> design
>> >>> questions.
>> >>>
>> >>> *1. Enhancement Chain firing and results*
>> >>>
>> >>> How to find out which enhancement chain detected which entities? One
>> way, I
>> >>> could see that by adding/removing particular chain. Is it possible to
>> just
>> >>> enable it via logging within current code?
>> >>>
>> >>> For ex.
>> >>> I have a chain categorized-linking and would like to find out whether
>> this
>> >>> chain fired and labeled entities properly
>> >>
>> >> A enhancement chain has 1..* enhancement engines. The engines create
>> >> the annotations not the chain. So your question should be what engine
>> >> is creating an annotation. This information is provided by the
>> >> dc:creator and dc:contributor metadata of the enhancement. See also
>> >> the documentation at [1]
>> >>
>> >>>
>> >>> *2. Categorize entities differently*
>> >>>
>> >>> Is it possible to categorize your detected entities as something else?
>> >>> i.e. other than People, Organizations or Places
>> >>>
>> >>> What steps one need to take in current framework to achieve the same?
>> >>
>> >> You can use the Custom NER Model Extraction Engine [2].
>> >> The models used in the documentation of this engine can be found at [3]
>> >>
>> >>>
>> >>> *3. Domain specific modeling*
>> >>>
>> >>> Suppose, I have a small domain and various types of entities. I am
>> >>> interested in
>> >>>
>> >>> a. analyzing various entities
>> >>> b. linking them with other entities and find relations from
>> dbpedia/freebase
>> >>> c. infer interesting aspects using reasoning
>> >>>
>> >>> Is Stanbol the way to go or Marmotta? or Is it preferred to develop a
>> >>> custom engine using Stanbol which uses internal components to perform
>> all
>> >>> of the above tasks?
>> >>
>> >> * Entity linking to your custom vocabulary in Stanbol
>> >> * If you want to have your custom entities linked with
>> >> dbpedia/freebase it is better to do that in the vocabulary. I think
>> >> Google refine provided reconciliation to freebase. that could be
>> >> definitely an option.
>> >> * If you want to find additional entities contained in
>> >> freebase/dbpedia configuring an other entity linking in Stanbol makes
>> >> complete sense.
>> >>
>> >> Not sure what you mean with "infer interesting aspects using reasoning".
>> >>
>> >>>
>> >>> *4. Enhance detected entities by annotation*
>> >>>
>> >>> Suppose, opennlp-ner detected an entity xyz. If I want to annotate this
>> >>> entity with additional attributes/fields using different custom
>> >>> vocabularies, what are the dev. steps I need to take?
>> >>>
>> >>
>> >> If you just want to link Named Entities with a controlled vocabulary
>> >> you can use the FST linking engine [4] with the Linking Mode set to
>> >> NER (read the Linking Mode of the engines documentation). In short you
>> >> will want to configure a "Apache Stanbol Enhancer Engine: FST Linking:
>> >> Named Entities" for the vocabulary you want to link against.
>> >>
>> >>
>> >>> *5. Previous demo project(s)*
>> >>>
>> >>> At the same time, any luck with restoring demo project(s) within 0.12
>> >>> branch ? I believe, it demonstrates various aspects and it would be
>> great
>> >>> to have it restored.
>> >>>
>> >>
>> >> I hope those are still functional in the 0.12 branch. No immediate
>> >> plans to move them to 1.0.0 (mainly because of lack of time).
>> >> Contributions are very welcome.
>> >>
>> >> Hope this helps
>> >> best
>> >> Rupert
>> >>
>> >>> Thanks in advance,
>> >>> Rajan
>> >>
>> >>
>> >> [1]
>> http://stanbol.apache.org/docs/trunk/components/enhancer/enhancementstructure#fiseenhancement
>> >> [2]
>> https://stanbol.apache.org/docs/trunk/components/enhancer/engines/opennlpcustomner
>> >> [3]
>> http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/demos/ehealth/src/main/resources/datafiles/
>> >> [4]
>> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/lucenefstlinking
>> >>
>> >>
>> >> --
>> >> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>> >> | Bodenlehenstraße 11                              ++43-699-11108907
>> >> | A-5500 Bischofshofen
>> >> | REDLINK.CO
>> ..........................................................................
>> >> | http://redlink.co/
>>
>>
>>
>> --
>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>> | Bodenlehenstraße 11                              ++43-699-11108907
>> | A-5500 Bischofshofen
>> | REDLINK.CO
>> ..........................................................................
>> | http://redlink.co/
>>



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/

Re: General Stanbol questions

Posted by Rajan Shah <ra...@gmail.com>.
Hi Rupert,

Thanks again for detailed answer.

I could setup the demo and look at the results. It's extremely powerful.
One quick question, what's the purpose of these bin files under resources
directory.

bionlp2004-DNA-en.bin
bionlp2004-RNA-en.bin
bionlp2004-cell_line-en.bin
bionlp2004-cell_type-en.bin
bionlp2004-protein-en.bin

With best regards,
Rajan

On Thu, May 28, 2015 at 3:52 AM, Rupert Westenthaler <
rupert.westenthaler@gmail.com> wrote:

> Hi Rajan,
>
> The demo never included any Java code.
>
> The module just provides configurations [1] and datafiles [2]. Those
> will be installed with the bundle using the Sling Installer and
> Stanbol DataFileProvider infrastructure when the bundle is installed.
> Note the <Install-Path> and <Data-Files> instructions configured for
> the maven-bundle-plugin in the pom.xml file.
>
> The demo also provides a shell script [3] the indexes eHealth related
> datasets and of corse the README explaining the demo
>
> best
> Rupert
>
> [1]
> http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/demos/ehealth/src/main/resources/config/
> [2]
> http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/demos/ehealth/src/main/resources/datafiles/
> [3]
> http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/demos/ehealth/index.sh
>
> On Wed, May 27, 2015 at 1:27 PM,  <ra...@gmail.com> wrote:
> > Hi Rupert,
> >
> > Thanks a lot for the detailed answers. Let me play a little bit further
> before I ask additional follow-up questions.
> >
> > As far as demo is concerned, I am interested in eHealth demo as it
> covers lots of items from my questions. At present, the Java code for it is
> missing. Is it possible to restore Java code for eHealth demo in 0.12
> branch?
> >
> > With best regards,
> > Rajan
> >
> > Sent from my iPhone
> >
> >> On May 27, 2015, at 6:42 AM, Rupert Westenthaler <
> rupert.westenthaler@gmail.com> wrote:
> >>
> >> Hi
> >>
> >>> On Wed, May 27, 2015 at 5:31 AM, Rajan Shah <ra...@gmail.com> wrote:
> >>> Hi,
> >>>
> >>> As I am trying to get my hands around stanbol, I have couple general
> design
> >>> questions.
> >>>
> >>> *1. Enhancement Chain firing and results*
> >>>
> >>> How to find out which enhancement chain detected which entities? One
> way, I
> >>> could see that by adding/removing particular chain. Is it possible to
> just
> >>> enable it via logging within current code?
> >>>
> >>> For ex.
> >>> I have a chain categorized-linking and would like to find out whether
> this
> >>> chain fired and labeled entities properly
> >>
> >> A enhancement chain has 1..* enhancement engines. The engines create
> >> the annotations not the chain. So your question should be what engine
> >> is creating an annotation. This information is provided by the
> >> dc:creator and dc:contributor metadata of the enhancement. See also
> >> the documentation at [1]
> >>
> >>>
> >>> *2. Categorize entities differently*
> >>>
> >>> Is it possible to categorize your detected entities as something else?
> >>> i.e. other than People, Organizations or Places
> >>>
> >>> What steps one need to take in current framework to achieve the same?
> >>
> >> You can use the Custom NER Model Extraction Engine [2].
> >> The models used in the documentation of this engine can be found at [3]
> >>
> >>>
> >>> *3. Domain specific modeling*
> >>>
> >>> Suppose, I have a small domain and various types of entities. I am
> >>> interested in
> >>>
> >>> a. analyzing various entities
> >>> b. linking them with other entities and find relations from
> dbpedia/freebase
> >>> c. infer interesting aspects using reasoning
> >>>
> >>> Is Stanbol the way to go or Marmotta? or Is it preferred to develop a
> >>> custom engine using Stanbol which uses internal components to perform
> all
> >>> of the above tasks?
> >>
> >> * Entity linking to your custom vocabulary in Stanbol
> >> * If you want to have your custom entities linked with
> >> dbpedia/freebase it is better to do that in the vocabulary. I think
> >> Google refine provided reconciliation to freebase. that could be
> >> definitely an option.
> >> * If you want to find additional entities contained in
> >> freebase/dbpedia configuring an other entity linking in Stanbol makes
> >> complete sense.
> >>
> >> Not sure what you mean with "infer interesting aspects using reasoning".
> >>
> >>>
> >>> *4. Enhance detected entities by annotation*
> >>>
> >>> Suppose, opennlp-ner detected an entity xyz. If I want to annotate this
> >>> entity with additional attributes/fields using different custom
> >>> vocabularies, what are the dev. steps I need to take?
> >>>
> >>
> >> If you just want to link Named Entities with a controlled vocabulary
> >> you can use the FST linking engine [4] with the Linking Mode set to
> >> NER (read the Linking Mode of the engines documentation). In short you
> >> will want to configure a "Apache Stanbol Enhancer Engine: FST Linking:
> >> Named Entities" for the vocabulary you want to link against.
> >>
> >>
> >>> *5. Previous demo project(s)*
> >>>
> >>> At the same time, any luck with restoring demo project(s) within 0.12
> >>> branch ? I believe, it demonstrates various aspects and it would be
> great
> >>> to have it restored.
> >>>
> >>
> >> I hope those are still functional in the 0.12 branch. No immediate
> >> plans to move them to 1.0.0 (mainly because of lack of time).
> >> Contributions are very welcome.
> >>
> >> Hope this helps
> >> best
> >> Rupert
> >>
> >>> Thanks in advance,
> >>> Rajan
> >>
> >>
> >> [1]
> http://stanbol.apache.org/docs/trunk/components/enhancer/enhancementstructure#fiseenhancement
> >> [2]
> https://stanbol.apache.org/docs/trunk/components/enhancer/engines/opennlpcustomner
> >> [3]
> http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/demos/ehealth/src/main/resources/datafiles/
> >> [4]
> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/lucenefstlinking
> >>
> >>
> >> --
> >> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> >> | Bodenlehenstraße 11                              ++43-699-11108907
> >> | A-5500 Bischofshofen
> >> | REDLINK.CO
> ..........................................................................
> >> | http://redlink.co/
>
>
>
> --
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                              ++43-699-11108907
> | A-5500 Bischofshofen
> | REDLINK.CO
> ..........................................................................
> | http://redlink.co/
>

Re: General Stanbol questions

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Rajan,

The demo never included any Java code.

The module just provides configurations [1] and datafiles [2]. Those
will be installed with the bundle using the Sling Installer and
Stanbol DataFileProvider infrastructure when the bundle is installed.
Note the <Install-Path> and <Data-Files> instructions configured for
the maven-bundle-plugin in the pom.xml file.

The demo also provides a shell script [3] the indexes eHealth related
datasets and of corse the README explaining the demo

best
Rupert

[1] http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/demos/ehealth/src/main/resources/config/
[2] http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/demos/ehealth/src/main/resources/datafiles/
[3] http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/demos/ehealth/index.sh

On Wed, May 27, 2015 at 1:27 PM,  <ra...@gmail.com> wrote:
> Hi Rupert,
>
> Thanks a lot for the detailed answers. Let me play a little bit further before I ask additional follow-up questions.
>
> As far as demo is concerned, I am interested in eHealth demo as it covers lots of items from my questions. At present, the Java code for it is missing. Is it possible to restore Java code for eHealth demo in 0.12 branch?
>
> With best regards,
> Rajan
>
> Sent from my iPhone
>
>> On May 27, 2015, at 6:42 AM, Rupert Westenthaler <ru...@gmail.com> wrote:
>>
>> Hi
>>
>>> On Wed, May 27, 2015 at 5:31 AM, Rajan Shah <ra...@gmail.com> wrote:
>>> Hi,
>>>
>>> As I am trying to get my hands around stanbol, I have couple general design
>>> questions.
>>>
>>> *1. Enhancement Chain firing and results*
>>>
>>> How to find out which enhancement chain detected which entities? One way, I
>>> could see that by adding/removing particular chain. Is it possible to just
>>> enable it via logging within current code?
>>>
>>> For ex.
>>> I have a chain categorized-linking and would like to find out whether this
>>> chain fired and labeled entities properly
>>
>> A enhancement chain has 1..* enhancement engines. The engines create
>> the annotations not the chain. So your question should be what engine
>> is creating an annotation. This information is provided by the
>> dc:creator and dc:contributor metadata of the enhancement. See also
>> the documentation at [1]
>>
>>>
>>> *2. Categorize entities differently*
>>>
>>> Is it possible to categorize your detected entities as something else?
>>> i.e. other than People, Organizations or Places
>>>
>>> What steps one need to take in current framework to achieve the same?
>>
>> You can use the Custom NER Model Extraction Engine [2].
>> The models used in the documentation of this engine can be found at [3]
>>
>>>
>>> *3. Domain specific modeling*
>>>
>>> Suppose, I have a small domain and various types of entities. I am
>>> interested in
>>>
>>> a. analyzing various entities
>>> b. linking them with other entities and find relations from dbpedia/freebase
>>> c. infer interesting aspects using reasoning
>>>
>>> Is Stanbol the way to go or Marmotta? or Is it preferred to develop a
>>> custom engine using Stanbol which uses internal components to perform all
>>> of the above tasks?
>>
>> * Entity linking to your custom vocabulary in Stanbol
>> * If you want to have your custom entities linked with
>> dbpedia/freebase it is better to do that in the vocabulary. I think
>> Google refine provided reconciliation to freebase. that could be
>> definitely an option.
>> * If you want to find additional entities contained in
>> freebase/dbpedia configuring an other entity linking in Stanbol makes
>> complete sense.
>>
>> Not sure what you mean with "infer interesting aspects using reasoning".
>>
>>>
>>> *4. Enhance detected entities by annotation*
>>>
>>> Suppose, opennlp-ner detected an entity xyz. If I want to annotate this
>>> entity with additional attributes/fields using different custom
>>> vocabularies, what are the dev. steps I need to take?
>>>
>>
>> If you just want to link Named Entities with a controlled vocabulary
>> you can use the FST linking engine [4] with the Linking Mode set to
>> NER (read the Linking Mode of the engines documentation). In short you
>> will want to configure a "Apache Stanbol Enhancer Engine: FST Linking:
>> Named Entities" for the vocabulary you want to link against.
>>
>>
>>> *5. Previous demo project(s)*
>>>
>>> At the same time, any luck with restoring demo project(s) within 0.12
>>> branch ? I believe, it demonstrates various aspects and it would be great
>>> to have it restored.
>>>
>>
>> I hope those are still functional in the 0.12 branch. No immediate
>> plans to move them to 1.0.0 (mainly because of lack of time).
>> Contributions are very welcome.
>>
>> Hope this helps
>> best
>> Rupert
>>
>>> Thanks in advance,
>>> Rajan
>>
>>
>> [1] http://stanbol.apache.org/docs/trunk/components/enhancer/enhancementstructure#fiseenhancement
>> [2] https://stanbol.apache.org/docs/trunk/components/enhancer/engines/opennlpcustomner
>> [3] http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/demos/ehealth/src/main/resources/datafiles/
>> [4] http://stanbol.apache.org/docs/trunk/components/enhancer/engines/lucenefstlinking
>>
>>
>> --
>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>> | Bodenlehenstraße 11                              ++43-699-11108907
>> | A-5500 Bischofshofen
>> | REDLINK.CO ..........................................................................
>> | http://redlink.co/



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/

Re: General Stanbol questions

Posted by ra...@gmail.com.
Hi Rupert,

Thanks a lot for the detailed answers. Let me play a little bit further before I ask additional follow-up questions.

As far as demo is concerned, I am interested in eHealth demo as it covers lots of items from my questions. At present, the Java code for it is missing. Is it possible to restore Java code for eHealth demo in 0.12 branch?

With best regards,
Rajan

Sent from my iPhone

> On May 27, 2015, at 6:42 AM, Rupert Westenthaler <ru...@gmail.com> wrote:
> 
> Hi
> 
>> On Wed, May 27, 2015 at 5:31 AM, Rajan Shah <ra...@gmail.com> wrote:
>> Hi,
>> 
>> As I am trying to get my hands around stanbol, I have couple general design
>> questions.
>> 
>> *1. Enhancement Chain firing and results*
>> 
>> How to find out which enhancement chain detected which entities? One way, I
>> could see that by adding/removing particular chain. Is it possible to just
>> enable it via logging within current code?
>> 
>> For ex.
>> I have a chain categorized-linking and would like to find out whether this
>> chain fired and labeled entities properly
> 
> A enhancement chain has 1..* enhancement engines. The engines create
> the annotations not the chain. So your question should be what engine
> is creating an annotation. This information is provided by the
> dc:creator and dc:contributor metadata of the enhancement. See also
> the documentation at [1]
> 
>> 
>> *2. Categorize entities differently*
>> 
>> Is it possible to categorize your detected entities as something else?
>> i.e. other than People, Organizations or Places
>> 
>> What steps one need to take in current framework to achieve the same?
> 
> You can use the Custom NER Model Extraction Engine [2].
> The models used in the documentation of this engine can be found at [3]
> 
>> 
>> *3. Domain specific modeling*
>> 
>> Suppose, I have a small domain and various types of entities. I am
>> interested in
>> 
>> a. analyzing various entities
>> b. linking them with other entities and find relations from dbpedia/freebase
>> c. infer interesting aspects using reasoning
>> 
>> Is Stanbol the way to go or Marmotta? or Is it preferred to develop a
>> custom engine using Stanbol which uses internal components to perform all
>> of the above tasks?
> 
> * Entity linking to your custom vocabulary in Stanbol
> * If you want to have your custom entities linked with
> dbpedia/freebase it is better to do that in the vocabulary. I think
> Google refine provided reconciliation to freebase. that could be
> definitely an option.
> * If you want to find additional entities contained in
> freebase/dbpedia configuring an other entity linking in Stanbol makes
> complete sense.
> 
> Not sure what you mean with "infer interesting aspects using reasoning".
> 
>> 
>> *4. Enhance detected entities by annotation*
>> 
>> Suppose, opennlp-ner detected an entity xyz. If I want to annotate this
>> entity with additional attributes/fields using different custom
>> vocabularies, what are the dev. steps I need to take?
>> 
> 
> If you just want to link Named Entities with a controlled vocabulary
> you can use the FST linking engine [4] with the Linking Mode set to
> NER (read the Linking Mode of the engines documentation). In short you
> will want to configure a "Apache Stanbol Enhancer Engine: FST Linking:
> Named Entities" for the vocabulary you want to link against.
> 
> 
>> *5. Previous demo project(s)*
>> 
>> At the same time, any luck with restoring demo project(s) within 0.12
>> branch ? I believe, it demonstrates various aspects and it would be great
>> to have it restored.
>> 
> 
> I hope those are still functional in the 0.12 branch. No immediate
> plans to move them to 1.0.0 (mainly because of lack of time).
> Contributions are very welcome.
> 
> Hope this helps
> best
> Rupert
> 
>> Thanks in advance,
>> Rajan
> 
> 
> [1] http://stanbol.apache.org/docs/trunk/components/enhancer/enhancementstructure#fiseenhancement
> [2] https://stanbol.apache.org/docs/trunk/components/enhancer/engines/opennlpcustomner
> [3] http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/demos/ehealth/src/main/resources/datafiles/
> [4] http://stanbol.apache.org/docs/trunk/components/enhancer/engines/lucenefstlinking
> 
> 
> -- 
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                              ++43-699-11108907
> | A-5500 Bischofshofen
> | REDLINK.CO ..........................................................................
> | http://redlink.co/

Re: General Stanbol questions

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi

On Wed, May 27, 2015 at 5:31 AM, Rajan Shah <ra...@gmail.com> wrote:
> Hi,
>
> As I am trying to get my hands around stanbol, I have couple general design
> questions.
>
> *1. Enhancement Chain firing and results*
>
> How to find out which enhancement chain detected which entities? One way, I
> could see that by adding/removing particular chain. Is it possible to just
> enable it via logging within current code?
>
> For ex.
> I have a chain categorized-linking and would like to find out whether this
> chain fired and labeled entities properly

A enhancement chain has 1..* enhancement engines. The engines create
the annotations not the chain. So your question should be what engine
is creating an annotation. This information is provided by the
dc:creator and dc:contributor metadata of the enhancement. See also
the documentation at [1]

>
> *2. Categorize entities differently*
>
> Is it possible to categorize your detected entities as something else?
>  i.e. other than People, Organizations or Places
>
> What steps one need to take in current framework to achieve the same?

You can use the Custom NER Model Extraction Engine [2].
The models used in the documentation of this engine can be found at [3]

>
> *3. Domain specific modeling*
>
> Suppose, I have a small domain and various types of entities. I am
> interested in
>
> a. analyzing various entities
> b. linking them with other entities and find relations from dbpedia/freebase
> c. infer interesting aspects using reasoning
>
> Is Stanbol the way to go or Marmotta? or Is it preferred to develop a
> custom engine using Stanbol which uses internal components to perform all
> of the above tasks?

* Entity linking to your custom vocabulary in Stanbol
* If you want to have your custom entities linked with
dbpedia/freebase it is better to do that in the vocabulary. I think
Google refine provided reconciliation to freebase. that could be
definitely an option.
* If you want to find additional entities contained in
freebase/dbpedia configuring an other entity linking in Stanbol makes
complete sense.

Not sure what you mean with "infer interesting aspects using reasoning".

>
> *4. Enhance detected entities by annotation*
>
> Suppose, opennlp-ner detected an entity xyz. If I want to annotate this
> entity with additional attributes/fields using different custom
> vocabularies, what are the dev. steps I need to take?
>

If you just want to link Named Entities with a controlled vocabulary
you can use the FST linking engine [4] with the Linking Mode set to
NER (read the Linking Mode of the engines documentation). In short you
will want to configure a "Apache Stanbol Enhancer Engine: FST Linking:
Named Entities" for the vocabulary you want to link against.


> *5. Previous demo project(s)*
>
>  At the same time, any luck with restoring demo project(s) within 0.12
> branch ? I believe, it demonstrates various aspects and it would be great
> to have it restored.
>

I hope those are still functional in the 0.12 branch. No immediate
plans to move them to 1.0.0 (mainly because of lack of time).
Contributions are very welcome.

Hope this helps
best
Rupert

> Thanks in advance,
> Rajan


[1] http://stanbol.apache.org/docs/trunk/components/enhancer/enhancementstructure#fiseenhancement
[2] https://stanbol.apache.org/docs/trunk/components/enhancer/engines/opennlpcustomner
[3] http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/demos/ehealth/src/main/resources/datafiles/
[4] http://stanbol.apache.org/docs/trunk/components/enhancer/engines/lucenefstlinking


-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/