You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Rafa Haro <rh...@zaizi.com> on 2012/12/18 15:46:13 UTC
Using Disambiguation-mlt with the new EntityHub Linking Engine
Hi all,
I have been trying to use disambiguation-mlt engine with the new
EntityHub Linking Engine for Spanish. My goal is to link and
disambiguate with any kind of entity within the EntityHub, not only with
Named Entities. So, I have configured a new Enhancement Chain including
only language detection, OpenNlpSentenceDetectionEngine,
OpenNlpTokenizerEngine, EntityLinkingEngine and Disambiguation-mlt
(installing the bundle version 0.10). After a few tests, the
disambiguation engine is working but is not able to disambiguate
anything. Removing the disambiguation engine from the Enhancement Chain
we have find out that only one candidate for each detected entity is
given. Therefore I think that maybe the disambiguation engine is working
fine but actually doesn't need to disambiguate anything due to only one
candidate is being passed to it from entityHub linking engine.
What can be happening? Our suggestions parameter is set to 5
Thanks. Regards
This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory.
Zaizi Ltd is registered in England and Wales with the registration number 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road, London W10 5JJ, UK.
Re: Using Disambiguation-mlt with the new EntityHub Linking Engine
Posted by Rupert Westenthaler <ru...@gmail.com>.
On Wed, Dec 19, 2012 at 1:08 PM, Rafa Haro <rh...@zaizi.com> wrote:
> Hi Rupert,
>
> Thanks. Now is working perfectly.
>
> By the way, Is the pos-tagger model for Spanish installed in Stanbol? I want
> to know if is possible to filter the disambiguation just for nouns
>
Yes it is installed. As the Spanish POS model does not provide
ProperNouns the configuration enables all Nouns by default. So yes the
results of the EntityhubLinkingEngine will provide suggestions for all
Nouns.
best
Rupert
> Thanks
>
> El 19/12/12 12:27, Rupert Westenthaler escribió:
>
>> enhancer.engines.linking.suggestions="20"
>> enhancer.engines.linking.minFoundTokens="1"
>> enhancer.engines.linking.minLabelScore="0.33"
>> enhancer.engines.linking.minTextScore="0.33"
>> enhancer.engines.linking.minMatchScore="0.2"
>
>
> This message should be regarded as confidential. If you have received this
> email in error please notify the sender and destroy it immediately.
> Statements of intent shall only become binding when confirmed in hard copy
> by an authorised signatory.
>
> Zaizi Ltd is registered in England and Wales with the registration number
> 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road,
> London W10 5JJ, UK.
>
--
| Rupert Westenthaler rupert.westenthaler@gmail.com
| Bodenlehenstraße 11 ++43-699-11108907
| A-5500 Bischofshofen
Re: Using Disambiguation-mlt with the new EntityHub Linking Engine
Posted by Rafa Haro <rh...@zaizi.com>.
Hi Rupert,
Thanks. Now is working perfectly.
By the way, Is the pos-tagger model for Spanish installed in Stanbol? I
want to know if is possible to filter the disambiguation just for nouns
Thanks
El 19/12/12 12:27, Rupert Westenthaler escribió:
> enhancer.engines.linking.suggestions="20"
> enhancer.engines.linking.minFoundTokens="1"
> enhancer.engines.linking.minLabelScore="0.33"
> enhancer.engines.linking.minTextScore="0.33"
> enhancer.engines.linking.minMatchScore="0.2"
This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory.
Zaizi Ltd is registered in England and Wales with the registration number 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road, London W10 5JJ, UK.
Re: Using Disambiguation-mlt with the new EntityHub Linking Engine
Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi
my fault as you are probably getting
org.apache.sling.installer.core.impl.OsgiInstallerImpl Cannot create
InternalResource (resource will be ignored):InstallableResource,
priority=100, id={path} java.io.IOException: Unable to read dictionary
from input stream: {path}
[..]
Caused by: java.io.IOException: Unexpected token 78; expected: 61
(line=19, pos=48)
at org.apache.felix.cm.file.ConfigurationHandler.readFailure(ConfigurationHandler.java:650)
at org.apache.felix.cm.file.ConfigurationHandler.readInternal(ConfigurationHandler.java:274)
at org.apache.felix.cm.file.ConfigurationHandler.read(ConfigurationHandler.java:237)
at org.apache.sling.installer.core.impl.InternalResource.readDictionary(InternalResource.java:243)
at org.apache.sling.installer.core.impl.InternalResource.create(InternalResource.java:98)
... 6 more
The reason for that is that the config files requires
{key}=[{data-type}]"{value}"
That means that my example is an illegal formatted config file
Changing to
enhancer.engines.linking.suggestions="20"
enhancer.engines.linking.minFoundTokens="1"
enhancer.engines.linking.minLabelScore="0.33"
enhancer.engines.linking.minTextScore="0.33"
enhancer.engines.linking.minMatchScore="0.2"
solved the problem for me
best
Rupert
On Wed, Dec 19, 2012 at 11:07 AM, Rafa Haro <rh...@zaizi.com> wrote:
> Hi Rupert,
>
> Thanks for the instructions. I have tried to change manually the
> configuration and I'm experimenting a weird behaviour. Creating a config
> file with a custom name in the fileinstall directory doesn't have any
> effect. After doing that, I can't see a new EntityHub Linking engine
> instance in the Felix Console. Maybe, it doesn't have to appear at all, I
> don't know.
>
> Therefore, I have tried then to create a new instance with Felix Console and
> change the configuration manually. When I create a new instance, Felix
> Console assigns it a concrete name, for instance:
>
> org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine.2449f404-1cce-4655-84ce-ae4235d42009
>
> Looking at fileinstall directory in my stanbol working directory, a new file
> with the configuration of this instance appears:
>
> /var/zaizi/workspace/stanbol/launchers/full/target/stanbol/fileinstall/org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine.org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine.2449f404-1cce-4655-84ce-ae4235d42009.config
>
> To change manually this file doesn't have any effect too. If you set for
> instance "Suggestions" to 50 (which is a parameter that you can actually see
> in the felix console), this change doesn't appear in the Felix Console Form
> for our instance, which always conserves the last value that you had set
> directly in the console. We have restarting the engine, bundle even
> restarting Stanbol, but the configuration doesn't change.
>
> Therefore, we thought that Felix Engine should be loading the configuration
> from another file in the filesystem. We have found more files with the same
> configuration:
>
> config/org/apache/stanbol/enhancer/engines/entityhublinking/EntityhubLinkingEngine/org/apache/stanbol/enhancer/engines/entityhublinking/EntityhubLinkingEngine/2449f404/ee7315a4-3139-4b0d-ad41-fbbf4deb7fe3.config
>
> AND
>
> config/org/apache/stanbol/enhancer/engines/entityhublinking/EntityhubLinkingEngine/2449f404-1cce-4655-84ce-ae4235d42009.config
>
> AND
>
> fileinstall/org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine.org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine.2449f404-1cce-4655-84ce-ae4235d42009.config
>
>
> We have tried changing all these files manually but we didn't success.
>
>
> The configuration of the engine is always the last configuration you did
> directly in the console. Even these files change its values when you restart
> Stanbol.
>
> Any idea here??
>
> Thanks again Rupert
> El 18/12/12 19:54, Rupert Westenthaler escribió:
>
>> Hi
>>
>> Those properties are not available in the Felix Webconsole. You can
>> only configure them by using OSGI config files. The
>> EntityLinkingEngine has simple to much configuration parameters to
>> include them all in the Form of the Felix Webconsole.
>>
>> The best is to use default configuration for the dbpedia
>> EntityhubLinkingEngine [1] as a template and adapt it to your needs.
>> e.g. by
>>
>> adding
>>
>> enhancer.engines.linking.minFoundTokens=1
>> enhancer.engines.linking.minLabelScore=0.33
>> enhancer.engines.linking.minTextScore=0.33
>> enhancer.engines.linking.minMatchScore=0.2
>>
>> you will also need to increase the value of
>> "enhancer.engines.linking.suggestions".
>>
>> Note that you do NOT need to use the datatypes (e.g. {key}=I"1" for
>> Integer). The Engine is implemented in a way that is also supports
>> string values as long as it can parse the expected numeric values from
>> the provided values.
>>
>> The file must follow the name
>>
>> "org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine-{instance_name}.config".
>>
>> You can use the Sling Fileinstaller to activate your configuration
>> file. Simple create the {stanbl-working-dir}/stanbol/fileinstall
>> directory and copy the config file into this directory.
>>
>> best
>> Rupert
>>
>> p.s. in my last mail I used outdated keys. Also the documentation on
>> the Stanbol website noted the wrong keys. I corrected this in the
>> meantime
>>
>>
>> [1]
>> http://svn.apache.org/repos/asf/stanbol/trunk/data/defaultconfig/src/main/resources/config/org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine-dbpedia.config
>>
>>
>> On Tue, Dec 18, 2012 at 4:27 PM, Rafa Haro <rh...@zaizi.com> wrote:
>>>
>>> Hi Rupert,
>>>
>>> In which revision is it possible to configure such parameters? We are
>>> working with revision 1421282 and I can't see these options in the Engine
>>> Configuration Dialogue.
>>>
>>> Regards
>>>
>>> El 18/12/12 16:21, Rupert Westenthaler escribió:
>>>
>>>> Hi Rafa
>>>>
>>>> To use the disambiguation engine you will need to tweak the parameters
>>>> for the EntityhubLinkingEngine. The relevant parameters are
>>>>
>>>> * Min Label Match Score
>>>>
>>>>
>>>> "org.apache.stanbol.enhancer.engines.keywordextraction.minLabelMatchFactor"
>>>> * Min Matched Tokens
>>>> "org.apache.stanbol.enhancer.engines.keywordextraction.minFoundTokens"
>>>>
>>>> see [1] for the documentation
>>>>
>>>> from the Documentation:
>>>>
>>>> If used in combination with an disambiguation Engine one might want to
>>>> consider to suggest Entities where only a single token of multi-token
>>>> labels do match. In such cases a configuration like Min Matched
>>>> Tokens=1 and Min Label Match Score <= 0.5 (e.g. 0.4) might be
>>>> considered. With such scenarios users will also want to considerable
>>>> increase the value for Max Suggestions (typically values > 10).
>>>>
>>>> I would suggest that you start of with "minLabelMatchFactor=0.33" and
>>>> "minFoundTokens=1". In addition I would set the number of suggestions
>>>> to ~20.
>>>>
>>>> best
>>>> Rupert
>>>>
>>>>
>>>> [1]
>>>>
>>>> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#entity-linker-configuration
>>>>
>>>> On Tue, Dec 18, 2012 at 3:46 PM, Rafa Haro <rh...@zaizi.com> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I have been trying to use disambiguation-mlt engine with the new
>>>>> EntityHub
>>>>> Linking Engine for Spanish. My goal is to link and disambiguate with
>>>>> any
>>>>> kind of entity within the EntityHub, not only with Named Entities. So,
>>>>> I
>>>>> have configured a new Enhancement Chain including only language
>>>>> detection,
>>>>> OpenNlpSentenceDetectionEngine, OpenNlpTokenizerEngine,
>>>>> EntityLinkingEngine
>>>>> and Disambiguation-mlt (installing the bundle version 0.10). After a
>>>>> few
>>>>> tests, the disambiguation engine is working but is not able to
>>>>> disambiguate
>>>>> anything. Removing the disambiguation engine from the Enhancement Chain
>>>>> we
>>>>> have find out that only one candidate for each detected entity is
>>>>> given.
>>>>> Therefore I think that maybe the disambiguation engine is working fine
>>>>> but
>>>>> actually doesn't need to disambiguate anything due to only one
>>>>> candidate
>>>>> is
>>>>> being passed to it from entityHub linking engine.
>>>>>
>>>>> What can be happening? Our suggestions parameter is set to 5
>>>>>
>>>>> Thanks. Regards
>>>>>
>>>>> This message should be regarded as confidential. If you have received
>>>>> this
>>>>> email in error please notify the sender and destroy it immediately.
>>>>> Statements of intent shall only become binding when confirmed in hard
>>>>> copy
>>>>> by an authorised signatory.
>>>>>
>>>>> Zaizi Ltd is registered in England and Wales with the registration
>>>>> number
>>>>> 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam
>>>>> Road,
>>>>> London W10 5JJ, UK.
>>>>
>>>>
>>>>
>>> This message should be regarded as confidential. If you have received
>>> this
>>> email in error please notify the sender and destroy it immediately.
>>> Statements of intent shall only become binding when confirmed in hard
>>> copy
>>> by an authorised signatory.
>>>
>>> Zaizi Ltd is registered in England and Wales with the registration number
>>> 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam
>>> Road,
>>> London W10 5JJ, UK.
>>>
>>
>>
>
>
> This message should be regarded as confidential. If you have received this
> email in error please notify the sender and destroy it immediately.
> Statements of intent shall only become binding when confirmed in hard copy
> by an authorised signatory.
>
> Zaizi Ltd is registered in England and Wales with the registration number
> 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road,
> London W10 5JJ, UK.
--
| Rupert Westenthaler rupert.westenthaler@gmail.com
| Bodenlehenstraße 11 ++43-699-11108907
| A-5500 Bischofshofen
Re: Using Disambiguation-mlt with the new EntityHub Linking Engine
Posted by Rafa Haro <rh...@zaizi.com>.
Hi Rupert,
Thanks for the instructions. I have tried to change manually the
configuration and I'm experimenting a weird behaviour. Creating a config
file with a custom name in the fileinstall directory doesn't have any
effect. After doing that, I can't see a new EntityHub Linking engine
instance in the Felix Console. Maybe, it doesn't have to appear at all,
I don't know.
Therefore, I have tried then to create a new instance with Felix Console
and change the configuration manually. When I create a new instance,
Felix Console assigns it a concrete name, for instance:
org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine.2449f404-1cce-4655-84ce-ae4235d42009
Looking at fileinstall directory in my stanbol working directory, a new
file with the configuration of this instance appears:
/var/zaizi/workspace/stanbol/launchers/full/target/stanbol/fileinstall/org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine.org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine.2449f404-1cce-4655-84ce-ae4235d42009.config
To change manually this file doesn't have any effect too. If you set for
instance "Suggestions" to 50 (which is a parameter that you can actually
see in the felix console), this change doesn't appear in the Felix
Console Form for our instance, which always conserves the last value
that you had set directly in the console. We have restarting the engine,
bundle even restarting Stanbol, but the configuration doesn't change.
Therefore, we thought that Felix Engine should be loading the
configuration from another file in the filesystem. We have found more
files with the same configuration:
config/org/apache/stanbol/enhancer/engines/entityhublinking/EntityhubLinkingEngine/org/apache/stanbol/enhancer/engines/entityhublinking/EntityhubLinkingEngine/2449f404/ee7315a4-3139-4b0d-ad41-fbbf4deb7fe3.config
AND
config/org/apache/stanbol/enhancer/engines/entityhublinking/EntityhubLinkingEngine/2449f404-1cce-4655-84ce-ae4235d42009.config
AND
fileinstall/org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine.org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine.2449f404-1cce-4655-84ce-ae4235d42009.config
We have tried changing all these files manually but we didn't success.
The configuration of the engine is always the last configuration you did
directly in the console. Even these files change its values when you
restart Stanbol.
Any idea here??
Thanks again Rupert
El 18/12/12 19:54, Rupert Westenthaler escribió:
> Hi
>
> Those properties are not available in the Felix Webconsole. You can
> only configure them by using OSGI config files. The
> EntityLinkingEngine has simple to much configuration parameters to
> include them all in the Form of the Felix Webconsole.
>
> The best is to use default configuration for the dbpedia
> EntityhubLinkingEngine [1] as a template and adapt it to your needs.
> e.g. by
>
> adding
>
> enhancer.engines.linking.minFoundTokens=1
> enhancer.engines.linking.minLabelScore=0.33
> enhancer.engines.linking.minTextScore=0.33
> enhancer.engines.linking.minMatchScore=0.2
>
> you will also need to increase the value of
> "enhancer.engines.linking.suggestions".
>
> Note that you do NOT need to use the datatypes (e.g. {key}=I"1" for
> Integer). The Engine is implemented in a way that is also supports
> string values as long as it can parse the expected numeric values from
> the provided values.
>
> The file must follow the name
> "org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine-{instance_name}.config".
>
> You can use the Sling Fileinstaller to activate your configuration
> file. Simple create the {stanbl-working-dir}/stanbol/fileinstall
> directory and copy the config file into this directory.
>
> best
> Rupert
>
> p.s. in my last mail I used outdated keys. Also the documentation on
> the Stanbol website noted the wrong keys. I corrected this in the
> meantime
>
>
> [1] http://svn.apache.org/repos/asf/stanbol/trunk/data/defaultconfig/src/main/resources/config/org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine-dbpedia.config
>
>
> On Tue, Dec 18, 2012 at 4:27 PM, Rafa Haro <rh...@zaizi.com> wrote:
>> Hi Rupert,
>>
>> In which revision is it possible to configure such parameters? We are
>> working with revision 1421282 and I can't see these options in the Engine
>> Configuration Dialogue.
>>
>> Regards
>>
>> El 18/12/12 16:21, Rupert Westenthaler escribió:
>>
>>> Hi Rafa
>>>
>>> To use the disambiguation engine you will need to tweak the parameters
>>> for the EntityhubLinkingEngine. The relevant parameters are
>>>
>>> * Min Label Match Score
>>>
>>> "org.apache.stanbol.enhancer.engines.keywordextraction.minLabelMatchFactor"
>>> * Min Matched Tokens
>>> "org.apache.stanbol.enhancer.engines.keywordextraction.minFoundTokens"
>>>
>>> see [1] for the documentation
>>>
>>> from the Documentation:
>>>
>>> If used in combination with an disambiguation Engine one might want to
>>> consider to suggest Entities where only a single token of multi-token
>>> labels do match. In such cases a configuration like Min Matched
>>> Tokens=1 and Min Label Match Score <= 0.5 (e.g. 0.4) might be
>>> considered. With such scenarios users will also want to considerable
>>> increase the value for Max Suggestions (typically values > 10).
>>>
>>> I would suggest that you start of with "minLabelMatchFactor=0.33" and
>>> "minFoundTokens=1". In addition I would set the number of suggestions
>>> to ~20.
>>>
>>> best
>>> Rupert
>>>
>>>
>>> [1]
>>> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#entity-linker-configuration
>>>
>>> On Tue, Dec 18, 2012 at 3:46 PM, Rafa Haro <rh...@zaizi.com> wrote:
>>>> Hi all,
>>>>
>>>> I have been trying to use disambiguation-mlt engine with the new
>>>> EntityHub
>>>> Linking Engine for Spanish. My goal is to link and disambiguate with any
>>>> kind of entity within the EntityHub, not only with Named Entities. So, I
>>>> have configured a new Enhancement Chain including only language
>>>> detection,
>>>> OpenNlpSentenceDetectionEngine, OpenNlpTokenizerEngine,
>>>> EntityLinkingEngine
>>>> and Disambiguation-mlt (installing the bundle version 0.10). After a few
>>>> tests, the disambiguation engine is working but is not able to
>>>> disambiguate
>>>> anything. Removing the disambiguation engine from the Enhancement Chain
>>>> we
>>>> have find out that only one candidate for each detected entity is given.
>>>> Therefore I think that maybe the disambiguation engine is working fine
>>>> but
>>>> actually doesn't need to disambiguate anything due to only one candidate
>>>> is
>>>> being passed to it from entityHub linking engine.
>>>>
>>>> What can be happening? Our suggestions parameter is set to 5
>>>>
>>>> Thanks. Regards
>>>>
>>>> This message should be regarded as confidential. If you have received
>>>> this
>>>> email in error please notify the sender and destroy it immediately.
>>>> Statements of intent shall only become binding when confirmed in hard
>>>> copy
>>>> by an authorised signatory.
>>>>
>>>> Zaizi Ltd is registered in England and Wales with the registration number
>>>> 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam
>>>> Road,
>>>> London W10 5JJ, UK.
>>>
>>>
>> This message should be regarded as confidential. If you have received this
>> email in error please notify the sender and destroy it immediately.
>> Statements of intent shall only become binding when confirmed in hard copy
>> by an authorised signatory.
>>
>> Zaizi Ltd is registered in England and Wales with the registration number
>> 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road,
>> London W10 5JJ, UK.
>>
>
>
This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory.
Zaizi Ltd is registered in England and Wales with the registration number 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road, London W10 5JJ, UK.
Re: Using Disambiguation-mlt with the new EntityHub Linking Engine
Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi
Those properties are not available in the Felix Webconsole. You can
only configure them by using OSGI config files. The
EntityLinkingEngine has simple to much configuration parameters to
include them all in the Form of the Felix Webconsole.
The best is to use default configuration for the dbpedia
EntityhubLinkingEngine [1] as a template and adapt it to your needs.
e.g. by
adding
enhancer.engines.linking.minFoundTokens=1
enhancer.engines.linking.minLabelScore=0.33
enhancer.engines.linking.minTextScore=0.33
enhancer.engines.linking.minMatchScore=0.2
you will also need to increase the value of
"enhancer.engines.linking.suggestions".
Note that you do NOT need to use the datatypes (e.g. {key}=I"1" for
Integer). The Engine is implemented in a way that is also supports
string values as long as it can parse the expected numeric values from
the provided values.
The file must follow the name
"org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine-{instance_name}.config".
You can use the Sling Fileinstaller to activate your configuration
file. Simple create the {stanbl-working-dir}/stanbol/fileinstall
directory and copy the config file into this directory.
best
Rupert
p.s. in my last mail I used outdated keys. Also the documentation on
the Stanbol website noted the wrong keys. I corrected this in the
meantime
[1] http://svn.apache.org/repos/asf/stanbol/trunk/data/defaultconfig/src/main/resources/config/org.apache.stanbol.enhancer.engines.entityhublinking.EntityhubLinkingEngine-dbpedia.config
On Tue, Dec 18, 2012 at 4:27 PM, Rafa Haro <rh...@zaizi.com> wrote:
> Hi Rupert,
>
> In which revision is it possible to configure such parameters? We are
> working with revision 1421282 and I can't see these options in the Engine
> Configuration Dialogue.
>
> Regards
>
> El 18/12/12 16:21, Rupert Westenthaler escribió:
>
>> Hi Rafa
>>
>> To use the disambiguation engine you will need to tweak the parameters
>> for the EntityhubLinkingEngine. The relevant parameters are
>>
>> * Min Label Match Score
>>
>> "org.apache.stanbol.enhancer.engines.keywordextraction.minLabelMatchFactor"
>> * Min Matched Tokens
>> "org.apache.stanbol.enhancer.engines.keywordextraction.minFoundTokens"
>>
>> see [1] for the documentation
>>
>> from the Documentation:
>>
>> If used in combination with an disambiguation Engine one might want to
>> consider to suggest Entities where only a single token of multi-token
>> labels do match. In such cases a configuration like Min Matched
>> Tokens=1 and Min Label Match Score <= 0.5 (e.g. 0.4) might be
>> considered. With such scenarios users will also want to considerable
>> increase the value for Max Suggestions (typically values > 10).
>>
>> I would suggest that you start of with "minLabelMatchFactor=0.33" and
>> "minFoundTokens=1". In addition I would set the number of suggestions
>> to ~20.
>>
>> best
>> Rupert
>>
>>
>> [1]
>> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#entity-linker-configuration
>>
>> On Tue, Dec 18, 2012 at 3:46 PM, Rafa Haro <rh...@zaizi.com> wrote:
>>>
>>> Hi all,
>>>
>>> I have been trying to use disambiguation-mlt engine with the new
>>> EntityHub
>>> Linking Engine for Spanish. My goal is to link and disambiguate with any
>>> kind of entity within the EntityHub, not only with Named Entities. So, I
>>> have configured a new Enhancement Chain including only language
>>> detection,
>>> OpenNlpSentenceDetectionEngine, OpenNlpTokenizerEngine,
>>> EntityLinkingEngine
>>> and Disambiguation-mlt (installing the bundle version 0.10). After a few
>>> tests, the disambiguation engine is working but is not able to
>>> disambiguate
>>> anything. Removing the disambiguation engine from the Enhancement Chain
>>> we
>>> have find out that only one candidate for each detected entity is given.
>>> Therefore I think that maybe the disambiguation engine is working fine
>>> but
>>> actually doesn't need to disambiguate anything due to only one candidate
>>> is
>>> being passed to it from entityHub linking engine.
>>>
>>> What can be happening? Our suggestions parameter is set to 5
>>>
>>> Thanks. Regards
>>>
>>> This message should be regarded as confidential. If you have received
>>> this
>>> email in error please notify the sender and destroy it immediately.
>>> Statements of intent shall only become binding when confirmed in hard
>>> copy
>>> by an authorised signatory.
>>>
>>> Zaizi Ltd is registered in England and Wales with the registration number
>>> 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam
>>> Road,
>>> London W10 5JJ, UK.
>>
>>
>>
>
> This message should be regarded as confidential. If you have received this
> email in error please notify the sender and destroy it immediately.
> Statements of intent shall only become binding when confirmed in hard copy
> by an authorised signatory.
>
> Zaizi Ltd is registered in England and Wales with the registration number
> 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road,
> London W10 5JJ, UK.
>
--
| Rupert Westenthaler rupert.westenthaler@gmail.com
| Bodenlehenstraße 11 ++43-699-11108907
| A-5500 Bischofshofen
Re: Using Disambiguation-mlt with the new EntityHub Linking Engine
Posted by Rafa Haro <rh...@zaizi.com>.
Hi Rupert,
In which revision is it possible to configure such parameters? We are
working with revision 1421282 and I can't see these options in the
Engine Configuration Dialogue.
Regards
El 18/12/12 16:21, Rupert Westenthaler escribió:
> Hi Rafa
>
> To use the disambiguation engine you will need to tweak the parameters
> for the EntityhubLinkingEngine. The relevant parameters are
>
> * Min Label Match Score
> "org.apache.stanbol.enhancer.engines.keywordextraction.minLabelMatchFactor"
> * Min Matched Tokens
> "org.apache.stanbol.enhancer.engines.keywordextraction.minFoundTokens"
>
> see [1] for the documentation
>
> from the Documentation:
>
> If used in combination with an disambiguation Engine one might want to
> consider to suggest Entities where only a single token of multi-token
> labels do match. In such cases a configuration like Min Matched
> Tokens=1 and Min Label Match Score <= 0.5 (e.g. 0.4) might be
> considered. With such scenarios users will also want to considerable
> increase the value for Max Suggestions (typically values > 10).
>
> I would suggest that you start of with "minLabelMatchFactor=0.33" and
> "minFoundTokens=1". In addition I would set the number of suggestions
> to ~20.
>
> best
> Rupert
>
>
> [1] http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#entity-linker-configuration
>
> On Tue, Dec 18, 2012 at 3:46 PM, Rafa Haro <rh...@zaizi.com> wrote:
>> Hi all,
>>
>> I have been trying to use disambiguation-mlt engine with the new EntityHub
>> Linking Engine for Spanish. My goal is to link and disambiguate with any
>> kind of entity within the EntityHub, not only with Named Entities. So, I
>> have configured a new Enhancement Chain including only language detection,
>> OpenNlpSentenceDetectionEngine, OpenNlpTokenizerEngine, EntityLinkingEngine
>> and Disambiguation-mlt (installing the bundle version 0.10). After a few
>> tests, the disambiguation engine is working but is not able to disambiguate
>> anything. Removing the disambiguation engine from the Enhancement Chain we
>> have find out that only one candidate for each detected entity is given.
>> Therefore I think that maybe the disambiguation engine is working fine but
>> actually doesn't need to disambiguate anything due to only one candidate is
>> being passed to it from entityHub linking engine.
>>
>> What can be happening? Our suggestions parameter is set to 5
>>
>> Thanks. Regards
>>
>> This message should be regarded as confidential. If you have received this
>> email in error please notify the sender and destroy it immediately.
>> Statements of intent shall only become binding when confirmed in hard copy
>> by an authorised signatory.
>>
>> Zaizi Ltd is registered in England and Wales with the registration number
>> 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road,
>> London W10 5JJ, UK.
>
>
This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory.
Zaizi Ltd is registered in England and Wales with the registration number 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road, London W10 5JJ, UK.
Re: Using Disambiguation-mlt with the new EntityHub Linking Engine
Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Rafa
To use the disambiguation engine you will need to tweak the parameters
for the EntityhubLinkingEngine. The relevant parameters are
* Min Label Match Score
"org.apache.stanbol.enhancer.engines.keywordextraction.minLabelMatchFactor"
* Min Matched Tokens
"org.apache.stanbol.enhancer.engines.keywordextraction.minFoundTokens"
see [1] for the documentation
from the Documentation:
If used in combination with an disambiguation Engine one might want to
consider to suggest Entities where only a single token of multi-token
labels do match. In such cases a configuration like Min Matched
Tokens=1 and Min Label Match Score <= 0.5 (e.g. 0.4) might be
considered. With such scenarios users will also want to considerable
increase the value for Max Suggestions (typically values > 10).
I would suggest that you start of with "minLabelMatchFactor=0.33" and
"minFoundTokens=1". In addition I would set the number of suggestions
to ~20.
best
Rupert
[1] http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#entity-linker-configuration
On Tue, Dec 18, 2012 at 3:46 PM, Rafa Haro <rh...@zaizi.com> wrote:
> Hi all,
>
> I have been trying to use disambiguation-mlt engine with the new EntityHub
> Linking Engine for Spanish. My goal is to link and disambiguate with any
> kind of entity within the EntityHub, not only with Named Entities. So, I
> have configured a new Enhancement Chain including only language detection,
> OpenNlpSentenceDetectionEngine, OpenNlpTokenizerEngine, EntityLinkingEngine
> and Disambiguation-mlt (installing the bundle version 0.10). After a few
> tests, the disambiguation engine is working but is not able to disambiguate
> anything. Removing the disambiguation engine from the Enhancement Chain we
> have find out that only one candidate for each detected entity is given.
> Therefore I think that maybe the disambiguation engine is working fine but
> actually doesn't need to disambiguate anything due to only one candidate is
> being passed to it from entityHub linking engine.
>
> What can be happening? Our suggestions parameter is set to 5
>
> Thanks. Regards
>
> This message should be regarded as confidential. If you have received this
> email in error please notify the sender and destroy it immediately.
> Statements of intent shall only become binding when confirmed in hard copy
> by an authorised signatory.
>
> Zaizi Ltd is registered in England and Wales with the registration number
> 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road,
> London W10 5JJ, UK.
--
| Rupert Westenthaler rupert.westenthaler@gmail.com
| Bodenlehenstraße 11 ++43-699-11108907
| A-5500 Bischofshofen