You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Johannes Goslar <jo...@dkd.de> on 2014/02/05 14:54:43 UTC

Re: Extracting english words in german texts

Hi Rupert,
yes, moving to a managed site did help.

Looking through logs, the failed the sparql-queries look like:
FILTER(regex(str(?v_7),"^Global$","i") || regex(str(?v_7),"^Toy$","i") && ((lang(?v_7) = "de") || (lang(?v_7) = "en"))) . 
So the query builder is somewhere wrongly inserting ^$.

best
Johnny

-- 
Johannes Goslar

dkd Internet Service GmbH 
development // kommunikation // design 
Kaiserstraße 73 
60329 Frankfurt am Main 

Kontakt: 
- email: johannes.goslar@dkd.de 
- fon: +49 69 2475218-0 
- fax: +49 69 2475218-99
- web: http://www.dkd.de
- social media: http://social.dkd.de

Aktuelle Projekte:
- http://j.mp/SehBiS-App – iPhone-App Sehbehinderungssimulator
- http://www.ellen-wille.de - Launch Website (TYPO3)
- http://www.vgf-ffm.de - Relaunch Website (TYPO3)

Geschäftsführer: O. Dobberkau, S. Schaffstein, G. Wegenast, C. Zabanski 
Registergericht: Amtsgericht Frankfurt am Main 
Registernummer: HRB 45590



On 28.01.2014, at 15:21, Rupert Westenthaler <ru...@gmail.com> wrote:

> Hi Johnny
> 
> On Mon, Jan 27, 2014 at 5:58 PM, Johannes Goslar <jo...@dkd.de> wrote:
>> Hi Rupert,
>> the docs are really interesting but sadly did not bring me to a solution.
>> Removed the config except the *, but Stanbol still behaved the same way.
>> Extra models were not installed by hand.
>> At the moment the chain is using a Referenced Site pointing to the Sesame
>> Sparql Interface.
> 
> So the most likely cause is that the Yard does not suggest the Entity.
> So the issue is most likely in the SPARQL query generated for the
> Entity Lookup generated by the Entityhub Linking Engine.
> 
> I will try to replicate this, but I will not have time to do it this
> week as I am traveling. In the meantime you could try to upload your
> RDF data to a ManagedSite backed by a SolrYard.
> 
> best
> Rupert
> 
>> 
>> Cheers
>> Johnny
>> 
>> --
>> Johannes Goslar
>> 
>> dkd Internet Service GmbH
>> development // kommunikation // design
>> Kaiserstraße 73
>> 60329 Frankfurt am Main
>> 
>> Kontakt:
>> - email: johannes.goslar@dkd.de
>> - fon: +49 69 2475218-0
>> - fax: +49 69 2475218-99
>> - web: http://www.dkd.de
>> - social media: http://social.dkd.de
>> 
>> Aktuelle Projekte:
>> - http://j.mp/SehBiS-App – iPhone-App Sehbehinderungssimulator
>> - http://www.ellen-wille.de - Launch Website (TYPO3)
>> - http://www.vgf-ffm.de - Relaunch Website (TYPO3)
>> 
>> Geschäftsführer: O. Dobberkau, S. Schaffstein, G. Wegenast, C. Zabanski
>> Registergericht: Amtsgericht Frankfurt am Main
>> Registernummer: HRB 45590
>> 
> 
> 
> 
> -- 
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen


Re: Extracting english words in german texts

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Johannes,

I implemented a fix for STANBOL-1277 today. SO when using a stanbol
version later as r1587849 [1] your issue should be resolved.

Queries for a TextConstraint with {text1} or {text2} in the languages
{lang1} or {lang2} are expected to look like:

    select ?entity, ?label where {
        ?entity rdfs:label ?label
        FILTER((regex(str(?label),"\\b{text1}\\b","i") ||
regex(str(?label),"\\b{text2}\\b","i"))
            && ((lang(?label) = "{lang1}") || (lang(?label) = "{lang2}"))) .
    }

best
Rupert


[1] http://svn.apache.org/r1587849

On Thu, Feb 6, 2014 at 4:37 PM, Rupert Westenthaler
<ru...@gmail.com> wrote:
> Hi Johannes,
>
> thx for the report. I created STANBOL-1277 [1] for this
>
> best
> Rupert
>
> [1] https://issues.apache.org/jira/browse/STANBOL-1277
>
>
> On Wed, Feb 5, 2014 at 2:54 PM, Johannes Goslar <jo...@dkd.de> wrote:
>> Hi Rupert,
>> yes, moving to a managed site did help.
>>
>> Looking through logs, the failed the sparql-queries look like:
>> FILTER(regex(str(?v_7),"^Global$","i") || regex(str(?v_7),"^Toy$","i") && ((lang(?v_7) = "de") || (lang(?v_7) = "en"))) .
>> So the query builder is somewhere wrongly inserting ^$.
>>
>> best
>> Johnny
>>
>> --
>> Johannes Goslar
>>
>> dkd Internet Service GmbH
>> development // kommunikation // design
>> Kaiserstraße 73
>> 60329 Frankfurt am Main
>>
>> Kontakt:
>> - email: johannes.goslar@dkd.de
>> - fon: +49 69 2475218-0
>> - fax: +49 69 2475218-99
>> - web: http://www.dkd.de
>> - social media: http://social.dkd.de
>>
>> Aktuelle Projekte:
>> - http://j.mp/SehBiS-App – iPhone-App Sehbehinderungssimulator
>> - http://www.ellen-wille.de - Launch Website (TYPO3)
>> - http://www.vgf-ffm.de - Relaunch Website (TYPO3)
>>
>> Geschäftsführer: O. Dobberkau, S. Schaffstein, G. Wegenast, C. Zabanski
>> Registergericht: Amtsgericht Frankfurt am Main
>> Registernummer: HRB 45590
>>
>>
>>
>> On 28.01.2014, at 15:21, Rupert Westenthaler <ru...@gmail.com> wrote:
>>
>>> Hi Johnny
>>>
>>> On Mon, Jan 27, 2014 at 5:58 PM, Johannes Goslar <jo...@dkd.de> wrote:
>>>> Hi Rupert,
>>>> the docs are really interesting but sadly did not bring me to a solution.
>>>> Removed the config except the *, but Stanbol still behaved the same way.
>>>> Extra models were not installed by hand.
>>>> At the moment the chain is using a Referenced Site pointing to the Sesame
>>>> Sparql Interface.
>>>
>>> So the most likely cause is that the Yard does not suggest the Entity.
>>> So the issue is most likely in the SPARQL query generated for the
>>> Entity Lookup generated by the Entityhub Linking Engine.
>>>
>>> I will try to replicate this, but I will not have time to do it this
>>> week as I am traveling. In the meantime you could try to upload your
>>> RDF data to a ManagedSite backed by a SolrYard.
>>>
>>> best
>>> Rupert
>>>
>>>>
>>>> Cheers
>>>> Johnny
>>>>
>>>> --
>>>> Johannes Goslar
>>>>
>>>> dkd Internet Service GmbH
>>>> development // kommunikation // design
>>>> Kaiserstraße 73
>>>> 60329 Frankfurt am Main
>>>>
>>>> Kontakt:
>>>> - email: johannes.goslar@dkd.de
>>>> - fon: +49 69 2475218-0
>>>> - fax: +49 69 2475218-99
>>>> - web: http://www.dkd.de
>>>> - social media: http://social.dkd.de
>>>>
>>>> Aktuelle Projekte:
>>>> - http://j.mp/SehBiS-App – iPhone-App Sehbehinderungssimulator
>>>> - http://www.ellen-wille.de - Launch Website (TYPO3)
>>>> - http://www.vgf-ffm.de - Relaunch Website (TYPO3)
>>>>
>>>> Geschäftsführer: O. Dobberkau, S. Schaffstein, G. Wegenast, C. Zabanski
>>>> Registergericht: Amtsgericht Frankfurt am Main
>>>> Registernummer: HRB 45590
>>>>
>>>
>>>
>>>
>>> --
>>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>>> | Bodenlehenstraße 11                             ++43-699-11108907
>>> | A-5500 Bischofshofen
>>
>
>
>
> --
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/

Re: Extracting english words in german texts

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Johannes,

thx for the report. I created STANBOL-1277 [1] for this

best
Rupert

[1] https://issues.apache.org/jira/browse/STANBOL-1277


On Wed, Feb 5, 2014 at 2:54 PM, Johannes Goslar <jo...@dkd.de> wrote:
> Hi Rupert,
> yes, moving to a managed site did help.
>
> Looking through logs, the failed the sparql-queries look like:
> FILTER(regex(str(?v_7),"^Global$","i") || regex(str(?v_7),"^Toy$","i") && ((lang(?v_7) = "de") || (lang(?v_7) = "en"))) .
> So the query builder is somewhere wrongly inserting ^$.
>
> best
> Johnny
>
> --
> Johannes Goslar
>
> dkd Internet Service GmbH
> development // kommunikation // design
> Kaiserstraße 73
> 60329 Frankfurt am Main
>
> Kontakt:
> - email: johannes.goslar@dkd.de
> - fon: +49 69 2475218-0
> - fax: +49 69 2475218-99
> - web: http://www.dkd.de
> - social media: http://social.dkd.de
>
> Aktuelle Projekte:
> - http://j.mp/SehBiS-App – iPhone-App Sehbehinderungssimulator
> - http://www.ellen-wille.de - Launch Website (TYPO3)
> - http://www.vgf-ffm.de - Relaunch Website (TYPO3)
>
> Geschäftsführer: O. Dobberkau, S. Schaffstein, G. Wegenast, C. Zabanski
> Registergericht: Amtsgericht Frankfurt am Main
> Registernummer: HRB 45590
>
>
>
> On 28.01.2014, at 15:21, Rupert Westenthaler <ru...@gmail.com> wrote:
>
>> Hi Johnny
>>
>> On Mon, Jan 27, 2014 at 5:58 PM, Johannes Goslar <jo...@dkd.de> wrote:
>>> Hi Rupert,
>>> the docs are really interesting but sadly did not bring me to a solution.
>>> Removed the config except the *, but Stanbol still behaved the same way.
>>> Extra models were not installed by hand.
>>> At the moment the chain is using a Referenced Site pointing to the Sesame
>>> Sparql Interface.
>>
>> So the most likely cause is that the Yard does not suggest the Entity.
>> So the issue is most likely in the SPARQL query generated for the
>> Entity Lookup generated by the Entityhub Linking Engine.
>>
>> I will try to replicate this, but I will not have time to do it this
>> week as I am traveling. In the meantime you could try to upload your
>> RDF data to a ManagedSite backed by a SolrYard.
>>
>> best
>> Rupert
>>
>>>
>>> Cheers
>>> Johnny
>>>
>>> --
>>> Johannes Goslar
>>>
>>> dkd Internet Service GmbH
>>> development // kommunikation // design
>>> Kaiserstraße 73
>>> 60329 Frankfurt am Main
>>>
>>> Kontakt:
>>> - email: johannes.goslar@dkd.de
>>> - fon: +49 69 2475218-0
>>> - fax: +49 69 2475218-99
>>> - web: http://www.dkd.de
>>> - social media: http://social.dkd.de
>>>
>>> Aktuelle Projekte:
>>> - http://j.mp/SehBiS-App – iPhone-App Sehbehinderungssimulator
>>> - http://www.ellen-wille.de - Launch Website (TYPO3)
>>> - http://www.vgf-ffm.de - Relaunch Website (TYPO3)
>>>
>>> Geschäftsführer: O. Dobberkau, S. Schaffstein, G. Wegenast, C. Zabanski
>>> Registergericht: Amtsgericht Frankfurt am Main
>>> Registernummer: HRB 45590
>>>
>>
>>
>>
>> --
>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>> | Bodenlehenstraße 11                             ++43-699-11108907
>> | A-5500 Bischofshofen
>



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen