You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@stanbol.apache.org by srecko joksimovic <sr...@gmail.com> on 2012/01/10 14:01:09 UTC

Annotating using DBPedia ontology

Hi,

Until now I used my ontology when I wanted to annotate document (or text).
Now I would like to use DBPedia ontology. Do I have to download ontology
and configure Stanbol like I did before, using

curl -v -X POST -H "Content-Type: application/rdf+xml" --data
"@acm-ccs_proton.owl" http://localhost:8080/entityhub/entity

or there is another procedure? Does Stanbol use DBPedia ontology by
default, or I have to configure something similar like when I use another
ontology?

RE: Annotating using DBPedia ontology

Posted by Srecko Joksimovic <sr...@gmail.com>.

Thank you Rupert.

I will try these options. I didn't have problems like this with Linux. Does
this have something to do with Windows?

Best,
Srecko

-----Original Message-----
From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
Sent: Wednesday, January 11, 2012 21:47
To: Srecko Joksimovic
Cc: stanbol-dev@incubator.apache.org
Subject: Re: Annotating using DBPedia ontology

Hi Srecko

On [1] is mentioned that setting the system property 

    java.nio.debug=pipe 

might help to get some additional information on why this is happening. 

In addition I found [2] that says that the system properties

    org.apache.felix.http.nio
    org.apache.felix.https.nio

can be used to deactivate the use of NIO for the http and https service of
Apache Felix. 

[1]
http://publib.boulder.ibm.com/infocenter/javasdk/v6r0/topic/com.ibm.java.doc
.user.win32.60/user/limitations.html
[2] https://issues.apache.org/jira/browse/FELIX-2398


best
Rupert

On 11.01.2012, at 20:40, Srecko Joksimovic wrote:

> Hi Rupert,
> 
> I thought so, but I checked firewall and turned it off. You are right, it
is
> Windows now. I don't have problems like this one when I use Linux, but now
I
> need Windows.
> 
> I don't have another firewall then default one, and that one is turned
off.
> 
> Best,
> Srecko
> 
> -----Original Message-----
> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
> Sent: Wednesday, January 11, 2012 20:28
> To: Srecko Joksimovic
> Cc: stanbol-dev@incubator.apache.org
> Subject: Re: Annotating using DBPedia ontology
> 
> Hi Srecko
> 
> I googled for this exception and 90%+ of all pages had to do with Firewall
> configurations on Windows machines.
> 
> The best description I found was on
> http://weblogs.java.net/blog/binod/archive/2006/12/glassfish_and_w.html
> 
> About the enhancement result you posted: This is what the result looks
like
> if only the Metaxa and the LangId Engine are active. So I assume that the
> other engines where not activated correctly. Maybe because of  the
> IOException
> 
> Can you please check if you use a Firewall that could cause this? Are you
> running Stanbol on Windos?
> 
> best
> Rupert
> 
> 
> 
> On 11.01.2012, at 19:46, Srecko Joksimovic wrote:
> 
>> Hi Rupert,
>> 
>> I configured Stanbol, and I thought everything is alright because I could
>> access Stanbol at http://localhost:8080.
>> But, I noticed that during the startup I'm getting this error:
>> 
>> [WARNING] failed org.mortbay.jetty.nio.SelectChannelConnector$1@29978933:
>> java.i
>> o.IOException: Unable to establish loopback connection
>> [WARNING] failed SelectChannelConnector@0.0.0.0:8080:
java.io.IOException:
>> Unabl
>> e to establish loopback connection
>> [WARNING] failed Server@62d844a9: java.io.IOException: Unable to
establish
>> loopb
>> ack connection
>> [ERROR] Exception while initializing Jetty.
>> java.io.IOException: Unable to establish loopback connection
>>       at sun.nio.ch.PipeImpl$Initializer.run(Unknown Source)
>>       at java.security.AccessController.doPrivileged(Native Method)
>>       at sun.nio.ch.PipeImpl.<init>(Unknown Source)
>>       at sun.nio.ch.SelectorProviderImpl.openPipe(Unknown Source)
>>       at java.nio.channels.Pipe.open(Unknown Source)
>>       at sun.nio.ch.WindowsSelectorImpl.<init>(Unknown Source)
>>       at sun.nio.ch.WindowsSelectorProvider.openSelector(Unknown Source)
>>       at java.nio.channels.Selector.open(Unknown Source)
>>       at
>> org.mortbay.io.nio.SelectorManager$SelectSet.<init>(SelectorManager.j
>> ava:312)
>>       at
>> org.mortbay.io.nio.SelectorManager.doStart(SelectorManager.java:223)
>>       at
>> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
>> 50)
>>       at
>> org.mortbay.jetty.nio.SelectChannelConnector.doStart(SelectChannelCon
>> nector.java:314)
>>       at
>> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
>> 50)
>>       at org.mortbay.jetty.Server.doStart(Server.java:235)
>>       at
>> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
>> 50)
>>       at
>> org.apache.felix.http.jetty.internal.JettyService.initializeJetty(Jet
>> tyService.java:164)
>>       at
>> org.apache.felix.http.jetty.internal.JettyService.startJetty(JettySer
>> vice.java:115)
>>       at
>> org.apache.felix.http.jetty.internal.JettyService.run(JettyService.ja
>> va:290)
>>       at java.lang.Thread.run(Unknown Source)
>> Caused by: java.nio.channels.ClosedByInterruptException
>>       at java.nio.channels.spi.AbstractInterruptibleChannel.end(Unknown
>> Source
>> )
>>       at sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
>>       at java.nio.channels.SocketChannel.open(Unknown Source)
>>       ... 19 more
>> 
>> There is another thing. When I try to annotate text from application, or
>> using web interface, I'm getting something like this:
>> 
>> <rdf:RDF
>>   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>   xmlns:j.0="http://www.semanticdesktop.org/ontologies/2007/01/19/nie#"
>>   xmlns:j.1="http://purl.org/dc/terms/"
>>   xmlns:j.2="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#"
>>   xmlns:j.3="http://fise.iks-project.eu/ontology/" > 
>> <rdf:Description
>> rdf:about="urn:enhancement-39c09311-3095-fbb1-0dfe-551f6fba2baa">
>>   <rdf:type
>> rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>>   <j.3:extracted-from
>> 
>
rdf:resource="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91
>> "/>
>>   <j.1:created
>> 
>
rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-11T18:44:11
>> .271Z</j.1:created>
>>   <j.1:creator
>> 
>
rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.en
>> hancer.engines.metaxa.MetaxaEngine</j.1:creator>
>>   <rdf:type
>> rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>>   <j.3:confidence
>> 
>
rdf:datatype="http://www.w3.org/2001/XMLSchema#double">1.0</j.3:confidence>
>> </rdf:Description>
>> <rdf:Description
>> rdf:about="urn:enhancement-9e659b3e-8978-7191-eb8b-fa7030c2ff68">
>>   <j.1:language>en</j.1:language>
>>   <rdf:type
>> rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>>   <j.3:extracted-from
>> 
>
rdf:resource="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91
>> "/>
>>   <j.1:created
>> 
>
rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-11T18:44:11
>> .278Z</j.1:created>
>>   <j.1:creator
>> 
>
rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.en
>> hancer.engines.langid.LangIdEnhancementEngine</j.1:creator>
>>   <rdf:type
>> rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>> </rdf:Description>
>> <rdf:Description
>> 
>
rdf:about="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91">
>>   <rdf:type
>> 
>
rdf:resource="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Plain
>> TextDocument"/>
>>   <j.0:plainTextContent>The Web's children became parents. They use tools
>> which can limit the access and the spreading of the information by their
>> children. So, the parents can see at any time the web's logs of their
>> children but they also have a net which is going to filter their
"private"
>> identity before it is broadcasted on the network. For example, a
> third-part
>> trust entity, along with their mobile telephone provider, the post office
>> and the bank, will possess the consumer's identity so as to mask the
> address
>> of delivery and the payment of this consumer. A public identity also
> exists
>> to spread a resume (CV), a blog or an avatar for example but the data
> remain
>> the property of the owner of the server who hosts this data. So, the
> mobile
>> telephone provider offers a personal server who will contain one public
> zone
>> who will automatically be copied on the network after every modification.
> If
>> I want that my resume is not any longer on the network, I just have to
> erase
>> it of my public zone from my server. So, the mobile telephone provider
>> creates a controllable silo of information for every public
>> profile.</j.0:plainTextContent>
>> </rdf:Description>
>> </rdf:RDF>
>> 
>> I am not sure that this is the content I should get.
>> Please, help :)
>> 
>> Best,
>> Srecko
>> 
>> -----Original Message-----
>> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
>> Sent: Tuesday, January 10, 2012 15:33
>> To: srecko joksimovic
>> Cc: stanbol-dev@incubator.apache.org
>> Subject: Re: Annotating using DBPedia ontology
>> 
>> Hi Srecko
>> 
>>> 
>>> curl -v -X POST -H "Content-Type: application/rdf+xml" --data
>> "@acm-ccs_proton.owl" http://localhost:8080/entityhub/entity
>>> 
>> 
>> No I would not propose you to upload the dbpedia dataset by using POST to
>> the entityhub. This is fine for small and medium sized datasets, but will
>> not work for dbpedia.
>> 
>> Stanbol comes already with a small sample set of DBPedia. This is also
> used
>> for enhancing documents with the default configuration.
>> 
>> This sample dataset contains the 43k DBPedia.org entities with the most
>> incoming links including some often used properties includinglabels in
> about
>> 10 languages, the english comments, types, redirects stored as
> rdf:seeAlso,
>> lat/long, populations, birth/death dates, home pages, and category
>> assignments stored in dc-terms:subject.
>> 
>> You can easily upgrade this index to a bigger version by downloading the
>> dbpedia.solrindex.zip file form [1] and copying it into the
> /sling/datafiles
>> folder within the directory where your Stanbol server is running. After
> some
>> minutes (the time your computer needs to extract a file with ~3GByte) the
>> bigger index will replace the sample set included in the launcher.
>> 
>> If you need some additional fields, languages . you can also create your
> own
>> index by using the indexing tool for dbpedia [2]. See the README.md file
> for
>> instructions.
>> 
>> best
>> Rupert
>> 
>> [1] http://dev.iks-project.eu/downloads/stanbol-indices/dbpedia-3.7/
>> [2]
>> 
>
https://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/indexing/
>> dbpedia/
>> 
>> On 10.01.2012, at 14:01, srecko joksimovic wrote:
>> 
>>> Hi,
>>> 
>>> Until now I used my ontology when I wanted to annotate document (or
> text).
>> Now I would like to use DBPedia ontology. Do I have to download ontology
> and
>> configure Stanbol like I did before, using
>>> 
>>> curl -v -X POST -H "Content-Type: application/rdf+xml" --data
>> "@acm-ccs_proton.owl" http://localhost:8080/entityhub/entity
>>> 
>>> or there is another procedure? Does Stanbol use DBPedia ontology by
>> default, or I have to configure something similar like when I use another
>> ontology?
>>> 
>> 
>

Re: Annotating using DBPedia ontology

Posted by Rupert Westenthaler <ru...@gmail.com>.

Hi Srecko

On [1] is mentioned that setting the system property 

    java.nio.debug=pipe 

might help to get some additional information on why this is happening. 

In addition I found [2] that says that the system properties

    org.apache.felix.http.nio
    org.apache.felix.https.nio

can be used to deactivate the use of NIO for the http and https service of Apache Felix. 

[1] http://publib.boulder.ibm.com/infocenter/javasdk/v6r0/topic/com.ibm.java.doc.user.win32.60/user/limitations.html
[2] https://issues.apache.org/jira/browse/FELIX-2398


best
Rupert

On 11.01.2012, at 20:40, Srecko Joksimovic wrote:

> Hi Rupert,
> 
> I thought so, but I checked firewall and turned it off. You are right, it is
> Windows now. I don't have problems like this one when I use Linux, but now I
> need Windows.
> 
> I don't have another firewall then default one, and that one is turned off.
> 
> Best,
> Srecko
> 
> -----Original Message-----
> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
> Sent: Wednesday, January 11, 2012 20:28
> To: Srecko Joksimovic
> Cc: stanbol-dev@incubator.apache.org
> Subject: Re: Annotating using DBPedia ontology
> 
> Hi Srecko
> 
> I googled for this exception and 90%+ of all pages had to do with Firewall
> configurations on Windows machines.
> 
> The best description I found was on
> http://weblogs.java.net/blog/binod/archive/2006/12/glassfish_and_w.html
> 
> About the enhancement result you posted: This is what the result looks like
> if only the Metaxa and the LangId Engine are active. So I assume that the
> other engines where not activated correctly. Maybe because of  the
> IOException
> 
> Can you please check if you use a Firewall that could cause this? Are you
> running Stanbol on Windos?
> 
> best
> Rupert
> 
> 
> 
> On 11.01.2012, at 19:46, Srecko Joksimovic wrote:
> 
>> Hi Rupert,
>> 
>> I configured Stanbol, and I thought everything is alright because I could
>> access Stanbol at http://localhost:8080.
>> But, I noticed that during the startup I'm getting this error:
>> 
>> [WARNING] failed org.mortbay.jetty.nio.SelectChannelConnector$1@29978933:
>> java.i
>> o.IOException: Unable to establish loopback connection
>> [WARNING] failed SelectChannelConnector@0.0.0.0:8080: java.io.IOException:
>> Unabl
>> e to establish loopback connection
>> [WARNING] failed Server@62d844a9: java.io.IOException: Unable to establish
>> loopb
>> ack connection
>> [ERROR] Exception while initializing Jetty.
>> java.io.IOException: Unable to establish loopback connection
>>       at sun.nio.ch.PipeImpl$Initializer.run(Unknown Source)
>>       at java.security.AccessController.doPrivileged(Native Method)
>>       at sun.nio.ch.PipeImpl.<init>(Unknown Source)
>>       at sun.nio.ch.SelectorProviderImpl.openPipe(Unknown Source)
>>       at java.nio.channels.Pipe.open(Unknown Source)
>>       at sun.nio.ch.WindowsSelectorImpl.<init>(Unknown Source)
>>       at sun.nio.ch.WindowsSelectorProvider.openSelector(Unknown Source)
>>       at java.nio.channels.Selector.open(Unknown Source)
>>       at
>> org.mortbay.io.nio.SelectorManager$SelectSet.<init>(SelectorManager.j
>> ava:312)
>>       at
>> org.mortbay.io.nio.SelectorManager.doStart(SelectorManager.java:223)
>>       at
>> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
>> 50)
>>       at
>> org.mortbay.jetty.nio.SelectChannelConnector.doStart(SelectChannelCon
>> nector.java:314)
>>       at
>> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
>> 50)
>>       at org.mortbay.jetty.Server.doStart(Server.java:235)
>>       at
>> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
>> 50)
>>       at
>> org.apache.felix.http.jetty.internal.JettyService.initializeJetty(Jet
>> tyService.java:164)
>>       at
>> org.apache.felix.http.jetty.internal.JettyService.startJetty(JettySer
>> vice.java:115)
>>       at
>> org.apache.felix.http.jetty.internal.JettyService.run(JettyService.ja
>> va:290)
>>       at java.lang.Thread.run(Unknown Source)
>> Caused by: java.nio.channels.ClosedByInterruptException
>>       at java.nio.channels.spi.AbstractInterruptibleChannel.end(Unknown
>> Source
>> )
>>       at sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
>>       at java.nio.channels.SocketChannel.open(Unknown Source)
>>       ... 19 more
>> 
>> There is another thing. When I try to annotate text from application, or
>> using web interface, I'm getting something like this:
>> 
>> <rdf:RDF
>>   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>   xmlns:j.0="http://www.semanticdesktop.org/ontologies/2007/01/19/nie#"
>>   xmlns:j.1="http://purl.org/dc/terms/"
>>   xmlns:j.2="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#"
>>   xmlns:j.3="http://fise.iks-project.eu/ontology/" > 
>> <rdf:Description
>> rdf:about="urn:enhancement-39c09311-3095-fbb1-0dfe-551f6fba2baa">
>>   <rdf:type
>> rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>>   <j.3:extracted-from
>> 
> rdf:resource="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91
>> "/>
>>   <j.1:created
>> 
> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-11T18:44:11
>> .271Z</j.1:created>
>>   <j.1:creator
>> 
> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.en
>> hancer.engines.metaxa.MetaxaEngine</j.1:creator>
>>   <rdf:type
>> rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>>   <j.3:confidence
>> 
> rdf:datatype="http://www.w3.org/2001/XMLSchema#double">1.0</j.3:confidence>
>> </rdf:Description>
>> <rdf:Description
>> rdf:about="urn:enhancement-9e659b3e-8978-7191-eb8b-fa7030c2ff68">
>>   <j.1:language>en</j.1:language>
>>   <rdf:type
>> rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>>   <j.3:extracted-from
>> 
> rdf:resource="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91
>> "/>
>>   <j.1:created
>> 
> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-11T18:44:11
>> .278Z</j.1:created>
>>   <j.1:creator
>> 
> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.en
>> hancer.engines.langid.LangIdEnhancementEngine</j.1:creator>
>>   <rdf:type
>> rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>> </rdf:Description>
>> <rdf:Description
>> 
> rdf:about="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91">
>>   <rdf:type
>> 
> rdf:resource="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Plain
>> TextDocument"/>
>>   <j.0:plainTextContent>The Web's children became parents. They use tools
>> which can limit the access and the spreading of the information by their
>> children. So, the parents can see at any time the web's logs of their
>> children but they also have a net which is going to filter their "private"
>> identity before it is broadcasted on the network. For example, a
> third-part
>> trust entity, along with their mobile telephone provider, the post office
>> and the bank, will possess the consumer's identity so as to mask the
> address
>> of delivery and the payment of this consumer. A public identity also
> exists
>> to spread a resume (CV), a blog or an avatar for example but the data
> remain
>> the property of the owner of the server who hosts this data. So, the
> mobile
>> telephone provider offers a personal server who will contain one public
> zone
>> who will automatically be copied on the network after every modification.
> If
>> I want that my resume is not any longer on the network, I just have to
> erase
>> it of my public zone from my server. So, the mobile telephone provider
>> creates a controllable silo of information for every public
>> profile.</j.0:plainTextContent>
>> </rdf:Description>
>> </rdf:RDF>
>> 
>> I am not sure that this is the content I should get.
>> Please, help :)
>> 
>> Best,
>> Srecko
>> 
>> -----Original Message-----
>> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
>> Sent: Tuesday, January 10, 2012 15:33
>> To: srecko joksimovic
>> Cc: stanbol-dev@incubator.apache.org
>> Subject: Re: Annotating using DBPedia ontology
>> 
>> Hi Srecko
>> 
>>> 
>>> curl -v -X POST -H "Content-Type: application/rdf+xml" --data
>> "@acm-ccs_proton.owl" http://localhost:8080/entityhub/entity
>>> 
>> 
>> No I would not propose you to upload the dbpedia dataset by using POST to
>> the entityhub. This is fine for small and medium sized datasets, but will
>> not work for dbpedia.
>> 
>> Stanbol comes already with a small sample set of DBPedia. This is also
> used
>> for enhancing documents with the default configuration.
>> 
>> This sample dataset contains the 43k DBPedia.org entities with the most
>> incoming links including some often used properties includinglabels in
> about
>> 10 languages, the english comments, types, redirects stored as
> rdf:seeAlso,
>> lat/long, populations, birth/death dates, home pages, and category
>> assignments stored in dc-terms:subject.
>> 
>> You can easily upgrade this index to a bigger version by downloading the
>> dbpedia.solrindex.zip file form [1] and copying it into the
> /sling/datafiles
>> folder within the directory where your Stanbol server is running. After
> some
>> minutes (the time your computer needs to extract a file with ~3GByte) the
>> bigger index will replace the sample set included in the launcher.
>> 
>> If you need some additional fields, languages . you can also create your
> own
>> index by using the indexing tool for dbpedia [2]. See the README.md file
> for
>> instructions.
>> 
>> best
>> Rupert
>> 
>> [1] http://dev.iks-project.eu/downloads/stanbol-indices/dbpedia-3.7/
>> [2]
>> 
> https://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/indexing/
>> dbpedia/
>> 
>> On 10.01.2012, at 14:01, srecko joksimovic wrote:
>> 
>>> Hi,
>>> 
>>> Until now I used my ontology when I wanted to annotate document (or
> text).
>> Now I would like to use DBPedia ontology. Do I have to download ontology
> and
>> configure Stanbol like I did before, using
>>> 
>>> curl -v -X POST -H "Content-Type: application/rdf+xml" --data
>> "@acm-ccs_proton.owl" http://localhost:8080/entityhub/entity
>>> 
>>> or there is another procedure? Does Stanbol use DBPedia ontology by
>> default, or I have to configure something similar like when I use another
>> ontology?
>>> 
>> 
>

RE: Annotating using DBPedia ontology

Posted by Srecko Joksimovic <sr...@gmail.com>.

Hi Rupert,

I thought so, but I checked firewall and turned it off. You are right, it is
Windows now. I don't have problems like this one when I use Linux, but now I
need Windows.

I don't have another firewall then default one, and that one is turned off.

Best,
Srecko

-----Original Message-----
From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
Sent: Wednesday, January 11, 2012 20:28
To: Srecko Joksimovic
Cc: stanbol-dev@incubator.apache.org
Subject: Re: Annotating using DBPedia ontology

Hi Srecko

I googled for this exception and 90%+ of all pages had to do with Firewall
configurations on Windows machines.

The best description I found was on
http://weblogs.java.net/blog/binod/archive/2006/12/glassfish_and_w.html

About the enhancement result you posted: This is what the result looks like
if only the Metaxa and the LangId Engine are active. So I assume that the
other engines where not activated correctly. Maybe because of  the
IOException

Can you please check if you use a Firewall that could cause this? Are you
running Stanbol on Windos?

best
Rupert



On 11.01.2012, at 19:46, Srecko Joksimovic wrote:

> Hi Rupert,
> 
> I configured Stanbol, and I thought everything is alright because I could
> access Stanbol at http://localhost:8080.
> But, I noticed that during the startup I'm getting this error:
> 
> [WARNING] failed org.mortbay.jetty.nio.SelectChannelConnector$1@29978933:
> java.i
> o.IOException: Unable to establish loopback connection
> [WARNING] failed SelectChannelConnector@0.0.0.0:8080: java.io.IOException:
> Unabl
> e to establish loopback connection
> [WARNING] failed Server@62d844a9: java.io.IOException: Unable to establish
> loopb
> ack connection
> [ERROR] Exception while initializing Jetty.
> java.io.IOException: Unable to establish loopback connection
>        at sun.nio.ch.PipeImpl$Initializer.run(Unknown Source)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at sun.nio.ch.PipeImpl.<init>(Unknown Source)
>        at sun.nio.ch.SelectorProviderImpl.openPipe(Unknown Source)
>        at java.nio.channels.Pipe.open(Unknown Source)
>        at sun.nio.ch.WindowsSelectorImpl.<init>(Unknown Source)
>        at sun.nio.ch.WindowsSelectorProvider.openSelector(Unknown Source)
>        at java.nio.channels.Selector.open(Unknown Source)
>        at
> org.mortbay.io.nio.SelectorManager$SelectSet.<init>(SelectorManager.j
> ava:312)
>        at
> org.mortbay.io.nio.SelectorManager.doStart(SelectorManager.java:223)
>        at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
> 50)
>        at
> org.mortbay.jetty.nio.SelectChannelConnector.doStart(SelectChannelCon
> nector.java:314)
>        at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
> 50)
>        at org.mortbay.jetty.Server.doStart(Server.java:235)
>        at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
> 50)
>        at
> org.apache.felix.http.jetty.internal.JettyService.initializeJetty(Jet
> tyService.java:164)
>        at
> org.apache.felix.http.jetty.internal.JettyService.startJetty(JettySer
> vice.java:115)
>        at
> org.apache.felix.http.jetty.internal.JettyService.run(JettyService.ja
> va:290)
>        at java.lang.Thread.run(Unknown Source)
> Caused by: java.nio.channels.ClosedByInterruptException
>        at java.nio.channels.spi.AbstractInterruptibleChannel.end(Unknown
> Source
> )
>        at sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
>        at java.nio.channels.SocketChannel.open(Unknown Source)
>        ... 19 more
> 
> There is another thing. When I try to annotate text from application, or
> using web interface, I'm getting something like this:
> 
> <rdf:RDF
>    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>    xmlns:j.0="http://www.semanticdesktop.org/ontologies/2007/01/19/nie#"
>    xmlns:j.1="http://purl.org/dc/terms/"
>    xmlns:j.2="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#"
>    xmlns:j.3="http://fise.iks-project.eu/ontology/" > 
>  <rdf:Description
> rdf:about="urn:enhancement-39c09311-3095-fbb1-0dfe-551f6fba2baa">
>    <rdf:type
> rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>    <j.3:extracted-from
>
rdf:resource="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91
> "/>
>    <j.1:created
>
rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-11T18:44:11
> .271Z</j.1:created>
>    <j.1:creator
>
rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.en
> hancer.engines.metaxa.MetaxaEngine</j.1:creator>
>    <rdf:type
> rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>    <j.3:confidence
>
rdf:datatype="http://www.w3.org/2001/XMLSchema#double">1.0</j.3:confidence>
>  </rdf:Description>
>  <rdf:Description
> rdf:about="urn:enhancement-9e659b3e-8978-7191-eb8b-fa7030c2ff68">
>    <j.1:language>en</j.1:language>
>    <rdf:type
> rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>    <j.3:extracted-from
>
rdf:resource="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91
> "/>
>    <j.1:created
>
rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-11T18:44:11
> .278Z</j.1:created>
>    <j.1:creator
>
rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.en
> hancer.engines.langid.LangIdEnhancementEngine</j.1:creator>
>    <rdf:type
> rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>  </rdf:Description>
>  <rdf:Description
>
rdf:about="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91">
>    <rdf:type
>
rdf:resource="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Plain
> TextDocument"/>
>    <j.0:plainTextContent>The Web's children became parents. They use tools
> which can limit the access and the spreading of the information by their
> children. So, the parents can see at any time the web's logs of their
> children but they also have a net which is going to filter their "private"
> identity before it is broadcasted on the network. For example, a
third-part
> trust entity, along with their mobile telephone provider, the post office
> and the bank, will possess the consumer's identity so as to mask the
address
> of delivery and the payment of this consumer. A public identity also
exists
> to spread a resume (CV), a blog or an avatar for example but the data
remain
> the property of the owner of the server who hosts this data. So, the
mobile
> telephone provider offers a personal server who will contain one public
zone
> who will automatically be copied on the network after every modification.
If
> I want that my resume is not any longer on the network, I just have to
erase
> it of my public zone from my server. So, the mobile telephone provider
> creates a controllable silo of information for every public
> profile.</j.0:plainTextContent>
>  </rdf:Description>
> </rdf:RDF>
> 
> I am not sure that this is the content I should get.
> Please, help :)
> 
> Best,
> Srecko
> 
> -----Original Message-----
> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
> Sent: Tuesday, January 10, 2012 15:33
> To: srecko joksimovic
> Cc: stanbol-dev@incubator.apache.org
> Subject: Re: Annotating using DBPedia ontology
> 
> Hi Srecko
> 
>> 
>> curl -v -X POST -H "Content-Type: application/rdf+xml" --data
> "@acm-ccs_proton.owl" http://localhost:8080/entityhub/entity
>> 
> 
> No I would not propose you to upload the dbpedia dataset by using POST to
> the entityhub. This is fine for small and medium sized datasets, but will
> not work for dbpedia.
> 
> Stanbol comes already with a small sample set of DBPedia. This is also
used
> for enhancing documents with the default configuration.
> 
> This sample dataset contains the 43k DBPedia.org entities with the most
> incoming links including some often used properties includinglabels in
about
> 10 languages, the english comments, types, redirects stored as
rdf:seeAlso,
> lat/long, populations, birth/death dates, home pages, and category
> assignments stored in dc-terms:subject.
> 
> You can easily upgrade this index to a bigger version by downloading the
> dbpedia.solrindex.zip file form [1] and copying it into the
/sling/datafiles
> folder within the directory where your Stanbol server is running. After
some
> minutes (the time your computer needs to extract a file with ~3GByte) the
> bigger index will replace the sample set included in the launcher.
> 
> If you need some additional fields, languages . you can also create your
own
> index by using the indexing tool for dbpedia [2]. See the README.md file
for
> instructions.
> 
> best
> Rupert
> 
> [1] http://dev.iks-project.eu/downloads/stanbol-indices/dbpedia-3.7/
> [2]
>
https://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/indexing/
> dbpedia/
> 
> On 10.01.2012, at 14:01, srecko joksimovic wrote:
> 
>> Hi,
>> 
>> Until now I used my ontology when I wanted to annotate document (or
text).
> Now I would like to use DBPedia ontology. Do I have to download ontology
and
> configure Stanbol like I did before, using
>> 
>> curl -v -X POST -H "Content-Type: application/rdf+xml" --data
> "@acm-ccs_proton.owl" http://localhost:8080/entityhub/entity
>> 
>> or there is another procedure? Does Stanbol use DBPedia ontology by
> default, or I have to configure something similar like when I use another
> ontology?
>> 
>

Re: Annotating using DBPedia ontology

Posted by srecko joksimovic <sr...@gmail.com>.

Hi Rupert,

There is also problem with web page. That means that text works just fine,
but I have problems with documents and web pages. When I try to annotate
web page, this is what I get:

http://en.wikipedia.org/wiki/Semantic_Web // URL to annotate
**** 174030  // file length - this means that you were right about
GetMethod, and it works now
174030 text/html // file length and content type just before calling
annotate method
ERROR <html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
<title>Error 500 INTERNAL_SERVER_ERROR</title>
</head>
<body><h2>HTTP ERROR 500</h2>
<p>Problem accessing /engines. Reason:
<pre>    INTERNAL_SERVER_ERROR</pre></p><h3>Caused
by:</h3><pre>org.apache.stanbol.enhancer.servicesapi.EngineException
at
org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(MetaxaEngine.java:191)
at
org.apache.stanbol.enhancer.jobmanager.impl.WeightedJobManager.enhanceContent(WeightedJobManager.java:80)
at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceAndBuildResponse(EnginesRootResource.java:175)
at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceFromData(EnginesRootResource.java:167)
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
at
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
at
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
at
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
at
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
at
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1465)
at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1396)
at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1345)
at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1335)
at
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537)
at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:699)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(ServletHandler.java:96)
at
org.apache.felix.http.base.internal.handler.ServletHandler.handle(ServletHandler.java:79)
at
org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(ServletPipeline.java:42)
at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:49)
at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:33)
at
org.apache.stanbol.commons.httpqueryheaders.impl.QueryHeadersFilter.doFilter(QueryHeadersFilter.java:75)
at
org.apache.felix.http.base.internal.handler.FilterHandler.doHandle(FilterHandler.java:88)
at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandler.java:76)
at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:47)
at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:33)
at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandler.java:78)
at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:47)
at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:33)
at
org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(FilterPipeline.java:48)
at
org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(Dispatcher.java:39)
at
org.apache.felix.http.base.internal.DispatcherServlet.service(DispatcherServlet.java:67)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:943)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: org.semanticdesktop.aperture.extractor.ExtractorException
at
org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(XsltExtractor.java:147)
at
org.apache.stanbol.enhancer.engines.metaxa.core.html.IksHtmlExtractor.extract(IksHtmlExtractor.java:123)
at
org.apache.stanbol.enhancer.engines.metaxa.core.MetaxaCore.extract(MetaxaCore.java:120)
at
org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(MetaxaEngine.java:157)
... 51 more
Caused by: java.io.IOException
at org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:661)
at org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:652)
at
org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(XsltExtractor.java:140)
... 54 more
Caused by: org.openrdf.rio.RDFParseException: Not a valid (absolute) URI: //
creativecommons.org/licenses/by-sa/3.0/ [line 4, column 179]
at
org.openrdf.rio.helpers.RDFParserBase.reportFatalError(RDFParserBase.java:533)
at
org.openrdf.rio.rdfxml.RDFXMLParser.reportFatalError(RDFXMLParser.java:1068)
at org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:285)
at org.openrdf.rio.helpers.RDFParserBase.resolveURI(RDFParserBase.java:272)
at
org.openrdf.rio.rdfxml.RDFXMLParser.getPropertyResource(RDFXMLParser.java:751)
at
org.openrdf.rio.rdfxml.RDFXMLParser.processPropertyElt(RDFXMLParser.java:674)
at org.openrdf.rio.rdfxml.RDFXMLParser.emptyElement(RDFXMLParser.java:378)
at org.openrdf.rio.rdfxml.SAXFilter.endElement(SAXFilter.java:359)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown
Source)
at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:260)
at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:244)
at
org.openrdf.repository.base.RepositoryConnectionBase.addInputStreamOrReader(RepositoryConnectionBase.java:357)
at
org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectionBase.java:312)
at org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:659)
... 56 more
Caused by: java.lang.IllegalArgumentException: Not a valid (absolute) URI:
//creativecommons.org/licenses/by-sa/3.0/
at org.openrdf.model.impl.URIImpl.setURIString(URIImpl.java:68)
at org.openrdf.model.impl.URIImpl.&lt;init&gt;(URIImpl.java:57)
at
org.openrdf.sail.memory.model.MemValueFactory.createURI(MemValueFactory.java:345)
at org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:282)
... 78 more
</pre>
<h3>Caused
by:</h3><pre>org.semanticdesktop.aperture.extractor.ExtractorException
at
org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(XsltExtractor.java:147)
at
org.apache.stanbol.enhancer.engines.metaxa.core.html.IksHtmlExtractor.extract(IksHtmlExtractor.java:123)
at
org.apache.stanbol.enhancer.engines.metaxa.core.MetaxaCore.extract(MetaxaCore.java:120)
at
org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(MetaxaEngine.java:157)
at
org.apache.stanbol.enhancer.jobmanager.impl.WeightedJobManager.enhanceContent(WeightedJobManager.java:80)
at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceAndBuildResponse(EnginesRootResource.java:175)
at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceFromData(EnginesRootResource.java:167)
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
at
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
at
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
at
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
at
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
at
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1465)
at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1396)
at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1345)
at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1335)
at
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537)
at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:699)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(ServletHandler.java:96)
at
org.apache.felix.http.base.internal.handler.ServletHandler.handle(ServletHandler.java:79)
at
org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(ServletPipeline.java:42)
at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:49)
at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:33)
at
org.apache.stanbol.commons.httpqueryheaders.impl.QueryHeadersFilter.doFilter(QueryHeadersFilter.java:75)
at
org.apache.felix.http.base.internal.handler.FilterHandler.doHandle(FilterHandler.java:88)
at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandler.java:76)
at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:47)
at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:33)
at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandler.java:78)
at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:47)
at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:33)
at
org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(FilterPipeline.java:48)
at
org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(Dispatcher.java:39)
at
org.apache.felix.http.base.internal.DispatcherServlet.service(DispatcherServlet.java:67)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:943)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: java.io.IOException
at org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:661)
at org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:652)
at
org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(XsltExtractor.java:140)
... 54 more
Caused by: org.openrdf.rio.RDFParseException: Not a valid (absolute) URI: //
creativecommons.org/licenses/by-sa/3.0/ [line 4, column 179]
at
org.openrdf.rio.helpers.RDFParserBase.reportFatalError(RDFParserBase.java:533)
at
org.openrdf.rio.rdfxml.RDFXMLParser.reportFatalError(RDFXMLParser.java:1068)
at org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:285)
at org.openrdf.rio.helpers.RDFParserBase.resolveURI(RDFParserBase.java:272)
at
org.openrdf.rio.rdfxml.RDFXMLParser.getPropertyResource(RDFXMLParser.java:751)
at
org.openrdf.rio.rdfxml.RDFXMLParser.processPropertyElt(RDFXMLParser.java:674)
at org.openrdf.rio.rdfxml.RDFXMLParser.emptyElement(RDFXMLParser.java:378)
at org.openrdf.rio.rdfxml.SAXFilter.endElement(SAXFilter.java:359)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown
Source)
at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:260)
at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:244)
at
org.openrdf.repository.base.RepositoryConnectionBase.addInputStreamOrReader(RepositoryConnectionBase.java:357)
at
org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectionBase.java:312)
at org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:659)
... 56 more
Caused by: java.lang.IllegalArgumentException: Not a valid (absolute) URI:
//creativecommons.org/licenses/by-sa/3.0/
at org.openrdf.model.impl.URIImpl.setURIString(URIImpl.java:68)
at org.openrdf.model.impl.URIImpl.&lt;init&gt;(URIImpl.java:57)
at
org.openrdf.sail.memory.model.MemValueFactory.createURI(MemValueFactory.java:345)
at org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:282)
... 78 more
</pre>
<h3>Caused by:</h3><pre>java.io.IOException
at org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:661)
at org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:652)
at
org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(XsltExtractor.java:140)
at
org.apache.stanbol.enhancer.engines.metaxa.core.html.IksHtmlExtractor.extract(IksHtmlExtractor.java:123)
at
org.apache.stanbol.enhancer.engines.metaxa.core.MetaxaCore.extract(MetaxaCore.java:120)
at
org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(MetaxaEngine.java:157)
at
org.apache.stanbol.enhancer.jobmanager.impl.WeightedJobManager.enhanceContent(WeightedJobManager.java:80)
at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceAndBuildResponse(EnginesRootResource.java:175)
at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceFromData(EnginesRootResource.java:167)
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
at
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
at
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
at
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
at
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
at
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1465)
at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1396)
at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1345)
at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1335)
at
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537)
at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:699)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(ServletHandler.java:96)
at
org.apache.felix.http.base.internal.handler.ServletHandler.handle(ServletHandler.java:79)
at
org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(ServletPipeline.java:42)
at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:49)
at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:33)
at
org.apache.stanbol.commons.httpqueryheaders.impl.QueryHeadersFilter.doFilter(QueryHeadersFilter.java:75)
at
org.apache.felix.http.base.internal.handler.FilterHandler.doHandle(FilterHandler.java:88)
at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandler.java:76)
at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:47)
at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:33)
at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandler.java:78)
at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:47)
at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:33)
at
org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(FilterPipeline.java:48)
at
org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(Dispatcher.java:39)
at
org.apache.felix.http.base.internal.DispatcherServlet.service(DispatcherServlet.java:67)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:943)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: org.openrdf.rio.RDFParseException: Not a valid (absolute) URI: //
creativecommons.org/licenses/by-sa/3.0/ [line 4, column 179]
at
org.openrdf.rio.helpers.RDFParserBase.reportFatalError(RDFParserBase.java:533)
at
org.openrdf.rio.rdfxml.RDFXMLParser.reportFatalError(RDFXMLParser.java:1068)
at org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:285)
at org.openrdf.rio.helpers.RDFParserBase.resolveURI(RDFParserBase.java:272)
at
org.openrdf.rio.rdfxml.RDFXMLParser.getPropertyResource(RDFXMLParser.java:751)
at
org.openrdf.rio.rdfxml.RDFXMLParser.processPropertyElt(RDFXMLParser.java:674)
at org.openrdf.rio.rdfxml.RDFXMLParser.emptyElement(RDFXMLParser.java:378)
at org.openrdf.rio.rdfxml.SAXFilter.endElement(SAXFilter.java:359)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown
Source)
at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:260)
at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:244)
at
org.openrdf.repository.base.RepositoryConnectionBase.addInputStreamOrReader(RepositoryConnectionBase.java:357)
at
org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectionBase.java:312)
at org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:659)
... 56 more
Caused by: java.lang.IllegalArgumentException: Not a valid (absolute) URI:
//creativecommons.org/licenses/by-sa/3.0/
at org.openrdf.model.impl.URIImpl.setURIString(URIImpl.java:68)
at org.openrdf.model.impl.URIImpl.&lt;init&gt;(URIImpl.java:57)
at
org.openrdf.sail.memory.model.MemValueFactory.createURI(MemValueFactory.java:345)
at org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:282)
... 78 more
</pre>
<h3>Caused by:</h3><pre>org.openrdf.rio.RDFParseException: Not a valid
(absolute) URI: //creativecommons.org/licenses/by-sa/3.0/ [line 4, column
179]
at
org.openrdf.rio.helpers.RDFParserBase.reportFatalError(RDFParserBase.java:533)
at
org.openrdf.rio.rdfxml.RDFXMLParser.reportFatalError(RDFXMLParser.java:1068)
at org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:285)
at org.openrdf.rio.helpers.RDFParserBase.resolveURI(RDFParserBase.java:272)
at
org.openrdf.rio.rdfxml.RDFXMLParser.getPropertyResource(RDFXMLParser.java:751)
at
org.openrdf.rio.rdfxml.RDFXMLParser.processPropertyElt(RDFXMLParser.java:674)
at org.openrdf.rio.rdfxml.RDFXMLParser.emptyElement(RDFXMLParser.java:378)
at org.openrdf.rio.rdfxml.SAXFilter.endElement(SAXFilter.java:359)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown
Source)
at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:260)
at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:244)
at
org.openrdf.repository.base.RepositoryConnectionBase.addInputStreamOrReader(RepositoryConnectionBase.java:357)
at
org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectionBase.java:312)
at org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:659)
at org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:652)
at
org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(XsltExtractor.java:140)
at
org.apache.stanbol.enhancer.engines.metaxa.core.html.IksHtmlExtractor.extract(IksHtmlExtractor.java:123)
at
org.apache.stanbol.enhancer.engines.metaxa.core.MetaxaCore.extract(MetaxaCore.java:120)
at
org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(MetaxaEngine.java:157)
at
org.apache.stanbol.enhancer.jobmanager.impl.WeightedJobManager.enhanceContent(WeightedJobManager.java:80)
at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceAndBuildResponse(EnginesRootResource.java:175)
at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceFromData(EnginesRootResource.java:167)
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
at
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
at
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
at
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
at
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
at
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1465)
at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1396)
at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1345)
at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1335)
at
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537)
at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:699)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(ServletHandler.java:96)
at
org.apache.felix.http.base.internal.handler.ServletHandler.handle(ServletHandler.java:79)
at
org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(ServletPipeline.java:42)
at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:49)
at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:33)
at
org.apache.stanbol.commons.httpqueryheaders.impl.QueryHeadersFilter.doFilter(QueryHeadersFilter.java:75)
at
org.apache.felix.http.base.internal.handler.FilterHandler.doHandle(FilterHandler.java:88)
at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandler.java:76)
at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:47)
at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:33)
at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandler.java:78)
at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:47)
at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:33)
at
org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(FilterPipeline.java:48)
at
org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(Dispatcher.java:39)
at
org.apache.felix.http.base.internal.DispatcherServlet.service(DispatcherServlet.java:67)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:943)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: java.lang.IllegalArgumentException: Not a valid (absolute) URI:
//creativecommons.org/licenses/by-sa/3.0/
at org.openrdf.model.impl.URIImpl.setURIString(URIImpl.java:68)
at org.openrdf.model.impl.URIImpl.&lt;init&gt;(URIImpl.java:57)
at
org.openrdf.sail.memory.model.MemValueFactory.createURI(MemValueFactory.java:345)
at org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:282)
... 78 more
</pre>
<h3>Caused by:</h3><pre>java.lang.IllegalArgumentException: Not a valid
(absolute) URI: //creativecommons.org/licenses/by-sa/3.0/
at org.openrdf.model.impl.URIImpl.setURIString(URIImpl.java:68)
at org.openrdf.model.impl.URIImpl.&lt;init&gt;(URIImpl.java:57)
at
org.openrdf.sail.memory.model.MemValueFactory.createURI(MemValueFactory.java:345)
at org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:282)
at org.openrdf.rio.helpers.RDFParserBase.resolveURI(RDFParserBase.java:272)
at
org.openrdf.rio.rdfxml.RDFXMLParser.getPropertyResource(RDFXMLParser.java:751)
at
org.openrdf.rio.rdfxml.RDFXMLParser.processPropertyElt(RDFXMLParser.java:674)
at org.openrdf.rio.rdfxml.RDFXMLParser.emptyElement(RDFXMLParser.java:378)
at org.openrdf.rio.rdfxml.SAXFilter.endElement(SAXFilter.java:359)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown
Source)
at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:260)
at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:244)
at
org.openrdf.repository.base.RepositoryConnectionBase.addInputStreamOrReader(RepositoryConnectionBase.java:357)
at
org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectionBase.java:312)
at org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:659)
at org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:652)
at
org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(XsltExtractor.java:140)
at
org.apache.stanbol.enhancer.engines.metaxa.core.html.IksHtmlExtractor.extract(IksHtmlExtractor.java:123)
at
org.apache.stanbol.enhancer.engines.metaxa.core.MetaxaCore.extract(MetaxaCore.java:120)
at
org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(MetaxaEngine.java:157)
at
org.apache.stanbol.enhancer.jobmanager.impl.WeightedJobManager.enhanceContent(WeightedJobManager.java:80)
at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceAndBuildResponse(EnginesRootResource.java:175)
at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceFromData(EnginesRootResource.java:167)
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
at
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
at
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
at
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
at
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
at
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1465)
at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1396)
at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1345)
at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1335)
at
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537)
at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:699)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(ServletHandler.java:96)
at
org.apache.felix.http.base.internal.handler.ServletHandler.handle(ServletHandler.java:79)
at
org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(ServletPipeline.java:42)
at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:49)
at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:33)
at
org.apache.stanbol.commons.httpqueryheaders.impl.QueryHeadersFilter.doFilter(QueryHeadersFilter.java:75)
at
org.apache.felix.http.base.internal.handler.FilterHandler.doHandle(FilterHandler.java:88)
at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandler.java:76)
at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:47)
at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:33)
at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandler.java:78)
at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(InvocationFilterChain.java:47)
at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFilterChain.java:33)
at
org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(FilterPipeline.java:48)
at
org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(Dispatcher.java:39)
at
org.apache.felix.http.base.internal.DispatcherServlet.service(DispatcherServlet.java:67)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:943)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
</pre>
<hr /><i><small>Powered by Jetty://</small></i><br/>

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

</body>
</html>

Best,
Srecko

On Thu, Jan 12, 2012 at 6:40 PM, srecko joksimovic <
sreckojoksimovic@gmail.com> wrote:

> Hi Rupert,
>
> I have another question, and I will finish soon.
>
> I tried to annotate pdf document, and I didn't get result I expected. Then
> I put string you sent to me
> "John Smith works for the Apple Inc. in Cupertino, California."
> in MS Word document, and this is the result I got:
>
> <rdf:RDF
>     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>     xmlns:j.0="http://www.semanticdesktop.org/ontologies/2007/01/19/nie#"
>     xmlns:j.1="http://purl.org/dc/terms/"
>     xmlns:j.2="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#"
>     xmlns:j.3="http://fise.iks-project.eu/ontology/" >
>   <rdf:Description
> rdf:about="urn:enhancement-55016818-eb97-7b98-521a-422e3742173b">
>     <rdf:type rdf:resource="
> http://fise.iks-project.eu/ontology/TextAnnotation"/>
>     <j.1:creator rdf:datatype="http://www.w3.org/2001/XMLSchema#string
> ">org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine</j.1:creator>
>     <j.1:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime
> ">2012-01-12T17:34:20.288Z</j.1:created>
>     <j.3:extracted-from
> rdf:resource="urn:content-item-sha1-835c8a5397d9b376a268b7bb5d3c8b4ab7e8b81f"/>
>     <rdf:type rdf:resource="
> http://fise.iks-project.eu/ontology/Enhancement"/>
>     <j.1:language>fr</j.1:language>
>   </rdf:Description>
>   <rdf:Description
> rdf:about="urn:content-item-sha1-835c8a5397d9b376a268b7bb5d3c8b4ab7e8b81f">
>     <rdf:type rdf:resource="
> http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#PaginatedTextDocument
> "/>
>     <j.0:plainTextContent>Microsoft Word-Dokument&#xD;
> srecko</j.0:plainTextContent>
>   </rdf:Description>
>   <rdf:Description
> rdf:about="urn:enhancement-0644a1ed-f1d8-334d-d4e9-690a0446cba8">
>     <j.3:confidence rdf:datatype="http://www.w3.org/2001/XMLSchema#double
> ">1.0</j.3:confidence>
>     <rdf:type rdf:resource="
> http://fise.iks-project.eu/ontology/TextAnnotation"/>
>     <j.1:creator rdf:datatype="http://www.w3.org/2001/XMLSchema#string
> ">org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine</j.1:creator>
>     <j.1:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime
> ">2012-01-12T17:34:20.273Z</j.1:created>
>     <j.3:extracted-from
> rdf:resource="urn:content-item-sha1-835c8a5397d9b376a268b7bb5d3c8b4ab7e8b81f"/>
>     <rdf:type rdf:resource="
> http://fise.iks-project.eu/ontology/Enhancement"/>
>   </rdf:Description>
> </rdf:RDF>
>
>
> and this is the code:
>
> public List<String> Annotate(byte[] _stream_to_annotate,
> ServiceUtils.MIMETypes _content_type, String _encoding)
>  {
> List<String> _return_list = new ArrayList<String>();
>  try
> {
>  URL url = new URL(ServiceUtils.SERVICE_URL);
> HttpURLConnection con = (HttpURLConnection)url.openConnection();
>
>  con.setDoOutput(true);
> con.setRequestMethod("POST");
>  con.setRequestProperty("Accept", "application/rdf+xml");
>
> con.setRequestProperty("Content-type", _content_type.getValue());
>
> java.io.OutputStream out = con.getOutputStream();
>
>  IOUtils.write(_stream_to_annotate, out);
> IOUtils.closeQuietly(out);
>
> con.connect(); //send the request
>
> if(con.getResponseCode() > 299)
>  {
> java.io.InputStream errorStream = con.getErrorStream();
>
>  if(errorStream != null)
> {
>  String errorMessage = IOUtils.toString(errorStream);
>
> IOUtils.closeQuietly(errorStream);
>  }
> else
> {
>  //no error data
> //write default error message with the status code
>
>  }
> }
> else
>  {
> Model model = ModelFactory.createDefaultModel();
>
>  java.io.InputStream enhancementResults = con.getInputStream();
>
>  model.read(enhancementResults, null);
>
>  String queryStringForGraph =  "PREFIX t: <
> http://fise.iks-project.eu/ontology/> " +
>  "SELECT ?label WHERE {?alias t:entity-reference ?label}";
>
> Query query = QueryFactory.create(queryStringForGraph);
>
>  QueryExecution qe = QueryExecutionFactory.create(query, model);
>
>  ResultSet results = qe.execSelect();
> while(results.hasNext())
>  {
> _return_list.add(results.next().toString());
>  }
>  }
> }
> catch(Exception ex)
>  {
> System.out.println(ex.getMessage());
>  }
> return _return_list;
>  }
>
> On Thu, Jan 12, 2012 at 8:32 AM, srecko joksimovic <
> sreckojoksimovic@gmail.com> wrote:
>
>>
>> Hi Rupert,
>>
>> Thank you for the answer. I've probably missed that.
>>
>> Best,
>> Srecko
>>
>>
>> On Thu, Jan 12, 2012 at 6:12 AM, Rupert Westenthaler <
>> rupert.westenthaler@gmail.com> wrote:
>>
>>> Hi Srecko
>>>
>>> I think the last time I directly used this API is about 3-4 years ago,
>>> but after a look at the http client tutorial [1] I think the reason for
>>> your problem is that you do not execute the GetMethod.
>>>
>>> Based on this tutorial the code should look like
>>>
>>>    // Create an instance of HttpClient.
>>>    HttpClient client = new HttpClient();
>>>    GetMethod get = new GetMethod(url);
>>>    try {
>>>        // Execute the method.
>>>        int statusCode = client.executeMethod(get);
>>>        if (statusCode != HttpStatus.SC_OK) {
>>>            //handle the error
>>>        }
>>>        InputStream t_is = get.getResponseBodyAsStream();
>>>        //read the data of the stream
>>>    }
>>>
>>> In addition you should not use a Reader if you want to read byte
>>> oriented data from the input stream.
>>>
>>> hope this helps
>>> best
>>> Rupert
>>>
>>> [1] http://hc.apache.org/httpclient-3.x/tutorial.html
>>>
>>> On 11.01.2012, at 22:34, Srecko Joksimovic wrote:
>>>
>>> > That's it. Thank you!
>>> > I have already configured KeywordLinkingEngine when I used my own
>>> ontology.
>>> > I think I'm familiar with that and I will try that option too.
>>> >
>>> > In meanwhile I found another interesting problem. I tried to annotate
>>> > document and web page. With web page, I tried
>>> > IOUtils.write(byte[], out) and I had to convert URL to byte[]:
>>> >
>>> > public static byte[] GetBytesFromURL(String _url) throws IOException
>>> > {
>>> >       GetMethod get = new GetMethod(_url);
>>> >       InputStream t_is = get.getResponseBodyAsStream();
>>> >       byte[] buffer = new byte[1024];
>>> >       int count = -1;
>>> >       Reader t_url_reader = new BufferedReader(new
>>> > InputStreamReader(t_is));
>>> >       byte[] t_bytes = IOUtils.toByteArray(t_url_reader, "UTF-8");
>>> >
>>> >       return t_bytes;
>>> > }
>>> >
>>> > But, the problem is that I'm getting null for InputStream.
>>> >
>>> > Any ideas?
>>> >
>>> > Best,
>>> > Srecko
>>> >
>>> >
>>> >
>>> > -----Original Message-----
>>> > From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
>>> > Sent: Wednesday, January 11, 2012 22:08
>>> > To: Srecko Joksimovic
>>> > Cc: stanbol-dev@incubator.apache.org
>>> > Subject: Re: Annotating using DBPedia ontology
>>> >
>>> >
>>> > On 11.01.2012, at 21:41, Srecko Joksimovic wrote:
>>> >> Hi Rupert,
>>> >>
>>> >> When I load localhost:8080/engines it says this:
>>> >>
>>> >> There are currently 5 active engines.
>>> >> org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine
>>> >> org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine
>>> >>
>>> >
>>> org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhanc
>>> >> ementEngine
>>> >>
>>> >
>>> org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
>>> >> ine
>>> >>
>>> >
>>> org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
>>> >> ine
>>> >>
>>> >> Maybe this could tell you something?
>>> >>
>>> >
>>> > This are exactly the 5 engines that are expected to run with the
>>> default
>>> > configuration.
>>> > Based on this the Stanbol Enhnacer should just work fine.
>>> >
>>> > After looking at the the text you enhanced I noticed however that is
>>> does
>>> > not mention
>>> > any named entities such as Persons, Organizations and Places. So I
>>> checked
>>> > it with
>>> > my local Stanbol version and was also not any detected entities.
>>> >
>>> > So to check if Stanbol works as expected you should try to use an
>>> other text
>>> > the
>>> > mentions some Named Entities such as
>>> >
>>> >    "John Smith works for the Apple Inc. in Cupertino, California."
>>> >
>>> >
>>> > If you want to search also for entities like "Bank", "Blog",
>>> "Consumer",
>>> > "Telephone" .
>>> > you need to also configure a KeywordLinkingEngine for dbpedia. Part B
>>> or [3]
>>> > provides
>>> > more information on how to do that.
>>> >
>>> > But let me mention that the KeywordLinkingEngine is more useful if
>>> used in
>>> > combination
>>> > with an own domain specific thesaurus rather than a global data set
>>> like
>>> > dbpedia. When
>>> > used with dbpedia you will also get a lot of false positives.
>>> >
>>> > best
>>> > Rupert
>>> >
>>> > [3]
>>> http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html
>>> >
>>>
>>>
>>
>

RE: Annotating using DBPedia ontology

Posted by Srecko Joksimovic <sr...@gmail.com>.

Hi Olivier,
And thank you for the answer. The problem was that on one of my machines I didn’t have the last Stanbol version. Everything is ok now.

Best,
Srecko

-----Original Message-----
From: olivier.grisel@gmail.com [mailto:olivier.grisel@gmail.com] On Behalf Of Olivier Grisel
Sent: Tuesday, January 24, 2012 19:06
To: stanbol-dev@incubator.apache.org
Cc: sreckojoksimovic@gmail.com
Subject: Re: Annotating using DBPedia ontology

2012/1/22 Srecko Joksimovic <sr...@gmail.com>:
> Hi,
>
>
>
> Recently I asked about annotating web resource and it was bug that you
> solved. Now something similar happened when I tried to annotate this URL

Can you please open a jira issue and attach the stacktrace as a text
file on the jira? The formatting of you message is broken in gmail.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Re: Annotating using DBPedia ontology

Posted by Olivier Grisel <ol...@ensta.org>.

2012/1/22 Srecko Joksimovic <sr...@gmail.com>:
> Hi,
>
>
>
> Recently I asked about annotating web resource and it was bug that you
> solved. Now something similar happened when I tried to annotate this URL

Can you please open a jira issue and attach the stacktrace as a text
file on the jira? The formatting of you message is broken in gmail.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Re: Annotating using DBPedia ontology

Posted by srecko joksimovic <sr...@gmail.com>.

Dear Walter,

As I told you, it does look like the previous, and I thought I have the
right version.
I will check again. I apologize for wasting your time, probably I did
something wrong.

Best,
Srecko

On Mon, Jan 23, 2012 at 9:23 AM, Walter Kasper <wk...@apache.org> wrote:

> Dear Srecko,
>
> The error looks much like the one we fixed. We cannot reproduce it on our
> Stanbol installation for the URL you gave. Are you sure, you have the right
> version?
>
> Best regards,
>
> Walter
>
> Srecko Joksimovic wrote:
>
>>
>> Hi,
>>
>> Recently I asked about annotating web resource and it was bug that you
>> solved. Now something similar happened when I tried to annotate this URL
>>
>> http://en.wikipedia.org/wiki/**Software_design_pattern<http://en.wikipedia.org/wiki/Software_design_pattern>
>>
>> This is the output I got:
>>
>> http://en.wikipedia.org/wiki/**Software_design_pattern<http://en.wikipedia.org/wiki/Software_design_pattern>
>>
>> ERROR <html>
>>
>> <head>
>>
>> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
>>
>> <title>Error 500 INTERNAL_SERVER_ERROR</title>
>>
>> </head>
>>
>> <body><h2>HTTP ERROR 500</h2>
>>
>> <p>Problem accessing /engines. Reason:
>>
>> <pre>    INTERNAL_SERVER_ERROR</pre></**p><h3>Caused
>> _by:</h3><pre>org.apache.**stanbol.enhancer.servicesapi.**
>> EngineException_
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.MetaxaEngine.**
>> computeEnhancements(_**MetaxaEngine.java:191_)
>>
>>       at org.apache.stanbol.enhancer.**jobmanager.impl.**
>> WeightedJobManager.**enhanceContent(_**WeightedJobManager.java:80_)
>>
>>       at org.apache.stanbol.enhancer.**jersey.resource.**
>> EnginesRootResource.**enhanceAndBuildResponse(_**
>> EnginesRootResource.java:175_)
>>
>>       at org.apache.stanbol.enhancer.**jersey.resource.**
>> EnginesRootResource.**enhanceFromData(_**EnginesRootResource.java:167_)
>>
>>
>>       at sun.reflect.**GeneratedMethodAccessor35.**invoke(Unknown Source)
>>
>>       at sun.reflect.**DelegatingMethodAccessorImpl.**invoke(Unknown
>> Source)
>>
>>       at java.lang.reflect.Method.**invoke(Unknown Source)
>>
>>       at com.sun.jersey.spi.container.**JavaMethodInvokerFactory$1.**
>> invoke(_**JavaMethodInvokerFactory.java:**60_)
>>
>>       at com.sun.jersey.server.impl.**model.method.dispatch.**
>> AbstractResourceMethodDispatch**Provider$ResponseOutInvoker._**dispatch(_
>> **AbstractResourceMethodDispatch**Provider.java:205_)
>>
>>       at com.sun.jersey.server.impl.**model.method.dispatch.**
>> ResourceJavaMethodDispatcher.**dispatch(_**ResourceJavaMethodDispatcher.*
>> *java:75_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.HttpMethodRule.**
>> accept(_HttpMethodRule.java:**288_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.ResourceClassRule.**
>> accept(_ResourceClassRule.**java:108_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.RightHandPathRule.**
>> accept(_RightHandPathRule.**java:147_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.**
>> RootResourceClassesRule.**accept(_**RootResourceClassesRule.java:**84_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl._*
>> *handleRequest(_**WebApplicationImpl.java:1465_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl._*
>> *handleRequest(_**WebApplicationImpl.java:1396_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl.**
>> handleRequest(_**WebApplicationImpl.java:1345_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl.**
>> handleRequest(_**WebApplicationImpl.java:1335_)
>>
>>       at com.sun.jersey.spi.container.**servlet.WebComponent.service(_**
>> WebComponent.java:416_)
>>
>>       at com.sun.jersey.spi.container.**servlet.ServletContainer.**
>> service(_ServletContainer.**java:537_)
>>
>>       at com.sun.jersey.spi.container.**servlet.ServletContainer.**
>> service(_ServletContainer.**java:699_)
>>
>>       at javax.servlet.http.**HttpServlet.service(_**
>> HttpServlet.java:820_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> ServletHandler.doHandle(_**ServletHandler.java:96_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> ServletHandler.handle(_**ServletHandler.java:79_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> ServletPipeline.handle(_**ServletPipeline.java:42_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> InvocationFilterChain.**doFilter(_**InvocationFilterChain.java:49_**)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> HttpFilterChain.doFilter(_**HttpFilterChain.java:33_)
>>
>>       at org.apache.stanbol.commons.**httpqueryheaders.impl.**
>> QueryHeadersFilter.doFilter(_**QueryHeadersFilter.java:75_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> FilterHandler.doHandle(_**FilterHandler.java:88_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> FilterHandler.handle(_**FilterHandler.java:76_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> InvocationFilterChain.**doFilter(_**InvocationFilterChain.java:47_**)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> HttpFilterChain.doFilter(_**HttpFilterChain.java:33_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> FilterHandler.handle(_**FilterHandler.java:78_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> InvocationFilterChain.**doFilter(_**InvocationFilterChain.java:47_**)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> HttpFilterChain.doFilter(_**HttpFilterChain.java:33_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> FilterPipeline.dispatch(_**FilterPipeline.java:48_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.Dispatcher.**
>> dispatch(_Dispatcher.java:39_)
>>
>>       at org.apache.felix.http.base.**internal.DispatcherServlet.**
>> service(_DispatcherServlet.**java:67_)
>>
>>       at javax.servlet.http.**HttpServlet.service(_**
>> HttpServlet.java:820_)
>>
>>       at org.mortbay.jetty.servlet.**ServletHolder.handle(_**
>> ServletHolder.java:511_)
>>
>>       at org.mortbay.jetty.servlet.**ServletHandler.handle(_**
>> ServletHandler.java:390_)
>>
>>       at org.mortbay.jetty.servlet.**SessionHandler.handle(_**
>> SessionHandler.java:182_)
>>
>>       at org.mortbay.jetty.handler.**ContextHandler.handle(_**
>> ContextHandler.java:765_)
>>
>>       at org.mortbay.jetty.handler.**HandlerWrapper.handle(_**
>> HandlerWrapper.java:152_)
>>
>>       at org.mortbay.jetty.Server.**handle(_Server.java:326_)
>>
>>       at org.mortbay.jetty.**HttpConnection.handleRequest(_**
>> HttpConnection.java:542_)
>>
>>       at org.mortbay.jetty.**HttpConnection$RequestHandler.**
>> content(_HttpConnection.java:**943_)
>>
>>       at org.mortbay.jetty.HttpParser.**parseNext(_HttpParser.java:**
>> 756_)
>>
>>       at org.mortbay.jetty.HttpParser.**parseAvailable(_HttpParser.**
>> java:212_)
>>
>>       at org.mortbay.jetty.**HttpConnection.handle(_**
>> HttpConnection.java:404_)
>>
>>       at org.mortbay.io.nio.**SelectChannelEndPoint.run(_**
>> SelectChannelEndPoint.java:**410_)
>>
>>       at org.mortbay.thread.**QueuedThreadPool$PoolThread.**
>> run(_QueuedThreadPool.java:**582_)
>>
>> Caused by: _org.semanticdesktop.aperture.**extractor.ExtractorException_
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.core.html.**
>> XsltExtractor.extract(_**XsltExtractor.java:147_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.core.html.**
>> IksHtmlExtractor.extract(_**IksHtmlExtractor.java:123_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.core.**
>> MetaxaCore.extract(_**MetaxaCore.java:120_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.MetaxaEngine.**
>> computeEnhancements(_**MetaxaEngine.java:157_)
>>
>>       ... 51 more
>>
>> Caused by: _java.io.IOException_
>>
>>       at org.openrdf.rdf2go.**RepositoryModel.readFrom(_**
>> RepositoryModel.java:661_)
>>
>>       at org.openrdf.rdf2go.**RepositoryModel.readFrom(_**
>> RepositoryModel.java:652_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.core.html.**
>> XsltExtractor.extract(_**XsltExtractor.java:140_)
>>
>>       ... 54 more
>>
>> Caused by: _org.openrdf.rio.**RDFParseException_: Not a valid (absolute)
>> URI: //creativecommons.org/**licenses/by-sa/3.0/<http://creativecommons.org/licenses/by-sa/3.0/>[line 4, column 179]
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.**reportFatalError(_**
>> RDFParserBase.java:533_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.reportFatalError(**
>> _RDFXMLParser.java:1068_)
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.createURI(_**
>> RDFParserBase.java:285_)
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.resolveURI(_**
>> RDFParserBase.java:272_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.**getPropertyResource(_**
>> RDFXMLParser.java:751_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.**processPropertyElt(_**
>> RDFXMLParser.java:674_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.emptyElement(_**
>> RDFXMLParser.java:378_)
>>
>>       at org.openrdf.rio.rdfxml.**SAXFilter.endElement(_**
>> SAXFilter.java:359_)
>>
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> AbstractSAXParser.endElement(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> AbstractXMLDocumentParser.**emptyElement(Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLNSDocumentScannerImpl.**scanStartElement(Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLDocumentFragmentScannerImpl**$FragmentContentDriver.next(**Unknown
>> Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLDocumentScannerImpl.next(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLNSDocumentScannerImpl.next(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLDocumentFragmentScannerImpl**.scanDocument(Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> XML11Configuration.parse(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> XML11Configuration.parse(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.XMLParser.**parse(Unknown
>> Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> AbstractSAXParser.parse(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.jaxp.SAXParserImpl$**JAXPSAXParser.parse(Unknown
>> Source)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.parse(_**
>> RDFXMLParser.java:260_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.parse(_**
>> RDFXMLParser.java:244_)
>>
>>       at org.openrdf.repository.base.**RepositoryConnectionBase.**
>> addInputStreamOrReader(_**RepositoryConnectionBase.java:**357_)
>>
>>       at org.openrdf.repository.base.**RepositoryConnectionBase.add(_**
>> RepositoryConnectionBase.java:**312_)
>>
>>       at org.openrdf.rdf2go.**RepositoryModel.readFrom(_**
>> RepositoryModel.java:659_)
>>
>>       ... 56 more
>>
>> Caused by: _java.lang.**IllegalArgumentException_: Not a valid
>> (absolute) URI: //creativecommons.org/**licenses/by-sa/3.0/<http://creativecommons.org/licenses/by-sa/3.0/>
>>
>>       at org.openrdf.model.impl.**URIImpl.setURIString(_URIImpl.**
>> java:68_)
>>
>>       at org.openrdf.model.impl.**URIImpl.&lt;init&gt;(_URIImpl.**
>> java:57_)
>>
>>       at org.openrdf.sail.memory.model.**MemValueFactory.createURI(_**
>> MemValueFactory.java:345_)
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.createURI(_**
>> RDFParserBase.java:282_)
>>
>>       ... 78 more
>>
>> </pre>
>>
>> <h3>Caused _by:</h3><pre>org.**semanticdesktop.aperture.**
>> extractor.ExtractorException_
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.core.html.**
>> XsltExtractor.extract(_**XsltExtractor.java:147_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.core.html.**
>> IksHtmlExtractor.extract(_**IksHtmlExtractor.java:123_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.core.**
>> MetaxaCore.extract(_**MetaxaCore.java:120_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.MetaxaEngine.**
>> computeEnhancements(_**MetaxaEngine.java:157_)
>>
>>       at org.apache.stanbol.enhancer.**jobmanager.impl.**
>> WeightedJobManager.**enhanceContent(_**WeightedJobManager.java:80_)
>>
>>       at org.apache.stanbol.enhancer.**jersey.resource.**
>> EnginesRootResource.**enhanceAndBuildResponse(_**
>> EnginesRootResource.java:175_)
>>
>>       at org.apache.stanbol.enhancer.**jersey.resource.**
>> EnginesRootResource.**enhanceFromData(_**EnginesRootResource.java:167_)
>>
>>
>>       at sun.reflect.**GeneratedMethodAccessor35.**invoke(Unknown Source)
>>
>>       at sun.reflect.**DelegatingMethodAccessorImpl.**invoke(Unknown
>> Source)
>>
>>       at java.lang.reflect.Method.**invoke(Unknown Source)
>>
>>       at com.sun.jersey.spi.container.**JavaMethodInvokerFactory$1.**
>> invoke(_**JavaMethodInvokerFactory.java:**60_)
>>
>>       at com.sun.jersey.server.impl.**model.method.dispatch.**
>> AbstractResourceMethodDispatch**Provider$ResponseOutInvoker._**dispatch(_
>> **AbstractResourceMethodDispatch**Provider.java:205_)
>>
>>       at com.sun.jersey.server.impl.**model.method.dispatch.**
>> ResourceJavaMethodDispatcher.**dispatch(_**ResourceJavaMethodDispatcher.*
>> *java:75_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.HttpMethodRule.**
>> accept(_HttpMethodRule.java:**288_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.ResourceClassRule.**
>> accept(_ResourceClassRule.**java:108_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.RightHandPathRule.**
>> accept(_RightHandPathRule.**java:147_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.**
>> RootResourceClassesRule.**accept(_**RootResourceClassesRule.java:**84_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl._*
>> *handleRequest(_**WebApplicationImpl.java:1465_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl._*
>> *handleRequest(_**WebApplicationImpl.java:1396_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl.**
>> handleRequest(_**WebApplicationImpl.java:1345_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl.**
>> handleRequest(_**WebApplicationImpl.java:1335_)
>>
>>       at com.sun.jersey.spi.container.**servlet.WebComponent.service(_**
>> WebComponent.java:416_)
>>
>>       at com.sun.jersey.spi.container.**servlet.ServletContainer.**
>> service(_ServletContainer.**java:537_)
>>
>>       at com.sun.jersey.spi.container.**servlet.ServletContainer.**
>> service(_ServletContainer.**java:699_)
>>
>>       at javax.servlet.http.**HttpServlet.service(_**
>> HttpServlet.java:820_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> ServletHandler.doHandle(_**ServletHandler.java:96_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> ServletHandler.handle(_**ServletHandler.java:79_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> ServletPipeline.handle(_**ServletPipeline.java:42_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> InvocationFilterChain.**doFilter(_**InvocationFilterChain.java:49_**)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> HttpFilterChain.doFilter(_**HttpFilterChain.java:33_)
>>
>>       at org.apache.stanbol.commons.**httpqueryheaders.impl.**
>> QueryHeadersFilter.doFilter(_**QueryHeadersFilter.java:75_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> FilterHandler.doHandle(_**FilterHandler.java:88_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> FilterHandler.handle(_**FilterHandler.java:76_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> InvocationFilterChain.**doFilter(_**InvocationFilterChain.java:47_**)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> HttpFilterChain.doFilter(_**HttpFilterChain.java:33_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> FilterHandler.handle(_**FilterHandler.java:78_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> InvocationFilterChain.**doFilter(_**InvocationFilterChain.java:47_**)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> HttpFilterChain.doFilter(_**HttpFilterChain.java:33_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> FilterPipeline.dispatch(_**FilterPipeline.java:48_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.Dispatcher.**
>> dispatch(_Dispatcher.java:39_)
>>
>>       at org.apache.felix.http.base.**internal.DispatcherServlet.**
>> service(_DispatcherServlet.**java:67_)
>>
>>       at javax.servlet.http.**HttpServlet.service(_**
>> HttpServlet.java:820_)
>>
>>       at org.mortbay.jetty.servlet.**ServletHolder.handle(_**
>> ServletHolder.java:511_)
>>
>>       at org.mortbay.jetty.servlet.**ServletHandler.handle(_**
>> ServletHandler.java:390_)
>>
>>       at org.mortbay.jetty.servlet.**SessionHandler.handle(_**
>> SessionHandler.java:182_)
>>
>>       at org.mortbay.jetty.handler.**ContextHandler.handle(_**
>> ContextHandler.java:765_)
>>
>>       at org.mortbay.jetty.handler.**HandlerWrapper.handle(_**
>> HandlerWrapper.java:152_)
>>
>>       at org.mortbay.jetty.Server.**handle(_Server.java:326_)
>>
>>       at org.mortbay.jetty.**HttpConnection.handleRequest(_**
>> HttpConnection.java:542_)
>>
>>       at org.mortbay.jetty.**HttpConnection$RequestHandler.**
>> content(_HttpConnection.java:**943_)
>>
>>       at org.mortbay.jetty.HttpParser.**parseNext(_HttpParser.java:**
>> 756_)
>>
>>       at org.mortbay.jetty.HttpParser.**parseAvailable(_HttpParser.**
>> java:212_)
>>
>>       at org.mortbay.jetty.**HttpConnection.handle(_**
>> HttpConnection.java:404_)
>>
>>       at org.mortbay.io.nio.**SelectChannelEndPoint.run(_**
>> SelectChannelEndPoint.java:**410_)
>>
>>       at org.mortbay.thread.**QueuedThreadPool$PoolThread.**
>> run(_QueuedThreadPool.java:**582_)
>>
>> Caused by: _java.io.IOException_
>>
>>       at org.openrdf.rdf2go.**RepositoryModel.readFrom(_**
>> RepositoryModel.java:661_)
>>
>>       at org.openrdf.rdf2go.**RepositoryModel.readFrom(_**
>> RepositoryModel.java:652_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.core.html.**
>> XsltExtractor.extract(_**XsltExtractor.java:140_)
>>
>>       ... 54 more
>>
>> Caused by: _org.openrdf.rio.**RDFParseException_: Not a valid (absolute)
>> URI: //creativecommons.org/**licenses/by-sa/3.0/<http://creativecommons.org/licenses/by-sa/3.0/>[line 4, column 179]
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.**reportFatalError(_**
>> RDFParserBase.java:533_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.reportFatalError(**
>> _RDFXMLParser.java:1068_)
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.createURI(_**
>> RDFParserBase.java:285_)
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.resolveURI(_**
>> RDFParserBase.java:272_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.**getPropertyResource(_**
>> RDFXMLParser.java:751_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.**processPropertyElt(_**
>> RDFXMLParser.java:674_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.emptyElement(_**
>> RDFXMLParser.java:378_)
>>
>>       at org.openrdf.rio.rdfxml.**SAXFilter.endElement(_**
>> SAXFilter.java:359_)
>>
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> AbstractSAXParser.endElement(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> AbstractXMLDocumentParser.**emptyElement(Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLNSDocumentScannerImpl.**scanStartElement(Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLDocumentFragmentScannerImpl**$FragmentContentDriver.next(**Unknown
>> Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLDocumentScannerImpl.next(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLNSDocumentScannerImpl.next(**UnknownSource)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLDocumentFragmentScannerImpl**.scanDocument(Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> XML11Configuration.parse(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> XML11Configuration.parse(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.XMLParser.**parse(Unknown
>> Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> AbstractSAXParser.parse(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.jaxp.SAXParserImpl$**JAXPSAXParser.parse(Unknown
>> Source)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.parse(_**
>> RDFXMLParser.java:260_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.parse(_**
>> RDFXMLParser.java:244_)
>>
>>       at org.openrdf.repository.base.**RepositoryConnectionBase.**
>> addInputStreamOrReader(_**RepositoryConnectionBase.java:**357_)
>>
>>       at org.openrdf.repository.base.**RepositoryConnectionBase.add(_**
>> RepositoryConnectionBase.java:**312_)
>>
>>       at org.openrdf.rdf2go.**RepositoryModel.readFrom(_**
>> RepositoryModel.java:659_)
>>
>>       ... 56 more
>>
>> Caused by: _java.lang.**IllegalArgumentException_: Not a valid
>> (absolute) URI: //creativecommons.org/**licenses/by-sa/3.0/<http://creativecommons.org/licenses/by-sa/3.0/>
>>
>>       at org.openrdf.model.impl.**URIImpl.setURIString(_URIImpl.**
>> java:68_)
>>
>>       at org.openrdf.model.impl.**URIImpl.&lt;init&gt;(_URIImpl.**
>> java:57_)
>>
>>       at org.openrdf.sail.memory.model.**MemValueFactory.createURI(_**
>> MemValueFactory.java:345_)
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.createURI(_**
>> RDFParserBase.java:282_)
>>
>>       ... 78 more
>>
>> </pre>
>>
>> <h3>Caused _by:</h3><pre>java.io.**IOException_
>>
>>       at org.openrdf.rdf2go.**RepositoryModel.readFrom(_**
>> RepositoryModel.java:661_)
>>
>>       at org.openrdf.rdf2go.**RepositoryModel.readFrom(_**
>> RepositoryModel.java:652_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.core.html.**
>> XsltExtractor.extract(_**XsltExtractor.java:140_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.core.html.**
>> IksHtmlExtractor.extract(_**IksHtmlExtractor.java:123_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.core.**
>> MetaxaCore.extract(_**MetaxaCore.java:120_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.MetaxaEngine.**
>> computeEnhancements(_**MetaxaEngine.java:157_)
>>
>>       at org.apache.stanbol.enhancer.**jobmanager.impl.**
>> WeightedJobManager.**enhanceContent(_**WeightedJobManager.java:80_)
>>
>>       at org.apache.stanbol.enhancer.**jersey.resource.**
>> EnginesRootResource.**enhanceAndBuildResponse(_**
>> EnginesRootResource.java:175_)
>>
>>       at org.apache.stanbol.enhancer.**jersey.resource.**
>> EnginesRootResource.**enhanceFromData(_**EnginesRootResource.java:167_)
>>
>>
>>       at sun.reflect.**GeneratedMethodAccessor35.**invoke(Unknown Source)
>>
>>       at sun.reflect.**DelegatingMethodAccessorImpl.**invoke(Unknown
>> Source)
>>
>>       at java.lang.reflect.Method.**invoke(Unknown Source)
>>
>>       at com.sun.jersey.spi.container.**JavaMethodInvokerFactory$1.**
>> invoke(_**JavaMethodInvokerFactory.java:**60_)
>>
>>       at com.sun.jersey.server.impl.**model.method.dispatch.**
>> AbstractResourceMethodDispatch**Provider$ResponseOutInvoker._**dispatch(_
>> **AbstractResourceMethodDispatch**Provider.java:205_)
>>
>>       at com.sun.jersey.server.impl.**model.method.dispatch.**
>> ResourceJavaMethodDispatcher.**dispatch(_**ResourceJavaMethodDispatcher.*
>> *java:75_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.HttpMethodRule.**
>> accept(_HttpMethodRule.java:**288_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.ResourceClassRule.**
>> accept(_ResourceClassRule.**java:108_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.RightHandPathRule.**
>> accept(_RightHandPathRule.**java:147_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.**
>> RootResourceClassesRule.**accept(_**RootResourceClassesRule.java:**84_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl._*
>> *handleRequest(_**WebApplicationImpl.java:1465_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl._*
>> *handleRequest(_**WebApplicationImpl.java:1396_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl.**
>> handleRequest(_**WebApplicationImpl.java:1345_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl.**
>> handleRequest(_**WebApplicationImpl.java:1335_)
>>
>>       at com.sun.jersey.spi.container.**servlet.WebComponent.service(_**
>> WebComponent.java:416_)
>>
>>       at com.sun.jersey.spi.container.**servlet.ServletContainer.**
>> service(_ServletContainer.**java:537_)
>>
>>       at com.sun.jersey.spi.container.**servlet.ServletContainer.**
>> service(_ServletContainer.**java:699_)
>>
>>       at javax.servlet.http.**HttpServlet.service(_**
>> HttpServlet.java:820_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> ServletHandler.doHandle(_**ServletHandler.java:96_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> ServletHandler.handle(_**ServletHandler.java:79_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> ServletPipeline.handle(_**ServletPipeline.java:42_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> InvocationFilterChain.**doFilter(_**InvocationFilterChain.java:49_**)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> HttpFilterChain.doFilter(_**HttpFilterChain.java:33_)
>>
>>       at org.apache.stanbol.commons.**httpqueryheaders.impl.**
>> QueryHeadersFilter.doFilter(_**QueryHeadersFilter.java:75_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> FilterHandler.doHandle(_**FilterHandler.java:88_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> FilterHandler.handle(_**FilterHandler.java:76_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> InvocationFilterChain.**doFilter(_**InvocationFilterChain.java:47_**)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> HttpFilterChain.doFilter(_**HttpFilterChain.java:33_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> FilterHandler.handle(_**FilterHandler.java:78_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> InvocationFilterChain.**doFilter(_**InvocationFilterChain.java:47_**)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> HttpFilterChain.doFilter(_**HttpFilterChain.java:33_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> FilterPipeline.dispatch(_**FilterPipeline.java:48_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.Dispatcher.**
>> dispatch(_Dispatcher.java:39_)
>>
>>       at org.apache.felix.http.base.**internal.DispatcherServlet.**
>> service(_DispatcherServlet.**java:67_)
>>
>>       at javax.servlet.http.**HttpServlet.service(_**
>> HttpServlet.java:820_)
>>
>>       at org.mortbay.jetty.servlet.**ServletHolder.handle(_**
>> ServletHolder.java:511_)
>>
>>       at org.mortbay.jetty.servlet.**ServletHandler.handle(_**
>> ServletHandler.java:390_)
>>
>>       at org.mortbay.jetty.servlet.**SessionHandler.handle(_**
>> SessionHandler.java:182_)
>>
>>       at org.mortbay.jetty.handler.**ContextHandler.handle(_**
>> ContextHandler.java:765_)
>>
>>       at org.mortbay.jetty.handler.**HandlerWrapper.handle(_**
>> HandlerWrapper.java:152_)
>>
>>       at org.mortbay.jetty.Server.**handle(_Server.java:326_)
>>
>>       at org.mortbay.jetty.**HttpConnection.handleRequest(_**
>> HttpConnection.java:542_)
>>
>>       at org.mortbay.jetty.**HttpConnection$RequestHandler.**
>> content(_HttpConnection.java:**943_)
>>
>>       at org.mortbay.jetty.HttpParser.**parseNext(_HttpParser.java:**
>> 756_)
>>
>>       at org.mortbay.jetty.HttpParser.**parseAvailable(_HttpParser.**
>> java:212_)
>>
>>       at org.mortbay.jetty.**HttpConnection.handle(_**
>> HttpConnection.java:404_)
>>
>>       at org.mortbay.io.nio.**SelectChannelEndPoint.run(_**
>> SelectChannelEndPoint.java:**410_)
>>
>>       at org.mortbay.thread.**QueuedThreadPool$PoolThread.**
>> run(_QueuedThreadPool.java:**582_)
>>
>> Caused by: _org.openrdf.rio.**RDFParseException_: Not a valid (absolute)
>> URI: //creativecommons.org/**licenses/by-sa/3.0/<http://creativecommons.org/licenses/by-sa/3.0/>[line 4, column 179]
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.**reportFatalError(_**
>> RDFParserBase.java:533_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.reportFatalError(**
>> _RDFXMLParser.java:1068_)
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.createURI(_**
>> RDFParserBase.java:285_)
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.resolveURI(_**
>> RDFParserBase.java:272_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.**getPropertyResource(_**
>> RDFXMLParser.java:751_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.**processPropertyElt(_**
>> RDFXMLParser.java:674_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.emptyElement(_**
>> RDFXMLParser.java:378_)
>>
>>       at org.openrdf.rio.rdfxml.**SAXFilter.endElement(_**
>> SAXFilter.java:359_)
>>
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> AbstractSAXParser.endElement(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> AbstractXMLDocumentParser.**emptyElement(Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLNSDocumentScannerImpl.**scanStartElement(**UnknownSource)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLDocumentFragmentScannerImpl**$FragmentContentDriver.next(**Unknown
>> Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLDocumentScannerImpl.next(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLNSDocumentScannerImpl.next(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLDocumentFragmentScannerImpl**.scanDocument(Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> XML11Configuration.parse(**UnknownSource)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> XML11Configuration.parse(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.XMLParser.**parse(Unknown
>> Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> AbstractSAXParser.parse(**UnknownSource)
>>
>>       at com.sun.org.apache.xerces.**internal.jaxp.SAXParserImpl$**JAXPSAXParser.parse(Unknown
>> Source)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.parse(_**
>> RDFXMLParser.java:260_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.parse(_**
>> RDFXMLParser.java:244_)
>>
>>       at org.openrdf.repository.base.**RepositoryConnectionBase.**
>> addInputStreamOrReader(_**RepositoryConnectionBase.java:**357_)
>>
>>       at org.openrdf.repository.base.**RepositoryConnectionBase.add(_**
>> RepositoryConnectionBase.java:**312_)
>>
>>       at org.openrdf.rdf2go.**RepositoryModel.readFrom(_**
>> RepositoryModel.java:659_)
>>
>>       ... 56 more
>>
>> Caused by: _java.lang.**IllegalArgumentException_: Not a valid
>> (absolute) URI: //creativecommons.org/**licenses/by-sa/3.0/<http://creativecommons.org/licenses/by-sa/3.0/>
>>
>>       at org.openrdf.model.impl.**URIImpl.setURIString(_URIImpl.**
>> java:68_)
>>
>>       at org.openrdf.model.impl.**URIImpl.&lt;init&gt;(_URIImpl.**
>> java:57_)
>>
>>       at org.openrdf.sail.memory.model.**MemValueFactory.createURI(_**
>> MemValueFactory.java:345_)
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.createURI(_**
>> RDFParserBase.java:282_)
>>
>>       ... 78 more
>>
>> </pre>
>>
>> <h3>Caused _by:</h3><pre>org.openrdf.rio.**RDFParseException_: Not a
>> valid (absolute) URI: //creativecommons.org/**licenses/by-sa/3.0/<http://creativecommons.org/licenses/by-sa/3.0/>[line 4, column 179]
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.**reportFatalError(_**
>> RDFParserBase.java:533_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.reportFatalError(**
>> _RDFXMLParser.java:1068_)
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.createURI(_**
>> RDFParserBase.java:285_)
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.resolveURI(_**
>> RDFParserBase.java:272_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.**getPropertyResource(_**
>> RDFXMLParser.java:751_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.**processPropertyElt(_**
>> RDFXMLParser.java:674_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.emptyElement(_**
>> RDFXMLParser.java:378_)
>>
>>       at org.openrdf.rio.rdfxml.**SAXFilter.endElement(_**
>> SAXFilter.java:359_)
>>
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> AbstractSAXParser.endElement(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> AbstractXMLDocumentParser.**emptyElement(Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLNSDocumentScannerImpl.**scanStartElement(Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLDocumentFragmentScannerImpl**$FragmentContentDriver.next(**Unknown
>> Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLDocumentScannerImpl.next(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLNSDocumentScannerImpl.next(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLDocumentFragmentScannerImpl**.scanDocument(Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> XML11Configuration.parse(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> XML11Configuration.parse(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.XMLParser.**parse(Unknown
>> Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> AbstractSAXParser.parse(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.jaxp.SAXParserImpl$**JAXPSAXParser.parse(Unknown
>> Source)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.parse(_**
>> RDFXMLParser.java:260_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.parse(_**
>> RDFXMLParser.java:244_)
>>
>>       at org.openrdf.repository.base.**RepositoryConnectionBase.**
>> addInputStreamOrReader(_**RepositoryConnectionBase.java:**357_)
>>
>>       at org.openrdf.repository.base.**RepositoryConnectionBase.add(_**
>> RepositoryConnectionBase.java:**312_)
>>
>>       at org.openrdf.rdf2go.**RepositoryModel.readFrom(_**
>> RepositoryModel.java:659_)
>>
>>       at org.openrdf.rdf2go.**RepositoryModel.readFrom(_**
>> RepositoryModel.java:652_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.core.html.**
>> XsltExtractor.extract(_**XsltExtractor.java:140_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.core.html.**
>> IksHtmlExtractor.extract(_**IksHtmlExtractor.java:123_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.core.**
>> MetaxaCore.extract(_**MetaxaCore.java:120_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.MetaxaEngine.**
>> computeEnhancements(_**MetaxaEngine.java:157_)
>>
>>       at org.apache.stanbol.enhancer.**jobmanager.impl.**
>> WeightedJobManager.**enhanceContent(_**WeightedJobManager.java:80_)
>>
>>       at org.apache.stanbol.enhancer.**jersey.resource.**
>> EnginesRootResource.**enhanceAndBuildResponse(_**
>> EnginesRootResource.java:175_)
>>
>>       at org.apache.stanbol.enhancer.**jersey.resource.**
>> EnginesRootResource.**enhanceFromData(_**EnginesRootResource.java:167_)
>>
>>
>>       at sun.reflect.**GeneratedMethodAccessor35.**invoke(Unknown Source)
>>
>>       at sun.reflect.**DelegatingMethodAccessorImpl.**invoke(Unknown
>> Source)
>>
>>       at java.lang.reflect.Method.**invoke(Unknown Source)
>>
>>       at com.sun.jersey.spi.container.**JavaMethodInvokerFactory$1.**
>> invoke(_**JavaMethodInvokerFactory.java:**60_)
>>
>>       at com.sun.jersey.server.impl.**model.method.dispatch.**
>> AbstractResourceMethodDispatch**Provider$ResponseOutInvoker._**dispatch(_
>> **AbstractResourceMethodDispatch**Provider.java:205_)
>>
>>       at com.sun.jersey.server.impl.**model.method.dispatch.**
>> ResourceJavaMethodDispatcher.**dispatch(_**ResourceJavaMethodDispatcher.*
>> *java:75_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.HttpMethodRule.**
>> accept(_HttpMethodRule.java:**288_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.ResourceClassRule.**
>> accept(_ResourceClassRule.**java:108_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.RightHandPathRule.**
>> accept(_RightHandPathRule.**java:147_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.**
>> RootResourceClassesRule.**accept(_**RootResourceClassesRule.java:**84_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl._*
>> *handleRequest(_**WebApplicationImpl.java:1465_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl._*
>> *handleRequest(_**WebApplicationImpl.java:1396_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl.**
>> handleRequest(_**WebApplicationImpl.java:1345_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl.**
>> handleRequest(_**WebApplicationImpl.java:1335_)
>>
>>       at com.sun.jersey.spi.container.**servlet.WebComponent.service(_**
>> WebComponent.java:416_)
>>
>>       at com.sun.jersey.spi.container.**servlet.ServletContainer.**
>> service(_ServletContainer.**java:537_)
>>
>>       at com.sun.jersey.spi.container.**servlet.ServletContainer.**
>> service(_ServletContainer.**java:699_)
>>
>>       at javax.servlet.http.**HttpServlet.service(_**
>> HttpServlet.java:820_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> ServletHandler.doHandle(_**ServletHandler.java:96_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> ServletHandler.handle(_**ServletHandler.java:79_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> ServletPipeline.handle(_**ServletPipeline.java:42_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> InvocationFilterChain.**doFilter(_**InvocationFilterChain.java:49_**)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> HttpFilterChain.doFilter(_**HttpFilterChain.java:33_)
>>
>>       at org.apache.stanbol.commons.**httpqueryheaders.impl.**
>> QueryHeadersFilter.doFilter(_**QueryHeadersFilter.java:75_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> FilterHandler.doHandle(_**FilterHandler.java:88_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> FilterHandler.handle(_**FilterHandler.java:76_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> InvocationFilterChain.**doFilter(_**InvocationFilterChain.java:47_**)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> HttpFilterChain.doFilter(_**HttpFilterChain.java:33_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> FilterHandler.handle(_**FilterHandler.java:78_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> InvocationFilterChain.**doFilter(_**InvocationFilterChain.java:47_**)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> HttpFilterChain.doFilter(_**HttpFilterChain.java:33_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> FilterPipeline.dispatch(_**FilterPipeline.java:48_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.Dispatcher.**
>> dispatch(_Dispatcher.java:39_)
>>
>>       at org.apache.felix.http.base.**internal.DispatcherServlet.**
>> service(_DispatcherServlet.**java:67_)
>>
>>       at javax.servlet.http.**HttpServlet.service(_**
>> HttpServlet.java:820_)
>>
>>       at org.mortbay.jetty.servlet.**ServletHolder.handle(_**
>> ServletHolder.java:511_)
>>
>>       at org.mortbay.jetty.servlet.**ServletHandler.handle(_**
>> ServletHandler.java:390_)
>>
>>       at org.mortbay.jetty.servlet.**SessionHandler.handle(_**
>> SessionHandler.java:182_)
>>
>>       at org.mortbay.jetty.handler.**ContextHandler.handle(_**
>> ContextHandler.java:765_)
>>
>>       at org.mortbay.jetty.handler.**HandlerWrapper.handle(_**
>> HandlerWrapper.java:152_)
>>
>>       at org.mortbay.jetty.Server.**handle(_Server.java:326_)
>>
>>       at org.mortbay.jetty.**HttpConnection.handleRequest(_**
>> HttpConnection.java:542_)
>>
>>       at org.mortbay.jetty.**HttpConnection$RequestHandler.**
>> content(_HttpConnection.java:**943_)
>>
>>       at org.mortbay.jetty.HttpParser.**parseNext(_HttpParser.java:**
>> 756_)
>>
>>       at org.mortbay.jetty.HttpParser.**parseAvailable(_HttpParser.**
>> java:212_)
>>
>>       at org.mortbay.jetty.**HttpConnection.handle(_**
>> HttpConnection.java:404_)
>>
>>       at org.mortbay.io.nio.**SelectChannelEndPoint.run(_**
>> SelectChannelEndPoint.java:**410_)
>>
>>       at org.mortbay.thread.**QueuedThreadPool$PoolThread.**
>> run(_QueuedThreadPool.java:**582_)
>>
>> Caused by: _java.lang.**IllegalArgumentException_: Not a valid
>> (absolute) URI: //creativecommons.org/**licenses/by-sa/3.0/<http://creativecommons.org/licenses/by-sa/3.0/>
>>
>>       at org.openrdf.model.impl.**URIImpl.setURIString(_URIImpl.**
>> java:68_)
>>
>>       at org.openrdf.model.impl.**URIImpl.&lt;init&gt;(_URIImpl.**
>> java:57_)
>>
>>       at org.openrdf.sail.memory.model.**MemValueFactory.createURI(_**
>> MemValueFactory.java:345_)
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.createURI(_**
>> RDFParserBase.java:282_)
>>
>>       ... 78 more
>>
>> </pre>
>>
>> <h3>Caused _by:</h3><pre>java.lang.**IllegalArgumentException_: Not a
>> valid (absolute) URI: //creativecommons.org/**licenses/by-sa/3.0/<http://creativecommons.org/licenses/by-sa/3.0/>
>>
>>       at org.openrdf.model.impl.**URIImpl.setURIString(_URIImpl.**
>> java:68_)
>>
>>       at org.openrdf.model.impl.**URIImpl.&lt;init&gt;(_URIImpl.**
>> java:57_)
>>
>>       at org.openrdf.sail.memory.model.**MemValueFactory.createURI(_**
>> MemValueFactory.java:345_)
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.createURI(_**
>> RDFParserBase.java:282_)
>>
>>       at org.openrdf.rio.helpers.**RDFParserBase.resolveURI(_**
>> RDFParserBase.java:272_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.**getPropertyResource(_**
>> RDFXMLParser.java:751_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.**processPropertyElt(_**
>> RDFXMLParser.java:674_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.emptyElement(_**
>> RDFXMLParser.java:378_)
>>
>>       at org.openrdf.rio.rdfxml.**SAXFilter.endElement(_**
>> SAXFilter.java:359_)
>>
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> AbstractSAXParser.endElement(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> AbstractXMLDocumentParser.**emptyElement(Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLNSDocumentScannerImpl.**scanStartElement(**UnknownSource)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLDocumentFragmentScannerImpl**$FragmentContentDriver.next(**Unknown
>> Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLDocumentScannerImpl.next(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLNSDocumentScannerImpl.next(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.impl.**
>> XMLDocumentFragmentScannerImpl**.scanDocument(Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> XML11Configuration.parse(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> XML11Configuration.parse(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.XMLParser.**parse(Unknown
>> Source)
>>
>>       at com.sun.org.apache.xerces.**internal.parsers.**
>> AbstractSAXParser.parse(**Unknown Source)
>>
>>       at com.sun.org.apache.xerces.**internal.jaxp.SAXParserImpl$**JAXPSAXParser.parse(Unknown
>> Source)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.parse(_**
>> RDFXMLParser.java:260_)
>>
>>       at org.openrdf.rio.rdfxml.**RDFXMLParser.parse(_**
>> RDFXMLParser.java:244_)
>>
>>       at org.openrdf.repository.base.**RepositoryConnectionBase.**
>> addInputStreamOrReader(_**RepositoryConnectionBase.java:**357_)
>>
>>       at org.openrdf.repository.base.**RepositoryConnectionBase.add(_**
>> RepositoryConnectionBase.java:**312_)
>>
>>       at org.openrdf.rdf2go.**RepositoryModel.readFrom(_**
>> RepositoryModel.java:659_)
>>
>>       at org.openrdf.rdf2go.**RepositoryModel.readFrom(_**
>> RepositoryModel.java:652_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.core.html.**
>> XsltExtractor.extract(_**XsltExtractor.java:140_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.core.html.**
>> IksHtmlExtractor.extract(_**IksHtmlExtractor.java:123_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.core.**
>> MetaxaCore.extract(_**MetaxaCore.java:120_)
>>
>>       at org.apache.stanbol.enhancer.**engines.metaxa.MetaxaEngine.**
>> computeEnhancements(_**MetaxaEngine.java:157_)
>>
>>       at org.apache.stanbol.enhancer.**jobmanager.impl.**
>> WeightedJobManager.**enhanceContent(_**WeightedJobManager.java:80_)
>>
>>       at org.apache.stanbol.enhancer.**jersey.resource.**
>> EnginesRootResource.**enhanceAndBuildResponse(_**
>> EnginesRootResource.java:175_)
>>
>>       at org.apache.stanbol.enhancer.**jersey.resource.**
>> EnginesRootResource.**enhanceFromData(_**EnginesRootResource.java:167_)
>>
>>
>>       at sun.reflect.**GeneratedMethodAccessor35.**invoke(Unknown Source)
>>
>>       at sun.reflect.**DelegatingMethodAccessorImpl.**invoke(Unknown
>> Source)
>>
>>       at java.lang.reflect.Method.**invoke(Unknown Source)
>>
>>       at com.sun.jersey.spi.container.**JavaMethodInvokerFactory$1.**
>> invoke(_**JavaMethodInvokerFactory.java:**60_)
>>
>>       at com.sun.jersey.server.impl.**model.method.dispatch.**
>> AbstractResourceMethodDispatch**Provider$ResponseOutInvoker._**dispatch(_
>> **AbstractResourceMethodDispatch**Provider.java:205_)
>>
>>       at com.sun.jersey.server.impl.**model.method.dispatch.**
>> ResourceJavaMethodDispatcher.**dispatch(_**ResourceJavaMethodDispatcher.*
>> *java:75_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.HttpMethodRule.**
>> accept(_HttpMethodRule.java:**288_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.ResourceClassRule.**
>> accept(_ResourceClassRule.**java:108_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.RightHandPathRule.**
>> accept(_RightHandPathRule.**java:147_)
>>
>>       at com.sun.jersey.server.impl.**uri.rules.**
>> RootResourceClassesRule.**accept(_**RootResourceClassesRule.java:**84_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl._*
>> *handleRequest(_**WebApplicationImpl.java:1465_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl._*
>> *handleRequest(_**WebApplicationImpl.java:1396_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl.**
>> handleRequest(_**WebApplicationImpl.java:1345_)
>>
>>       at com.sun.jersey.server.impl.**application.**WebApplicationImpl.**
>> handleRequest(_**WebApplicationImpl.java:1335_)
>>
>>       at com.sun.jersey.spi.container.**servlet.WebComponent.service(_**
>> WebComponent.java:416_)
>>
>>       at com.sun.jersey.spi.container.**servlet.ServletContainer.**
>> service(_ServletContainer.**java:537_)
>>
>>       at com.sun.jersey.spi.container.**servlet.ServletContainer.**
>> service(_ServletContainer.**java:699_)
>>
>>       at javax.servlet.http.**HttpServlet.service(_**
>> HttpServlet.java:820_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> ServletHandler.doHandle(_**ServletHandler.java:96_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> ServletHandler.handle(_**ServletHandler.java:79_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> ServletPipeline.handle(_**ServletPipeline.java:42_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> InvocationFilterChain.**doFilter(_**InvocationFilterChain.java:49_**)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> HttpFilterChain.doFilter(_**HttpFilterChain.java:33_)
>>
>>       at org.apache.stanbol.commons.**httpqueryheaders.impl.**
>> QueryHeadersFilter.doFilter(_**QueryHeadersFilter.java:75_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> FilterHandler.doHandle(_**FilterHandler.java:88_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> FilterHandler.handle(_**FilterHandler.java:76_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> InvocationFilterChain.**doFilter(_**InvocationFilterChain.java:47_**)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> HttpFilterChain.doFilter(_**HttpFilterChain.java:33_)
>>
>>       at org.apache.felix.http.base.**internal.handler.**
>> FilterHandler.handle(_**FilterHandler.java:78_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> InvocationFilterChain.**doFilter(_**InvocationFilterChain.java:47_**)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> HttpFilterChain.doFilter(_**HttpFilterChain.java:33_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.**
>> FilterPipeline.dispatch(_**FilterPipeline.java:48_)
>>
>>       at org.apache.felix.http.base.**internal.dispatch.Dispatcher.**
>> dispatch(_Dispatcher.java:39_)
>>
>>       at org.apache.felix.http.base.**internal.DispatcherServlet.**
>> service(_DispatcherServlet.**java:67_)
>>
>>       at javax.servlet.http.**HttpServlet.service(_**
>> HttpServlet.java:820_)
>>
>>       at org.mortbay.jetty.servlet.**ServletHolder.handle(_**
>> ServletHolder.java:511_)
>>
>>       at org.mortbay.jetty.servlet.**ServletHandler.handle(_**
>> ServletHandler.java:390_)
>>
>>       at org.mortbay.jetty.servlet.**SessionHandler.handle(_**
>> SessionHandler.java:182_)
>>
>>       at org.mortbay.jetty.handler.**ContextHandler.handle(_**
>> ContextHandler.java:765_)
>>
>>       at org.mortbay.jetty.handler.**HandlerWrapper.handle(_**
>> HandlerWrapper.java:152_)
>>
>>       at org.mortbay.jetty.Server.**handle(_Server.java:326_)
>>
>>       at org.mortbay.jetty.**HttpConnection.handleRequest(_**
>> HttpConnection.java:542_)
>>
>>       at org.mortbay.jetty.**HttpConnection$RequestHandler.**
>> content(_HttpConnection.java:**943_)
>>
>>       at org.mortbay.jetty.HttpParser.**parseNext(_HttpParser.java:**
>> 756_)
>>
>>       at org.mortbay.jetty.HttpParser.**parseAvailable(_HttpParser.**
>> java:212_)
>>
>>       at org.mortbay.jetty.**HttpConnection.handle(_**
>> HttpConnection.java:404_)
>>
>>       at org.mortbay.io.nio.**SelectChannelEndPoint.run(_**
>> SelectChannelEndPoint.java:**410_)
>>
>>       at org.mortbay.thread.**QueuedThreadPool$PoolThread.**
>> run(_QueuedThreadPool.java:**582_)
>>
>> </pre>
>>
>> <hr /><i><small>Powered by Jetty://</small></i><br/>
>>
>> <br/>
>>
>> <br/>
>>
>> <br/>
>>
>> <br/>
>>
>> <br/>
>>
>> <br/>
>>
>> <br/>
>>
>> <br/>
>>
>> <br/>
>>
>> <br/>
>>
>> <br/>
>>
>> <br/>
>>
>> <br/>
>>
>> <br/>
>>
>> <br/>
>>
>> <br/>
>>
>> <br/>
>>
>> <br/>
>>
>> <br/>
>>
>> </body>
>>
>> </html>
>>
>> *From:*srecko joksimovic [mailto:sreckojoksimovic@**gmail.com<sr...@gmail.com>
>> ]
>> *Sent:* Monday, January 16, 2012 12:43
>> *To:* Walter Kasper
>> *Cc:* stanbol-dev@incubator.apache.**org<st...@incubator.apache.org>
>> *Subject:* Re: Annotating using DBPedia ontology
>>
>> Hi,
>>
>> Thank you for your time, and answer.
>> Could you please explain me how do you extract annotated concepts from
>> generated model? I usually generate RDF output, parse SPARQL query and
>> process the output. Could you send me an example based on one of these two
>> documents you sent to me?
>>
>> Best,
>> Srecko
>>
>> On Mon, Jan 16, 2012 at 10:51 AM, Walter Kasper <wkasper@apache.org<mailto:
>> wkasper@apache.org>> wrote:
>>
>> Hi,
>>
>> Both of the PDF documents you sent work fine for us and the annotations
>> look as expected. Find attached output from our test, for the 22 pages doc
>> as well as for the 558 pages doc.
>> I have no explanation of why you apparently got different annotations.
>>
>> Best regards,
>>
>> Walter
>>
>>
>>
>> Srecko Joksimovic wrote:
>>
>> Hi,
>>
>> I have to say that everything works great for string, txt and doc files,
>> as
>> well as web pages. I have only one confusion, and it is regarding pdf
>> docs.
>> I attached results for this document
>> http://www.gtbit.org/**downloads/dwdmsem6/**dwdmsem6lman.pdf<http://www.gtbit.org/downloads/dwdmsem6/dwdmsem6lman.pdf>.
>> I didn't access
>> document using this URL, but downloaded and access it as C:\temp\weka.pdf.
>> I get correct answer, I mean, there are no errors, but I think there
>> should
>> be more annotated concepts.
>>
>> Could you please try to annotate this document and compare results? I am
>> sure you will be able to find what I did wrong.
>>
>> Best,
>> Srecko
>>
>>
>>
>>
>
>

Re: Annotating using DBPedia ontology

Posted by Walter Kasper <wk...@apache.org>.

Dear Srecko,

The error looks much like the one we fixed. We cannot reproduce it on 
our Stanbol installation for the URL you gave. Are you sure, you have 
the right version?

Best regards,

Walter

Srecko Joksimovic wrote:
>
> Hi,
>
> Recently I asked about annotating web resource and it was bug that you 
> solved. Now something similar happened when I tried to annotate this URL
>
> http://en.wikipedia.org/wiki/Software_design_pattern
>
> This is the output I got:
>
> http://en.wikipedia.org/wiki/Software_design_pattern
>
> ERROR <html>
>
> <head>
>
> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
>
> <title>Error 500 INTERNAL_SERVER_ERROR</title>
>
> </head>
>
> <body><h2>HTTP ERROR 500</h2>
>
> <p>Problem accessing /engines. Reason:
>
> <pre>    INTERNAL_SERVER_ERROR</pre></p><h3>Caused 
> _by:</h3><pre>org.apache.stanbol.enhancer.servicesapi.EngineException_
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(_MetaxaEngine.java:191_)
>
>        at 
> org.apache.stanbol.enhancer.jobmanager.impl.WeightedJobManager.enhanceContent(_WeightedJobManager.java:80_)
>
>        at 
> org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceAndBuildResponse(_EnginesRootResource.java:175_)
>
>        at 
> org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceFromData(_EnginesRootResource.java:167_)
>
>        at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
>
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
>        at java.lang.reflect.Method.invoke(Unknown Source)
>
>        at 
> com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(_JavaMethodInvokerFactory.java:60_)
>
>        at 
> com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(_AbstractResourceMethodDispatchProvider.java:205_)
>
>        at 
> com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(_ResourceJavaMethodDispatcher.java:75_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(_HttpMethodRule.java:288_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(_ResourceClassRule.java:108_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(_RightHandPathRule.java:147_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(_RootResourceClassesRule.java:84_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(_WebApplicationImpl.java:1465_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(_WebApplicationImpl.java:1396_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(_WebApplicationImpl.java:1345_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(_WebApplicationImpl.java:1335_)
>
>        at 
> com.sun.jersey.spi.container.servlet.WebComponent.service(_WebComponent.java:416_)
>
>        at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(_ServletContainer.java:537_)
>
>        at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(_ServletContainer.java:699_)
>
>        at javax.servlet.http.HttpServlet.service(_HttpServlet.java:820_)
>
>        at 
> org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(_ServletHandler.java:96_)
>
>        at 
> org.apache.felix.http.base.internal.handler.ServletHandler.handle(_ServletHandler.java:79_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(_ServletPipeline.java:42_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(_InvocationFilterChain.java:49_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(_HttpFilterChain.java:33_)
>
>        at 
> org.apache.stanbol.commons.httpqueryheaders.impl.QueryHeadersFilter.doFilter(_QueryHeadersFilter.java:75_)
>
>        at 
> org.apache.felix.http.base.internal.handler.FilterHandler.doHandle(_FilterHandler.java:88_)
>
>        at 
> org.apache.felix.http.base.internal.handler.FilterHandler.handle(_FilterHandler.java:76_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(_InvocationFilterChain.java:47_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(_HttpFilterChain.java:33_)
>
>        at 
> org.apache.felix.http.base.internal.handler.FilterHandler.handle(_FilterHandler.java:78_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(_InvocationFilterChain.java:47_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(_HttpFilterChain.java:33_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(_FilterPipeline.java:48_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(_Dispatcher.java:39_)
>
>        at 
> org.apache.felix.http.base.internal.DispatcherServlet.service(_DispatcherServlet.java:67_)
>
>        at javax.servlet.http.HttpServlet.service(_HttpServlet.java:820_)
>
>        at 
> org.mortbay.jetty.servlet.ServletHolder.handle(_ServletHolder.java:511_)
>
>        at 
> org.mortbay.jetty.servlet.ServletHandler.handle(_ServletHandler.java:390_)
>
>        at 
> org.mortbay.jetty.servlet.SessionHandler.handle(_SessionHandler.java:182_)
>
>        at 
> org.mortbay.jetty.handler.ContextHandler.handle(_ContextHandler.java:765_)
>
>        at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(_HandlerWrapper.java:152_)
>
>        at org.mortbay.jetty.Server.handle(_Server.java:326_)
>
>        at 
> org.mortbay.jetty.HttpConnection.handleRequest(_HttpConnection.java:542_)
>
>        at 
> org.mortbay.jetty.HttpConnection$RequestHandler.content(_HttpConnection.java:943_)
>
>        at org.mortbay.jetty.HttpParser.parseNext(_HttpParser.java:756_)
>
>        at 
> org.mortbay.jetty.HttpParser.parseAvailable(_HttpParser.java:212_)
>
>        at 
> org.mortbay.jetty.HttpConnection.handle(_HttpConnection.java:404_)
>
>        at 
> org.mortbay.io.nio.SelectChannelEndPoint.run(_SelectChannelEndPoint.java:410_)
>
>        at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(_QueuedThreadPool.java:582_)
>
> Caused by: _org.semanticdesktop.aperture.extractor.ExtractorException_
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(_XsltExtractor.java:147_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.core.html.IksHtmlExtractor.extract(_IksHtmlExtractor.java:123_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.core.MetaxaCore.extract(_MetaxaCore.java:120_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(_MetaxaEngine.java:157_)
>
>        ... 51 more
>
> Caused by: _java.io.IOException_
>
>        at 
> org.openrdf.rdf2go.RepositoryModel.readFrom(_RepositoryModel.java:661_)
>
>        at 
> org.openrdf.rdf2go.RepositoryModel.readFrom(_RepositoryModel.java:652_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(_XsltExtractor.java:140_)
>
>        ... 54 more
>
> Caused by: _org.openrdf.rio.RDFParseException_: Not a valid (absolute) 
> URI: //creativecommons.org/licenses/by-sa/3.0/ [line 4, column 179]
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.reportFatalError(_RDFParserBase.java:533_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.reportFatalError(_RDFXMLParser.java:1068_)
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.createURI(_RDFParserBase.java:285_)
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.resolveURI(_RDFParserBase.java:272_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.getPropertyResource(_RDFXMLParser.java:751_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.processPropertyElt(_RDFXMLParser.java:674_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.emptyElement(_RDFXMLParser.java:378_)
>
>        at 
> org.openrdf.rio.rdfxml.SAXFilter.endElement(_SAXFilter.java:359_)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown 
> Source)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.parse(_RDFXMLParser.java:260_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.parse(_RDFXMLParser.java:244_)
>
>        at 
> org.openrdf.repository.base.RepositoryConnectionBase.addInputStreamOrReader(_RepositoryConnectionBase.java:357_)
>
>        at 
> org.openrdf.repository.base.RepositoryConnectionBase.add(_RepositoryConnectionBase.java:312_)
>
>        at 
> org.openrdf.rdf2go.RepositoryModel.readFrom(_RepositoryModel.java:659_)
>
>        ... 56 more
>
> Caused by: _java.lang.IllegalArgumentException_: Not a valid 
> (absolute) URI: //creativecommons.org/licenses/by-sa/3.0/
>
>        at org.openrdf.model.impl.URIImpl.setURIString(_URIImpl.java:68_)
>
>        at org.openrdf.model.impl.URIImpl.&lt;init&gt;(_URIImpl.java:57_)
>
>        at 
> org.openrdf.sail.memory.model.MemValueFactory.createURI(_MemValueFactory.java:345_)
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.createURI(_RDFParserBase.java:282_)
>
>        ... 78 more
>
> </pre>
>
> <h3>Caused 
> _by:</h3><pre>org.semanticdesktop.aperture.extractor.ExtractorException_
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(_XsltExtractor.java:147_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.core.html.IksHtmlExtractor.extract(_IksHtmlExtractor.java:123_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.core.MetaxaCore.extract(_MetaxaCore.java:120_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(_MetaxaEngine.java:157_)
>
>        at 
> org.apache.stanbol.enhancer.jobmanager.impl.WeightedJobManager.enhanceContent(_WeightedJobManager.java:80_)
>
>        at 
> org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceAndBuildResponse(_EnginesRootResource.java:175_)
>
>        at 
> org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceFromData(_EnginesRootResource.java:167_)
>
>        at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
>
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
>        at java.lang.reflect.Method.invoke(Unknown Source)
>
>        at 
> com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(_JavaMethodInvokerFactory.java:60_)
>
>        at 
> com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(_AbstractResourceMethodDispatchProvider.java:205_)
>
>        at 
> com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(_ResourceJavaMethodDispatcher.java:75_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(_HttpMethodRule.java:288_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(_ResourceClassRule.java:108_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(_RightHandPathRule.java:147_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(_RootResourceClassesRule.java:84_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(_WebApplicationImpl.java:1465_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(_WebApplicationImpl.java:1396_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(_WebApplicationImpl.java:1345_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(_WebApplicationImpl.java:1335_)
>
>        at 
> com.sun.jersey.spi.container.servlet.WebComponent.service(_WebComponent.java:416_)
>
>        at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(_ServletContainer.java:537_)
>
>        at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(_ServletContainer.java:699_)
>
>        at javax.servlet.http.HttpServlet.service(_HttpServlet.java:820_)
>
>        at 
> org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(_ServletHandler.java:96_)
>
>        at 
> org.apache.felix.http.base.internal.handler.ServletHandler.handle(_ServletHandler.java:79_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(_ServletPipeline.java:42_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(_InvocationFilterChain.java:49_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(_HttpFilterChain.java:33_)
>
>        at 
> org.apache.stanbol.commons.httpqueryheaders.impl.QueryHeadersFilter.doFilter(_QueryHeadersFilter.java:75_)
>
>        at 
> org.apache.felix.http.base.internal.handler.FilterHandler.doHandle(_FilterHandler.java:88_)
>
>        at 
> org.apache.felix.http.base.internal.handler.FilterHandler.handle(_FilterHandler.java:76_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(_InvocationFilterChain.java:47_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(_HttpFilterChain.java:33_)
>
>        at 
> org.apache.felix.http.base.internal.handler.FilterHandler.handle(_FilterHandler.java:78_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(_InvocationFilterChain.java:47_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(_HttpFilterChain.java:33_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(_FilterPipeline.java:48_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(_Dispatcher.java:39_)
>
>        at 
> org.apache.felix.http.base.internal.DispatcherServlet.service(_DispatcherServlet.java:67_)
>
>        at javax.servlet.http.HttpServlet.service(_HttpServlet.java:820_)
>
>        at 
> org.mortbay.jetty.servlet.ServletHolder.handle(_ServletHolder.java:511_)
>
>        at 
> org.mortbay.jetty.servlet.ServletHandler.handle(_ServletHandler.java:390_)
>
>        at 
> org.mortbay.jetty.servlet.SessionHandler.handle(_SessionHandler.java:182_)
>
>        at 
> org.mortbay.jetty.handler.ContextHandler.handle(_ContextHandler.java:765_)
>
>        at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(_HandlerWrapper.java:152_)
>
>        at org.mortbay.jetty.Server.handle(_Server.java:326_)
>
>        at 
> org.mortbay.jetty.HttpConnection.handleRequest(_HttpConnection.java:542_)
>
>        at 
> org.mortbay.jetty.HttpConnection$RequestHandler.content(_HttpConnection.java:943_)
>
>        at org.mortbay.jetty.HttpParser.parseNext(_HttpParser.java:756_)
>
>        at 
> org.mortbay.jetty.HttpParser.parseAvailable(_HttpParser.java:212_)
>
>        at 
> org.mortbay.jetty.HttpConnection.handle(_HttpConnection.java:404_)
>
>        at 
> org.mortbay.io.nio.SelectChannelEndPoint.run(_SelectChannelEndPoint.java:410_)
>
>        at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(_QueuedThreadPool.java:582_)
>
> Caused by: _java.io.IOException_
>
>        at 
> org.openrdf.rdf2go.RepositoryModel.readFrom(_RepositoryModel.java:661_)
>
>        at 
> org.openrdf.rdf2go.RepositoryModel.readFrom(_RepositoryModel.java:652_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(_XsltExtractor.java:140_)
>
>        ... 54 more
>
> Caused by: _org.openrdf.rio.RDFParseException_: Not a valid (absolute) 
> URI: //creativecommons.org/licenses/by-sa/3.0/ [line 4, column 179]
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.reportFatalError(_RDFParserBase.java:533_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.reportFatalError(_RDFXMLParser.java:1068_)
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.createURI(_RDFParserBase.java:285_)
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.resolveURI(_RDFParserBase.java:272_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.getPropertyResource(_RDFXMLParser.java:751_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.processPropertyElt(_RDFXMLParser.java:674_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.emptyElement(_RDFXMLParser.java:378_)
>
>        at 
> org.openrdf.rio.rdfxml.SAXFilter.endElement(_SAXFilter.java:359_)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(UnknownSource)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown 
> Source)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.parse(_RDFXMLParser.java:260_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.parse(_RDFXMLParser.java:244_)
>
>        at 
> org.openrdf.repository.base.RepositoryConnectionBase.addInputStreamOrReader(_RepositoryConnectionBase.java:357_)
>
>        at 
> org.openrdf.repository.base.RepositoryConnectionBase.add(_RepositoryConnectionBase.java:312_)
>
>        at 
> org.openrdf.rdf2go.RepositoryModel.readFrom(_RepositoryModel.java:659_)
>
>        ... 56 more
>
> Caused by: _java.lang.IllegalArgumentException_: Not a valid 
> (absolute) URI: //creativecommons.org/licenses/by-sa/3.0/
>
>        at org.openrdf.model.impl.URIImpl.setURIString(_URIImpl.java:68_)
>
>        at org.openrdf.model.impl.URIImpl.&lt;init&gt;(_URIImpl.java:57_)
>
>        at 
> org.openrdf.sail.memory.model.MemValueFactory.createURI(_MemValueFactory.java:345_)
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.createURI(_RDFParserBase.java:282_)
>
>        ... 78 more
>
> </pre>
>
> <h3>Caused _by:</h3><pre>java.io.IOException_
>
>        at 
> org.openrdf.rdf2go.RepositoryModel.readFrom(_RepositoryModel.java:661_)
>
>        at 
> org.openrdf.rdf2go.RepositoryModel.readFrom(_RepositoryModel.java:652_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(_XsltExtractor.java:140_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.core.html.IksHtmlExtractor.extract(_IksHtmlExtractor.java:123_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.core.MetaxaCore.extract(_MetaxaCore.java:120_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(_MetaxaEngine.java:157_)
>
>        at 
> org.apache.stanbol.enhancer.jobmanager.impl.WeightedJobManager.enhanceContent(_WeightedJobManager.java:80_)
>
>        at 
> org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceAndBuildResponse(_EnginesRootResource.java:175_)
>
>        at 
> org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceFromData(_EnginesRootResource.java:167_)
>
>        at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
>
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
>        at java.lang.reflect.Method.invoke(Unknown Source)
>
>        at 
> com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(_JavaMethodInvokerFactory.java:60_)
>
>        at 
> com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(_AbstractResourceMethodDispatchProvider.java:205_)
>
>        at 
> com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(_ResourceJavaMethodDispatcher.java:75_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(_HttpMethodRule.java:288_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(_ResourceClassRule.java:108_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(_RightHandPathRule.java:147_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(_RootResourceClassesRule.java:84_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(_WebApplicationImpl.java:1465_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(_WebApplicationImpl.java:1396_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(_WebApplicationImpl.java:1345_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(_WebApplicationImpl.java:1335_)
>
>        at 
> com.sun.jersey.spi.container.servlet.WebComponent.service(_WebComponent.java:416_)
>
>        at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(_ServletContainer.java:537_)
>
>        at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(_ServletContainer.java:699_)
>
>        at javax.servlet.http.HttpServlet.service(_HttpServlet.java:820_)
>
>        at 
> org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(_ServletHandler.java:96_)
>
>        at 
> org.apache.felix.http.base.internal.handler.ServletHandler.handle(_ServletHandler.java:79_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(_ServletPipeline.java:42_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(_InvocationFilterChain.java:49_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(_HttpFilterChain.java:33_)
>
>        at 
> org.apache.stanbol.commons.httpqueryheaders.impl.QueryHeadersFilter.doFilter(_QueryHeadersFilter.java:75_)
>
>        at 
> org.apache.felix.http.base.internal.handler.FilterHandler.doHandle(_FilterHandler.java:88_)
>
>        at 
> org.apache.felix.http.base.internal.handler.FilterHandler.handle(_FilterHandler.java:76_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(_InvocationFilterChain.java:47_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(_HttpFilterChain.java:33_)
>
>        at 
> org.apache.felix.http.base.internal.handler.FilterHandler.handle(_FilterHandler.java:78_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(_InvocationFilterChain.java:47_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(_HttpFilterChain.java:33_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(_FilterPipeline.java:48_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(_Dispatcher.java:39_)
>
>        at 
> org.apache.felix.http.base.internal.DispatcherServlet.service(_DispatcherServlet.java:67_)
>
>        at javax.servlet.http.HttpServlet.service(_HttpServlet.java:820_)
>
>        at 
> org.mortbay.jetty.servlet.ServletHolder.handle(_ServletHolder.java:511_)
>
>        at 
> org.mortbay.jetty.servlet.ServletHandler.handle(_ServletHandler.java:390_)
>
>        at 
> org.mortbay.jetty.servlet.SessionHandler.handle(_SessionHandler.java:182_)
>
>        at 
> org.mortbay.jetty.handler.ContextHandler.handle(_ContextHandler.java:765_)
>
>        at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(_HandlerWrapper.java:152_)
>
>        at org.mortbay.jetty.Server.handle(_Server.java:326_)
>
>        at 
> org.mortbay.jetty.HttpConnection.handleRequest(_HttpConnection.java:542_)
>
>        at 
> org.mortbay.jetty.HttpConnection$RequestHandler.content(_HttpConnection.java:943_)
>
>        at org.mortbay.jetty.HttpParser.parseNext(_HttpParser.java:756_)
>
>        at 
> org.mortbay.jetty.HttpParser.parseAvailable(_HttpParser.java:212_)
>
>        at 
> org.mortbay.jetty.HttpConnection.handle(_HttpConnection.java:404_)
>
>        at 
> org.mortbay.io.nio.SelectChannelEndPoint.run(_SelectChannelEndPoint.java:410_)
>
>        at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(_QueuedThreadPool.java:582_)
>
> Caused by: _org.openrdf.rio.RDFParseException_: Not a valid (absolute) 
> URI: //creativecommons.org/licenses/by-sa/3.0/ [line 4, column 179]
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.reportFatalError(_RDFParserBase.java:533_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.reportFatalError(_RDFXMLParser.java:1068_)
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.createURI(_RDFParserBase.java:285_)
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.resolveURI(_RDFParserBase.java:272_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.getPropertyResource(_RDFXMLParser.java:751_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.processPropertyElt(_RDFXMLParser.java:674_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.emptyElement(_RDFXMLParser.java:378_)
>
>        at 
> org.openrdf.rio.rdfxml.SAXFilter.endElement(_SAXFilter.java:359_)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(UnknownSource)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(UnknownSource)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(UnknownSource)
>
>        at 
> com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown 
> Source)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.parse(_RDFXMLParser.java:260_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.parse(_RDFXMLParser.java:244_)
>
>        at 
> org.openrdf.repository.base.RepositoryConnectionBase.addInputStreamOrReader(_RepositoryConnectionBase.java:357_)
>
>        at 
> org.openrdf.repository.base.RepositoryConnectionBase.add(_RepositoryConnectionBase.java:312_)
>
>        at 
> org.openrdf.rdf2go.RepositoryModel.readFrom(_RepositoryModel.java:659_)
>
>        ... 56 more
>
> Caused by: _java.lang.IllegalArgumentException_: Not a valid 
> (absolute) URI: //creativecommons.org/licenses/by-sa/3.0/
>
>        at org.openrdf.model.impl.URIImpl.setURIString(_URIImpl.java:68_)
>
>        at org.openrdf.model.impl.URIImpl.&lt;init&gt;(_URIImpl.java:57_)
>
>        at 
> org.openrdf.sail.memory.model.MemValueFactory.createURI(_MemValueFactory.java:345_)
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.createURI(_RDFParserBase.java:282_)
>
>        ... 78 more
>
> </pre>
>
> <h3>Caused _by:</h3><pre>org.openrdf.rio.RDFParseException_: Not a 
> valid (absolute) URI: //creativecommons.org/licenses/by-sa/3.0/ [line 
> 4, column 179]
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.reportFatalError(_RDFParserBase.java:533_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.reportFatalError(_RDFXMLParser.java:1068_)
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.createURI(_RDFParserBase.java:285_)
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.resolveURI(_RDFParserBase.java:272_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.getPropertyResource(_RDFXMLParser.java:751_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.processPropertyElt(_RDFXMLParser.java:674_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.emptyElement(_RDFXMLParser.java:378_)
>
>        at 
> org.openrdf.rio.rdfxml.SAXFilter.endElement(_SAXFilter.java:359_)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown 
> Source)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.parse(_RDFXMLParser.java:260_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.parse(_RDFXMLParser.java:244_)
>
>        at 
> org.openrdf.repository.base.RepositoryConnectionBase.addInputStreamOrReader(_RepositoryConnectionBase.java:357_)
>
>        at 
> org.openrdf.repository.base.RepositoryConnectionBase.add(_RepositoryConnectionBase.java:312_)
>
>        at 
> org.openrdf.rdf2go.RepositoryModel.readFrom(_RepositoryModel.java:659_)
>
>        at 
> org.openrdf.rdf2go.RepositoryModel.readFrom(_RepositoryModel.java:652_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(_XsltExtractor.java:140_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.core.html.IksHtmlExtractor.extract(_IksHtmlExtractor.java:123_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.core.MetaxaCore.extract(_MetaxaCore.java:120_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(_MetaxaEngine.java:157_)
>
>        at 
> org.apache.stanbol.enhancer.jobmanager.impl.WeightedJobManager.enhanceContent(_WeightedJobManager.java:80_)
>
>        at 
> org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceAndBuildResponse(_EnginesRootResource.java:175_)
>
>        at 
> org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceFromData(_EnginesRootResource.java:167_)
>
>        at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
>
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
>        at java.lang.reflect.Method.invoke(Unknown Source)
>
>        at 
> com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(_JavaMethodInvokerFactory.java:60_)
>
>        at 
> com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(_AbstractResourceMethodDispatchProvider.java:205_)
>
>        at 
> com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(_ResourceJavaMethodDispatcher.java:75_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(_HttpMethodRule.java:288_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(_ResourceClassRule.java:108_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(_RightHandPathRule.java:147_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(_RootResourceClassesRule.java:84_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(_WebApplicationImpl.java:1465_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(_WebApplicationImpl.java:1396_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(_WebApplicationImpl.java:1345_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(_WebApplicationImpl.java:1335_)
>
>        at 
> com.sun.jersey.spi.container.servlet.WebComponent.service(_WebComponent.java:416_)
>
>        at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(_ServletContainer.java:537_)
>
>        at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(_ServletContainer.java:699_)
>
>        at javax.servlet.http.HttpServlet.service(_HttpServlet.java:820_)
>
>        at 
> org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(_ServletHandler.java:96_)
>
>        at 
> org.apache.felix.http.base.internal.handler.ServletHandler.handle(_ServletHandler.java:79_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(_ServletPipeline.java:42_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(_InvocationFilterChain.java:49_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(_HttpFilterChain.java:33_)
>
>        at 
> org.apache.stanbol.commons.httpqueryheaders.impl.QueryHeadersFilter.doFilter(_QueryHeadersFilter.java:75_)
>
>        at 
> org.apache.felix.http.base.internal.handler.FilterHandler.doHandle(_FilterHandler.java:88_)
>
>        at 
> org.apache.felix.http.base.internal.handler.FilterHandler.handle(_FilterHandler.java:76_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(_InvocationFilterChain.java:47_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(_HttpFilterChain.java:33_)
>
>        at 
> org.apache.felix.http.base.internal.handler.FilterHandler.handle(_FilterHandler.java:78_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(_InvocationFilterChain.java:47_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(_HttpFilterChain.java:33_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(_FilterPipeline.java:48_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(_Dispatcher.java:39_)
>
>        at 
> org.apache.felix.http.base.internal.DispatcherServlet.service(_DispatcherServlet.java:67_)
>
>        at javax.servlet.http.HttpServlet.service(_HttpServlet.java:820_)
>
>        at 
> org.mortbay.jetty.servlet.ServletHolder.handle(_ServletHolder.java:511_)
>
>        at 
> org.mortbay.jetty.servlet.ServletHandler.handle(_ServletHandler.java:390_)
>
>        at 
> org.mortbay.jetty.servlet.SessionHandler.handle(_SessionHandler.java:182_)
>
>        at 
> org.mortbay.jetty.handler.ContextHandler.handle(_ContextHandler.java:765_)
>
>        at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(_HandlerWrapper.java:152_)
>
>        at org.mortbay.jetty.Server.handle(_Server.java:326_)
>
>        at 
> org.mortbay.jetty.HttpConnection.handleRequest(_HttpConnection.java:542_)
>
>        at 
> org.mortbay.jetty.HttpConnection$RequestHandler.content(_HttpConnection.java:943_)
>
>        at org.mortbay.jetty.HttpParser.parseNext(_HttpParser.java:756_)
>
>        at 
> org.mortbay.jetty.HttpParser.parseAvailable(_HttpParser.java:212_)
>
>        at 
> org.mortbay.jetty.HttpConnection.handle(_HttpConnection.java:404_)
>
>        at 
> org.mortbay.io.nio.SelectChannelEndPoint.run(_SelectChannelEndPoint.java:410_)
>
>        at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(_QueuedThreadPool.java:582_)
>
> Caused by: _java.lang.IllegalArgumentException_: Not a valid 
> (absolute) URI: //creativecommons.org/licenses/by-sa/3.0/
>
>        at org.openrdf.model.impl.URIImpl.setURIString(_URIImpl.java:68_)
>
>        at org.openrdf.model.impl.URIImpl.&lt;init&gt;(_URIImpl.java:57_)
>
>        at 
> org.openrdf.sail.memory.model.MemValueFactory.createURI(_MemValueFactory.java:345_)
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.createURI(_RDFParserBase.java:282_)
>
>        ... 78 more
>
> </pre>
>
> <h3>Caused _by:</h3><pre>java.lang.IllegalArgumentException_: Not a 
> valid (absolute) URI: //creativecommons.org/licenses/by-sa/3.0/
>
>        at org.openrdf.model.impl.URIImpl.setURIString(_URIImpl.java:68_)
>
>        at org.openrdf.model.impl.URIImpl.&lt;init&gt;(_URIImpl.java:57_)
>
>        at 
> org.openrdf.sail.memory.model.MemValueFactory.createURI(_MemValueFactory.java:345_)
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.createURI(_RDFParserBase.java:282_)
>
>        at 
> org.openrdf.rio.helpers.RDFParserBase.resolveURI(_RDFParserBase.java:272_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.getPropertyResource(_RDFXMLParser.java:751_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.processPropertyElt(_RDFXMLParser.java:674_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.emptyElement(_RDFXMLParser.java:378_)
>
>        at 
> org.openrdf.rio.rdfxml.SAXFilter.endElement(_SAXFilter.java:359_)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(UnknownSource)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
>
>        at 
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown 
> Source)
>
>        at 
> com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown 
> Source)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.parse(_RDFXMLParser.java:260_)
>
>        at 
> org.openrdf.rio.rdfxml.RDFXMLParser.parse(_RDFXMLParser.java:244_)
>
>        at 
> org.openrdf.repository.base.RepositoryConnectionBase.addInputStreamOrReader(_RepositoryConnectionBase.java:357_)
>
>        at 
> org.openrdf.repository.base.RepositoryConnectionBase.add(_RepositoryConnectionBase.java:312_)
>
>        at 
> org.openrdf.rdf2go.RepositoryModel.readFrom(_RepositoryModel.java:659_)
>
>        at 
> org.openrdf.rdf2go.RepositoryModel.readFrom(_RepositoryModel.java:652_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(_XsltExtractor.java:140_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.core.html.IksHtmlExtractor.extract(_IksHtmlExtractor.java:123_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.core.MetaxaCore.extract(_MetaxaCore.java:120_)
>
>        at 
> org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(_MetaxaEngine.java:157_)
>
>        at 
> org.apache.stanbol.enhancer.jobmanager.impl.WeightedJobManager.enhanceContent(_WeightedJobManager.java:80_)
>
>        at 
> org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceAndBuildResponse(_EnginesRootResource.java:175_)
>
>        at 
> org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceFromData(_EnginesRootResource.java:167_)
>
>        at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
>
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
>        at java.lang.reflect.Method.invoke(Unknown Source)
>
>        at 
> com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(_JavaMethodInvokerFactory.java:60_)
>
>        at 
> com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(_AbstractResourceMethodDispatchProvider.java:205_)
>
>        at 
> com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(_ResourceJavaMethodDispatcher.java:75_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(_HttpMethodRule.java:288_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(_ResourceClassRule.java:108_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(_RightHandPathRule.java:147_)
>
>        at 
> com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(_RootResourceClassesRule.java:84_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(_WebApplicationImpl.java:1465_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(_WebApplicationImpl.java:1396_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(_WebApplicationImpl.java:1345_)
>
>        at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(_WebApplicationImpl.java:1335_)
>
>        at 
> com.sun.jersey.spi.container.servlet.WebComponent.service(_WebComponent.java:416_)
>
>        at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(_ServletContainer.java:537_)
>
>        at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(_ServletContainer.java:699_)
>
>        at javax.servlet.http.HttpServlet.service(_HttpServlet.java:820_)
>
>        at 
> org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(_ServletHandler.java:96_)
>
>        at 
> org.apache.felix.http.base.internal.handler.ServletHandler.handle(_ServletHandler.java:79_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(_ServletPipeline.java:42_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(_InvocationFilterChain.java:49_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(_HttpFilterChain.java:33_)
>
>        at 
> org.apache.stanbol.commons.httpqueryheaders.impl.QueryHeadersFilter.doFilter(_QueryHeadersFilter.java:75_)
>
>        at 
> org.apache.felix.http.base.internal.handler.FilterHandler.doHandle(_FilterHandler.java:88_)
>
>        at 
> org.apache.felix.http.base.internal.handler.FilterHandler.handle(_FilterHandler.java:76_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(_InvocationFilterChain.java:47_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(_HttpFilterChain.java:33_)
>
>        at 
> org.apache.felix.http.base.internal.handler.FilterHandler.handle(_FilterHandler.java:78_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(_InvocationFilterChain.java:47_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(_HttpFilterChain.java:33_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(_FilterPipeline.java:48_)
>
>        at 
> org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(_Dispatcher.java:39_)
>
>        at 
> org.apache.felix.http.base.internal.DispatcherServlet.service(_DispatcherServlet.java:67_)
>
>        at javax.servlet.http.HttpServlet.service(_HttpServlet.java:820_)
>
>        at 
> org.mortbay.jetty.servlet.ServletHolder.handle(_ServletHolder.java:511_)
>
>        at 
> org.mortbay.jetty.servlet.ServletHandler.handle(_ServletHandler.java:390_)
>
>        at 
> org.mortbay.jetty.servlet.SessionHandler.handle(_SessionHandler.java:182_)
>
>        at 
> org.mortbay.jetty.handler.ContextHandler.handle(_ContextHandler.java:765_)
>
>        at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(_HandlerWrapper.java:152_)
>
>        at org.mortbay.jetty.Server.handle(_Server.java:326_)
>
>        at 
> org.mortbay.jetty.HttpConnection.handleRequest(_HttpConnection.java:542_)
>
>        at 
> org.mortbay.jetty.HttpConnection$RequestHandler.content(_HttpConnection.java:943_)
>
>        at org.mortbay.jetty.HttpParser.parseNext(_HttpParser.java:756_)
>
>        at 
> org.mortbay.jetty.HttpParser.parseAvailable(_HttpParser.java:212_)
>
>        at 
> org.mortbay.jetty.HttpConnection.handle(_HttpConnection.java:404_)
>
>        at 
> org.mortbay.io.nio.SelectChannelEndPoint.run(_SelectChannelEndPoint.java:410_)
>
>        at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(_QueuedThreadPool.java:582_)
>
> </pre>
>
> <hr /><i><small>Powered by Jetty://</small></i><br/>
>
> <br/>
>
> <br/>
>
> <br/>
>
> <br/>
>
> <br/>
>
> <br/>
>
> <br/>
>
> <br/>
>
> <br/>
>
> <br/>
>
> <br/>
>
> <br/>
>
> <br/>
>
> <br/>
>
> <br/>
>
> <br/>
>
> <br/>
>
> <br/>
>
> <br/>
>
> </body>
>
> </html>
>
> *From:*srecko joksimovic [mailto:sreckojoksimovic@gmail.com]
> *Sent:* Monday, January 16, 2012 12:43
> *To:* Walter Kasper
> *Cc:* stanbol-dev@incubator.apache.org
> *Subject:* Re: Annotating using DBPedia ontology
>
> Hi,
>
> Thank you for your time, and answer.
> Could you please explain me how do you extract annotated concepts from 
> generated model? I usually generate RDF output, parse SPARQL query and 
> process the output. Could you send me an example based on one of these 
> two documents you sent to me?
>
> Best,
> Srecko
>
> On Mon, Jan 16, 2012 at 10:51 AM, Walter Kasper <wkasper@apache.org 
> <ma...@apache.org>> wrote:
>
> Hi,
>
> Both of the PDF documents you sent work fine for us and the 
> annotations look as expected. Find attached output from our test, for 
> the 22 pages doc as well as for the 558 pages doc.
> I have no explanation of why you apparently got different annotations.
>
> Best regards,
>
> Walter
>
>
>
> Srecko Joksimovic wrote:
>
> Hi,
>
> I have to say that everything works great for string, txt and doc 
> files, as
> well as web pages. I have only one confusion, and it is regarding pdf 
> docs.
> I attached results for this document
> http://www.gtbit.org/downloads/dwdmsem6/dwdmsem6lman.pdf. I didn't access
> document using this URL, but downloaded and access it as C:\temp\weka.pdf.
> I get correct answer, I mean, there are no errors, but I think there 
> should
> be more annotated concepts.
>
> Could you please try to annotate this document and compare results? I am
> sure you will be able to find what I did wrong.
>
> Best,
> Srecko
>
>
>

RE: Annotating using DBPedia ontology

Posted by Srecko Joksimovic <sr...@gmail.com>.

Hi,

 

Recently I asked about annotating web resource and it was bug that you
solved. Now something similar happened when I tried to annotate this URL

http://en.wikipedia.org/wiki/Software_design_pattern

This is the output I got:

 

http://en.wikipedia.org/wiki/Software_design_pattern

ERROR <html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>

<title>Error 500 INTERNAL_SERVER_ERROR</title>

</head>

<body><h2>HTTP ERROR 500</h2>

<p>Problem accessing /engines. Reason:

<pre>    INTERNAL_SERVER_ERROR</pre></p><h3>Caused
by:</h3><pre>org.apache.stanbol.enhancer.servicesapi.EngineException

       at
org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(
MetaxaEngine.java:191)

       at
org.apache.stanbol.enhancer.jobmanager.impl.WeightedJobManager.enhanceConten
t(WeightedJobManager.java:80)

       at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceAndBu
ildResponse(EnginesRootResource.java:175)

       at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceFromD
ata(EnginesRootResource.java:167)

       at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)

       at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

       at java.lang.reflect.Method.invoke(Unknown Source)

       at
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInv
okerFactory.java:60)

       at
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispa
tchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvi
der.java:205)

       at
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatche
r.dispatch(ResourceJavaMethodDispatcher.java:75)

       at
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.ja
va:288)

       at
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassR
ule.java:108)

       at
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathR
ule.java:147)

       at
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootReso
urceClassesRule.java:84)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(Web
ApplicationImpl.java:1465)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(Web
ApplicationImpl.java:1396)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebA
pplicationImpl.java:1345)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebA
pplicationImpl.java:1335)

       at
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:
416)

       at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContain
er.java:537)

       at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContain
er.java:699)

       at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

       at
org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(ServletH
andler.java:96)

       at
org.apache.felix.http.base.internal.handler.ServletHandler.handle(ServletHan
dler.java:79)

       at
org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(ServletP
ipeline.java:42)

       at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(
InvocationFilterChain.java:49)

       at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFi
lterChain.java:33)

       at
org.apache.stanbol.commons.httpqueryheaders.impl.QueryHeadersFilter.doFilter
(QueryHeadersFilter.java:75)

       at
org.apache.felix.http.base.internal.handler.FilterHandler.doHandle(FilterHan
dler.java:88)

       at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandl
er.java:76)

       at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(
InvocationFilterChain.java:47)

       at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFi
lterChain.java:33)

       at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandl
er.java:78)

       at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(
InvocationFilterChain.java:47)

       at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFi
lterChain.java:33)

       at
org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(FilterP
ipeline.java:48)

       at
org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(Dispatcher.
java:39)

       at
org.apache.felix.http.base.internal.DispatcherServlet.service(DispatcherServ
let.java:67)

       at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

       at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)

       at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390)

       at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)

       at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)

       at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)

       at org.mortbay.jetty.Server.handle(Server.java:326)

       at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)

       at
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:
943)

       at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)

       at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)

       at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)

       at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)

       at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582
)

Caused by: org.semanticdesktop.aperture.extractor.ExtractorException

       at
org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(X
sltExtractor.java:147)

       at
org.apache.stanbol.enhancer.engines.metaxa.core.html.IksHtmlExtractor.extrac
t(IksHtmlExtractor.java:123)

       at
org.apache.stanbol.enhancer.engines.metaxa.core.MetaxaCore.extract(MetaxaCor
e.java:120)

       at
org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(
MetaxaEngine.java:157)

       ... 51 more

Caused by: java.io.IOException

       at
org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:661)

       at
org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:652)

       at
org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(X
sltExtractor.java:140)

       ... 54 more

Caused by: org.openrdf.rio.RDFParseException: Not a valid (absolute) URI:
//creativecommons.org/licenses/by-sa/3.0/ [line 4, column 179]

       at
org.openrdf.rio.helpers.RDFParserBase.reportFatalError(RDFParserBase.java:53
3)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.reportFatalError(RDFXMLParser.java:1068)

       at
org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:285)

       at
org.openrdf.rio.helpers.RDFParserBase.resolveURI(RDFParserBase.java:272)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.getPropertyResource(RDFXMLParser.java:75
1)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.processPropertyElt(RDFXMLParser.java:674
)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.emptyElement(RDFXMLParser.java:378)

       at org.openrdf.rio.rdfxml.SAXFilter.endElement(SAXFilter.java:359)

       at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unkn
own Source)

       at
com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyEl
ement(Unknown Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartEl
ement(Unknown Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$Fragm
entContentDriver.next(Unknown Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknow
n Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanD
ocument(Unknown Source)

       at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)

       at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Un
known Source)

       at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:260)

       at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:244)

       at
org.openrdf.repository.base.RepositoryConnectionBase.addInputStreamOrReader(
RepositoryConnectionBase.java:357)

       at
org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectio
nBase.java:312)

       at
org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:659)

       ... 56 more

Caused by: java.lang.IllegalArgumentException: Not a valid (absolute) URI:
//creativecommons.org/licenses/by-sa/3.0/

       at org.openrdf.model.impl.URIImpl.setURIString(URIImpl.java:68)

       at org.openrdf.model.impl.URIImpl.&lt;init&gt;(URIImpl.java:57)

       at
org.openrdf.sail.memory.model.MemValueFactory.createURI(MemValueFactory.java
:345)

       at
org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:282)

       ... 78 more

</pre>

<h3>Caused
by:</h3><pre>org.semanticdesktop.aperture.extractor.ExtractorException

       at
org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(X
sltExtractor.java:147)

       at
org.apache.stanbol.enhancer.engines.metaxa.core.html.IksHtmlExtractor.extrac
t(IksHtmlExtractor.java:123)

       at
org.apache.stanbol.enhancer.engines.metaxa.core.MetaxaCore.extract(MetaxaCor
e.java:120)

       at
org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(
MetaxaEngine.java:157)

       at
org.apache.stanbol.enhancer.jobmanager.impl.WeightedJobManager.enhanceConten
t(WeightedJobManager.java:80)

       at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceAndBu
ildResponse(EnginesRootResource.java:175)

       at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceFromD
ata(EnginesRootResource.java:167)

       at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)

       at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

       at java.lang.reflect.Method.invoke(Unknown Source)

       at
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInv
okerFactory.java:60)

       at
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispa
tchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvi
der.java:205)

       at
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatche
r.dispatch(ResourceJavaMethodDispatcher.java:75)

       at
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.ja
va:288)

       at
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassR
ule.java:108)

       at
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathR
ule.java:147)

       at
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootReso
urceClassesRule.java:84)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(Web
ApplicationImpl.java:1465)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(Web
ApplicationImpl.java:1396)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebA
pplicationImpl.java:1345)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebA
pplicationImpl.java:1335)

       at
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:
416)

       at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContain
er.java:537)

       at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContain
er.java:699)

       at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

       at
org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(ServletH
andler.java:96)

       at
org.apache.felix.http.base.internal.handler.ServletHandler.handle(ServletHan
dler.java:79)

       at
org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(ServletP
ipeline.java:42)

       at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(
InvocationFilterChain.java:49)

       at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFi
lterChain.java:33)

       at
org.apache.stanbol.commons.httpqueryheaders.impl.QueryHeadersFilter.doFilter
(QueryHeadersFilter.java:75)

       at
org.apache.felix.http.base.internal.handler.FilterHandler.doHandle(FilterHan
dler.java:88)

       at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandl
er.java:76)

       at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(
InvocationFilterChain.java:47)

       at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFi
lterChain.java:33)

       at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandl
er.java:78)

       at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(
InvocationFilterChain.java:47)

       at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFi
lterChain.java:33)

       at
org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(FilterP
ipeline.java:48)

       at
org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(Dispatcher.
java:39)

       at
org.apache.felix.http.base.internal.DispatcherServlet.service(DispatcherServ
let.java:67)

       at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

       at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)

       at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390)

       at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)

       at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)

       at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)

       at org.mortbay.jetty.Server.handle(Server.java:326)

       at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)

       at
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:
943)

       at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)

       at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)

       at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)

       at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)

       at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582
)

Caused by: java.io.IOException

       at
org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:661)

       at
org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:652)

       at
org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(X
sltExtractor.java:140)

       ... 54 more

Caused by: org.openrdf.rio.RDFParseException: Not a valid (absolute) URI:
//creativecommons.org/licenses/by-sa/3.0/ [line 4, column 179]

       at
org.openrdf.rio.helpers.RDFParserBase.reportFatalError(RDFParserBase.java:53
3)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.reportFatalError(RDFXMLParser.java:1068)

       at
org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:285)

       at
org.openrdf.rio.helpers.RDFParserBase.resolveURI(RDFParserBase.java:272)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.getPropertyResource(RDFXMLParser.java:75
1)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.processPropertyElt(RDFXMLParser.java:674
)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.emptyElement(RDFXMLParser.java:378)

       at org.openrdf.rio.rdfxml.SAXFilter.endElement(SAXFilter.java:359)

       at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unkn
own Source)

       at
com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyEl
ement(Unknown Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartEl
ement(Unknown Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$Fragm
entContentDriver.next(Unknown Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknow
n Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanD
ocument(Unknown Source)

       at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)

       at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Un
known Source)

       at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:260)

       at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:244)

       at
org.openrdf.repository.base.RepositoryConnectionBase.addInputStreamOrReader(
RepositoryConnectionBase.java:357)

       at
org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectio
nBase.java:312)

       at
org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:659)

       ... 56 more

Caused by: java.lang.IllegalArgumentException: Not a valid (absolute) URI:
//creativecommons.org/licenses/by-sa/3.0/

       at org.openrdf.model.impl.URIImpl.setURIString(URIImpl.java:68)

       at org.openrdf.model.impl.URIImpl.&lt;init&gt;(URIImpl.java:57)

       at
org.openrdf.sail.memory.model.MemValueFactory.createURI(MemValueFactory.java
:345)

       at
org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:282)

       ... 78 more

</pre>

<h3>Caused by:</h3><pre>java.io.IOException

       at
org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:661)

       at
org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:652)

       at
org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(X
sltExtractor.java:140)

       at
org.apache.stanbol.enhancer.engines.metaxa.core.html.IksHtmlExtractor.extrac
t(IksHtmlExtractor.java:123)

       at
org.apache.stanbol.enhancer.engines.metaxa.core.MetaxaCore.extract(MetaxaCor
e.java:120)

       at
org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(
MetaxaEngine.java:157)

       at
org.apache.stanbol.enhancer.jobmanager.impl.WeightedJobManager.enhanceConten
t(WeightedJobManager.java:80)

       at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceAndBu
ildResponse(EnginesRootResource.java:175)

       at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceFromD
ata(EnginesRootResource.java:167)

       at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)

       at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

       at java.lang.reflect.Method.invoke(Unknown Source)

       at
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInv
okerFactory.java:60)

       at
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispa
tchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvi
der.java:205)

       at
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatche
r.dispatch(ResourceJavaMethodDispatcher.java:75)

       at
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.ja
va:288)

       at
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassR
ule.java:108)

       at
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathR
ule.java:147)

       at
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootReso
urceClassesRule.java:84)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(Web
ApplicationImpl.java:1465)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(Web
ApplicationImpl.java:1396)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebA
pplicationImpl.java:1345)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebA
pplicationImpl.java:1335)

       at
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:
416)

       at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContain
er.java:537)

       at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContain
er.java:699)

       at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

       at
org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(ServletH
andler.java:96)

       at
org.apache.felix.http.base.internal.handler.ServletHandler.handle(ServletHan
dler.java:79)

       at
org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(ServletP
ipeline.java:42)

       at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(
InvocationFilterChain.java:49)

       at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFi
lterChain.java:33)

       at
org.apache.stanbol.commons.httpqueryheaders.impl.QueryHeadersFilter.doFilter
(QueryHeadersFilter.java:75)

       at
org.apache.felix.http.base.internal.handler.FilterHandler.doHandle(FilterHan
dler.java:88)

       at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandl
er.java:76)

       at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(
InvocationFilterChain.java:47)

       at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFi
lterChain.java:33)

       at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandl
er.java:78)

       at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(
InvocationFilterChain.java:47)

       at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFi
lterChain.java:33)

       at
org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(FilterP
ipeline.java:48)

       at
org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(Dispatcher.
java:39)

       at
org.apache.felix.http.base.internal.DispatcherServlet.service(DispatcherServ
let.java:67)

       at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

       at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)

       at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390)

       at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)

       at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)

       at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)

       at org.mortbay.jetty.Server.handle(Server.java:326)

       at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)

       at
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:
943)

       at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)

       at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)

       at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)

       at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)

       at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582
)

Caused by: org.openrdf.rio.RDFParseException: Not a valid (absolute) URI:
//creativecommons.org/licenses/by-sa/3.0/ [line 4, column 179]

       at
org.openrdf.rio.helpers.RDFParserBase.reportFatalError(RDFParserBase.java:53
3)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.reportFatalError(RDFXMLParser.java:1068)

       at
org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:285)

       at
org.openrdf.rio.helpers.RDFParserBase.resolveURI(RDFParserBase.java:272)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.getPropertyResource(RDFXMLParser.java:75
1)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.processPropertyElt(RDFXMLParser.java:674
)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.emptyElement(RDFXMLParser.java:378)

       at org.openrdf.rio.rdfxml.SAXFilter.endElement(SAXFilter.java:359)

       at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unkn
own Source)

       at
com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyEl
ement(Unknown Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartEl
ement(Unknown Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$Fragm
entContentDriver.next(Unknown Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknow
n Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanD
ocument(Unknown Source)

       at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)

       at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Un
known Source)

       at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:260)

       at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:244)

       at
org.openrdf.repository.base.RepositoryConnectionBase.addInputStreamOrReader(
RepositoryConnectionBase.java:357)

       at
org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectio
nBase.java:312)

       at
org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:659)

       ... 56 more

Caused by: java.lang.IllegalArgumentException: Not a valid (absolute) URI:
//creativecommons.org/licenses/by-sa/3.0/

       at org.openrdf.model.impl.URIImpl.setURIString(URIImpl.java:68)

       at org.openrdf.model.impl.URIImpl.&lt;init&gt;(URIImpl.java:57)

       at
org.openrdf.sail.memory.model.MemValueFactory.createURI(MemValueFactory.java
:345)

       at
org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:282)

       ... 78 more

</pre>

<h3>Caused by:</h3><pre>org.openrdf.rio.RDFParseException: Not a valid
(absolute) URI: //creativecommons.org/licenses/by-sa/3.0/ [line 4, column
179]

       at
org.openrdf.rio.helpers.RDFParserBase.reportFatalError(RDFParserBase.java:53
3)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.reportFatalError(RDFXMLParser.java:1068)

       at
org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:285)

       at
org.openrdf.rio.helpers.RDFParserBase.resolveURI(RDFParserBase.java:272)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.getPropertyResource(RDFXMLParser.java:75
1)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.processPropertyElt(RDFXMLParser.java:674
)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.emptyElement(RDFXMLParser.java:378)

       at org.openrdf.rio.rdfxml.SAXFilter.endElement(SAXFilter.java:359)

       at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unkn
own Source)

       at
com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyEl
ement(Unknown Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartEl
ement(Unknown Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$Fragm
entContentDriver.next(Unknown Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknow
n Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanD
ocument(Unknown Source)

       at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)

       at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Un
known Source)

       at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:260)

       at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:244)

       at
org.openrdf.repository.base.RepositoryConnectionBase.addInputStreamOrReader(
RepositoryConnectionBase.java:357)

       at
org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectio
nBase.java:312)

       at
org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:659)

       at
org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:652)

       at
org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(X
sltExtractor.java:140)

       at
org.apache.stanbol.enhancer.engines.metaxa.core.html.IksHtmlExtractor.extrac
t(IksHtmlExtractor.java:123)

       at
org.apache.stanbol.enhancer.engines.metaxa.core.MetaxaCore.extract(MetaxaCor
e.java:120)

       at
org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(
MetaxaEngine.java:157)

       at
org.apache.stanbol.enhancer.jobmanager.impl.WeightedJobManager.enhanceConten
t(WeightedJobManager.java:80)

       at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceAndBu
ildResponse(EnginesRootResource.java:175)

       at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceFromD
ata(EnginesRootResource.java:167)

       at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)

       at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

       at java.lang.reflect.Method.invoke(Unknown Source)

       at
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInv
okerFactory.java:60)

       at
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispa
tchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvi
der.java:205)

       at
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatche
r.dispatch(ResourceJavaMethodDispatcher.java:75)

       at
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.ja
va:288)

       at
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassR
ule.java:108)

       at
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathR
ule.java:147)

       at
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootReso
urceClassesRule.java:84)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(Web
ApplicationImpl.java:1465)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(Web
ApplicationImpl.java:1396)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebA
pplicationImpl.java:1345)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebA
pplicationImpl.java:1335)

       at
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:
416)

       at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContain
er.java:537)

       at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContain
er.java:699)

       at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

       at
org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(ServletH
andler.java:96)

       at
org.apache.felix.http.base.internal.handler.ServletHandler.handle(ServletHan
dler.java:79)

       at
org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(ServletP
ipeline.java:42)

       at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(
InvocationFilterChain.java:49)

       at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFi
lterChain.java:33)

       at
org.apache.stanbol.commons.httpqueryheaders.impl.QueryHeadersFilter.doFilter
(QueryHeadersFilter.java:75)

       at
org.apache.felix.http.base.internal.handler.FilterHandler.doHandle(FilterHan
dler.java:88)

       at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandl
er.java:76)

       at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(
InvocationFilterChain.java:47)

       at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFi
lterChain.java:33)

       at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandl
er.java:78)

       at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(
InvocationFilterChain.java:47)

       at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFi
lterChain.java:33)

       at
org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(FilterP
ipeline.java:48)

       at
org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(Dispatcher.
java:39)

       at
org.apache.felix.http.base.internal.DispatcherServlet.service(DispatcherServ
let.java:67)

       at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

       at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)

       at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390)

       at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)

       at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)

       at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)

       at org.mortbay.jetty.Server.handle(Server.java:326)

       at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)

       at
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:
943)

       at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)

       at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)

       at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)

       at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)

       at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582
)

Caused by: java.lang.IllegalArgumentException: Not a valid (absolute) URI:
//creativecommons.org/licenses/by-sa/3.0/

       at org.openrdf.model.impl.URIImpl.setURIString(URIImpl.java:68)

       at org.openrdf.model.impl.URIImpl.&lt;init&gt;(URIImpl.java:57)

       at
org.openrdf.sail.memory.model.MemValueFactory.createURI(MemValueFactory.java
:345)

       at
org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:282)

       ... 78 more

</pre>

<h3>Caused by:</h3><pre>java.lang.IllegalArgumentException: Not a valid
(absolute) URI: //creativecommons.org/licenses/by-sa/3.0/

       at org.openrdf.model.impl.URIImpl.setURIString(URIImpl.java:68)

       at org.openrdf.model.impl.URIImpl.&lt;init&gt;(URIImpl.java:57)

       at
org.openrdf.sail.memory.model.MemValueFactory.createURI(MemValueFactory.java
:345)

       at
org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:282)

       at
org.openrdf.rio.helpers.RDFParserBase.resolveURI(RDFParserBase.java:272)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.getPropertyResource(RDFXMLParser.java:75
1)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.processPropertyElt(RDFXMLParser.java:674
)

       at
org.openrdf.rio.rdfxml.RDFXMLParser.emptyElement(RDFXMLParser.java:378)

       at org.openrdf.rio.rdfxml.SAXFilter.endElement(SAXFilter.java:359)

       at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unkn
own Source)

       at
com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyEl
ement(Unknown Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartEl
ement(Unknown Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$Fragm
entContentDriver.next(Unknown Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknow
n Source)

       at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanD
ocument(Unknown Source)

       at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)

       at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown
Source)

       at
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Un
known Source)

       at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:260)

       at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:244)

       at
org.openrdf.repository.base.RepositoryConnectionBase.addInputStreamOrReader(
RepositoryConnectionBase.java:357)

       at
org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectio
nBase.java:312)

       at
org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:659)

       at
org.openrdf.rdf2go.RepositoryModel.readFrom(RepositoryModel.java:652)

       at
org.apache.stanbol.enhancer.engines.metaxa.core.html.XsltExtractor.extract(X
sltExtractor.java:140)

       at
org.apache.stanbol.enhancer.engines.metaxa.core.html.IksHtmlExtractor.extrac
t(IksHtmlExtractor.java:123)

       at
org.apache.stanbol.enhancer.engines.metaxa.core.MetaxaCore.extract(MetaxaCor
e.java:120)

       at
org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine.computeEnhancements(
MetaxaEngine.java:157)

       at
org.apache.stanbol.enhancer.jobmanager.impl.WeightedJobManager.enhanceConten
t(WeightedJobManager.java:80)

       at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceAndBu
ildResponse(EnginesRootResource.java:175)

       at
org.apache.stanbol.enhancer.jersey.resource.EnginesRootResource.enhanceFromD
ata(EnginesRootResource.java:167)

       at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)

       at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

       at java.lang.reflect.Method.invoke(Unknown Source)

       at
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInv
okerFactory.java:60)

       at
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispa
tchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvi
der.java:205)

       at
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatche
r.dispatch(ResourceJavaMethodDispatcher.java:75)

       at
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.ja
va:288)

       at
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassR
ule.java:108)

       at
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathR
ule.java:147)

       at
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootReso
urceClassesRule.java:84)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(Web
ApplicationImpl.java:1465)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(Web
ApplicationImpl.java:1396)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebA
pplicationImpl.java:1345)

       at
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebA
pplicationImpl.java:1335)

       at
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:
416)

       at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContain
er.java:537)

       at
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContain
er.java:699)

       at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

       at
org.apache.felix.http.base.internal.handler.ServletHandler.doHandle(ServletH
andler.java:96)

       at
org.apache.felix.http.base.internal.handler.ServletHandler.handle(ServletHan
dler.java:79)

       at
org.apache.felix.http.base.internal.dispatch.ServletPipeline.handle(ServletP
ipeline.java:42)

       at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(
InvocationFilterChain.java:49)

       at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFi
lterChain.java:33)

       at
org.apache.stanbol.commons.httpqueryheaders.impl.QueryHeadersFilter.doFilter
(QueryHeadersFilter.java:75)

       at
org.apache.felix.http.base.internal.handler.FilterHandler.doHandle(FilterHan
dler.java:88)

       at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandl
er.java:76)

       at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(
InvocationFilterChain.java:47)

       at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFi
lterChain.java:33)

       at
org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandl
er.java:78)

       at
org.apache.felix.http.base.internal.dispatch.InvocationFilterChain.doFilter(
InvocationFilterChain.java:47)

       at
org.apache.felix.http.base.internal.dispatch.HttpFilterChain.doFilter(HttpFi
lterChain.java:33)

       at
org.apache.felix.http.base.internal.dispatch.FilterPipeline.dispatch(FilterP
ipeline.java:48)

       at
org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(Dispatcher.
java:39)

       at
org.apache.felix.http.base.internal.DispatcherServlet.service(DispatcherServ
let.java:67)

       at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

       at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)

       at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390)

       at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)

       at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)

       at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)

       at org.mortbay.jetty.Server.handle(Server.java:326)

       at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)

       at
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:
943)

       at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)

       at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)

       at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)

       at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)

       at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582
)

</pre>

<hr /><i><small>Powered by Jetty://</small></i><br/>


<br/>                                                

<br/>                                                

<br/>                                                

<br/>                                                

<br/>                                                

<br/>                                                

<br/>                                                

<br/>                                                

<br/>                                                

<br/>                                                

<br/>                                                

<br/>                                                

<br/>                                                

<br/>                                                

<br/>                                                

<br/>                                                

<br/>                                                

<br/>                                                

<br/>                                                

 

</body>

</html>

 

 

From: srecko joksimovic [mailto:sreckojoksimovic@gmail.com] 
Sent: Monday, January 16, 2012 12:43
To: Walter Kasper
Cc: stanbol-dev@incubator.apache.org
Subject: Re: Annotating using DBPedia ontology

 

Hi,

Thank you for your time, and answer.
Could you please explain me how do you extract annotated concepts from
generated model? I usually generate RDF output, parse SPARQL query and
process the output. Could you send me an example based on one of these two
documents you sent to me?

Best,
Srecko

On Mon, Jan 16, 2012 at 10:51 AM, Walter Kasper <wk...@apache.org> wrote:

Hi,

Both of the PDF documents you sent work fine for us and the annotations look
as expected. Find attached output from our test, for the 22 pages doc as
well as for the 558 pages doc.
I have no explanation of why you apparently got different annotations.

Best regards,

Walter



Srecko Joksimovic wrote:

Hi,

I have to say that everything works great for string, txt and doc files, as
well as web pages. I have only one confusion, and it is regarding pdf docs.
I attached results for this document
http://www.gtbit.org/downloads/dwdmsem6/dwdmsem6lman.pdf. I didn't access
document using this URL, but downloaded and access it as C:\temp\weka.pdf.
I get correct answer, I mean, there are no errors, but I think there should
be more annotated concepts.

Could you please try to annotate this document and compare results? I am
sure you will be able to find what I did wrong.

Best,
Srecko

Re: Annotating using DBPedia ontology

Posted by Walter Kasper <wk...@apache.org>.

srecko joksimovic wrote:
> Hi,
>
> Thank you for your time, and answer.
> Could you please explain me how do you extract annotated concepts from
> generated model? I usually generate RDF output, parse SPARQL query and
> process the output.

This is exactly the way, I would do it, too.

Best regards,

Walter

Re: Annotating using DBPedia ontology

Posted by srecko joksimovic <sr...@gmail.com>.

Hi,

Thank you for your time, and answer.
Could you please explain me how do you extract annotated concepts from
generated model? I usually generate RDF output, parse SPARQL query and
process the output. Could you send me an example based on one of these two
documents you sent to me?

Best,
Srecko

On Mon, Jan 16, 2012 at 10:51 AM, Walter Kasper <wk...@apache.org> wrote:

> Hi,
>
> Both of the PDF documents you sent work fine for us and the annotations
> look as expected. Find attached output from our test, for the 22 pages doc
> as well as for the 558 pages doc.
> I have no explanation of why you apparently got different annotations.
>
> Best regards,
>
> Walter
>
>
> Srecko Joksimovic wrote:
>
>> Hi,
>>
>> I have to say that everything works great for string, txt and doc files,
>> as
>> well as web pages. I have only one confusion, and it is regarding pdf
>> docs.
>> I attached results for this document
>> http://www.gtbit.org/**downloads/dwdmsem6/**dwdmsem6lman.pdf<http://www.gtbit.org/downloads/dwdmsem6/dwdmsem6lman.pdf>.
>> I didn't access
>> document using this URL, but downloaded and access it as C:\temp\weka.pdf.
>> I get correct answer, I mean, there are no errors, but I think there
>> should
>> be more annotated concepts.
>>
>> Could you please try to annotate this document and compare results? I am
>> sure you will be able to find what I did wrong.
>>
>> Best,
>> Srecko
>>
>
>
>
>

Re: Annotating using DBPedia ontology

Posted by Walter Kasper <wk...@apache.org>.

Hi,

Both of the PDF documents you sent work fine for us and the annotations 
look as expected. Find attached output from our test, for the 22 pages 
doc as well as for the 558 pages doc.
I have no explanation of why you apparently got different annotations.

Best regards,

Walter

Srecko Joksimovic wrote:
> Hi,
>
> I have to say that everything works great for string, txt and doc files, as
> well as web pages. I have only one confusion, and it is regarding pdf docs.
> I attached results for this document
> http://www.gtbit.org/downloads/dwdmsem6/dwdmsem6lman.pdf. I didn't access
> document using this URL, but downloaded and access it as C:\temp\weka.pdf.
> I get correct answer, I mean, there are no errors, but I think there should
> be more annotated concepts.
>
> Could you please try to annotate this document and compare results? I am
> sure you will be able to find what I did wrong.
>
> Best,
> Srecko

RE: Annotating using DBPedia ontology

Posted by Srecko Joksimovic <sr...@gmail.com>.

Hi,

I have to say that everything works great for string, txt and doc files, as
well as web pages. I have only one confusion, and it is regarding pdf docs.
I attached results for this document
http://www.gtbit.org/downloads/dwdmsem6/dwdmsem6lman.pdf. I didn't access
document using this URL, but downloaded and access it as C:\temp\weka.pdf.
I get correct answer, I mean, there are no errors, but I think there should
be more annotated concepts.

Could you please try to annotate this document and compare results? I am
sure you will be able to find what I did wrong.

Best,
Srecko

Re: Annotating using DBPedia ontology

Posted by srecko joksimovic <sr...@gmail.com>.

Thank you very much!

Best,
Srecko

On Fri, Jan 13, 2012 at 2:41 PM, Walter Kasper <wk...@apache.org> wrote:

> Hi,
>
> Here are recognized standard mime types:
>
> pdf: application/pdf
> txt: text/plain
> ppt: application/vnd.ms-powerpoint
> xls: application/vnd.ms-excel
> odt: application/vnd.oasis.**opendocument.text
>
> Regards,
>
> Walter
>
> srecko joksimovic wrote:
>
>> Hi,
>>
>> Thank you! I will checkout the last version.
>> I'm using application/msword, because I thought that is the right one.
>> Could you please send me correct formats for pdf, txt, ppt, xls and odt
>> formats?
>>
>> Best,
>> Srecko
>>
>> On Fri, Jan 13, 2012 at 1:34 PM, Walter Kasper <wkasper@apache.org<mailto:
>> wkasper@apache.org>> wrote:
>>
>>    Hi,
>>
>>    We fixed the problem with unresolved relative URL from HTML
>>    documents. In the case of your Wikipedia page it came from an
>>    embedded rel-license microformat. If you are interested only in
>>    text extraction you can also just disable the RDFa and Microformat
>>    extractors in the configuration for the html extraction.
>>
>>    We tested also Word documents with your test sentence. Everything
>>    worked fine for us. Did you use the correct mime type? The correct
>>    ones for Word documents are:
>>
>>    doc-Format (<= Word-2003): application/vnd.ms-word
>>    docx-Format (Word-2007):
>>    application/vnd.**openxmlformats-officedocument.**wordprocessingml
>>
>>    Best regards,
>>
>>    Walter
>>
>>    srecko joksimovic wrote:
>>
>>        Hi Walter,
>>
>>        Word document is nothing special, just one line of text:
>>
>>        "John Smith works for the Apple Inc. in Cupertino, California."
>>
>>        Rupert suggested this sentence in order to test text
>>        annotation. As I now
>>        result after annotating this string, I thought to create Word
>>        document with
>>        same content for test purposes.
>>
>>        The error with your HTML page apparently arises from a bug in
>>        resolving
>>        relative URLs in one of the HTML extractors. We will fix that.
>>
>>        Does it means that I can't annotate HTML page at this moment,
>>        or that
>>        depends on page to page basis?
>>
>>        Best,
>>        Srecko
>>
>>        On Fri, Jan 13, 2012 at 9:51 AM, Walter
>>        Kasper<wkasper@apache.org <ma...@apache.org>>  wrote:
>>
>>
>>            Hi Srecko,
>>
>>            I don't know what the problem with your Word document
>>            could have been.
>>            Could you send it to me for testing?
>>
>>            The error with your HTML page apparently arises from a bug
>>            in resolving
>>            relative URLs in one of the HTML extractors. We will fix that.
>>
>>            Best regards,
>>
>>            Walter
>>
>>
>>            Srecko Joksimovic wrote:
>>
>>                Thank you Rupert!
>>
>>                It is probably something that I missed.
>>
>>                Best,
>>                Srecko
>>
>>                -----Original Message-----
>>                From: Rupert Westenthaler [mailto:rupert.westenthaler@
>>                <mailto:rupert.westenthaler@>****gmail.com
>>                <ht...@gmail.com>
>>                <ma...@gmail.com>
>> >>
>>                ]
>>                Sent: Thursday, January 12, 2012 20:16
>>                To: Srecko Joksimovic; wkasper@apache.org
>>                <ma...@apache.org>
>>                Cc:
>>                stanbol-dev@incubator.apache.****org<
>> stanbol-dev@incubator.**apache.org <st...@incubator.apache.org>
>>                <ma...@incubator.apache.org>
>> >>
>>
>>                Subject: Re: Annotating using DBPedia ontology
>>
>>                Hi Srecko
>>
>>                I seams that both cases are related to the Metaxa
>>                Engine. My knowledge
>>                abut
>>                the libs used by this engine to extract the textual
>>                content is very
>>                limited.
>>                So I might not be the right person to look into that.
>>
>>                In the first Example I think Metaxa was not able to
>>                extract the text from
>>                the word document because the only plainTextContent
>>                triple noted is
>>
>>                <j.0:plainTextContent>****Microsoft Word-Dokument&#xD;
>>
>>                srecko</j.0:plainTextContent>
>>
>>                The  second example looks like an issue within the RDF
>>                metadata generation
>>                in Aperture.
>>
>>                I sent this replay also directly to Walter Kasper. He
>>                is the one who
>>                contributed this engine and should be able to provide
>>                a more information.
>>
>>                best
>>                Rupert
>>
>>                On 12.01.2012, at 18:40, srecko joksimovic wrote:
>>
>>                 Hi Rupert,
>>
>>                    I have another question, and I will finish soon.
>>
>>                    I tried to annotate pdf document, and I didn't get
>>                    result I expected.
>>                    Then
>>
>>                I put string you sent to me
>>
>>                    "John Smith works for the Apple Inc. in Cupertino,
>>                    California."
>>                    in MS Word document, and this is the result I got:
>>
>>                    <rdf:RDF
>>                                           xmlns:rdf="http://www.w3.org/**
>> **1999/02/22-rdf-syntax-ns#<http://www.w3.org/**1999/02/22-rdf-syntax-ns#>
>> <htt**p://www.w3.org/1999/02/22-rdf-**syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>> >
>>                    "
>>                        xmlns:j.0="http://www.**semant**icdesktop.org/**<http://semanticdesktop.org/**>
>>                    <http://semanticdesktop.org/****>
>>
>>                    ontologies/2007/01/19/nie#<htt**
>> p://www.semanticdesktop.org/**ontologies/2007/01/19/nie#<http://www.semanticdesktop.org/ontologies/2007/01/19/nie#>
>> >
>>                    "
>>                                           xmlns:j.1="http://purl.org/dc/*
>> ***terms/ <http://purl.org/dc/**terms/><http://purl.org/dc/**terms/<http://purl.org/dc/terms/>
>> >"
>>                        xmlns:j.2="http://www.**semant**icdesktop.org/**<http://semanticdesktop.org/**>
>>                    <http://semanticdesktop.org/****>
>>
>>                    ontologies/2007/03/22/nfo#<htt**
>> p://www.semanticdesktop.org/**ontologies/2007/03/22/nfo#<http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#>
>> >
>>                    "
>>                                           xmlns:j.3="http://fise.iks-**p*
>> *roject.eu/ontology/ <http://project.eu/ontology/>
>>                    <http://project.eu/ontology/><**
>> http://fise.iks-project.eu/**ontology/<http://fise.iks-project.eu/ontology/>
>> >
>>
>>                    ">
>>                    <rdf:Description
>>
>>                rdf:about="urn:enhancement-****55016818-eb97-7b98-521a-***
>> *422e3742173b">
>>
>>                    <rdf:type
>>
>>                rdf:resource="http://fise.iks-****project.eu/ontology/****
>> TextAnnotation <http://project.eu/ontology/**TextAnnotation>
>>                <http://project.eu/ontology/****TextAnnotation<http://project.eu/ontology/**TextAnnotation>
>> ><http://fise.**iks-project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>> >
>>
>>                "/>
>>
>>                    <j.1:creator
>>
>>                rdf:datatype="http://www.w3.****org/2001/XMLSchema#string<
>> http**://www.w3.org/2001/XMLSchema#**string<http://www.w3.org/2001/XMLSchema#string>
>> >
>>                ">**org.apache.stanbol.en
>>                hancer.engines.langid.****LangIdEnhancementEngine</j.1:***
>> *creator>
>>
>>                    <j.1:created
>>
>>                rdf:datatype="http://www.w3.****
>> org/2001/XMLSchema#dateTime<ht**tp://www.w3.org/2001/**XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>> >
>>                ">**2012-01-12T17:34:20
>>
>>                .288Z</j.1:created>
>>
>>                    <j.3:extracted-from
>>
>>                rdf:resource="urn:content-****item-sha1-****
>> 835c8a5397d9b376a268b7bb5d3c8b****
>>                4ab7e8b81f
>>                "/>
>>
>>                    <rdf:type
>>
>>                rdf:resource="http://fise.iks-****project.eu/ontology/****
>> Enhancement <http://project.eu/ontology/**Enhancement>
>>                <http://project.eu/ontology/****Enhancement<http://project.eu/ontology/**Enhancement>
>> ><http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>> >
>>
>>                "/>
>>
>>                    <j.1:language>fr</j.1:****language>
>>                    </rdf:Description>
>>                    <rdf:Description
>>
>>                rdf:about="urn:content-item-****sha1-****
>> 835c8a5397d9b376a268b7bb5d3c8b****
>>                4ab7e8b81f">
>>
>>                    <rdf:type
>>
>>                rdf:resource="http://www.**sem**anticdesktop.org/**<http://semanticdesktop.org/**>
>>                <http://semanticdesktop.org/****>
>>
>>                ontologies/2007/03/22/nfo#****Pagin<http://www.**
>> semanticdesktop.org/**ontologies/2007/03/22/nfo#**Pagin<http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Pagin>
>> >
>>                atedTextDocument"/>
>>
>>                    <j.0:plainTextContent>****Microsoft Word-Dokument&#xD;
>>
>>                    srecko</j.0:plainTextContent>
>>                    </rdf:Description>
>>                    <rdf:Description
>>
>>                rdf:about="urn:enhancement-****0644a1ed-f1d8-334d-d4e9-***
>> *690a0446cba8">
>>
>>                    <j.3:confidence
>>
>>                rdf:datatype="http://www.w3.****org/2001/XMLSchema#double<
>> http**://www.w3.org/2001/XMLSchema#**double<http://www.w3.org/2001/XMLSchema#double>
>> >
>>                ">1.**0</j.3:confidence>
>>
>>                    <rdf:type
>>
>>                rdf:resource="http://fise.iks-****project.eu/ontology/****
>> TextAnnotation <http://project.eu/ontology/**TextAnnotation>
>>                <http://project.eu/ontology/****TextAnnotation<http://project.eu/ontology/**TextAnnotation>
>> ><http://fise.**iks-project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>> >
>>
>>                "/>
>>
>>                    <j.1:creator
>>
>>                rdf:datatype="http://www.w3.****org/2001/XMLSchema#string<
>> http**://www.w3.org/2001/XMLSchema#**string<http://www.w3.org/2001/XMLSchema#string>
>> >
>>                ">**org.apache.stanbol.en
>>                hancer.engines.metaxa.****MetaxaEngine</j.1:creator>
>>
>>                    <j.1:created
>>
>>                rdf:datatype="http://www.w3.****
>> org/2001/XMLSchema#dateTime<ht**tp://www.w3.org/2001/**XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>> >
>>                ">**2012-01-12T17:34:20
>>
>>                .273Z</j.1:created>
>>
>>                    <j.3:extracted-from
>>
>>                rdf:resource="urn:content-****item-sha1-****
>> 835c8a5397d9b376a268b7bb5d3c8b****
>>                4ab7e8b81f
>>                "/>
>>
>>                    <rdf:type
>>
>>                rdf:resource="http://fise.iks-****project.eu/ontology/****
>> Enhancement <http://project.eu/ontology/**Enhancement>
>>                <http://project.eu/ontology/****Enhancement<http://project.eu/ontology/**Enhancement>
>> ><http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>> >
>>
>>
>>
>>                "/>
>>
>>                    </rdf:Description>
>>                    </rdf:RDF>
>>
>>
>>                    and this is the code:
>>
>>                           public List<String>   Annotate(byte[]
>>                    _stream_to_annotate,
>>
>>                ServiceUtils.MIMETypes _content_type, String _encoding)
>>
>>                           {
>>                                   List<String>   _return_list = new
>>                    ArrayList<String>();
>>                                   try
>>                                   {
>>                                           URL url = new
>>                    URL(ServiceUtils.SERVICE_URL);
>>                                           HttpURLConnection con =
>>
>>                (HttpURLConnection)url.****openConnection();
>>
>>                                           con.setDoOutput(true);
>>                                           con.setRequestMethod("POST");
>>
>>  con.setRequestProperty("****Accept",
>>
>>                "application/rdf+xml");
>>
>>
>>  con.setRequestProperty("****Content-type",
>>
>>                _content_type.getValue());
>>
>>                                           java.io.OutputStream out =
>>                    con.getOutputStream();
>>
>>
>>  IOUtils.write(_stream_to_****annotate, out);
>>
>>                                           IOUtils.closeQuietly(out);
>>
>>                                           con.connect(); //send the
>>                    request
>>
>>                                           if(con.getResponseCode()>
>>                299)
>>                                           {
>>                                                   java.io.InputStream
>>                    errorStream =
>>
>>                con.getErrorStream();
>>
>>                                                   if(errorStream != null)
>>                                                   {
>>                                                           String
>>                    errorMessage =
>>
>>                IOUtils.toString(errorStream);
>>
>>
>>    IOUtils.closeQuietly(**
>>
>>                    errorStream);
>>                                                   }
>>                                                   else
>>                                                   {
>>                                                           //no error data
>>                                                           //write
>>                    default error message with
>>
>>                the status code
>>
>>                                                   }
>>                                           }
>>                                           else
>>                                           {
>>                                                   Model model =
>>
>>                ModelFactory.****createDefaultModel();
>>
>>
>>                                                java.io.InputStream
>>                enhancementResults =
>>                con.getInputStream();
>>
>>
>> model.read(enhancementResults, null);
>>
>>                                                   String
>>                    queryStringForGraph =  "PREFIX t:
>>
>>                <http://fise.iks-project.eu/****ontology/<http://fise.iks-project.eu/**ontology/>
>> <http://fise.iks-**project.eu/ontology/<http://fise.iks-project.eu/ontology/>
>> >>
>>
>>
>>                 " +
>>
>>
>>            "SELECT ?label WHERE
>>                    {?alias
>>
>>                t:entity-reference ?label}";
>>
>>                                                   Query query =
>>
>>                QueryFactory.create(****queryStringForGraph);
>>
>>                                                   QueryExecution qe =
>>
>>                QueryExecutionFactory.create(****query, model);
>>
>>
>>
>>                                                   ResultSet results =
>>                    qe.execSelect();
>>
>>  while(results.hasNext())
>>                                                   {
>>
>>                _return_list.add(results.next(****).toString());
>>
>>                                                   }
>>                                           }
>>                                   }
>>                                   catch(Exception ex)
>>                                   {
>>
>>  System.out.println(ex.****getMessage());
>>
>>                                   }
>>                                   return _return_list;
>>                           }
>>
>>                    On Thu, Jan 12, 2012 at 8:32 AM, srecko joksimovic
>>
>>                <sreckojoksimovic@gmail.com
>>                <ma...@gmail.com>>>
>>   wrote:
>>
>>                    Hi Rupert,
>>
>>                    Thank you for the answer. I've probably missed that.
>>
>>                    Best,
>>                    Srecko
>>
>>
>>                    On Thu, Jan 12, 2012 at 6:12 AM, Rupert Westenthaler
>>
>>                <rupert.westenthaler@gmail.com
>>                <ma...@gmail.com>>**>
>>   wrote:
>>
>>                    Hi Srecko
>>
>>                    I think the last time I directly used this API is
>>                    about 3-4 years ago,
>>                    but
>>
>>                after a look at the http client tutorial [1] I think
>>                the reason for your
>>                problem is that you do not execute the GetMethod.
>>
>>                    Based on this tutorial the code should look like
>>
>>                       // Create an instance of HttpClient.
>>                       HttpClient client = new HttpClient();
>>                       GetMethod get = new GetMethod(url);
>>                       try {
>>                           // Execute the method.
>>                           int statusCode = client.executeMethod(get);
>>                           if (statusCode != HttpStatus.SC_OK) {
>>                               //handle the error
>>                           }
>>                           InputStream t_is =
>>                    get.getResponseBodyAsStream();
>>                           //read the data of the stream
>>                       }
>>
>>                    In addition you should not use a Reader if you
>>                    want to read byte oriented
>>
>>                data from the input stream.
>>
>>                    hope this helps
>>                    best
>>                    Rupert
>>
>>                    [1]
>>                    http://hc.apache.org/****httpclient-3.x/tutorial.html<http://hc.apache.org/**httpclient-3.x/tutorial.html>
>> <h**ttp://hc.apache.org/**httpclient-3.x/tutorial.html<http://hc.apache.org/httpclient-3.x/tutorial.html>
>> >
>>
>>
>>
>>                    On 11.01.2012, at 22:34, Srecko Joksimovic wrote:
>>
>>                     That's it. Thank you!
>>
>>                        I have already configured KeywordLinkingEngine
>>                        when I used my own
>>
>>                    ontology.
>>                    I think I'm familiar with that and I will try that
>>                    option too.
>>
>>                        In meanwhile I found another interesting
>>                        problem. I tried to annotate
>>                        document and web page. With web page, I tried
>>                        IOUtils.write(byte[], out) and I had to
>>                        convert URL to byte[]:
>>
>>                        public static byte[] GetBytesFromURL(String
>>                        _url) throws IOException
>>                        {
>>                              GetMethod get = new GetMethod(_url);
>>                              InputStream t_is =
>>                        get.getResponseBodyAsStream();
>>                              byte[] buffer = new byte[1024];
>>                              int count = -1;
>>                              Reader t_url_reader = new BufferedReader(new
>>                        InputStreamReader(t_is));
>>                              byte[] t_bytes =
>>                        IOUtils.toByteArray(t_url_****reader, "UTF-8");
>>
>>
>>                              return t_bytes;
>>                        }
>>
>>                        But, the problem is that I'm getting null for
>>                        InputStream.
>>
>>                        Any ideas?
>>
>>                        Best,
>>                        Srecko
>>
>>
>>
>>                        -----Original Message-----
>>                        From: Rupert Westenthaler
>>                        [mailto:rupert.westenthaler@
>>                        <mailto:rupert.westenthaler@>****gmail.com
>>                        <ht...@gmail.com>
>>                        <ma...@gmail.com>
>> >>
>>                        ]
>>                        Sent: Wednesday, January 11, 2012 22:08
>>                        To: Srecko Joksimovic
>>                        Cc:
>>                        stanbol-dev@incubator.apache.****org<
>> stanbol-dev@incubator.**apache.org <st...@incubator.apache.org>
>>                        <ma...@incubator.apache.org>
>> >>
>>
>>                        Subject: Re: Annotating using DBPedia ontology
>>
>>
>>                        On 11.01.2012, at 21:41, Srecko Joksimovic wrote:
>>
>>                            Hi Rupert,
>>
>>                            When I load localhost:8080/engines it says
>>                            this:
>>
>>                            There are currently 5 active engines.
>>                            org.apache.stanbol.enhancer.****
>> engines.metaxa.MetaxaEngine
>>                            org.apache.stanbol.enhancer.****
>> engines.langid.****LangIdEnhancementEngine
>>
>>                             org.apache.stanbol.enhancer.****
>> engines.opennlp.impl.**
>>
>>                NamedEntityExtractionEnhanc
>>
>>                    ementEngine
>>
>>                             org.apache.stanbol.enhancer.****
>> engines.entitytagging.impl.**
>>
>>                NamedEntityTaggingEng
>>
>>                    ine
>>
>>                             org.apache.stanbol.enhancer.****
>> engines.entitytagging.impl.**
>>
>>                NamedEntityTaggingEng
>>
>>                    ine
>>
>>                            Maybe this could tell you something?
>>
>>                             This are exactly the 5 engines that are
>>                            expected to run with the
>>
>>                        default
>>                        configuration.
>>                        Based on this the Stanbol Enhnacer should just
>>                        work fine.
>>
>>                        After looking at the the text you enhanced I
>>                        noticed however that is
>>
>>                    does
>>                    not mention
>>
>>                        any named entities such as Persons,
>>                        Organizations and Places. So I
>>
>>                    checked
>>                    it with
>>
>>                        my local Stanbol version and was also not any
>>                        detected entities.
>>
>>                        So to check if Stanbol works as expected you
>>                        should try to use an other
>>
>>                    text
>>                    the
>>
>>                        mentions some Named Entities such as
>>
>>                           "John Smith works for the Apple Inc. in
>>                        Cupertino, California."
>>
>>
>>                        If you want to search also for entities like
>>                        "Bank", "Blog", "Consumer",
>>                        "Telephone" .
>>                        you need to also configure a
>>                        KeywordLinkingEngine for dbpedia. Part B or
>>
>>                    [3]
>>                    provides
>>
>>                        more information on how to do that.
>>
>>                        But let me mention that the
>>                        KeywordLinkingEngine is more useful if used
>>
>>                    in
>>                    combination
>>
>>                        with an own domain specific thesaurus rather
>>                        than a global data set like
>>                        dbpedia. When
>>                        used with dbpedia you will also get a lot of
>>                        false positives.
>>
>>                        best
>>                        Rupert
>>
>>                        [3]
>>                        http://incubator.apache.org/****
>> stanbol/docs/trunk/**<http://incubator.apache.org/**stanbol/docs/trunk/**>
>>                        customvocabulary.html<http://**
>> incubator.apache.org/stanbol/**docs/trunk/customvocabulary.**html<http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html>
>> >
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>

Re: Annotating using DBPedia ontology

Posted by Walter Kasper <wk...@apache.org>.

Hi,

Here are recognized standard mime types:

pdf: application/pdf
txt: text/plain
ppt: application/vnd.ms-powerpoint
xls: application/vnd.ms-excel
odt: application/vnd.oasis.opendocument.text

Regards,

Walter

srecko joksimovic wrote:
> Hi,
>
> Thank you! I will checkout the last version.
> I'm using application/msword, because I thought that is the right one. 
> Could you please send me correct formats for pdf, txt, ppt, xls and 
> odt formats?
>
> Best,
> Srecko
>
> On Fri, Jan 13, 2012 at 1:34 PM, Walter Kasper <wkasper@apache.org 
> <ma...@apache.org>> wrote:
>
>     Hi,
>
>     We fixed the problem with unresolved relative URL from HTML
>     documents. In the case of your Wikipedia page it came from an
>     embedded rel-license microformat. If you are interested only in
>     text extraction you can also just disable the RDFa and Microformat
>     extractors in the configuration for the html extraction.
>
>     We tested also Word documents with your test sentence. Everything
>     worked fine for us. Did you use the correct mime type? The correct
>     ones for Word documents are:
>
>     doc-Format (<= Word-2003): application/vnd.ms-word
>     docx-Format (Word-2007):
>     application/vnd.openxmlformats-officedocument.wordprocessingml
>
>     Best regards,
>
>     Walter
>
>     srecko joksimovic wrote:
>
>         Hi Walter,
>
>         Word document is nothing special, just one line of text:
>
>         "John Smith works for the Apple Inc. in Cupertino, California."
>
>         Rupert suggested this sentence in order to test text
>         annotation. As I now
>         result after annotating this string, I thought to create Word
>         document with
>         same content for test purposes.
>
>         The error with your HTML page apparently arises from a bug in
>         resolving
>         relative URLs in one of the HTML extractors. We will fix that.
>
>         Does it means that I can't annotate HTML page at this moment,
>         or that
>         depends on page to page basis?
>
>         Best,
>         Srecko
>
>         On Fri, Jan 13, 2012 at 9:51 AM, Walter
>         Kasper<wkasper@apache.org <ma...@apache.org>>  wrote:
>
>             Hi Srecko,
>
>             I don't know what the problem with your Word document
>             could have been.
>             Could you send it to me for testing?
>
>             The error with your HTML page apparently arises from a bug
>             in resolving
>             relative URLs in one of the HTML extractors. We will fix that.
>
>             Best regards,
>
>             Walter
>
>
>             Srecko Joksimovic wrote:
>
>                 Thank you Rupert!
>
>                 It is probably something that I missed.
>
>                 Best,
>                 Srecko
>
>                 -----Original Message-----
>                 From: Rupert Westenthaler [mailto:rupert.westenthaler@
>                 <mailto:rupert.westenthaler@>**gmail.com
>                 <http://gmail.com><rupert.westenthaler@gmail.com
>                 <ma...@gmail.com>>
>                 ]
>                 Sent: Thursday, January 12, 2012 20:16
>                 To: Srecko Joksimovic; wkasper@apache.org
>                 <ma...@apache.org>
>                 Cc:
>                 stanbol-dev@incubator.apache.**org<stanbol-dev@incubator.apache.org
>                 <ma...@incubator.apache.org>>
>                 Subject: Re: Annotating using DBPedia ontology
>
>                 Hi Srecko
>
>                 I seams that both cases are related to the Metaxa
>                 Engine. My knowledge
>                 abut
>                 the libs used by this engine to extract the textual
>                 content is very
>                 limited.
>                 So I might not be the right person to look into that.
>
>                 In the first Example I think Metaxa was not able to
>                 extract the text from
>                 the word document because the only plainTextContent
>                 triple noted is
>
>                 <j.0:plainTextContent>**Microsoft Word-Dokument&#xD;
>
>                 srecko</j.0:plainTextContent>
>
>                 The  second example looks like an issue within the RDF
>                 metadata generation
>                 in Aperture.
>
>                 I sent this replay also directly to Walter Kasper. He
>                 is the one who
>                 contributed this engine and should be able to provide
>                 a more information.
>
>                 best
>                 Rupert
>
>                 On 12.01.2012, at 18:40, srecko joksimovic wrote:
>
>                  Hi Rupert,
>
>                     I have another question, and I will finish soon.
>
>                     I tried to annotate pdf document, and I didn't get
>                     result I expected.
>                     Then
>
>                 I put string you sent to me
>
>                     "John Smith works for the Apple Inc. in Cupertino,
>                     California."
>                     in MS Word document, and this is the result I got:
>
>                     <rdf:RDF
>                        
>                     xmlns:rdf="http://www.w3.org/**1999/02/22-rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>                     "
>                         xmlns:j.0="http://www.**semanticdesktop.org/**
>                     <http://semanticdesktop.org/**>
>                     ontologies/2007/01/19/nie#<http://www.semanticdesktop.org/ontologies/2007/01/19/nie#>
>                     "
>                        
>                     xmlns:j.1="http://purl.org/dc/**terms/<http://purl.org/dc/terms/>"
>                         xmlns:j.2="http://www.**semanticdesktop.org/**
>                     <http://semanticdesktop.org/**>
>                     ontologies/2007/03/22/nfo#<http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#>
>                     "
>                        
>                     xmlns:j.3="http://fise.iks-**project.eu/ontology/
>                     <http://project.eu/ontology/><http://fise.iks-project.eu/ontology/>
>                     ">
>                     <rdf:Description
>
>                 rdf:about="urn:enhancement-**55016818-eb97-7b98-521a-**422e3742173b">
>
>                     <rdf:type
>
>                 rdf:resource="http://fise.iks-**project.eu/ontology/**TextAnnotation
>                 <http://project.eu/ontology/**TextAnnotation><http://fise.iks-project.eu/ontology/TextAnnotation>
>                 "/>
>
>                     <j.1:creator
>
>                 rdf:datatype="http://www.w3.**org/2001/XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>                 ">**org.apache.stanbol.en
>                 hancer.engines.langid.**LangIdEnhancementEngine</j.1:**creator>
>
>                     <j.1:created
>
>                 rdf:datatype="http://www.w3.**org/2001/XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>                 ">**2012-01-12T17:34:20
>
>                 .288Z</j.1:created>
>
>                     <j.3:extracted-from
>
>                 rdf:resource="urn:content-**item-sha1-**835c8a5397d9b376a268b7bb5d3c8b**
>                 4ab7e8b81f
>                 "/>
>
>                     <rdf:type
>
>                 rdf:resource="http://fise.iks-**project.eu/ontology/**Enhancement
>                 <http://project.eu/ontology/**Enhancement><http://fise.iks-project.eu/ontology/Enhancement>
>                 "/>
>
>                     <j.1:language>fr</j.1:**language>
>                     </rdf:Description>
>                     <rdf:Description
>
>                 rdf:about="urn:content-item-**sha1-**835c8a5397d9b376a268b7bb5d3c8b**
>                 4ab7e8b81f">
>
>                     <rdf:type
>
>                 rdf:resource="http://www.**semanticdesktop.org/**
>                 <http://semanticdesktop.org/**>
>                 ontologies/2007/03/22/nfo#**Pagin<http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Pagin>
>                 atedTextDocument"/>
>
>                     <j.0:plainTextContent>**Microsoft Word-Dokument&#xD;
>
>                     srecko</j.0:plainTextContent>
>                     </rdf:Description>
>                     <rdf:Description
>
>                 rdf:about="urn:enhancement-**0644a1ed-f1d8-334d-d4e9-**690a0446cba8">
>
>                     <j.3:confidence
>
>                 rdf:datatype="http://www.w3.**org/2001/XMLSchema#double<http://www.w3.org/2001/XMLSchema#double>
>                 ">1.**0</j.3:confidence>
>
>                     <rdf:type
>
>                 rdf:resource="http://fise.iks-**project.eu/ontology/**TextAnnotation
>                 <http://project.eu/ontology/**TextAnnotation><http://fise.iks-project.eu/ontology/TextAnnotation>
>                 "/>
>
>                     <j.1:creator
>
>                 rdf:datatype="http://www.w3.**org/2001/XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>                 ">**org.apache.stanbol.en
>                 hancer.engines.metaxa.**MetaxaEngine</j.1:creator>
>
>                     <j.1:created
>
>                 rdf:datatype="http://www.w3.**org/2001/XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>                 ">**2012-01-12T17:34:20
>
>                 .273Z</j.1:created>
>
>                     <j.3:extracted-from
>
>                 rdf:resource="urn:content-**item-sha1-**835c8a5397d9b376a268b7bb5d3c8b**
>                 4ab7e8b81f
>                 "/>
>
>                     <rdf:type
>
>                 rdf:resource="http://fise.iks-**project.eu/ontology/**Enhancement
>                 <http://project.eu/ontology/**Enhancement><http://fise.iks-project.eu/ontology/Enhancement>
>
>
>                 "/>
>
>                     </rdf:Description>
>                     </rdf:RDF>
>
>
>                     and this is the code:
>
>                            public List<String>   Annotate(byte[]
>                     _stream_to_annotate,
>
>                 ServiceUtils.MIMETypes _content_type, String _encoding)
>
>                            {
>                                    List<String>   _return_list = new
>                     ArrayList<String>();
>                                    try
>                                    {
>                                            URL url = new
>                     URL(ServiceUtils.SERVICE_URL);
>                                            HttpURLConnection con =
>
>                 (HttpURLConnection)url.**openConnection();
>
>                                            con.setDoOutput(true);
>                                            con.setRequestMethod("POST");
>                                          
>                      con.setRequestProperty("**Accept",
>
>                 "application/rdf+xml");
>
>                                          
>                      con.setRequestProperty("**Content-type",
>
>                 _content_type.getValue());
>
>                                            java.io.OutputStream out =
>                     con.getOutputStream();
>
>                                          
>                      IOUtils.write(_stream_to_**annotate, out);
>
>                                            IOUtils.closeQuietly(out);
>
>                                            con.connect(); //send the
>                     request
>
>                                            if(con.getResponseCode()>  
>                     299)
>                                            {
>                                                    java.io.InputStream
>                     errorStream =
>
>                 con.getErrorStream();
>
>                                                    if(errorStream != null)
>                                                    {
>                                                            String
>                     errorMessage =
>
>                 IOUtils.toString(errorStream);
>
>                                                          
>                      IOUtils.closeQuietly(**
>
>                     errorStream);
>                                                    }
>                                                    else
>                                                    {
>                                                            //no error data
>                                                            //write
>                     default error message with
>
>                 the status code
>
>                                                    }
>                                            }
>                                            else
>                                            {
>                                                    Model model =
>
>                 ModelFactory.**createDefaultModel();
>
>
>                                                 java.io.InputStream
>                 enhancementResults =
>                 con.getInputStream();
>
>                                                
>                 model.read(enhancementResults, null);
>
>                                                    String
>                     queryStringForGraph =  "PREFIX t:
>
>                 <http://fise.iks-project.eu/**ontology/<http://fise.iks-project.eu/ontology/>>
>
>
>                  " +
>
>                                                                  
>                      "SELECT ?label WHERE
>                     {?alias
>
>                 t:entity-reference ?label}";
>
>                                                    Query query =
>
>                 QueryFactory.create(**queryStringForGraph);
>
>                                                    QueryExecution qe =
>
>                 QueryExecutionFactory.create(**query, model);
>
>
>
>                                                    ResultSet results =
>                     qe.execSelect();
>                                                  
>                      while(results.hasNext())
>                                                    {
>
>                 _return_list.add(results.next(**).toString());
>
>                                                    }
>                                            }
>                                    }
>                                    catch(Exception ex)
>                                    {
>                                          
>                      System.out.println(ex.**getMessage());
>
>                                    }
>                                    return _return_list;
>                            }
>
>                     On Thu, Jan 12, 2012 at 8:32 AM, srecko joksimovic
>
>                 <sreckojoksimovic@gmail.com
>                 <ma...@gmail.com>>   wrote:
>
>                     Hi Rupert,
>
>                     Thank you for the answer. I've probably missed that.
>
>                     Best,
>                     Srecko
>
>
>                     On Thu, Jan 12, 2012 at 6:12 AM, Rupert Westenthaler
>
>                 <rupert.westenthaler@gmail.com
>                 <ma...@gmail.com>**>   wrote:
>
>                     Hi Srecko
>
>                     I think the last time I directly used this API is
>                     about 3-4 years ago,
>                     but
>
>                 after a look at the http client tutorial [1] I think
>                 the reason for your
>                 problem is that you do not execute the GetMethod.
>
>                     Based on this tutorial the code should look like
>
>                        // Create an instance of HttpClient.
>                        HttpClient client = new HttpClient();
>                        GetMethod get = new GetMethod(url);
>                        try {
>                            // Execute the method.
>                            int statusCode = client.executeMethod(get);
>                            if (statusCode != HttpStatus.SC_OK) {
>                                //handle the error
>                            }
>                            InputStream t_is =
>                     get.getResponseBodyAsStream();
>                            //read the data of the stream
>                        }
>
>                     In addition you should not use a Reader if you
>                     want to read byte oriented
>
>                 data from the input stream.
>
>                     hope this helps
>                     best
>                     Rupert
>
>                     [1]
>                     http://hc.apache.org/**httpclient-3.x/tutorial.html<http://hc.apache.org/httpclient-3.x/tutorial.html>
>
>
>
>                     On 11.01.2012, at 22:34, Srecko Joksimovic wrote:
>
>                      That's it. Thank you!
>
>                         I have already configured KeywordLinkingEngine
>                         when I used my own
>
>                     ontology.
>                     I think I'm familiar with that and I will try that
>                     option too.
>
>                         In meanwhile I found another interesting
>                         problem. I tried to annotate
>                         document and web page. With web page, I tried
>                         IOUtils.write(byte[], out) and I had to
>                         convert URL to byte[]:
>
>                         public static byte[] GetBytesFromURL(String
>                         _url) throws IOException
>                         {
>                               GetMethod get = new GetMethod(_url);
>                               InputStream t_is =
>                         get.getResponseBodyAsStream();
>                               byte[] buffer = new byte[1024];
>                               int count = -1;
>                               Reader t_url_reader = new BufferedReader(new
>                         InputStreamReader(t_is));
>                               byte[] t_bytes =
>                         IOUtils.toByteArray(t_url_**reader, "UTF-8");
>
>
>                               return t_bytes;
>                         }
>
>                         But, the problem is that I'm getting null for
>                         InputStream.
>
>                         Any ideas?
>
>                         Best,
>                         Srecko
>
>
>
>                         -----Original Message-----
>                         From: Rupert Westenthaler
>                         [mailto:rupert.westenthaler@
>                         <mailto:rupert.westenthaler@>**gmail.com
>                         <http://gmail.com><rupert.westenthaler@gmail.com
>                         <ma...@gmail.com>>
>                         ]
>                         Sent: Wednesday, January 11, 2012 22:08
>                         To: Srecko Joksimovic
>                         Cc:
>                         stanbol-dev@incubator.apache.**org<stanbol-dev@incubator.apache.org
>                         <ma...@incubator.apache.org>>
>                         Subject: Re: Annotating using DBPedia ontology
>
>
>                         On 11.01.2012, at 21:41, Srecko Joksimovic wrote:
>
>                             Hi Rupert,
>
>                             When I load localhost:8080/engines it says
>                             this:
>
>                             There are currently 5 active engines.
>                             org.apache.stanbol.enhancer.**engines.metaxa.MetaxaEngine
>                             org.apache.stanbol.enhancer.**engines.langid.**LangIdEnhancementEngine
>
>                              org.apache.stanbol.enhancer.**engines.opennlp.impl.**
>
>                 NamedEntityExtractionEnhanc
>
>                     ementEngine
>
>                              org.apache.stanbol.enhancer.**engines.entitytagging.impl.**
>
>                 NamedEntityTaggingEng
>
>                     ine
>
>                              org.apache.stanbol.enhancer.**engines.entitytagging.impl.**
>
>                 NamedEntityTaggingEng
>
>                     ine
>
>                             Maybe this could tell you something?
>
>                              This are exactly the 5 engines that are
>                             expected to run with the
>
>                         default
>                         configuration.
>                         Based on this the Stanbol Enhnacer should just
>                         work fine.
>
>                         After looking at the the text you enhanced I
>                         noticed however that is
>
>                     does
>                     not mention
>
>                         any named entities such as Persons,
>                         Organizations and Places. So I
>
>                     checked
>                     it with
>
>                         my local Stanbol version and was also not any
>                         detected entities.
>
>                         So to check if Stanbol works as expected you
>                         should try to use an other
>
>                     text
>                     the
>
>                         mentions some Named Entities such as
>
>                            "John Smith works for the Apple Inc. in
>                         Cupertino, California."
>
>
>                         If you want to search also for entities like
>                         "Bank", "Blog", "Consumer",
>                         "Telephone" .
>                         you need to also configure a
>                         KeywordLinkingEngine for dbpedia. Part B or
>
>                     [3]
>                     provides
>
>                         more information on how to do that.
>
>                         But let me mention that the
>                         KeywordLinkingEngine is more useful if used
>
>                     in
>                     combination
>
>                         with an own domain specific thesaurus rather
>                         than a global data set like
>                         dbpedia. When
>                         used with dbpedia you will also get a lot of
>                         false positives.
>
>                         best
>                         Rupert
>
>                         [3]
>                         http://incubator.apache.org/**stanbol/docs/trunk/**
>                         customvocabulary.html<http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html>
>
>
>
>
>
>
>
>

Re: Annotating using DBPedia ontology

Posted by srecko joksimovic <sr...@gmail.com>.

Hi,

Thank you! I will checkout the last version.
I'm using application/msword, because I thought that is the right one.
Could you please send me correct formats for pdf, txt, ppt, xls and odt
formats?

Best,
Srecko

On Fri, Jan 13, 2012 at 1:34 PM, Walter Kasper <wk...@apache.org> wrote:

> Hi,
>
> We fixed the problem with unresolved relative URL from HTML documents. In
> the case of your Wikipedia page it came from an embedded rel-license
> microformat. If you are interested only in text extraction you can also
> just disable the RDFa and Microformat extractors in the configuration for
> the html extraction.
>
> We tested also Word documents with your test sentence. Everything worked
> fine for us. Did you use the correct mime type? The correct ones for Word
> documents are:
>
> doc-Format (<= Word-2003): application/vnd.ms-word
> docx-Format (Word-2007): application/vnd.**openxmlformats-officedocument.*
> *wordprocessingml
>
> Best regards,
>
> Walter
>
> srecko joksimovic wrote:
>
>> Hi Walter,
>>
>> Word document is nothing special, just one line of text:
>>
>> "John Smith works for the Apple Inc. in Cupertino, California."
>>
>> Rupert suggested this sentence in order to test text annotation. As I now
>> result after annotating this string, I thought to create Word document
>> with
>> same content for test purposes.
>>
>> The error with your HTML page apparently arises from a bug in resolving
>> relative URLs in one of the HTML extractors. We will fix that.
>>
>> Does it means that I can't annotate HTML page at this moment, or that
>> depends on page to page basis?
>>
>> Best,
>> Srecko
>>
>> On Fri, Jan 13, 2012 at 9:51 AM, Walter Kasper<wk...@apache.org>
>>  wrote:
>>
>>  Hi Srecko,
>>>
>>> I don't know what the problem with your Word document could have been.
>>> Could you send it to me for testing?
>>>
>>> The error with your HTML page apparently arises from a bug in resolving
>>> relative URLs in one of the HTML extractors. We will fix that.
>>>
>>> Best regards,
>>>
>>> Walter
>>>
>>>
>>> Srecko Joksimovic wrote:
>>>
>>>  Thank you Rupert!
>>>>
>>>> It is probably something that I missed.
>>>>
>>>> Best,
>>>> Srecko
>>>>
>>>> -----Original Message-----
>>>> From: Rupert Westenthaler [mailto:rupert.westenthaler@****gmail.com<http://gmail.com>
>>>> <rupert.westenthaler@**gmail.com <ru...@gmail.com>>
>>>> ]
>>>> Sent: Thursday, January 12, 2012 20:16
>>>> To: Srecko Joksimovic; wkasper@apache.org
>>>> Cc: stanbol-dev@incubator.apache.****org<stanbol-dev@incubator.**
>>>> apache.org <st...@incubator.apache.org>>
>>>> Subject: Re: Annotating using DBPedia ontology
>>>>
>>>> Hi Srecko
>>>>
>>>> I seams that both cases are related to the Metaxa Engine. My knowledge
>>>> abut
>>>> the libs used by this engine to extract the textual content is very
>>>> limited.
>>>> So I might not be the right person to look into that.
>>>>
>>>> In the first Example I think Metaxa was not able to extract the text
>>>> from
>>>> the word document because the only plainTextContent triple noted is
>>>>
>>>> <j.0:plainTextContent>****Microsoft Word-Dokument&#xD;
>>>>
>>>> srecko</j.0:plainTextContent>
>>>>
>>>> The  second example looks like an issue within the RDF metadata
>>>> generation
>>>> in Aperture.
>>>>
>>>> I sent this replay also directly to Walter Kasper. He is the one who
>>>> contributed this engine and should be able to provide a more
>>>> information.
>>>>
>>>> best
>>>> Rupert
>>>>
>>>> On 12.01.2012, at 18:40, srecko joksimovic wrote:
>>>>
>>>>  Hi Rupert,
>>>>
>>>>> I have another question, and I will finish soon.
>>>>>
>>>>> I tried to annotate pdf document, and I didn't get result I expected.
>>>>> Then
>>>>>
>>>>>  I put string you sent to me
>>>>
>>>>  "John Smith works for the Apple Inc. in Cupertino, California."
>>>>> in MS Word document, and this is the result I got:
>>>>>
>>>>> <rdf:RDF
>>>>>     xmlns:rdf="http://www.w3.org/****1999/02/22-rdf-syntax-ns#<http://www.w3.org/**1999/02/22-rdf-syntax-ns#>
>>>>> <htt**p://www.w3.org/1999/02/22-rdf-**syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>>>> >
>>>>> "
>>>>>     xmlns:j.0="http://www.**semant**icdesktop.org/**<http://semanticdesktop.org/**>
>>>>> ontologies/2007/01/19/nie#<htt**p://www.semanticdesktop.org/**
>>>>> ontologies/2007/01/19/nie#<http://www.semanticdesktop.org/ontologies/2007/01/19/nie#>
>>>>> >
>>>>> "
>>>>>     xmlns:j.1="http://purl.org/dc/****terms/<http://purl.org/dc/**terms/>
>>>>> <http://purl.org/dc/**terms/ <http://purl.org/dc/terms/>>"
>>>>>     xmlns:j.2="http://www.**semant**icdesktop.org/**<http://semanticdesktop.org/**>
>>>>> ontologies/2007/03/22/nfo#<htt**p://www.semanticdesktop.org/**
>>>>> ontologies/2007/03/22/nfo#<http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#>
>>>>> >
>>>>> "
>>>>>     xmlns:j.3="http://fise.iks-**p**roject.eu/ontology/<http://project.eu/ontology/>
>>>>> <http://**fise.iks-project.eu/ontology/<http://fise.iks-project.eu/ontology/>
>>>>> >
>>>>> ">
>>>>>   <rdf:Description
>>>>>
>>>>>  rdf:about="urn:enhancement-****55016818-eb97-7b98-521a-****
>>>> 422e3742173b">
>>>>
>>>>      <rdf:type
>>>>>
>>>>>  rdf:resource="http://fise.iks-****project.eu/ontology/****
>>>> TextAnnotation <http://project.eu/ontology/**TextAnnotation><
>>>> http://fise.**iks-project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>>> >
>>>> "/>
>>>>
>>>>      <j.1:creator
>>>>>
>>>>>  rdf:datatype="http://www.w3.****org/2001/XMLSchema#string<http**
>>>> ://www.w3.org/2001/XMLSchema#**string<http://www.w3.org/2001/XMLSchema#string>
>>>> >
>>>> ">**org.apache.stanbol.en
>>>> hancer.engines.langid.****LangIdEnhancementEngine</j.1:****creator>
>>>>
>>>>      <j.1:created
>>>>>
>>>>>  rdf:datatype="http://www.w3.****org/2001/XMLSchema#dateTime<ht**
>>>> tp://www.w3.org/2001/**XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>>> >
>>>> ">**2012-01-12T17:34:20
>>>>
>>>> .288Z</j.1:created>
>>>>
>>>>      <j.3:extracted-from
>>>>>
>>>>>  rdf:resource="urn:content-****item-sha1-****
>>>> 835c8a5397d9b376a268b7bb5d3c8b****
>>>> 4ab7e8b81f
>>>> "/>
>>>>
>>>>      <rdf:type
>>>>>
>>>>>  rdf:resource="http://fise.iks-****project.eu/ontology/****Enhancement<http://project.eu/ontology/**Enhancement>
>>>> <http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>>> >
>>>> "/>
>>>>
>>>>      <j.1:language>fr</j.1:****language>
>>>>>   </rdf:Description>
>>>>>   <rdf:Description
>>>>>
>>>>>  rdf:about="urn:content-item-****sha1-****
>>>> 835c8a5397d9b376a268b7bb5d3c8b****
>>>> 4ab7e8b81f">
>>>>
>>>>      <rdf:type
>>>>>
>>>>>  rdf:resource="http://www.**sem**anticdesktop.org/**<http://semanticdesktop.org/**>
>>>> ontologies/2007/03/22/nfo#****Pagin<http://www.**semanticdesktop.org/**
>>>> ontologies/2007/03/22/nfo#**Pagin<http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Pagin>
>>>> >
>>>> atedTextDocument"/>
>>>>
>>>>      <j.0:plainTextContent>****Microsoft Word-Dokument&#xD;
>>>>>
>>>>> srecko</j.0:plainTextContent>
>>>>>   </rdf:Description>
>>>>>   <rdf:Description
>>>>>
>>>>>  rdf:about="urn:enhancement-****0644a1ed-f1d8-334d-d4e9-****
>>>> 690a0446cba8">
>>>>
>>>>      <j.3:confidence
>>>>>
>>>>>  rdf:datatype="http://www.w3.****org/2001/XMLSchema#double<http**
>>>> ://www.w3.org/2001/XMLSchema#**double<http://www.w3.org/2001/XMLSchema#double>
>>>> >
>>>> ">1.**0</j.3:confidence>
>>>>
>>>>      <rdf:type
>>>>>
>>>>>  rdf:resource="http://fise.iks-****project.eu/ontology/****
>>>> TextAnnotation <http://project.eu/ontology/**TextAnnotation><
>>>> http://fise.**iks-project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>>> >
>>>> "/>
>>>>
>>>>      <j.1:creator
>>>>>
>>>>>  rdf:datatype="http://www.w3.****org/2001/XMLSchema#string<http**
>>>> ://www.w3.org/2001/XMLSchema#**string<http://www.w3.org/2001/XMLSchema#string>
>>>> >
>>>> ">**org.apache.stanbol.en
>>>> hancer.engines.metaxa.****MetaxaEngine</j.1:creator>
>>>>
>>>>      <j.1:created
>>>>>
>>>>>  rdf:datatype="http://www.w3.****org/2001/XMLSchema#dateTime<ht**
>>>> tp://www.w3.org/2001/**XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>>> >
>>>> ">**2012-01-12T17:34:20
>>>>
>>>> .273Z</j.1:created>
>>>>
>>>>      <j.3:extracted-from
>>>>>
>>>>>  rdf:resource="urn:content-****item-sha1-****
>>>> 835c8a5397d9b376a268b7bb5d3c8b****
>>>> 4ab7e8b81f
>>>> "/>
>>>>
>>>>      <rdf:type
>>>>>
>>>>>  rdf:resource="http://fise.iks-****project.eu/ontology/****Enhancement<http://project.eu/ontology/**Enhancement>
>>>> <http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>>> >
>>>>
>>>> "/>
>>>>
>>>>    </rdf:Description>
>>>>> </rdf:RDF>
>>>>>
>>>>>
>>>>> and this is the code:
>>>>>
>>>>>        public List<String>   Annotate(byte[] _stream_to_annotate,
>>>>>
>>>>>  ServiceUtils.MIMETypes _content_type, String _encoding)
>>>>
>>>>         {
>>>>>                List<String>   _return_list = new ArrayList<String>();
>>>>>                try
>>>>>                {
>>>>>                        URL url = new URL(ServiceUtils.SERVICE_URL);
>>>>>                        HttpURLConnection con =
>>>>>
>>>>>  (HttpURLConnection)url.****openConnection();
>>>>
>>>>                         con.setDoOutput(true);
>>>>>                        con.setRequestMethod("POST");
>>>>>                        con.setRequestProperty("****Accept",
>>>>>
>>>>>  "application/rdf+xml");
>>>>
>>>>                         con.setRequestProperty("****Content-type",
>>>>>
>>>>>  _content_type.getValue());
>>>>
>>>>                         java.io.OutputStream out =
>>>>> con.getOutputStream();
>>>>>
>>>>>                        IOUtils.write(_stream_to_****annotate, out);
>>>>>
>>>>>                        IOUtils.closeQuietly(out);
>>>>>
>>>>>                        con.connect(); //send the request
>>>>>
>>>>>                        if(con.getResponseCode()>   299)
>>>>>                        {
>>>>>                                java.io.InputStream errorStream =
>>>>>
>>>>>  con.getErrorStream();
>>>>
>>>>                                 if(errorStream != null)
>>>>>                                {
>>>>>                                        String errorMessage =
>>>>>
>>>>>  IOUtils.toString(errorStream);
>>>>
>>>>                                         IOUtils.closeQuietly(**
>>>>>
>>>>> errorStream);
>>>>>                                }
>>>>>                                else
>>>>>                                {
>>>>>                                        //no error data
>>>>>                                        //write default error message
>>>>> with
>>>>>
>>>>>  the status code
>>>>
>>>>                                 }
>>>>>                        }
>>>>>                        else
>>>>>                        {
>>>>>                                Model model =
>>>>>
>>>>>  ModelFactory.****createDefaultModel();
>>>>
>>>>
>>>>                                 java.io.InputStream enhancementResults =
>>>> con.getInputStream();
>>>>
>>>>                                 model.read(enhancementResults, null);
>>>>
>>>>>                                String queryStringForGraph =  "PREFIX t:
>>>>>
>>>>>  <http://fise.iks-project.eu/****ontology/<http://fise.iks-project.eu/**ontology/>
>>>> <http://fise.iks-**project.eu/ontology/<http://fise.iks-project.eu/ontology/>
>>>> >>
>>>>
>>>>  " +
>>>>
>>>>                                                 "SELECT ?label WHERE
>>>>> {?alias
>>>>>
>>>>>  t:entity-reference ?label}";
>>>>
>>>>                                 Query query =
>>>>>
>>>>>  QueryFactory.create(****queryStringForGraph);
>>>>
>>>>                                 QueryExecution qe =
>>>>>
>>>>>  QueryExecutionFactory.create(****query, model);
>>>>
>>>>
>>>>
>>>>                                 ResultSet results = qe.execSelect();
>>>>>                                while(results.hasNext())
>>>>>                                {
>>>>>
>>>>>  _return_list.add(results.next(****).toString());
>>>>
>>>>                                 }
>>>>>                        }
>>>>>                }
>>>>>                catch(Exception ex)
>>>>>                {
>>>>>                        System.out.println(ex.****getMessage());
>>>>>
>>>>>                }
>>>>>                return _return_list;
>>>>>        }
>>>>>
>>>>> On Thu, Jan 12, 2012 at 8:32 AM, srecko joksimovic
>>>>>
>>>>>  <sr...@gmail.com>   wrote:
>>>>
>>>>  Hi Rupert,
>>>>>
>>>>> Thank you for the answer. I've probably missed that.
>>>>>
>>>>> Best,
>>>>> Srecko
>>>>>
>>>>>
>>>>> On Thu, Jan 12, 2012 at 6:12 AM, Rupert Westenthaler
>>>>>
>>>>>  <rupert.westenthaler@gmail.com****>   wrote:
>>>>
>>>>  Hi Srecko
>>>>>
>>>>> I think the last time I directly used this API is about 3-4 years ago,
>>>>> but
>>>>>
>>>>>  after a look at the http client tutorial [1] I think the reason for
>>>> your
>>>> problem is that you do not execute the GetMethod.
>>>>
>>>>  Based on this tutorial the code should look like
>>>>>
>>>>>    // Create an instance of HttpClient.
>>>>>    HttpClient client = new HttpClient();
>>>>>    GetMethod get = new GetMethod(url);
>>>>>    try {
>>>>>        // Execute the method.
>>>>>        int statusCode = client.executeMethod(get);
>>>>>        if (statusCode != HttpStatus.SC_OK) {
>>>>>            //handle the error
>>>>>        }
>>>>>        InputStream t_is = get.getResponseBodyAsStream();
>>>>>        //read the data of the stream
>>>>>    }
>>>>>
>>>>> In addition you should not use a Reader if you want to read byte
>>>>> oriented
>>>>>
>>>>>  data from the input stream.
>>>>
>>>>  hope this helps
>>>>> best
>>>>> Rupert
>>>>>
>>>>> [1] http://hc.apache.org/****httpclient-3.x/tutorial.html<http://hc.apache.org/**httpclient-3.x/tutorial.html>
>>>>> <h**ttp://hc.apache.org/**httpclient-3.x/tutorial.html<http://hc.apache.org/httpclient-3.x/tutorial.html>
>>>>> >
>>>>>
>>>>>
>>>>> On 11.01.2012, at 22:34, Srecko Joksimovic wrote:
>>>>>
>>>>>  That's it. Thank you!
>>>>>
>>>>>> I have already configured KeywordLinkingEngine when I used my own
>>>>>>
>>>>>>  ontology.
>>>>> I think I'm familiar with that and I will try that option too.
>>>>>
>>>>>> In meanwhile I found another interesting problem. I tried to annotate
>>>>>> document and web page. With web page, I tried
>>>>>> IOUtils.write(byte[], out) and I had to convert URL to byte[]:
>>>>>>
>>>>>> public static byte[] GetBytesFromURL(String _url) throws IOException
>>>>>> {
>>>>>>       GetMethod get = new GetMethod(_url);
>>>>>>       InputStream t_is = get.getResponseBodyAsStream();
>>>>>>       byte[] buffer = new byte[1024];
>>>>>>       int count = -1;
>>>>>>       Reader t_url_reader = new BufferedReader(new
>>>>>> InputStreamReader(t_is));
>>>>>>       byte[] t_bytes = IOUtils.toByteArray(t_url_****reader,
>>>>>> "UTF-8");
>>>>>>
>>>>>>
>>>>>>       return t_bytes;
>>>>>> }
>>>>>>
>>>>>> But, the problem is that I'm getting null for InputStream.
>>>>>>
>>>>>> Any ideas?
>>>>>>
>>>>>> Best,
>>>>>> Srecko
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Rupert Westenthaler [mailto:rupert.westenthaler@****gmail.com<http://gmail.com>
>>>>>> <rupert.westenthaler@**gmail.com <ru...@gmail.com>>
>>>>>> ]
>>>>>> Sent: Wednesday, January 11, 2012 22:08
>>>>>> To: Srecko Joksimovic
>>>>>> Cc: stanbol-dev@incubator.apache.****org<stanbol-dev@incubator.**
>>>>>> apache.org <st...@incubator.apache.org>>
>>>>>> Subject: Re: Annotating using DBPedia ontology
>>>>>>
>>>>>>
>>>>>> On 11.01.2012, at 21:41, Srecko Joksimovic wrote:
>>>>>>
>>>>>>  Hi Rupert,
>>>>>>>
>>>>>>> When I load localhost:8080/engines it says this:
>>>>>>>
>>>>>>> There are currently 5 active engines.
>>>>>>> org.apache.stanbol.enhancer.****engines.metaxa.MetaxaEngine
>>>>>>> org.apache.stanbol.enhancer.****engines.langid.****
>>>>>>> LangIdEnhancementEngine
>>>>>>>
>>>>>>>  org.apache.stanbol.enhancer.****engines.opennlp.impl.**
>>>>>>>
>>>>>> NamedEntityExtractionEnhanc
>>>>
>>>>  ementEngine
>>>>>
>>>>>>  org.apache.stanbol.enhancer.****engines.entitytagging.impl.**
>>>>>>>
>>>>>> NamedEntityTaggingEng
>>>>
>>>>  ine
>>>>>
>>>>>>  org.apache.stanbol.enhancer.****engines.entitytagging.impl.**
>>>>>>>
>>>>>> NamedEntityTaggingEng
>>>>
>>>>  ine
>>>>>
>>>>>> Maybe this could tell you something?
>>>>>>>
>>>>>>>  This are exactly the 5 engines that are expected to run with the
>>>>>>>
>>>>>> default
>>>>>> configuration.
>>>>>> Based on this the Stanbol Enhnacer should just work fine.
>>>>>>
>>>>>> After looking at the the text you enhanced I noticed however that is
>>>>>>
>>>>>>  does
>>>>> not mention
>>>>>
>>>>>> any named entities such as Persons, Organizations and Places. So I
>>>>>>
>>>>>>  checked
>>>>> it with
>>>>>
>>>>>> my local Stanbol version and was also not any detected entities.
>>>>>>
>>>>>> So to check if Stanbol works as expected you should try to use an
>>>>>> other
>>>>>>
>>>>>>  text
>>>>> the
>>>>>
>>>>>> mentions some Named Entities such as
>>>>>>
>>>>>>    "John Smith works for the Apple Inc. in Cupertino, California."
>>>>>>
>>>>>>
>>>>>> If you want to search also for entities like "Bank", "Blog",
>>>>>> "Consumer",
>>>>>> "Telephone" .
>>>>>> you need to also configure a KeywordLinkingEngine for dbpedia. Part B
>>>>>> or
>>>>>>
>>>>>>  [3]
>>>>> provides
>>>>>
>>>>>> more information on how to do that.
>>>>>>
>>>>>> But let me mention that the KeywordLinkingEngine is more useful if
>>>>>> used
>>>>>>
>>>>>>  in
>>>>> combination
>>>>>
>>>>>> with an own domain specific thesaurus rather than a global data set
>>>>>> like
>>>>>> dbpedia. When
>>>>>> used with dbpedia you will also get a lot of false positives.
>>>>>>
>>>>>> best
>>>>>> Rupert
>>>>>>
>>>>>> [3] http://incubator.apache.org/****stanbol/docs/trunk/**<http://incubator.apache.org/**stanbol/docs/trunk/**>
>>>>>> customvocabulary.html<http://**incubator.apache.org/stanbol/**
>>>>>> docs/trunk/customvocabulary.**html<http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html>
>>>>>> >
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>
>
>
>

Re: Annotating using DBPedia ontology

Posted by Walter Kasper <wk...@apache.org>.

Hi,

We fixed the problem with unresolved relative URL from HTML documents. 
In the case of your Wikipedia page it came from an embedded rel-license 
microformat. If you are interested only in text extraction you can also 
just disable the RDFa and Microformat extractors in the configuration 
for the html extraction.

We tested also Word documents with your test sentence. Everything worked 
fine for us. Did you use the correct mime type? The correct ones for 
Word documents are:

doc-Format (<= Word-2003): application/vnd.ms-word
docx-Format (Word-2007): 
application/vnd.openxmlformats-officedocument.wordprocessingml

Best regards,

Walter

srecko joksimovic wrote:
> Hi Walter,
>
> Word document is nothing special, just one line of text:
>
> "John Smith works for the Apple Inc. in Cupertino, California."
>
> Rupert suggested this sentence in order to test text annotation. As I now
> result after annotating this string, I thought to create Word document with
> same content for test purposes.
>
> The error with your HTML page apparently arises from a bug in resolving
> relative URLs in one of the HTML extractors. We will fix that.
>
> Does it means that I can't annotate HTML page at this moment, or that
> depends on page to page basis?
>
> Best,
> Srecko
>
> On Fri, Jan 13, 2012 at 9:51 AM, Walter Kasper<wk...@apache.org>  wrote:
>
>> Hi Srecko,
>>
>> I don't know what the problem with your Word document could have been.
>> Could you send it to me for testing?
>>
>> The error with your HTML page apparently arises from a bug in resolving
>> relative URLs in one of the HTML extractors. We will fix that.
>>
>> Best regards,
>>
>> Walter
>>
>>
>> Srecko Joksimovic wrote:
>>
>>> Thank you Rupert!
>>>
>>> It is probably something that I missed.
>>>
>>> Best,
>>> Srecko
>>>
>>> -----Original Message-----
>>> From: Rupert Westenthaler [mailto:rupert.westenthaler@**gmail.com<ru...@gmail.com>
>>> ]
>>> Sent: Thursday, January 12, 2012 20:16
>>> To: Srecko Joksimovic; wkasper@apache.org
>>> Cc: stanbol-dev@incubator.apache.**org<st...@incubator.apache.org>
>>> Subject: Re: Annotating using DBPedia ontology
>>>
>>> Hi Srecko
>>>
>>> I seams that both cases are related to the Metaxa Engine. My knowledge
>>> abut
>>> the libs used by this engine to extract the textual content is very
>>> limited.
>>> So I might not be the right person to look into that.
>>>
>>> In the first Example I think Metaxa was not able to extract the text from
>>> the word document because the only plainTextContent triple noted is
>>>
>>> <j.0:plainTextContent>**Microsoft Word-Dokument&#xD;
>>> srecko</j.0:plainTextContent>
>>>
>>> The  second example looks like an issue within the RDF metadata generation
>>> in Aperture.
>>>
>>> I sent this replay also directly to Walter Kasper. He is the one who
>>> contributed this engine and should be able to provide a more information.
>>>
>>> best
>>> Rupert
>>>
>>> On 12.01.2012, at 18:40, srecko joksimovic wrote:
>>>
>>>   Hi Rupert,
>>>> I have another question, and I will finish soon.
>>>>
>>>> I tried to annotate pdf document, and I didn't get result I expected.
>>>> Then
>>>>
>>> I put string you sent to me
>>>
>>>> "John Smith works for the Apple Inc. in Cupertino, California."
>>>> in MS Word document, and this is the result I got:
>>>>
>>>> <rdf:RDF
>>>>      xmlns:rdf="http://www.w3.org/**1999/02/22-rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>>> "
>>>>      xmlns:j.0="http://www.**semanticdesktop.org/**
>>>> ontologies/2007/01/19/nie#<http://www.semanticdesktop.org/ontologies/2007/01/19/nie#>
>>>> "
>>>>      xmlns:j.1="http://purl.org/dc/**terms/<http://purl.org/dc/terms/>"
>>>>      xmlns:j.2="http://www.**semanticdesktop.org/**
>>>> ontologies/2007/03/22/nfo#<http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#>
>>>> "
>>>>      xmlns:j.3="http://fise.iks-**project.eu/ontology/<http://fise.iks-project.eu/ontology/>
>>>> ">
>>>>    <rdf:Description
>>>>
>>> rdf:about="urn:enhancement-**55016818-eb97-7b98-521a-**422e3742173b">
>>>
>>>>      <rdf:type
>>>>
>>> rdf:resource="http://fise.iks-**project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>> "/>
>>>
>>>>      <j.1:creator
>>>>
>>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>>> ">**org.apache.stanbol.en
>>> hancer.engines.langid.**LangIdEnhancementEngine</j.1:**creator>
>>>
>>>>      <j.1:created
>>>>
>>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>> ">**2012-01-12T17:34:20
>>> .288Z</j.1:created>
>>>
>>>>      <j.3:extracted-from
>>>>
>>> rdf:resource="urn:content-**item-sha1-**835c8a5397d9b376a268b7bb5d3c8b**
>>> 4ab7e8b81f
>>> "/>
>>>
>>>>      <rdf:type
>>>>
>>> rdf:resource="http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>> "/>
>>>
>>>>      <j.1:language>fr</j.1:**language>
>>>>    </rdf:Description>
>>>>    <rdf:Description
>>>>
>>> rdf:about="urn:content-item-**sha1-**835c8a5397d9b376a268b7bb5d3c8b**
>>> 4ab7e8b81f">
>>>
>>>>      <rdf:type
>>>>
>>> rdf:resource="http://www.**semanticdesktop.org/**
>>> ontologies/2007/03/22/nfo#**Pagin<http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Pagin>
>>> atedTextDocument"/>
>>>
>>>>      <j.0:plainTextContent>**Microsoft Word-Dokument&#xD;
>>>> srecko</j.0:plainTextContent>
>>>>    </rdf:Description>
>>>>    <rdf:Description
>>>>
>>> rdf:about="urn:enhancement-**0644a1ed-f1d8-334d-d4e9-**690a0446cba8">
>>>
>>>>      <j.3:confidence
>>>>
>>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#double<http://www.w3.org/2001/XMLSchema#double>
>>> ">1.**0</j.3:confidence>
>>>
>>>>      <rdf:type
>>>>
>>> rdf:resource="http://fise.iks-**project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>> "/>
>>>
>>>>      <j.1:creator
>>>>
>>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>>> ">**org.apache.stanbol.en
>>> hancer.engines.metaxa.**MetaxaEngine</j.1:creator>
>>>
>>>>      <j.1:created
>>>>
>>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>> ">**2012-01-12T17:34:20
>>> .273Z</j.1:created>
>>>
>>>>      <j.3:extracted-from
>>>>
>>> rdf:resource="urn:content-**item-sha1-**835c8a5397d9b376a268b7bb5d3c8b**
>>> 4ab7e8b81f
>>> "/>
>>>
>>>>      <rdf:type
>>>>
>>> rdf:resource="http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>> "/>
>>>
>>>>    </rdf:Description>
>>>> </rdf:RDF>
>>>>
>>>>
>>>> and this is the code:
>>>>
>>>>         public List<String>   Annotate(byte[] _stream_to_annotate,
>>>>
>>> ServiceUtils.MIMETypes _content_type, String _encoding)
>>>
>>>>         {
>>>>                 List<String>   _return_list = new ArrayList<String>();
>>>>                 try
>>>>                 {
>>>>                         URL url = new URL(ServiceUtils.SERVICE_URL);
>>>>                         HttpURLConnection con =
>>>>
>>> (HttpURLConnection)url.**openConnection();
>>>
>>>>                         con.setDoOutput(true);
>>>>                         con.setRequestMethod("POST");
>>>>                         con.setRequestProperty("**Accept",
>>>>
>>> "application/rdf+xml");
>>>
>>>>                         con.setRequestProperty("**Content-type",
>>>>
>>> _content_type.getValue());
>>>
>>>>                         java.io.OutputStream out = con.getOutputStream();
>>>>
>>>>                         IOUtils.write(_stream_to_**annotate, out);
>>>>                         IOUtils.closeQuietly(out);
>>>>
>>>>                         con.connect(); //send the request
>>>>
>>>>                         if(con.getResponseCode()>   299)
>>>>                         {
>>>>                                 java.io.InputStream errorStream =
>>>>
>>> con.getErrorStream();
>>>
>>>>                                 if(errorStream != null)
>>>>                                 {
>>>>                                         String errorMessage =
>>>>
>>> IOUtils.toString(errorStream);
>>>
>>>>                                         IOUtils.closeQuietly(**
>>>> errorStream);
>>>>                                 }
>>>>                                 else
>>>>                                 {
>>>>                                         //no error data
>>>>                                         //write default error message with
>>>>
>>> the status code
>>>
>>>>                                 }
>>>>                         }
>>>>                         else
>>>>                         {
>>>>                                 Model model =
>>>>
>>> ModelFactory.**createDefaultModel();
>>>
>>>                                  java.io.InputStream enhancementResults =
>>> con.getInputStream();
>>>
>>>                                  model.read(enhancementResults, null);
>>>>                                 String queryStringForGraph =  "PREFIX t:
>>>>
>>> <http://fise.iks-project.eu/**ontology/<http://fise.iks-project.eu/ontology/>>
>>>   " +
>>>
>>>>                                                 "SELECT ?label WHERE
>>>> {?alias
>>>>
>>> t:entity-reference ?label}";
>>>
>>>>                                 Query query =
>>>>
>>> QueryFactory.create(**queryStringForGraph);
>>>
>>>>                                 QueryExecution qe =
>>>>
>>> QueryExecutionFactory.create(**query, model);
>>>
>>>
>>>>                                 ResultSet results = qe.execSelect();
>>>>                                 while(results.hasNext())
>>>>                                 {
>>>>
>>> _return_list.add(results.next(**).toString());
>>>
>>>>                                 }
>>>>                         }
>>>>                 }
>>>>                 catch(Exception ex)
>>>>                 {
>>>>                         System.out.println(ex.**getMessage());
>>>>                 }
>>>>                 return _return_list;
>>>>         }
>>>>
>>>> On Thu, Jan 12, 2012 at 8:32 AM, srecko joksimovic
>>>>
>>> <sr...@gmail.com>   wrote:
>>>
>>>> Hi Rupert,
>>>>
>>>> Thank you for the answer. I've probably missed that.
>>>>
>>>> Best,
>>>> Srecko
>>>>
>>>>
>>>> On Thu, Jan 12, 2012 at 6:12 AM, Rupert Westenthaler
>>>>
>>> <rupert.westenthaler@gmail.com**>   wrote:
>>>
>>>> Hi Srecko
>>>>
>>>> I think the last time I directly used this API is about 3-4 years ago,
>>>> but
>>>>
>>> after a look at the http client tutorial [1] I think the reason for your
>>> problem is that you do not execute the GetMethod.
>>>
>>>> Based on this tutorial the code should look like
>>>>
>>>>     // Create an instance of HttpClient.
>>>>     HttpClient client = new HttpClient();
>>>>     GetMethod get = new GetMethod(url);
>>>>     try {
>>>>         // Execute the method.
>>>>         int statusCode = client.executeMethod(get);
>>>>         if (statusCode != HttpStatus.SC_OK) {
>>>>             //handle the error
>>>>         }
>>>>         InputStream t_is = get.getResponseBodyAsStream();
>>>>         //read the data of the stream
>>>>     }
>>>>
>>>> In addition you should not use a Reader if you want to read byte oriented
>>>>
>>> data from the input stream.
>>>
>>>> hope this helps
>>>> best
>>>> Rupert
>>>>
>>>> [1] http://hc.apache.org/**httpclient-3.x/tutorial.html<http://hc.apache.org/httpclient-3.x/tutorial.html>
>>>>
>>>> On 11.01.2012, at 22:34, Srecko Joksimovic wrote:
>>>>
>>>>   That's it. Thank you!
>>>>> I have already configured KeywordLinkingEngine when I used my own
>>>>>
>>>> ontology.
>>>> I think I'm familiar with that and I will try that option too.
>>>>> In meanwhile I found another interesting problem. I tried to annotate
>>>>> document and web page. With web page, I tried
>>>>> IOUtils.write(byte[], out) and I had to convert URL to byte[]:
>>>>>
>>>>> public static byte[] GetBytesFromURL(String _url) throws IOException
>>>>> {
>>>>>        GetMethod get = new GetMethod(_url);
>>>>>        InputStream t_is = get.getResponseBodyAsStream();
>>>>>        byte[] buffer = new byte[1024];
>>>>>        int count = -1;
>>>>>        Reader t_url_reader = new BufferedReader(new
>>>>> InputStreamReader(t_is));
>>>>>        byte[] t_bytes = IOUtils.toByteArray(t_url_**reader, "UTF-8");
>>>>>
>>>>>        return t_bytes;
>>>>> }
>>>>>
>>>>> But, the problem is that I'm getting null for InputStream.
>>>>>
>>>>> Any ideas?
>>>>>
>>>>> Best,
>>>>> Srecko
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Rupert Westenthaler [mailto:rupert.westenthaler@**gmail.com<ru...@gmail.com>
>>>>> ]
>>>>> Sent: Wednesday, January 11, 2012 22:08
>>>>> To: Srecko Joksimovic
>>>>> Cc: stanbol-dev@incubator.apache.**org<st...@incubator.apache.org>
>>>>> Subject: Re: Annotating using DBPedia ontology
>>>>>
>>>>>
>>>>> On 11.01.2012, at 21:41, Srecko Joksimovic wrote:
>>>>>
>>>>>> Hi Rupert,
>>>>>>
>>>>>> When I load localhost:8080/engines it says this:
>>>>>>
>>>>>> There are currently 5 active engines.
>>>>>> org.apache.stanbol.enhancer.**engines.metaxa.MetaxaEngine
>>>>>> org.apache.stanbol.enhancer.**engines.langid.**LangIdEnhancementEngine
>>>>>>
>>>>>>   org.apache.stanbol.enhancer.**engines.opennlp.impl.**
>>> NamedEntityExtractionEnhanc
>>>
>>>> ementEngine
>>>>>>   org.apache.stanbol.enhancer.**engines.entitytagging.impl.**
>>> NamedEntityTaggingEng
>>>
>>>> ine
>>>>>>   org.apache.stanbol.enhancer.**engines.entitytagging.impl.**
>>> NamedEntityTaggingEng
>>>
>>>> ine
>>>>>> Maybe this could tell you something?
>>>>>>
>>>>>>   This are exactly the 5 engines that are expected to run with the
>>>>> default
>>>>> configuration.
>>>>> Based on this the Stanbol Enhnacer should just work fine.
>>>>>
>>>>> After looking at the the text you enhanced I noticed however that is
>>>>>
>>>> does
>>>> not mention
>>>>> any named entities such as Persons, Organizations and Places. So I
>>>>>
>>>> checked
>>>> it with
>>>>> my local Stanbol version and was also not any detected entities.
>>>>>
>>>>> So to check if Stanbol works as expected you should try to use an other
>>>>>
>>>> text
>>>> the
>>>>> mentions some Named Entities such as
>>>>>
>>>>>     "John Smith works for the Apple Inc. in Cupertino, California."
>>>>>
>>>>>
>>>>> If you want to search also for entities like "Bank", "Blog", "Consumer",
>>>>> "Telephone" .
>>>>> you need to also configure a KeywordLinkingEngine for dbpedia. Part B or
>>>>>
>>>> [3]
>>>> provides
>>>>> more information on how to do that.
>>>>>
>>>>> But let me mention that the KeywordLinkingEngine is more useful if used
>>>>>
>>>> in
>>>> combination
>>>>> with an own domain specific thesaurus rather than a global data set like
>>>>> dbpedia. When
>>>>> used with dbpedia you will also get a lot of false positives.
>>>>>
>>>>> best
>>>>> Rupert
>>>>>
>>>>> [3] http://incubator.apache.org/**stanbol/docs/trunk/**
>>>>> customvocabulary.html<http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html>
>>>>>
>>>>>
>>>>
>>

Re: Annotating using DBPedia ontology

Posted by srecko joksimovic <sr...@gmail.com>.

Hi Walter,

Word document is nothing special, just one line of text:

"John Smith works for the Apple Inc. in Cupertino, California."

Rupert suggested this sentence in order to test text annotation. As I now
result after annotating this string, I thought to create Word document with
same content for test purposes.

The error with your HTML page apparently arises from a bug in resolving
relative URLs in one of the HTML extractors. We will fix that.

Does it means that I can't annotate HTML page at this moment, or that
depends on page to page basis?

Best,
Srecko

On Fri, Jan 13, 2012 at 9:51 AM, Walter Kasper <wk...@apache.org> wrote:

> Hi Srecko,
>
> I don't know what the problem with your Word document could have been.
> Could you send it to me for testing?
>
> The error with your HTML page apparently arises from a bug in resolving
> relative URLs in one of the HTML extractors. We will fix that.
>
> Best regards,
>
> Walter
>
>
> Srecko Joksimovic wrote:
>
>> Thank you Rupert!
>>
>> It is probably something that I missed.
>>
>> Best,
>> Srecko
>>
>> -----Original Message-----
>> From: Rupert Westenthaler [mailto:rupert.westenthaler@**gmail.com<ru...@gmail.com>
>> ]
>> Sent: Thursday, January 12, 2012 20:16
>> To: Srecko Joksimovic; wkasper@apache.org
>> Cc: stanbol-dev@incubator.apache.**org <st...@incubator.apache.org>
>> Subject: Re: Annotating using DBPedia ontology
>>
>> Hi Srecko
>>
>> I seams that both cases are related to the Metaxa Engine. My knowledge
>> abut
>> the libs used by this engine to extract the textual content is very
>> limited.
>> So I might not be the right person to look into that.
>>
>> In the first Example I think Metaxa was not able to extract the text from
>> the word document because the only plainTextContent triple noted is
>>
>> <j.0:plainTextContent>**Microsoft Word-Dokument&#xD;
>> srecko</j.0:plainTextContent>
>>
>> The  second example looks like an issue within the RDF metadata generation
>> in Aperture.
>>
>> I sent this replay also directly to Walter Kasper. He is the one who
>> contributed this engine and should be able to provide a more information.
>>
>> best
>> Rupert
>>
>> On 12.01.2012, at 18:40, srecko joksimovic wrote:
>>
>>  Hi Rupert,
>>>
>>> I have another question, and I will finish soon.
>>>
>>> I tried to annotate pdf document, and I didn't get result I expected.
>>> Then
>>>
>> I put string you sent to me
>>
>>> "John Smith works for the Apple Inc. in Cupertino, California."
>>> in MS Word document, and this is the result I got:
>>>
>>> <rdf:RDF
>>>     xmlns:rdf="http://www.w3.org/**1999/02/22-rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>> "
>>>     xmlns:j.0="http://www.**semanticdesktop.org/**
>>> ontologies/2007/01/19/nie#<http://www.semanticdesktop.org/ontologies/2007/01/19/nie#>
>>> "
>>>     xmlns:j.1="http://purl.org/dc/**terms/ <http://purl.org/dc/terms/>"
>>>     xmlns:j.2="http://www.**semanticdesktop.org/**
>>> ontologies/2007/03/22/nfo#<http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#>
>>> "
>>>     xmlns:j.3="http://fise.iks-**project.eu/ontology/<http://fise.iks-project.eu/ontology/>
>>> ">
>>>   <rdf:Description
>>>
>> rdf:about="urn:enhancement-**55016818-eb97-7b98-521a-**422e3742173b">
>>
>>>     <rdf:type
>>>
>> rdf:resource="http://fise.iks-**project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>> "/>
>>
>>>     <j.1:creator
>>>
>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>> ">**org.apache.stanbol.en
>> hancer.engines.langid.**LangIdEnhancementEngine</j.1:**creator>
>>
>>>     <j.1:created
>>>
>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>> ">**2012-01-12T17:34:20
>> .288Z</j.1:created>
>>
>>>     <j.3:extracted-from
>>>
>> rdf:resource="urn:content-**item-sha1-**835c8a5397d9b376a268b7bb5d3c8b**
>> 4ab7e8b81f
>> "/>
>>
>>>     <rdf:type
>>>
>> rdf:resource="http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>> "/>
>>
>>>     <j.1:language>fr</j.1:**language>
>>>   </rdf:Description>
>>>   <rdf:Description
>>>
>> rdf:about="urn:content-item-**sha1-**835c8a5397d9b376a268b7bb5d3c8b**
>> 4ab7e8b81f">
>>
>>>     <rdf:type
>>>
>> rdf:resource="http://www.**semanticdesktop.org/**
>> ontologies/2007/03/22/nfo#**Pagin<http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Pagin>
>> atedTextDocument"/>
>>
>>>     <j.0:plainTextContent>**Microsoft Word-Dokument&#xD;
>>> srecko</j.0:plainTextContent>
>>>   </rdf:Description>
>>>   <rdf:Description
>>>
>> rdf:about="urn:enhancement-**0644a1ed-f1d8-334d-d4e9-**690a0446cba8">
>>
>>>     <j.3:confidence
>>>
>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#double<http://www.w3.org/2001/XMLSchema#double>
>> ">1.**0</j.3:confidence>
>>
>>>     <rdf:type
>>>
>> rdf:resource="http://fise.iks-**project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>> "/>
>>
>>>     <j.1:creator
>>>
>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>> ">**org.apache.stanbol.en
>> hancer.engines.metaxa.**MetaxaEngine</j.1:creator>
>>
>>>     <j.1:created
>>>
>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>> ">**2012-01-12T17:34:20
>> .273Z</j.1:created>
>>
>>>     <j.3:extracted-from
>>>
>> rdf:resource="urn:content-**item-sha1-**835c8a5397d9b376a268b7bb5d3c8b**
>> 4ab7e8b81f
>> "/>
>>
>>>     <rdf:type
>>>
>> rdf:resource="http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>> "/>
>>
>>>   </rdf:Description>
>>> </rdf:RDF>
>>>
>>>
>>> and this is the code:
>>>
>>>        public List<String>  Annotate(byte[] _stream_to_annotate,
>>>
>> ServiceUtils.MIMETypes _content_type, String _encoding)
>>
>>>        {
>>>                List<String>  _return_list = new ArrayList<String>();
>>>                try
>>>                {
>>>                        URL url = new URL(ServiceUtils.SERVICE_URL);
>>>                        HttpURLConnection con =
>>>
>> (HttpURLConnection)url.**openConnection();
>>
>>>                        con.setDoOutput(true);
>>>                        con.setRequestMethod("POST");
>>>                        con.setRequestProperty("**Accept",
>>>
>> "application/rdf+xml");
>>
>>>                        con.setRequestProperty("**Content-type",
>>>
>> _content_type.getValue());
>>
>>>
>>>                        java.io.OutputStream out = con.getOutputStream();
>>>
>>>                        IOUtils.write(_stream_to_**annotate, out);
>>>                        IOUtils.closeQuietly(out);
>>>
>>>                        con.connect(); //send the request
>>>
>>>                        if(con.getResponseCode()>  299)
>>>                        {
>>>                                java.io.InputStream errorStream =
>>>
>> con.getErrorStream();
>>
>>>                                if(errorStream != null)
>>>                                {
>>>                                        String errorMessage =
>>>
>> IOUtils.toString(errorStream);
>>
>>>                                        IOUtils.closeQuietly(**
>>> errorStream);
>>>                                }
>>>                                else
>>>                                {
>>>                                        //no error data
>>>                                        //write default error message with
>>>
>> the status code
>>
>>>                                }
>>>                        }
>>>                        else
>>>                        {
>>>                                Model model =
>>>
>> ModelFactory.**createDefaultModel();
>>
>>                                 java.io.InputStream enhancementResults =
>>>
>> con.getInputStream();
>>
>>                                 model.read(enhancementResults, null);
>>>                                String queryStringForGraph =  "PREFIX t:
>>>
>> <http://fise.iks-project.eu/**ontology/<http://fise.iks-project.eu/ontology/>>
>>  " +
>>
>>>                                                "SELECT ?label WHERE
>>> {?alias
>>>
>> t:entity-reference ?label}";
>>
>>>                                Query query =
>>>
>> QueryFactory.create(**queryStringForGraph);
>>
>>>                                QueryExecution qe =
>>>
>> QueryExecutionFactory.create(**query, model);
>>
>>
>>>
>>>                                ResultSet results = qe.execSelect();
>>>                                while(results.hasNext())
>>>                                {
>>>
>> _return_list.add(results.next(**).toString());
>>
>>>                                }
>>>                        }
>>>                }
>>>                catch(Exception ex)
>>>                {
>>>                        System.out.println(ex.**getMessage());
>>>                }
>>>                return _return_list;
>>>        }
>>>
>>> On Thu, Jan 12, 2012 at 8:32 AM, srecko joksimovic
>>>
>> <sr...@gmail.com>  wrote:
>>
>>> Hi Rupert,
>>>
>>> Thank you for the answer. I've probably missed that.
>>>
>>> Best,
>>> Srecko
>>>
>>>
>>> On Thu, Jan 12, 2012 at 6:12 AM, Rupert Westenthaler
>>>
>> <rupert.westenthaler@gmail.com**>  wrote:
>>
>>> Hi Srecko
>>>
>>> I think the last time I directly used this API is about 3-4 years ago,
>>> but
>>>
>> after a look at the http client tutorial [1] I think the reason for your
>> problem is that you do not execute the GetMethod.
>>
>>> Based on this tutorial the code should look like
>>>
>>>    // Create an instance of HttpClient.
>>>    HttpClient client = new HttpClient();
>>>    GetMethod get = new GetMethod(url);
>>>    try {
>>>        // Execute the method.
>>>        int statusCode = client.executeMethod(get);
>>>        if (statusCode != HttpStatus.SC_OK) {
>>>            //handle the error
>>>        }
>>>        InputStream t_is = get.getResponseBodyAsStream();
>>>        //read the data of the stream
>>>    }
>>>
>>> In addition you should not use a Reader if you want to read byte oriented
>>>
>> data from the input stream.
>>
>>> hope this helps
>>> best
>>> Rupert
>>>
>>> [1] http://hc.apache.org/**httpclient-3.x/tutorial.html<http://hc.apache.org/httpclient-3.x/tutorial.html>
>>>
>>> On 11.01.2012, at 22:34, Srecko Joksimovic wrote:
>>>
>>>  That's it. Thank you!
>>>> I have already configured KeywordLinkingEngine when I used my own
>>>>
>>> ontology.
>>
>>> I think I'm familiar with that and I will try that option too.
>>>>
>>>> In meanwhile I found another interesting problem. I tried to annotate
>>>> document and web page. With web page, I tried
>>>> IOUtils.write(byte[], out) and I had to convert URL to byte[]:
>>>>
>>>> public static byte[] GetBytesFromURL(String _url) throws IOException
>>>> {
>>>>       GetMethod get = new GetMethod(_url);
>>>>       InputStream t_is = get.getResponseBodyAsStream();
>>>>       byte[] buffer = new byte[1024];
>>>>       int count = -1;
>>>>       Reader t_url_reader = new BufferedReader(new
>>>> InputStreamReader(t_is));
>>>>       byte[] t_bytes = IOUtils.toByteArray(t_url_**reader, "UTF-8");
>>>>
>>>>       return t_bytes;
>>>> }
>>>>
>>>> But, the problem is that I'm getting null for InputStream.
>>>>
>>>> Any ideas?
>>>>
>>>> Best,
>>>> Srecko
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Rupert Westenthaler [mailto:rupert.westenthaler@**gmail.com<ru...@gmail.com>
>>>> ]
>>>> Sent: Wednesday, January 11, 2012 22:08
>>>> To: Srecko Joksimovic
>>>> Cc: stanbol-dev@incubator.apache.**org<st...@incubator.apache.org>
>>>> Subject: Re: Annotating using DBPedia ontology
>>>>
>>>>
>>>> On 11.01.2012, at 21:41, Srecko Joksimovic wrote:
>>>>
>>>>> Hi Rupert,
>>>>>
>>>>> When I load localhost:8080/engines it says this:
>>>>>
>>>>> There are currently 5 active engines.
>>>>> org.apache.stanbol.enhancer.**engines.metaxa.MetaxaEngine
>>>>> org.apache.stanbol.enhancer.**engines.langid.**LangIdEnhancementEngine
>>>>>
>>>>>  org.apache.stanbol.enhancer.**engines.opennlp.impl.**
>> NamedEntityExtractionEnhanc
>>
>>> ementEngine
>>>>>
>>>>>  org.apache.stanbol.enhancer.**engines.entitytagging.impl.**
>> NamedEntityTaggingEng
>>
>>> ine
>>>>>
>>>>>  org.apache.stanbol.enhancer.**engines.entitytagging.impl.**
>> NamedEntityTaggingEng
>>
>>> ine
>>>>>
>>>>> Maybe this could tell you something?
>>>>>
>>>>>  This are exactly the 5 engines that are expected to run with the
>>>> default
>>>> configuration.
>>>> Based on this the Stanbol Enhnacer should just work fine.
>>>>
>>>> After looking at the the text you enhanced I noticed however that is
>>>>
>>> does
>>
>>> not mention
>>>> any named entities such as Persons, Organizations and Places. So I
>>>>
>>> checked
>>
>>> it with
>>>> my local Stanbol version and was also not any detected entities.
>>>>
>>>> So to check if Stanbol works as expected you should try to use an other
>>>>
>>> text
>>
>>> the
>>>> mentions some Named Entities such as
>>>>
>>>>    "John Smith works for the Apple Inc. in Cupertino, California."
>>>>
>>>>
>>>> If you want to search also for entities like "Bank", "Blog", "Consumer",
>>>> "Telephone" .
>>>> you need to also configure a KeywordLinkingEngine for dbpedia. Part B or
>>>>
>>> [3]
>>
>>> provides
>>>> more information on how to do that.
>>>>
>>>> But let me mention that the KeywordLinkingEngine is more useful if used
>>>>
>>> in
>>
>>> combination
>>>> with an own domain specific thesaurus rather than a global data set like
>>>> dbpedia. When
>>>> used with dbpedia you will also get a lot of false positives.
>>>>
>>>> best
>>>> Rupert
>>>>
>>>> [3] http://incubator.apache.org/**stanbol/docs/trunk/**
>>>> customvocabulary.html<http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html>
>>>>
>>>>
>>>
>>>
>
>

Re: Annotating using DBPedia ontology

Posted by Walter Kasper <wk...@apache.org>.

Hi Srecko,

I don't know what the problem with your Word document could have been. 
Could you send it to me for testing?

The error with your HTML page apparently arises from a bug in resolving 
relative URLs in one of the HTML extractors. We will fix that.

Best regards,

Walter

Srecko Joksimovic wrote:
> Thank you Rupert!
>
> It is probably something that I missed.
>
> Best,
> Srecko
>
> -----Original Message-----
> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
> Sent: Thursday, January 12, 2012 20:16
> To: Srecko Joksimovic; wkasper@apache.org
> Cc: stanbol-dev@incubator.apache.org
> Subject: Re: Annotating using DBPedia ontology
>
> Hi Srecko
>
> I seams that both cases are related to the Metaxa Engine. My knowledge abut
> the libs used by this engine to extract the textual content is very limited.
> So I might not be the right person to look into that.
>
> In the first Example I think Metaxa was not able to extract the text from
> the word document because the only plainTextContent triple noted is
>
> <j.0:plainTextContent>Microsoft Word-Dokument&#xD;
> srecko</j.0:plainTextContent>
>
> The  second example looks like an issue within the RDF metadata generation
> in Aperture.
>
> I sent this replay also directly to Walter Kasper. He is the one who
> contributed this engine and should be able to provide a more information.
>
> best
> Rupert
>
> On 12.01.2012, at 18:40, srecko joksimovic wrote:
>
>> Hi Rupert,
>>
>> I have another question, and I will finish soon.
>>
>> I tried to annotate pdf document, and I didn't get result I expected. Then
> I put string you sent to me
>> "John Smith works for the Apple Inc. in Cupertino, California."
>> in MS Word document, and this is the result I got:
>>
>> <rdf:RDF
>>      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>      xmlns:j.0="http://www.semanticdesktop.org/ontologies/2007/01/19/nie#"
>>      xmlns:j.1="http://purl.org/dc/terms/"
>>      xmlns:j.2="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#"
>>      xmlns:j.3="http://fise.iks-project.eu/ontology/">
>>    <rdf:Description
> rdf:about="urn:enhancement-55016818-eb97-7b98-521a-422e3742173b">
>>      <rdf:type
> rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>>      <j.1:creator
> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.en
> hancer.engines.langid.LangIdEnhancementEngine</j.1:creator>
>>      <j.1:created
> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-12T17:34:20
> .288Z</j.1:created>
>>      <j.3:extracted-from
> rdf:resource="urn:content-item-sha1-835c8a5397d9b376a268b7bb5d3c8b4ab7e8b81f
> "/>
>>      <rdf:type
> rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>>      <j.1:language>fr</j.1:language>
>>    </rdf:Description>
>>    <rdf:Description
> rdf:about="urn:content-item-sha1-835c8a5397d9b376a268b7bb5d3c8b4ab7e8b81f">
>>      <rdf:type
> rdf:resource="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Pagin
> atedTextDocument"/>
>>      <j.0:plainTextContent>Microsoft Word-Dokument&#xD;
>> srecko</j.0:plainTextContent>
>>    </rdf:Description>
>>    <rdf:Description
> rdf:about="urn:enhancement-0644a1ed-f1d8-334d-d4e9-690a0446cba8">
>>      <j.3:confidence
> rdf:datatype="http://www.w3.org/2001/XMLSchema#double">1.0</j.3:confidence>
>>      <rdf:type
> rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>>      <j.1:creator
> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.en
> hancer.engines.metaxa.MetaxaEngine</j.1:creator>
>>      <j.1:created
> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-12T17:34:20
> .273Z</j.1:created>
>>      <j.3:extracted-from
> rdf:resource="urn:content-item-sha1-835c8a5397d9b376a268b7bb5d3c8b4ab7e8b81f
> "/>
>>      <rdf:type
> rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>>    </rdf:Description>
>> </rdf:RDF>
>>
>>
>> and this is the code:
>>
>> 	public List<String>  Annotate(byte[] _stream_to_annotate,
> ServiceUtils.MIMETypes _content_type, String _encoding)
>> 	{	
>> 		List<String>  _return_list = new ArrayList<String>();
>> 		try
>> 		{			
>> 			URL url = new URL(ServiceUtils.SERVICE_URL);
>> 			HttpURLConnection con =
> (HttpURLConnection)url.openConnection();
>> 			con.setDoOutput(true);
>> 			con.setRequestMethod("POST");
>> 			con.setRequestProperty("Accept",
> "application/rdf+xml");
>> 			con.setRequestProperty("Content-type",
> _content_type.getValue());
>> 			
>> 			java.io.OutputStream out = con.getOutputStream();
>>
>> 			IOUtils.write(_stream_to_annotate, out);
>> 			IOUtils.closeQuietly(out);
>>
>> 			con.connect(); //send the request
>>
>> 			if(con.getResponseCode()>  299)
>> 			{
>> 				java.io.InputStream errorStream =
> con.getErrorStream();
>> 				if(errorStream != null)
>> 				{
>> 					String errorMessage =
> IOUtils.toString(errorStream);
>> 					IOUtils.closeQuietly(errorStream);
>> 				}
>> 				else
>> 				{
>> 					//no error data
>> 					//write default error message with
> the status code
>> 				}
>> 			}
>> 			else
>> 			{
>> 				Model model =
> ModelFactory.createDefaultModel();
>
>> 				java.io.InputStream enhancementResults =
> con.getInputStream();
>
>> 				model.read(enhancementResults, null);
>> 				String queryStringForGraph =  "PREFIX t:
> <http://fise.iks-project.eu/ontology/>  " +
>> 						"SELECT ?label WHERE {?alias
> t:entity-reference ?label}";
>> 				Query query =
> QueryFactory.create(queryStringForGraph);
>> 				QueryExecution qe =
> QueryExecutionFactory.create(query, model);				
>>              			
>> 				ResultSet results = qe.execSelect();
>> 				while(results.hasNext())
>> 				{
> _return_list.add(results.next().toString());
>> 				}
>> 			}
>> 		}
>> 		catch(Exception ex)
>> 		{
>> 			System.out.println(ex.getMessage());
>> 		}        	
>> 		return _return_list;
>> 	}
>>
>> On Thu, Jan 12, 2012 at 8:32 AM, srecko joksimovic
> <sr...@gmail.com>  wrote:
>> Hi Rupert,
>>
>> Thank you for the answer. I've probably missed that.
>>
>> Best,
>> Srecko
>>
>>
>> On Thu, Jan 12, 2012 at 6:12 AM, Rupert Westenthaler
> <ru...@gmail.com>  wrote:
>> Hi Srecko
>>
>> I think the last time I directly used this API is about 3-4 years ago, but
> after a look at the http client tutorial [1] I think the reason for your
> problem is that you do not execute the GetMethod.
>> Based on this tutorial the code should look like
>>
>>     // Create an instance of HttpClient.
>>     HttpClient client = new HttpClient();
>>     GetMethod get = new GetMethod(url);
>>     try {
>>         // Execute the method.
>>         int statusCode = client.executeMethod(get);
>>         if (statusCode != HttpStatus.SC_OK) {
>>             //handle the error
>>         }
>>         InputStream t_is = get.getResponseBodyAsStream();
>>         //read the data of the stream
>>     }
>>
>> In addition you should not use a Reader if you want to read byte oriented
> data from the input stream.
>> hope this helps
>> best
>> Rupert
>>
>> [1] http://hc.apache.org/httpclient-3.x/tutorial.html
>>
>> On 11.01.2012, at 22:34, Srecko Joksimovic wrote:
>>
>>> That's it. Thank you!
>>> I have already configured KeywordLinkingEngine when I used my own
> ontology.
>>> I think I'm familiar with that and I will try that option too.
>>>
>>> In meanwhile I found another interesting problem. I tried to annotate
>>> document and web page. With web page, I tried
>>> IOUtils.write(byte[], out) and I had to convert URL to byte[]:
>>>
>>> public static byte[] GetBytesFromURL(String _url) throws IOException
>>> {
>>>        GetMethod get = new GetMethod(_url);
>>>        InputStream t_is = get.getResponseBodyAsStream();
>>>        byte[] buffer = new byte[1024];
>>>        int count = -1;
>>>        Reader t_url_reader = new BufferedReader(new
>>> InputStreamReader(t_is));
>>>        byte[] t_bytes = IOUtils.toByteArray(t_url_reader, "UTF-8");
>>>
>>>        return t_bytes;
>>> }
>>>
>>> But, the problem is that I'm getting null for InputStream.
>>>
>>> Any ideas?
>>>
>>> Best,
>>> Srecko
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
>>> Sent: Wednesday, January 11, 2012 22:08
>>> To: Srecko Joksimovic
>>> Cc: stanbol-dev@incubator.apache.org
>>> Subject: Re: Annotating using DBPedia ontology
>>>
>>>
>>> On 11.01.2012, at 21:41, Srecko Joksimovic wrote:
>>>> Hi Rupert,
>>>>
>>>> When I load localhost:8080/engines it says this:
>>>>
>>>> There are currently 5 active engines.
>>>> org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine
>>>> org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine
>>>>
> org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhanc
>>>> ementEngine
>>>>
> org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
>>>> ine
>>>>
> org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
>>>> ine
>>>>
>>>> Maybe this could tell you something?
>>>>
>>> This are exactly the 5 engines that are expected to run with the default
>>> configuration.
>>> Based on this the Stanbol Enhnacer should just work fine.
>>>
>>> After looking at the the text you enhanced I noticed however that is
> does
>>> not mention
>>> any named entities such as Persons, Organizations and Places. So I
> checked
>>> it with
>>> my local Stanbol version and was also not any detected entities.
>>>
>>> So to check if Stanbol works as expected you should try to use an other
> text
>>> the
>>> mentions some Named Entities such as
>>>
>>>     "John Smith works for the Apple Inc. in Cupertino, California."
>>>
>>>
>>> If you want to search also for entities like "Bank", "Blog", "Consumer",
>>> "Telephone" .
>>> you need to also configure a KeywordLinkingEngine for dbpedia. Part B or
> [3]
>>> provides
>>> more information on how to do that.
>>>
>>> But let me mention that the KeywordLinkingEngine is more useful if used
> in
>>> combination
>>> with an own domain specific thesaurus rather than a global data set like
>>> dbpedia. When
>>> used with dbpedia you will also get a lot of false positives.
>>>
>>> best
>>> Rupert
>>>
>>> [3] http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html
>>>
>>
>>

RE: Annotating using DBPedia ontology

Posted by Srecko Joksimovic <sr...@gmail.com>.

Thank you Rupert!

It is probably something that I missed.

Best,
Srecko

-----Original Message-----
From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
Sent: Thursday, January 12, 2012 20:16
To: Srecko Joksimovic; wkasper@apache.org
Cc: stanbol-dev@incubator.apache.org
Subject: Re: Annotating using DBPedia ontology

Hi Srecko

I seams that both cases are related to the Metaxa Engine. My knowledge abut
the libs used by this engine to extract the textual content is very limited.
So I might not be the right person to look into that. 

In the first Example I think Metaxa was not able to extract the text from
the word document because the only plainTextContent triple noted is

<j.0:plainTextContent>Microsoft Word-Dokument&#xD;
srecko</j.0:plainTextContent>

The  second example looks like an issue within the RDF metadata generation
in Aperture.
 
I sent this replay also directly to Walter Kasper. He is the one who
contributed this engine and should be able to provide a more information.

best
Rupert

On 12.01.2012, at 18:40, srecko joksimovic wrote:

> Hi Rupert,
> 
> I have another question, and I will finish soon.
> 
> I tried to annotate pdf document, and I didn't get result I expected. Then
I put string you sent to me 
> "John Smith works for the Apple Inc. in Cupertino, California."
> in MS Word document, and this is the result I got:
> 
> <rdf:RDF
>     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>     xmlns:j.0="http://www.semanticdesktop.org/ontologies/2007/01/19/nie#"
>     xmlns:j.1="http://purl.org/dc/terms/"
>     xmlns:j.2="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#"
>     xmlns:j.3="http://fise.iks-project.eu/ontology/" > 
>   <rdf:Description
rdf:about="urn:enhancement-55016818-eb97-7b98-521a-422e3742173b">
>     <rdf:type
rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>     <j.1:creator
rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.en
hancer.engines.langid.LangIdEnhancementEngine</j.1:creator>
>     <j.1:created
rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-12T17:34:20
.288Z</j.1:created>
>     <j.3:extracted-from
rdf:resource="urn:content-item-sha1-835c8a5397d9b376a268b7bb5d3c8b4ab7e8b81f
"/>
>     <rdf:type
rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>     <j.1:language>fr</j.1:language>
>   </rdf:Description>
>   <rdf:Description
rdf:about="urn:content-item-sha1-835c8a5397d9b376a268b7bb5d3c8b4ab7e8b81f">
>     <rdf:type
rdf:resource="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Pagin
atedTextDocument"/>
>     <j.0:plainTextContent>Microsoft Word-Dokument&#xD;
> srecko</j.0:plainTextContent>
>   </rdf:Description>
>   <rdf:Description
rdf:about="urn:enhancement-0644a1ed-f1d8-334d-d4e9-690a0446cba8">
>     <j.3:confidence
rdf:datatype="http://www.w3.org/2001/XMLSchema#double">1.0</j.3:confidence>
>     <rdf:type
rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>     <j.1:creator
rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.en
hancer.engines.metaxa.MetaxaEngine</j.1:creator>
>     <j.1:created
rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-12T17:34:20
.273Z</j.1:created>
>     <j.3:extracted-from
rdf:resource="urn:content-item-sha1-835c8a5397d9b376a268b7bb5d3c8b4ab7e8b81f
"/>
>     <rdf:type
rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>   </rdf:Description>
> </rdf:RDF>
> 
> 
> and this is the code:
> 
> 	public List<String> Annotate(byte[] _stream_to_annotate,
ServiceUtils.MIMETypes _content_type, String _encoding)
> 	{	 
> 		List<String> _return_list = new ArrayList<String>();
> 		try             
> 		{			   
> 			URL url = new URL(ServiceUtils.SERVICE_URL);

> 			HttpURLConnection con =
(HttpURLConnection)url.openConnection();                    
> 			con.setDoOutput(true);

> 			con.setRequestMethod("POST");                    
> 			con.setRequestProperty("Accept",
"application/rdf+xml");                    
> 			con.setRequestProperty("Content-type",
_content_type.getValue());
> 			              
> 			java.io.OutputStream out = con.getOutputStream();
>                  
> 			IOUtils.write(_stream_to_annotate, out);

> 			IOUtils.closeQuietly(out);
>                  
> 			con.connect(); //send the request           
>          
> 			if(con.getResponseCode() > 299)         
> 			{ 
> 				java.io.InputStream errorStream =
con.getErrorStream();                            
> 				if(errorStream != null)             
> 				{                                 
> 					String errorMessage =
IOUtils.toString(errorStream);                                   
> 					IOUtils.closeQuietly(errorStream);

> 				}              
> 				else              
> 				{ 
> 					//no error data                
> 					//write default error message with
the status code                            
> 				}                    
> 			}          
> 			else                     
> 			{   
> 				Model model =
ModelFactory.createDefaultModel();

> 				java.io.InputStream enhancementResults =
con.getInputStream();

> 				model.read(enhancementResults, null);

> 				String queryStringForGraph =  "PREFIX t:
<http://fise.iks-project.eu/ontology/> " +
> 						"SELECT ?label WHERE {?alias
t:entity-reference ?label}";                            
> 				Query query =
QueryFactory.create(queryStringForGraph);                            
> 				QueryExecution qe =
QueryExecutionFactory.create(query, model);				
>             			
> 				ResultSet results = qe.execSelect();
> 				while(results.hasNext())

> 				{

>
_return_list.add(results.next().toString());
> 				}

> 			}                 
> 		}                 
> 		catch(Exception ex)                            
> 		{                 
> 			System.out.println(ex.getMessage());

> 		}        	
> 		return _return_list;
> 	}
> 
> On Thu, Jan 12, 2012 at 8:32 AM, srecko joksimovic
<sr...@gmail.com> wrote:
> 
> Hi Rupert,
> 
> Thank you for the answer. I've probably missed that. 
> 
> Best,
> Srecko
> 
> 
> On Thu, Jan 12, 2012 at 6:12 AM, Rupert Westenthaler
<ru...@gmail.com> wrote:
> Hi Srecko
> 
> I think the last time I directly used this API is about 3-4 years ago, but
after a look at the http client tutorial [1] I think the reason for your
problem is that you do not execute the GetMethod.
> 
> Based on this tutorial the code should look like
> 
>    // Create an instance of HttpClient.
>    HttpClient client = new HttpClient();
>    GetMethod get = new GetMethod(url);
>    try {
>        // Execute the method.
>        int statusCode = client.executeMethod(get);
>        if (statusCode != HttpStatus.SC_OK) {
>            //handle the error
>        }
>        InputStream t_is = get.getResponseBodyAsStream();
>        //read the data of the stream
>    }
> 
> In addition you should not use a Reader if you want to read byte oriented
data from the input stream.
> 
> hope this helps
> best
> Rupert
> 
> [1] http://hc.apache.org/httpclient-3.x/tutorial.html
> 
> On 11.01.2012, at 22:34, Srecko Joksimovic wrote:
> 
> > That's it. Thank you!
> > I have already configured KeywordLinkingEngine when I used my own
ontology.
> > I think I'm familiar with that and I will try that option too.
> >
> > In meanwhile I found another interesting problem. I tried to annotate
> > document and web page. With web page, I tried
> > IOUtils.write(byte[], out) and I had to convert URL to byte[]:
> >
> > public static byte[] GetBytesFromURL(String _url) throws IOException
> > {
> >       GetMethod get = new GetMethod(_url);
> >       InputStream t_is = get.getResponseBodyAsStream();
> >       byte[] buffer = new byte[1024];
> >       int count = -1;
> >       Reader t_url_reader = new BufferedReader(new
> > InputStreamReader(t_is));
> >       byte[] t_bytes = IOUtils.toByteArray(t_url_reader, "UTF-8");
> >
> >       return t_bytes;
> > }
> >
> > But, the problem is that I'm getting null for InputStream.
> >
> > Any ideas?
> >
> > Best,
> > Srecko
> >
> >
> >
> > -----Original Message-----
> > From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
> > Sent: Wednesday, January 11, 2012 22:08
> > To: Srecko Joksimovic
> > Cc: stanbol-dev@incubator.apache.org
> > Subject: Re: Annotating using DBPedia ontology
> >
> >
> > On 11.01.2012, at 21:41, Srecko Joksimovic wrote:
> >> Hi Rupert,
> >>
> >> When I load localhost:8080/engines it says this:
> >>
> >> There are currently 5 active engines.
> >> org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine
> >> org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine
> >>
> >
org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhanc
> >> ementEngine
> >>
> >
org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
> >> ine
> >>
> >
org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
> >> ine
> >>
> >> Maybe this could tell you something?
> >>
> >
> > This are exactly the 5 engines that are expected to run with the default
> > configuration.
> > Based on this the Stanbol Enhnacer should just work fine.
> >
> > After looking at the the text you enhanced I noticed however that is
does
> > not mention
> > any named entities such as Persons, Organizations and Places. So I
checked
> > it with
> > my local Stanbol version and was also not any detected entities.
> >
> > So to check if Stanbol works as expected you should try to use an other
text
> > the
> > mentions some Named Entities such as
> >
> >    "John Smith works for the Apple Inc. in Cupertino, California."
> >
> >
> > If you want to search also for entities like "Bank", "Blog", "Consumer",
> > "Telephone" .
> > you need to also configure a KeywordLinkingEngine for dbpedia. Part B or
[3]
> > provides
> > more information on how to do that.
> >
> > But let me mention that the KeywordLinkingEngine is more useful if used
in
> > combination
> > with an own domain specific thesaurus rather than a global data set like
> > dbpedia. When
> > used with dbpedia you will also get a lot of false positives.
> >
> > best
> > Rupert
> >
> > [3] http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html
> >
> 
> 
>

Re: Annotating using DBPedia ontology

Posted by Rupert Westenthaler <ru...@gmail.com>.

Hi Srecko

I seams that both cases are related to the Metaxa Engine. My knowledge abut the libs used by this engine to extract the textual content is very limited. So I might not be the right person to look into that. 

In the first Example I think Metaxa was not able to extract the text from the word document because the only plainTextContent triple noted is

<j.0:plainTextContent>Microsoft Word-Dokument&#xD;
srecko</j.0:plainTextContent>

The  second example looks like an issue within the RDF metadata generation in Aperture.
 
I sent this replay also directly to Walter Kasper. He is the one who contributed this engine and should be able to provide a more information.

best
Rupert

On 12.01.2012, at 18:40, srecko joksimovic wrote:

> Hi Rupert,
> 
> I have another question, and I will finish soon.
> 
> I tried to annotate pdf document, and I didn't get result I expected. Then I put string you sent to me 
> "John Smith works for the Apple Inc. in Cupertino, California."
> in MS Word document, and this is the result I got:
> 
> <rdf:RDF
>     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>     xmlns:j.0="http://www.semanticdesktop.org/ontologies/2007/01/19/nie#"
>     xmlns:j.1="http://purl.org/dc/terms/"
>     xmlns:j.2="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#"
>     xmlns:j.3="http://fise.iks-project.eu/ontology/" > 
>   <rdf:Description rdf:about="urn:enhancement-55016818-eb97-7b98-521a-422e3742173b">
>     <rdf:type rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>     <j.1:creator rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine</j.1:creator>
>     <j.1:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-12T17:34:20.288Z</j.1:created>
>     <j.3:extracted-from rdf:resource="urn:content-item-sha1-835c8a5397d9b376a268b7bb5d3c8b4ab7e8b81f"/>
>     <rdf:type rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>     <j.1:language>fr</j.1:language>
>   </rdf:Description>
>   <rdf:Description rdf:about="urn:content-item-sha1-835c8a5397d9b376a268b7bb5d3c8b4ab7e8b81f">
>     <rdf:type rdf:resource="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#PaginatedTextDocument"/>
>     <j.0:plainTextContent>Microsoft Word-Dokument&#xD;
> srecko</j.0:plainTextContent>
>   </rdf:Description>
>   <rdf:Description rdf:about="urn:enhancement-0644a1ed-f1d8-334d-d4e9-690a0446cba8">
>     <j.3:confidence rdf:datatype="http://www.w3.org/2001/XMLSchema#double">1.0</j.3:confidence>
>     <rdf:type rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>     <j.1:creator rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine</j.1:creator>
>     <j.1:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-12T17:34:20.273Z</j.1:created>
>     <j.3:extracted-from rdf:resource="urn:content-item-sha1-835c8a5397d9b376a268b7bb5d3c8b4ab7e8b81f"/>
>     <rdf:type rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>   </rdf:Description>
> </rdf:RDF>
> 
> 
> and this is the code:
> 
> 	public List<String> Annotate(byte[] _stream_to_annotate, ServiceUtils.MIMETypes _content_type, String _encoding)
> 	{	 
> 		List<String> _return_list = new ArrayList<String>();
> 		try             
> 		{			   
> 			URL url = new URL(ServiceUtils.SERVICE_URL);	                            
> 			HttpURLConnection con = (HttpURLConnection)url.openConnection();                    
> 			con.setDoOutput(true);                               
> 			con.setRequestMethod("POST");                    
> 			con.setRequestProperty("Accept", "application/rdf+xml");                    
> 			con.setRequestProperty("Content-type", _content_type.getValue());
> 			              
> 			java.io.OutputStream out = con.getOutputStream();
>                  
> 			IOUtils.write(_stream_to_annotate, out);                   
> 			IOUtils.closeQuietly(out);
>                  
> 			con.connect(); //send the request           
>          
> 			if(con.getResponseCode() > 299)         
> 			{ 
> 				java.io.InputStream errorStream = con.getErrorStream();                            
> 				if(errorStream != null)             
> 				{                                 
> 					String errorMessage = IOUtils.toString(errorStream);                                   
> 					IOUtils.closeQuietly(errorStream);			
> 				}              
> 				else              
> 				{ 
> 					//no error data                
> 					//write default error message with the status code                            
> 				}                    
> 			}          
> 			else                     
> 			{   
> 				Model model = ModelFactory.createDefaultModel();	    	                               
> 				java.io.InputStream enhancementResults = con.getInputStream();                                           			
> 				model.read(enhancementResults, null);	    	   	                                
> 				String queryStringForGraph =  "PREFIX t: <http://fise.iks-project.eu/ontology/> " +
> 						"SELECT ?label WHERE {?alias t:entity-reference ?label}";                            
> 				Query query = QueryFactory.create(queryStringForGraph);                            
> 				QueryExecution qe = QueryExecutionFactory.create(query, model);				
>             			
> 				ResultSet results = qe.execSelect();
> 				while(results.hasNext())                            
> 				{                 					
> 					_return_list.add(results.next().toString());
> 				}                            		       				                  
> 			}                 
> 		}                 
> 		catch(Exception ex)                            
> 		{                 
> 			System.out.println(ex.getMessage());                 
> 		}        	
> 		return _return_list;
> 	}
> 
> On Thu, Jan 12, 2012 at 8:32 AM, srecko joksimovic <sr...@gmail.com> wrote:
> 
> Hi Rupert,
> 
> Thank you for the answer. I've probably missed that. 
> 
> Best,
> Srecko
> 
> 
> On Thu, Jan 12, 2012 at 6:12 AM, Rupert Westenthaler <ru...@gmail.com> wrote:
> Hi Srecko
> 
> I think the last time I directly used this API is about 3-4 years ago, but after a look at the http client tutorial [1] I think the reason for your problem is that you do not execute the GetMethod.
> 
> Based on this tutorial the code should look like
> 
>    // Create an instance of HttpClient.
>    HttpClient client = new HttpClient();
>    GetMethod get = new GetMethod(url);
>    try {
>        // Execute the method.
>        int statusCode = client.executeMethod(get);
>        if (statusCode != HttpStatus.SC_OK) {
>            //handle the error
>        }
>        InputStream t_is = get.getResponseBodyAsStream();
>        //read the data of the stream
>    }
> 
> In addition you should not use a Reader if you want to read byte oriented data from the input stream.
> 
> hope this helps
> best
> Rupert
> 
> [1] http://hc.apache.org/httpclient-3.x/tutorial.html
> 
> On 11.01.2012, at 22:34, Srecko Joksimovic wrote:
> 
> > That's it. Thank you!
> > I have already configured KeywordLinkingEngine when I used my own ontology.
> > I think I'm familiar with that and I will try that option too.
> >
> > In meanwhile I found another interesting problem. I tried to annotate
> > document and web page. With web page, I tried
> > IOUtils.write(byte[], out) and I had to convert URL to byte[]:
> >
> > public static byte[] GetBytesFromURL(String _url) throws IOException
> > {
> >       GetMethod get = new GetMethod(_url);
> >       InputStream t_is = get.getResponseBodyAsStream();
> >       byte[] buffer = new byte[1024];
> >       int count = -1;
> >       Reader t_url_reader = new BufferedReader(new
> > InputStreamReader(t_is));
> >       byte[] t_bytes = IOUtils.toByteArray(t_url_reader, "UTF-8");
> >
> >       return t_bytes;
> > }
> >
> > But, the problem is that I'm getting null for InputStream.
> >
> > Any ideas?
> >
> > Best,
> > Srecko
> >
> >
> >
> > -----Original Message-----
> > From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
> > Sent: Wednesday, January 11, 2012 22:08
> > To: Srecko Joksimovic
> > Cc: stanbol-dev@incubator.apache.org
> > Subject: Re: Annotating using DBPedia ontology
> >
> >
> > On 11.01.2012, at 21:41, Srecko Joksimovic wrote:
> >> Hi Rupert,
> >>
> >> When I load localhost:8080/engines it says this:
> >>
> >> There are currently 5 active engines.
> >> org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine
> >> org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine
> >>
> > org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhanc
> >> ementEngine
> >>
> > org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
> >> ine
> >>
> > org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
> >> ine
> >>
> >> Maybe this could tell you something?
> >>
> >
> > This are exactly the 5 engines that are expected to run with the default
> > configuration.
> > Based on this the Stanbol Enhnacer should just work fine.
> >
> > After looking at the the text you enhanced I noticed however that is does
> > not mention
> > any named entities such as Persons, Organizations and Places. So I checked
> > it with
> > my local Stanbol version and was also not any detected entities.
> >
> > So to check if Stanbol works as expected you should try to use an other text
> > the
> > mentions some Named Entities such as
> >
> >    "John Smith works for the Apple Inc. in Cupertino, California."
> >
> >
> > If you want to search also for entities like "Bank", "Blog", "Consumer",
> > "Telephone" .
> > you need to also configure a KeywordLinkingEngine for dbpedia. Part B or [3]
> > provides
> > more information on how to do that.
> >
> > But let me mention that the KeywordLinkingEngine is more useful if used in
> > combination
> > with an own domain specific thesaurus rather than a global data set like
> > dbpedia. When
> > used with dbpedia you will also get a lot of false positives.
> >
> > best
> > Rupert
> >
> > [3] http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html
> >
> 
> 
>

Re: Annotating using DBPedia ontology

Posted by srecko joksimovic <sr...@gmail.com>.

Hi Rupert,

I have another question, and I will finish soon.

I tried to annotate pdf document, and I didn't get result I expected. Then
I put string you sent to me
"John Smith works for the Apple Inc. in Cupertino, California."
in MS Word document, and this is the result I got:

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:j.0="http://www.semanticdesktop.org/ontologies/2007/01/19/nie#"
    xmlns:j.1="http://purl.org/dc/terms/"
    xmlns:j.2="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#"
    xmlns:j.3="http://fise.iks-project.eu/ontology/" >
  <rdf:Description
rdf:about="urn:enhancement-55016818-eb97-7b98-521a-422e3742173b">
    <rdf:type rdf:resource="
http://fise.iks-project.eu/ontology/TextAnnotation"/>
    <j.1:creator rdf:datatype="http://www.w3.org/2001/XMLSchema#string
">org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine</j.1:creator>
    <j.1:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime
">2012-01-12T17:34:20.288Z</j.1:created>
    <j.3:extracted-from
rdf:resource="urn:content-item-sha1-835c8a5397d9b376a268b7bb5d3c8b4ab7e8b81f"/>
    <rdf:type rdf:resource="http://fise.iks-project.eu/ontology/Enhancement
"/>
    <j.1:language>fr</j.1:language>
  </rdf:Description>
  <rdf:Description
rdf:about="urn:content-item-sha1-835c8a5397d9b376a268b7bb5d3c8b4ab7e8b81f">
    <rdf:type rdf:resource="
http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#PaginatedTextDocument
"/>
    <j.0:plainTextContent>Microsoft Word-Dokument&#xD;
srecko</j.0:plainTextContent>
  </rdf:Description>
  <rdf:Description
rdf:about="urn:enhancement-0644a1ed-f1d8-334d-d4e9-690a0446cba8">
    <j.3:confidence rdf:datatype="http://www.w3.org/2001/XMLSchema#double
">1.0</j.3:confidence>
    <rdf:type rdf:resource="
http://fise.iks-project.eu/ontology/TextAnnotation"/>
    <j.1:creator rdf:datatype="http://www.w3.org/2001/XMLSchema#string
">org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine</j.1:creator>
    <j.1:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime
">2012-01-12T17:34:20.273Z</j.1:created>
    <j.3:extracted-from
rdf:resource="urn:content-item-sha1-835c8a5397d9b376a268b7bb5d3c8b4ab7e8b81f"/>
    <rdf:type rdf:resource="http://fise.iks-project.eu/ontology/Enhancement
"/>
  </rdf:Description>
</rdf:RDF>


and this is the code:

public List<String> Annotate(byte[] _stream_to_annotate,
ServiceUtils.MIMETypes _content_type, String _encoding)
{
List<String> _return_list = new ArrayList<String>();
try
{
URL url = new URL(ServiceUtils.SERVICE_URL);
HttpURLConnection con = (HttpURLConnection)url.openConnection();

con.setDoOutput(true);
con.setRequestMethod("POST");
con.setRequestProperty("Accept", "application/rdf+xml");
con.setRequestProperty("Content-type", _content_type.getValue());

java.io.OutputStream out = con.getOutputStream();

IOUtils.write(_stream_to_annotate, out);
IOUtils.closeQuietly(out);

con.connect(); //send the request

if(con.getResponseCode() > 299)
{
java.io.InputStream errorStream = con.getErrorStream();

if(errorStream != null)
{
String errorMessage = IOUtils.toString(errorStream);

IOUtils.closeQuietly(errorStream);
}
else
{
//no error data
//write default error message with the status code

}
}
else
{
Model model = ModelFactory.createDefaultModel();

java.io.InputStream enhancementResults = con.getInputStream();

model.read(enhancementResults, null);
String queryStringForGraph =  "PREFIX t: <
http://fise.iks-project.eu/ontology/> " +
"SELECT ?label WHERE {?alias t:entity-reference ?label}";

Query query = QueryFactory.create(queryStringForGraph);

QueryExecution qe = QueryExecutionFactory.create(query, model);

ResultSet results = qe.execSelect();
while(results.hasNext())
{
_return_list.add(results.next().toString());
}
}
}
catch(Exception ex)
{
System.out.println(ex.getMessage());
}
return _return_list;
}

On Thu, Jan 12, 2012 at 8:32 AM, srecko joksimovic <
sreckojoksimovic@gmail.com> wrote:

>
> Hi Rupert,
>
> Thank you for the answer. I've probably missed that.
>
> Best,
> Srecko
>
>
> On Thu, Jan 12, 2012 at 6:12 AM, Rupert Westenthaler <
> rupert.westenthaler@gmail.com> wrote:
>
>> Hi Srecko
>>
>> I think the last time I directly used this API is about 3-4 years ago,
>> but after a look at the http client tutorial [1] I think the reason for
>> your problem is that you do not execute the GetMethod.
>>
>> Based on this tutorial the code should look like
>>
>>    // Create an instance of HttpClient.
>>    HttpClient client = new HttpClient();
>>    GetMethod get = new GetMethod(url);
>>    try {
>>        // Execute the method.
>>        int statusCode = client.executeMethod(get);
>>        if (statusCode != HttpStatus.SC_OK) {
>>            //handle the error
>>        }
>>        InputStream t_is = get.getResponseBodyAsStream();
>>        //read the data of the stream
>>    }
>>
>> In addition you should not use a Reader if you want to read byte oriented
>> data from the input stream.
>>
>> hope this helps
>> best
>> Rupert
>>
>> [1] http://hc.apache.org/httpclient-3.x/tutorial.html
>>
>> On 11.01.2012, at 22:34, Srecko Joksimovic wrote:
>>
>> > That's it. Thank you!
>> > I have already configured KeywordLinkingEngine when I used my own
>> ontology.
>> > I think I'm familiar with that and I will try that option too.
>> >
>> > In meanwhile I found another interesting problem. I tried to annotate
>> > document and web page. With web page, I tried
>> > IOUtils.write(byte[], out) and I had to convert URL to byte[]:
>> >
>> > public static byte[] GetBytesFromURL(String _url) throws IOException
>> > {
>> >       GetMethod get = new GetMethod(_url);
>> >       InputStream t_is = get.getResponseBodyAsStream();
>> >       byte[] buffer = new byte[1024];
>> >       int count = -1;
>> >       Reader t_url_reader = new BufferedReader(new
>> > InputStreamReader(t_is));
>> >       byte[] t_bytes = IOUtils.toByteArray(t_url_reader, "UTF-8");
>> >
>> >       return t_bytes;
>> > }
>> >
>> > But, the problem is that I'm getting null for InputStream.
>> >
>> > Any ideas?
>> >
>> > Best,
>> > Srecko
>> >
>> >
>> >
>> > -----Original Message-----
>> > From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
>> > Sent: Wednesday, January 11, 2012 22:08
>> > To: Srecko Joksimovic
>> > Cc: stanbol-dev@incubator.apache.org
>> > Subject: Re: Annotating using DBPedia ontology
>> >
>> >
>> > On 11.01.2012, at 21:41, Srecko Joksimovic wrote:
>> >> Hi Rupert,
>> >>
>> >> When I load localhost:8080/engines it says this:
>> >>
>> >> There are currently 5 active engines.
>> >> org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine
>> >> org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine
>> >>
>> >
>> org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhanc
>> >> ementEngine
>> >>
>> >
>> org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
>> >> ine
>> >>
>> >
>> org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
>> >> ine
>> >>
>> >> Maybe this could tell you something?
>> >>
>> >
>> > This are exactly the 5 engines that are expected to run with the default
>> > configuration.
>> > Based on this the Stanbol Enhnacer should just work fine.
>> >
>> > After looking at the the text you enhanced I noticed however that is
>> does
>> > not mention
>> > any named entities such as Persons, Organizations and Places. So I
>> checked
>> > it with
>> > my local Stanbol version and was also not any detected entities.
>> >
>> > So to check if Stanbol works as expected you should try to use an other
>> text
>> > the
>> > mentions some Named Entities such as
>> >
>> >    "John Smith works for the Apple Inc. in Cupertino, California."
>> >
>> >
>> > If you want to search also for entities like "Bank", "Blog", "Consumer",
>> > "Telephone" .
>> > you need to also configure a KeywordLinkingEngine for dbpedia. Part B
>> or [3]
>> > provides
>> > more information on how to do that.
>> >
>> > But let me mention that the KeywordLinkingEngine is more useful if used
>> in
>> > combination
>> > with an own domain specific thesaurus rather than a global data set like
>> > dbpedia. When
>> > used with dbpedia you will also get a lot of false positives.
>> >
>> > best
>> > Rupert
>> >
>> > [3]
>> http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html
>> >
>>
>>
>

Re: Annotating using DBPedia ontology

Posted by srecko joksimovic <sr...@gmail.com>.

Hi Rupert,

Thank you for the answer. I've probably missed that.

Best,
Srecko

On Thu, Jan 12, 2012 at 6:12 AM, Rupert Westenthaler <
rupert.westenthaler@gmail.com> wrote:

> Hi Srecko
>
> I think the last time I directly used this API is about 3-4 years ago, but
> after a look at the http client tutorial [1] I think the reason for your
> problem is that you do not execute the GetMethod.
>
> Based on this tutorial the code should look like
>
>    // Create an instance of HttpClient.
>    HttpClient client = new HttpClient();
>    GetMethod get = new GetMethod(url);
>    try {
>        // Execute the method.
>        int statusCode = client.executeMethod(get);
>        if (statusCode != HttpStatus.SC_OK) {
>            //handle the error
>        }
>        InputStream t_is = get.getResponseBodyAsStream();
>        //read the data of the stream
>    }
>
> In addition you should not use a Reader if you want to read byte oriented
> data from the input stream.
>
> hope this helps
> best
> Rupert
>
> [1] http://hc.apache.org/httpclient-3.x/tutorial.html
>
> On 11.01.2012, at 22:34, Srecko Joksimovic wrote:
>
> > That's it. Thank you!
> > I have already configured KeywordLinkingEngine when I used my own
> ontology.
> > I think I'm familiar with that and I will try that option too.
> >
> > In meanwhile I found another interesting problem. I tried to annotate
> > document and web page. With web page, I tried
> > IOUtils.write(byte[], out) and I had to convert URL to byte[]:
> >
> > public static byte[] GetBytesFromURL(String _url) throws IOException
> > {
> >       GetMethod get = new GetMethod(_url);
> >       InputStream t_is = get.getResponseBodyAsStream();
> >       byte[] buffer = new byte[1024];
> >       int count = -1;
> >       Reader t_url_reader = new BufferedReader(new
> > InputStreamReader(t_is));
> >       byte[] t_bytes = IOUtils.toByteArray(t_url_reader, "UTF-8");
> >
> >       return t_bytes;
> > }
> >
> > But, the problem is that I'm getting null for InputStream.
> >
> > Any ideas?
> >
> > Best,
> > Srecko
> >
> >
> >
> > -----Original Message-----
> > From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
> > Sent: Wednesday, January 11, 2012 22:08
> > To: Srecko Joksimovic
> > Cc: stanbol-dev@incubator.apache.org
> > Subject: Re: Annotating using DBPedia ontology
> >
> >
> > On 11.01.2012, at 21:41, Srecko Joksimovic wrote:
> >> Hi Rupert,
> >>
> >> When I load localhost:8080/engines it says this:
> >>
> >> There are currently 5 active engines.
> >> org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine
> >> org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine
> >>
> >
> org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhanc
> >> ementEngine
> >>
> >
> org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
> >> ine
> >>
> >
> org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
> >> ine
> >>
> >> Maybe this could tell you something?
> >>
> >
> > This are exactly the 5 engines that are expected to run with the default
> > configuration.
> > Based on this the Stanbol Enhnacer should just work fine.
> >
> > After looking at the the text you enhanced I noticed however that is does
> > not mention
> > any named entities such as Persons, Organizations and Places. So I
> checked
> > it with
> > my local Stanbol version and was also not any detected entities.
> >
> > So to check if Stanbol works as expected you should try to use an other
> text
> > the
> > mentions some Named Entities such as
> >
> >    "John Smith works for the Apple Inc. in Cupertino, California."
> >
> >
> > If you want to search also for entities like "Bank", "Blog", "Consumer",
> > "Telephone" .
> > you need to also configure a KeywordLinkingEngine for dbpedia. Part B or
> [3]
> > provides
> > more information on how to do that.
> >
> > But let me mention that the KeywordLinkingEngine is more useful if used
> in
> > combination
> > with an own domain specific thesaurus rather than a global data set like
> > dbpedia. When
> > used with dbpedia you will also get a lot of false positives.
> >
> > best
> > Rupert
> >
> > [3] http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html
> >
>
>

Re: Annotating using DBPedia ontology

Posted by Rupert Westenthaler <ru...@gmail.com>.

Hi Srecko

I think the last time I directly used this API is about 3-4 years ago, but after a look at the http client tutorial [1] I think the reason for your problem is that you do not execute the GetMethod.

Based on this tutorial the code should look like

    // Create an instance of HttpClient. 
    HttpClient client = new HttpClient();
    GetMethod get = new GetMethod(url);
    try {
        // Execute the method.
        int statusCode = client.executeMethod(get);
        if (statusCode != HttpStatus.SC_OK) {
            //handle the error
        }
        InputStream t_is = get.getResponseBodyAsStream();
        //read the data of the stream
    }

In addition you should not use a Reader if you want to read byte oriented data from the input stream.

hope this helps
best
Rupert

[1] http://hc.apache.org/httpclient-3.x/tutorial.html

On 11.01.2012, at 22:34, Srecko Joksimovic wrote:

> That's it. Thank you!
> I have already configured KeywordLinkingEngine when I used my own ontology.
> I think I'm familiar with that and I will try that option too.
> 
> In meanwhile I found another interesting problem. I tried to annotate
> document and web page. With web page, I tried 
> IOUtils.write(byte[], out) and I had to convert URL to byte[]:
> 
> public static byte[] GetBytesFromURL(String _url) throws IOException
> {
> 	GetMethod get = new GetMethod(_url);		
> 	InputStream t_is = get.getResponseBodyAsStream();		
> 	byte[] buffer = new byte[1024];
> 	int count = -1;			
> 	Reader t_url_reader = new BufferedReader(new
> InputStreamReader(t_is));		
> 	byte[] t_bytes = IOUtils.toByteArray(t_url_reader, "UTF-8");
> 
> 	return t_bytes;
> }
> 
> But, the problem is that I'm getting null for InputStream. 
> 
> Any ideas?
> 
> Best,
> Srecko
> 
> 
> 
> -----Original Message-----
> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
> Sent: Wednesday, January 11, 2012 22:08
> To: Srecko Joksimovic
> Cc: stanbol-dev@incubator.apache.org
> Subject: Re: Annotating using DBPedia ontology
> 
> 
> On 11.01.2012, at 21:41, Srecko Joksimovic wrote:
>> Hi Rupert,
>> 
>> When I load localhost:8080/engines it says this:
>> 
>> There are currently 5 active engines.
>> org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine
>> org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine
>> 
> org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhanc
>> ementEngine
>> 
> org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
>> ine
>> 
> org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
>> ine
>> 
>> Maybe this could tell you something?
>> 
> 
> This are exactly the 5 engines that are expected to run with the default
> configuration.
> Based on this the Stanbol Enhnacer should just work fine.
> 
> After looking at the the text you enhanced I noticed however that is does
> not mention
> any named entities such as Persons, Organizations and Places. So I checked
> it with
> my local Stanbol version and was also not any detected entities.
> 
> So to check if Stanbol works as expected you should try to use an other text
> the
> mentions some Named Entities such as 
> 
>    "John Smith works for the Apple Inc. in Cupertino, California."
> 
> 
> If you want to search also for entities like "Bank", "Blog", "Consumer",
> "Telephone" .
> you need to also configure a KeywordLinkingEngine for dbpedia. Part B or [3]
> provides
> more information on how to do that.
> 
> But let me mention that the KeywordLinkingEngine is more useful if used in
> combination
> with an own domain specific thesaurus rather than a global data set like
> dbpedia. When
> used with dbpedia you will also get a lot of false positives.
> 
> best
> Rupert
> 
> [3] http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html
>

RE: Annotating using DBPedia ontology

Posted by Srecko Joksimovic <sr...@gmail.com>.

That's it. Thank you!
I have already configured KeywordLinkingEngine when I used my own ontology.
I think I'm familiar with that and I will try that option too.

In meanwhile I found another interesting problem. I tried to annotate
document and web page. With web page, I tried 
IOUtils.write(byte[], out) and I had to convert URL to byte[]:

public static byte[] GetBytesFromURL(String _url) throws IOException
{
	GetMethod get = new GetMethod(_url);		
	InputStream t_is = get.getResponseBodyAsStream();		
	byte[] buffer = new byte[1024];
	int count = -1;			
	Reader t_url_reader = new BufferedReader(new
InputStreamReader(t_is));		
	byte[] t_bytes = IOUtils.toByteArray(t_url_reader, "UTF-8");

	return t_bytes;
}

But, the problem is that I'm getting null for InputStream. 

Any ideas?

Best,
Srecko

-----Original Message-----
From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
Sent: Wednesday, January 11, 2012 22:08
To: Srecko Joksimovic
Cc: stanbol-dev@incubator.apache.org
Subject: Re: Annotating using DBPedia ontology

On 11.01.2012, at 21:41, Srecko Joksimovic wrote:
> Hi Rupert,
> 
> When I load localhost:8080/engines it says this:
> 
> There are currently 5 active engines.
> org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine
> org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine
>
org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhanc
> ementEngine
>
org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
> ine
>
org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
> ine
> 
> Maybe this could tell you something?
> 

This are exactly the 5 engines that are expected to run with the default
configuration.
Based on this the Stanbol Enhnacer should just work fine.

After looking at the the text you enhanced I noticed however that is does
not mention
any named entities such as Persons, Organizations and Places. So I checked
it with
my local Stanbol version and was also not any detected entities.

So to check if Stanbol works as expected you should try to use an other text
the
mentions some Named Entities such as 

    "John Smith works for the Apple Inc. in Cupertino, California."

If you want to search also for entities like "Bank", "Blog", "Consumer",
"Telephone" .
you need to also configure a KeywordLinkingEngine for dbpedia. Part B or [3]
provides
more information on how to do that.

But let me mention that the KeywordLinkingEngine is more useful if used in
combination
with an own domain specific thesaurus rather than a global data set like
dbpedia. When
used with dbpedia you will also get a lot of false positives.

best
Rupert

[3] http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html

Re: Annotating using DBPedia ontology

Posted by Rupert Westenthaler <ru...@gmail.com>.

On 11.01.2012, at 21:41, Srecko Joksimovic wrote:
> Hi Rupert,
> 
> When I load localhost:8080/engines it says this:
> 
> There are currently 5 active engines.
> org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine
> org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine
> org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhanc
> ementEngine
> org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
> ine
> org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
> ine
> 
> Maybe this could tell you something?
> 

This are exactly the 5 engines that are expected to run with the default configuration.
Based on this the Stanbol Enhnacer should just work fine.

After looking at the the text you enhanced I noticed however that is does not mention
any named entities such as Persons, Organizations and Places. So I checked it with
my local Stanbol version and was also not any detected entities.

So to check if Stanbol works as expected you should try to use an other text the
mentions some Named Entities such as 

    "John Smith works for the Apple Inc. in Cupertino, California."

If you want to search also for entities like "Bank", "Blog", "Consumer", "Telephone" …
you need to also configure a KeywordLinkingEngine for dbpedia. Part B or [3] provides
more information on how to do that.

But let me mention that the KeywordLinkingEngine is more useful if used in combination
with an own domain specific thesaurus rather than a global data set like dbpedia. When
used with dbpedia you will also get a lot of false positives.

best
Rupert

[3] http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html

RE: Annotating using DBPedia ontology

Posted by Srecko Joksimovic <sr...@gmail.com>.

Hi Rupert,

When I load localhost:8080/engines it says this:

There are currently 5 active engines.
org.apache.stanbol.enhancer.engines.metaxa.MetaxaEngine
org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine
org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhanc
ementEngine
org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
ine
org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
ine

Maybe this could tell you something?

Best,
Srecko 

-----Original Message-----
From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
Sent: Wednesday, January 11, 2012 20:28
To: Srecko Joksimovic
Cc: stanbol-dev@incubator.apache.org
Subject: Re: Annotating using DBPedia ontology

Hi Srecko

I googled for this exception and 90%+ of all pages had to do with Firewall
configurations on Windows machines.

The best description I found was on
http://weblogs.java.net/blog/binod/archive/2006/12/glassfish_and_w.html

About the enhancement result you posted: This is what the result looks like
if only the Metaxa and the LangId Engine are active. So I assume that the
other engines where not activated correctly. Maybe because of  the
IOException

Can you please check if you use a Firewall that could cause this? Are you
running Stanbol on Windos?

best
Rupert



On 11.01.2012, at 19:46, Srecko Joksimovic wrote:

> Hi Rupert,
> 
> I configured Stanbol, and I thought everything is alright because I could
> access Stanbol at http://localhost:8080.
> But, I noticed that during the startup I'm getting this error:
> 
> [WARNING] failed org.mortbay.jetty.nio.SelectChannelConnector$1@29978933:
> java.i
> o.IOException: Unable to establish loopback connection
> [WARNING] failed SelectChannelConnector@0.0.0.0:8080: java.io.IOException:
> Unabl
> e to establish loopback connection
> [WARNING] failed Server@62d844a9: java.io.IOException: Unable to establish
> loopb
> ack connection
> [ERROR] Exception while initializing Jetty.
> java.io.IOException: Unable to establish loopback connection
>        at sun.nio.ch.PipeImpl$Initializer.run(Unknown Source)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at sun.nio.ch.PipeImpl.<init>(Unknown Source)
>        at sun.nio.ch.SelectorProviderImpl.openPipe(Unknown Source)
>        at java.nio.channels.Pipe.open(Unknown Source)
>        at sun.nio.ch.WindowsSelectorImpl.<init>(Unknown Source)
>        at sun.nio.ch.WindowsSelectorProvider.openSelector(Unknown Source)
>        at java.nio.channels.Selector.open(Unknown Source)
>        at
> org.mortbay.io.nio.SelectorManager$SelectSet.<init>(SelectorManager.j
> ava:312)
>        at
> org.mortbay.io.nio.SelectorManager.doStart(SelectorManager.java:223)
>        at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
> 50)
>        at
> org.mortbay.jetty.nio.SelectChannelConnector.doStart(SelectChannelCon
> nector.java:314)
>        at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
> 50)
>        at org.mortbay.jetty.Server.doStart(Server.java:235)
>        at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
> 50)
>        at
> org.apache.felix.http.jetty.internal.JettyService.initializeJetty(Jet
> tyService.java:164)
>        at
> org.apache.felix.http.jetty.internal.JettyService.startJetty(JettySer
> vice.java:115)
>        at
> org.apache.felix.http.jetty.internal.JettyService.run(JettyService.ja
> va:290)
>        at java.lang.Thread.run(Unknown Source)
> Caused by: java.nio.channels.ClosedByInterruptException
>        at java.nio.channels.spi.AbstractInterruptibleChannel.end(Unknown
> Source
> )
>        at sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
>        at java.nio.channels.SocketChannel.open(Unknown Source)
>        ... 19 more
> 
> There is another thing. When I try to annotate text from application, or
> using web interface, I'm getting something like this:
> 
> <rdf:RDF
>    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>    xmlns:j.0="http://www.semanticdesktop.org/ontologies/2007/01/19/nie#"
>    xmlns:j.1="http://purl.org/dc/terms/"
>    xmlns:j.2="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#"
>    xmlns:j.3="http://fise.iks-project.eu/ontology/" > 
>  <rdf:Description
> rdf:about="urn:enhancement-39c09311-3095-fbb1-0dfe-551f6fba2baa">
>    <rdf:type
> rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>    <j.3:extracted-from
>
rdf:resource="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91
> "/>
>    <j.1:created
>
rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-11T18:44:11
> .271Z</j.1:created>
>    <j.1:creator
>
rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.en
> hancer.engines.metaxa.MetaxaEngine</j.1:creator>
>    <rdf:type
> rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>    <j.3:confidence
>
rdf:datatype="http://www.w3.org/2001/XMLSchema#double">1.0</j.3:confidence>
>  </rdf:Description>
>  <rdf:Description
> rdf:about="urn:enhancement-9e659b3e-8978-7191-eb8b-fa7030c2ff68">
>    <j.1:language>en</j.1:language>
>    <rdf:type
> rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>    <j.3:extracted-from
>
rdf:resource="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91
> "/>
>    <j.1:created
>
rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-11T18:44:11
> .278Z</j.1:created>
>    <j.1:creator
>
rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.en
> hancer.engines.langid.LangIdEnhancementEngine</j.1:creator>
>    <rdf:type
> rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>  </rdf:Description>
>  <rdf:Description
>
rdf:about="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91">
>    <rdf:type
>
rdf:resource="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Plain
> TextDocument"/>
>    <j.0:plainTextContent>The Web's children became parents. They use tools
> which can limit the access and the spreading of the information by their
> children. So, the parents can see at any time the web's logs of their
> children but they also have a net which is going to filter their "private"
> identity before it is broadcasted on the network. For example, a
third-part
> trust entity, along with their mobile telephone provider, the post office
> and the bank, will possess the consumer's identity so as to mask the
address
> of delivery and the payment of this consumer. A public identity also
exists
> to spread a resume (CV), a blog or an avatar for example but the data
remain
> the property of the owner of the server who hosts this data. So, the
mobile
> telephone provider offers a personal server who will contain one public
zone
> who will automatically be copied on the network after every modification.
If
> I want that my resume is not any longer on the network, I just have to
erase
> it of my public zone from my server. So, the mobile telephone provider
> creates a controllable silo of information for every public
> profile.</j.0:plainTextContent>
>  </rdf:Description>
> </rdf:RDF>
> 
> I am not sure that this is the content I should get.
> Please, help :)
> 
> Best,
> Srecko
> 
> -----Original Message-----
> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
> Sent: Tuesday, January 10, 2012 15:33
> To: srecko joksimovic
> Cc: stanbol-dev@incubator.apache.org
> Subject: Re: Annotating using DBPedia ontology
> 
> Hi Srecko
> 
>> 
>> curl -v -X POST -H "Content-Type: application/rdf+xml" --data
> "@acm-ccs_proton.owl" http://localhost:8080/entityhub/entity
>> 
> 
> No I would not propose you to upload the dbpedia dataset by using POST to
> the entityhub. This is fine for small and medium sized datasets, but will
> not work for dbpedia.
> 
> Stanbol comes already with a small sample set of DBPedia. This is also
used
> for enhancing documents with the default configuration.
> 
> This sample dataset contains the 43k DBPedia.org entities with the most
> incoming links including some often used properties includinglabels in
about
> 10 languages, the english comments, types, redirects stored as
rdf:seeAlso,
> lat/long, populations, birth/death dates, home pages, and category
> assignments stored in dc-terms:subject.
> 
> You can easily upgrade this index to a bigger version by downloading the
> dbpedia.solrindex.zip file form [1] and copying it into the
/sling/datafiles
> folder within the directory where your Stanbol server is running. After
some
> minutes (the time your computer needs to extract a file with ~3GByte) the
> bigger index will replace the sample set included in the launcher.
> 
> If you need some additional fields, languages . you can also create your
own
> index by using the indexing tool for dbpedia [2]. See the README.md file
for
> instructions.
> 
> best
> Rupert
> 
> [1] http://dev.iks-project.eu/downloads/stanbol-indices/dbpedia-3.7/
> [2]
>
https://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/indexing/
> dbpedia/
> 
> On 10.01.2012, at 14:01, srecko joksimovic wrote:
> 
>> Hi,
>> 
>> Until now I used my ontology when I wanted to annotate document (or
text).
> Now I would like to use DBPedia ontology. Do I have to download ontology
and
> configure Stanbol like I did before, using
>> 
>> curl -v -X POST -H "Content-Type: application/rdf+xml" --data
> "@acm-ccs_proton.owl" http://localhost:8080/entityhub/entity
>> 
>> or there is another procedure? Does Stanbol use DBPedia ontology by
> default, or I have to configure something similar like when I use another
> ontology?
>> 
>

Re: Annotating using DBPedia ontology

Posted by Rupert Westenthaler <ru...@gmail.com>.

Hi Srecko

I googled for this exception and 90%+ of all pages had to do with Firewall configurations on Windows machines.

The best description I found was on http://weblogs.java.net/blog/binod/archive/2006/12/glassfish_and_w.html

About the enhancement result you posted: This is what the result looks like if only the Metaxa and the LangId Engine are active. So I assume that the other engines where not activated correctly. Maybe because of  the IOException

Can you please check if you use a Firewall that could cause this? Are you running Stanbol on Windos?

best
Rupert



On 11.01.2012, at 19:46, Srecko Joksimovic wrote:

> Hi Rupert,
> 
> I configured Stanbol, and I thought everything is alright because I could
> access Stanbol at http://localhost:8080.
> But, I noticed that during the startup I'm getting this error:
> 
> [WARNING] failed org.mortbay.jetty.nio.SelectChannelConnector$1@29978933:
> java.i
> o.IOException: Unable to establish loopback connection
> [WARNING] failed SelectChannelConnector@0.0.0.0:8080: java.io.IOException:
> Unabl
> e to establish loopback connection
> [WARNING] failed Server@62d844a9: java.io.IOException: Unable to establish
> loopb
> ack connection
> [ERROR] Exception while initializing Jetty.
> java.io.IOException: Unable to establish loopback connection
>        at sun.nio.ch.PipeImpl$Initializer.run(Unknown Source)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at sun.nio.ch.PipeImpl.<init>(Unknown Source)
>        at sun.nio.ch.SelectorProviderImpl.openPipe(Unknown Source)
>        at java.nio.channels.Pipe.open(Unknown Source)
>        at sun.nio.ch.WindowsSelectorImpl.<init>(Unknown Source)
>        at sun.nio.ch.WindowsSelectorProvider.openSelector(Unknown Source)
>        at java.nio.channels.Selector.open(Unknown Source)
>        at
> org.mortbay.io.nio.SelectorManager$SelectSet.<init>(SelectorManager.j
> ava:312)
>        at
> org.mortbay.io.nio.SelectorManager.doStart(SelectorManager.java:223)
>        at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
> 50)
>        at
> org.mortbay.jetty.nio.SelectChannelConnector.doStart(SelectChannelCon
> nector.java:314)
>        at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
> 50)
>        at org.mortbay.jetty.Server.doStart(Server.java:235)
>        at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
> 50)
>        at
> org.apache.felix.http.jetty.internal.JettyService.initializeJetty(Jet
> tyService.java:164)
>        at
> org.apache.felix.http.jetty.internal.JettyService.startJetty(JettySer
> vice.java:115)
>        at
> org.apache.felix.http.jetty.internal.JettyService.run(JettyService.ja
> va:290)
>        at java.lang.Thread.run(Unknown Source)
> Caused by: java.nio.channels.ClosedByInterruptException
>        at java.nio.channels.spi.AbstractInterruptibleChannel.end(Unknown
> Source
> )
>        at sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
>        at java.nio.channels.SocketChannel.open(Unknown Source)
>        ... 19 more
> 
> There is another thing. When I try to annotate text from application, or
> using web interface, I'm getting something like this:
> 
> <rdf:RDF
>    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>    xmlns:j.0="http://www.semanticdesktop.org/ontologies/2007/01/19/nie#"
>    xmlns:j.1="http://purl.org/dc/terms/"
>    xmlns:j.2="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#"
>    xmlns:j.3="http://fise.iks-project.eu/ontology/" > 
>  <rdf:Description
> rdf:about="urn:enhancement-39c09311-3095-fbb1-0dfe-551f6fba2baa">
>    <rdf:type
> rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>    <j.3:extracted-from
> rdf:resource="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91
> "/>
>    <j.1:created
> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-11T18:44:11
> .271Z</j.1:created>
>    <j.1:creator
> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.en
> hancer.engines.metaxa.MetaxaEngine</j.1:creator>
>    <rdf:type
> rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>    <j.3:confidence
> rdf:datatype="http://www.w3.org/2001/XMLSchema#double">1.0</j.3:confidence>
>  </rdf:Description>
>  <rdf:Description
> rdf:about="urn:enhancement-9e659b3e-8978-7191-eb8b-fa7030c2ff68">
>    <j.1:language>en</j.1:language>
>    <rdf:type
> rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>    <j.3:extracted-from
> rdf:resource="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91
> "/>
>    <j.1:created
> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-11T18:44:11
> .278Z</j.1:created>
>    <j.1:creator
> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.en
> hancer.engines.langid.LangIdEnhancementEngine</j.1:creator>
>    <rdf:type
> rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>  </rdf:Description>
>  <rdf:Description
> rdf:about="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91">
>    <rdf:type
> rdf:resource="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Plain
> TextDocument"/>
>    <j.0:plainTextContent>The Web's children became parents. They use tools
> which can limit the access and the spreading of the information by their
> children. So, the parents can see at any time the web's logs of their
> children but they also have a net which is going to filter their "private"
> identity before it is broadcasted on the network. For example, a third-part
> trust entity, along with their mobile telephone provider, the post office
> and the bank, will possess the consumer's identity so as to mask the address
> of delivery and the payment of this consumer. A public identity also exists
> to spread a resume (CV), a blog or an avatar for example but the data remain
> the property of the owner of the server who hosts this data. So, the mobile
> telephone provider offers a personal server who will contain one public zone
> who will automatically be copied on the network after every modification. If
> I want that my resume is not any longer on the network, I just have to erase
> it of my public zone from my server. So, the mobile telephone provider
> creates a controllable silo of information for every public
> profile.</j.0:plainTextContent>
>  </rdf:Description>
> </rdf:RDF>
> 
> I am not sure that this is the content I should get.
> Please, help :)
> 
> Best,
> Srecko
> 
> -----Original Message-----
> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
> Sent: Tuesday, January 10, 2012 15:33
> To: srecko joksimovic
> Cc: stanbol-dev@incubator.apache.org
> Subject: Re: Annotating using DBPedia ontology
> 
> Hi Srecko
> 
>> 
>> curl -v -X POST -H "Content-Type: application/rdf+xml" --data
> "@acm-ccs_proton.owl" http://localhost:8080/entityhub/entity
>> 
> 
> No I would not propose you to upload the dbpedia dataset by using POST to
> the entityhub. This is fine for small and medium sized datasets, but will
> not work for dbpedia.
> 
> Stanbol comes already with a small sample set of DBPedia. This is also used
> for enhancing documents with the default configuration.
> 
> This sample dataset contains the 43k DBPedia.org entities with the most
> incoming links including some often used properties includinglabels in about
> 10 languages, the english comments, types, redirects stored as rdf:seeAlso,
> lat/long, populations, birth/death dates, home pages, and category
> assignments stored in dc-terms:subject.
> 
> You can easily upgrade this index to a bigger version by downloading the
> dbpedia.solrindex.zip file form [1] and copying it into the /sling/datafiles
> folder within the directory where your Stanbol server is running. After some
> minutes (the time your computer needs to extract a file with ~3GByte) the
> bigger index will replace the sample set included in the launcher.
> 
> If you need some additional fields, languages . you can also create your own
> index by using the indexing tool for dbpedia [2]. See the README.md file for
> instructions.
> 
> best
> Rupert
> 
> [1] http://dev.iks-project.eu/downloads/stanbol-indices/dbpedia-3.7/
> [2]
> https://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/indexing/
> dbpedia/
> 
> On 10.01.2012, at 14:01, srecko joksimovic wrote:
> 
>> Hi,
>> 
>> Until now I used my ontology when I wanted to annotate document (or text).
> Now I would like to use DBPedia ontology. Do I have to download ontology and
> configure Stanbol like I did before, using
>> 
>> curl -v -X POST -H "Content-Type: application/rdf+xml" --data
> "@acm-ccs_proton.owl" http://localhost:8080/entityhub/entity
>> 
>> or there is another procedure? Does Stanbol use DBPedia ontology by
> default, or I have to configure something similar like when I use another
> ontology?
>> 
>

RE: Annotating using DBPedia ontology

Posted by Srecko Joksimovic <sr...@gmail.com>.

Hi Rupert,

I configured Stanbol, and I thought everything is alright because I could
access Stanbol at http://localhost:8080.
But, I noticed that during the startup I'm getting this error:

[WARNING] failed org.mortbay.jetty.nio.SelectChannelConnector$1@29978933:
java.i
o.IOException: Unable to establish loopback connection
[WARNING] failed SelectChannelConnector@0.0.0.0:8080: java.io.IOException:
Unabl
e to establish loopback connection
[WARNING] failed Server@62d844a9: java.io.IOException: Unable to establish
loopb
ack connection
[ERROR] Exception while initializing Jetty.
java.io.IOException: Unable to establish loopback connection
        at sun.nio.ch.PipeImpl$Initializer.run(Unknown Source)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.nio.ch.PipeImpl.<init>(Unknown Source)
        at sun.nio.ch.SelectorProviderImpl.openPipe(Unknown Source)
        at java.nio.channels.Pipe.open(Unknown Source)
        at sun.nio.ch.WindowsSelectorImpl.<init>(Unknown Source)
        at sun.nio.ch.WindowsSelectorProvider.openSelector(Unknown Source)
        at java.nio.channels.Selector.open(Unknown Source)
        at
org.mortbay.io.nio.SelectorManager$SelectSet.<init>(SelectorManager.j
ava:312)
        at
org.mortbay.io.nio.SelectorManager.doStart(SelectorManager.java:223)
        at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
50)
        at
org.mortbay.jetty.nio.SelectChannelConnector.doStart(SelectChannelCon
nector.java:314)
        at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
50)
        at org.mortbay.jetty.Server.doStart(Server.java:235)
        at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:
50)
        at
org.apache.felix.http.jetty.internal.JettyService.initializeJetty(Jet
tyService.java:164)
        at
org.apache.felix.http.jetty.internal.JettyService.startJetty(JettySer
vice.java:115)
        at
org.apache.felix.http.jetty.internal.JettyService.run(JettyService.ja
va:290)
        at java.lang.Thread.run(Unknown Source)
Caused by: java.nio.channels.ClosedByInterruptException
        at java.nio.channels.spi.AbstractInterruptibleChannel.end(Unknown
Source
)
        at sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
        at java.nio.channels.SocketChannel.open(Unknown Source)
        ... 19 more

There is another thing. When I try to annotate text from application, or
using web interface, I'm getting something like this:

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:j.0="http://www.semanticdesktop.org/ontologies/2007/01/19/nie#"
    xmlns:j.1="http://purl.org/dc/terms/"
    xmlns:j.2="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#"
    xmlns:j.3="http://fise.iks-project.eu/ontology/" > 
  <rdf:Description
rdf:about="urn:enhancement-39c09311-3095-fbb1-0dfe-551f6fba2baa">
    <rdf:type
rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
    <j.3:extracted-from
rdf:resource="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91
"/>
    <j.1:created
rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-11T18:44:11
.271Z</j.1:created>
    <j.1:creator
rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.en
hancer.engines.metaxa.MetaxaEngine</j.1:creator>
    <rdf:type
rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
    <j.3:confidence
rdf:datatype="http://www.w3.org/2001/XMLSchema#double">1.0</j.3:confidence>
  </rdf:Description>
  <rdf:Description
rdf:about="urn:enhancement-9e659b3e-8978-7191-eb8b-fa7030c2ff68">
    <j.1:language>en</j.1:language>
    <rdf:type
rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
    <j.3:extracted-from
rdf:resource="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91
"/>
    <j.1:created
rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-01-11T18:44:11
.278Z</j.1:created>
    <j.1:creator
rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.en
hancer.engines.langid.LangIdEnhancementEngine</j.1:creator>
    <rdf:type
rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
  </rdf:Description>
  <rdf:Description
rdf:about="urn:content-item-sha1-322650339df64c4e5acd17a81af29bd8fed3ba91">
    <rdf:type
rdf:resource="http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Plain
TextDocument"/>
    <j.0:plainTextContent>The Web's children became parents. They use tools
which can limit the access and the spreading of the information by their
children. So, the parents can see at any time the web's logs of their
children but they also have a net which is going to filter their "private"
identity before it is broadcasted on the network. For example, a third-part
trust entity, along with their mobile telephone provider, the post office
and the bank, will possess the consumer's identity so as to mask the address
of delivery and the payment of this consumer. A public identity also exists
to spread a resume (CV), a blog or an avatar for example but the data remain
the property of the owner of the server who hosts this data. So, the mobile
telephone provider offers a personal server who will contain one public zone
who will automatically be copied on the network after every modification. If
I want that my resume is not any longer on the network, I just have to erase
it of my public zone from my server. So, the mobile telephone provider
creates a controllable silo of information for every public
profile.</j.0:plainTextContent>
  </rdf:Description>
</rdf:RDF>

I am not sure that this is the content I should get.
Please, help :)

Best,
Srecko

-----Original Message-----
From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
Sent: Tuesday, January 10, 2012 15:33
To: srecko joksimovic
Cc: stanbol-dev@incubator.apache.org
Subject: Re: Annotating using DBPedia ontology

Hi Srecko

> 
> curl -v -X POST -H "Content-Type: application/rdf+xml" --data
"@acm-ccs_proton.owl" http://localhost:8080/entityhub/entity
> 

No I would not propose you to upload the dbpedia dataset by using POST to
the entityhub. This is fine for small and medium sized datasets, but will
not work for dbpedia.

Stanbol comes already with a small sample set of DBPedia. This is also used
for enhancing documents with the default configuration.

This sample dataset contains the 43k DBPedia.org entities with the most
incoming links including some often used properties includinglabels in about
10 languages, the english comments, types, redirects stored as rdf:seeAlso,
lat/long, populations, birth/death dates, home pages, and category
assignments stored in dc-terms:subject.

You can easily upgrade this index to a bigger version by downloading the
dbpedia.solrindex.zip file form [1] and copying it into the /sling/datafiles
folder within the directory where your Stanbol server is running. After some
minutes (the time your computer needs to extract a file with ~3GByte) the
bigger index will replace the sample set included in the launcher.

If you need some additional fields, languages . you can also create your own
index by using the indexing tool for dbpedia [2]. See the README.md file for
instructions.

best
Rupert
 
[1] http://dev.iks-project.eu/downloads/stanbol-indices/dbpedia-3.7/
[2]
https://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/indexing/
dbpedia/

On 10.01.2012, at 14:01, srecko joksimovic wrote:

> Hi,
> 
> Until now I used my ontology when I wanted to annotate document (or text).
Now I would like to use DBPedia ontology. Do I have to download ontology and
configure Stanbol like I did before, using
> 
> curl -v -X POST -H "Content-Type: application/rdf+xml" --data
"@acm-ccs_proton.owl" http://localhost:8080/entityhub/entity
> 
> or there is another procedure? Does Stanbol use DBPedia ontology by
default, or I have to configure something similar like when I use another
ontology?
>

Re: Annotating using DBPedia ontology

Posted by Rupert Westenthaler <ru...@gmail.com>.

Hi Srecko

> 
> curl -v -X POST -H "Content-Type: application/rdf+xml" --data "@acm-ccs_proton.owl" http://localhost:8080/entityhub/entity
> 

No I would not propose you to upload the dbpedia dataset by using POST to the entityhub. This is fine for small and medium sized datasets, but will not work for dbpedia.

Stanbol comes already with a small sample set of DBPedia. This is also used for enhancing documents with the default configuration.

This sample dataset contains the 43k DBPedia.org entities with the most incoming links including some often used properties includinglabels in about 10 languages, the english comments, types, redirects stored as rdf:seeAlso, lat/long, populations, birth/death dates, home pages, and category assignments stored in dc-terms:subject.

You can easily upgrade this index to a bigger version by downloading the dbpedia.solrindex.zip file form [1] and copying it into the /sling/datafiles folder within the directory where your Stanbol server is running. After some minutes (the time your computer needs to extract a file with ~3GByte) the bigger index will replace the sample set included in the launcher.

If you need some additional fields, languages … you can also create your own index by using the indexing tool for dbpedia [2]. See the README.md file for instructions.

best
Rupert

[1] http://dev.iks-project.eu/downloads/stanbol-indices/dbpedia-3.7/
[2] https://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/indexing/dbpedia/

On 10.01.2012, at 14:01, srecko joksimovic wrote:

> Hi,
> 
> Until now I used my ontology when I wanted to annotate document (or text). Now I would like to use DBPedia ontology. Do I have to download ontology and configure Stanbol like I did before, using
> 
> curl -v -X POST -H "Content-Type: application/rdf+xml" --data "@acm-ccs_proton.owl" http://localhost:8080/entityhub/entity
> 
> or there is another procedure? Does Stanbol use DBPedia ontology by default, or I have to configure something similar like when I use another ontology?
>