You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@any23.apache.org by Lewis John Mcgibbney <le...@gmail.com> on 2013/01/08 01:53:28 UTC
Re: any23 problem
Hi Iosif,
I know that this was a while back but are you still struggling with this?
Best
Lewis
On Sat, Dec 15, 2012 at 8:05 AM, Iosif Viktoratos <vi...@econ.auth.gr>wrote:
> hello guys i am a phd student and i use any23 for a work of mine. i have
> embedded it into jsp server andi i used the following lines of code:
>
> ......
> some imports
>
> <%@ page import="org.apache.any23.*"%>
> <%@ page import="org.apache.any23.**Any23.*"%>
> <%@ page import="org.apache.any23.**extractor.*"%>
> <%@ page import="org.apache.any23.**extractor.rdf.*"%>
> <%@ page import="org.apache.any23.**extractor.rdfa.*"%>
> <%@ page import="org.apache.any23.**extractor.xpath.*"%>
> <%@ page import="org.apache.any23.**extractor.html.*"%>
> <%@ page import="org.apache.any23.**extractor.html.annotations.*"%**>
> <%@ page import="org.apache.any23.http.***"%>
> <%@ page import="org.apache.any23.**writer.*"%>
> <%@ page import="org.apache.any23.**source.*"%>
> <%@ page import="org.apache.any23.**validator.*"%>
> <%@ page import="org.apache.any23.**validator.rule.*"%>
> <%@ page import="org.apache.any23.**vocab.*"%>
> <%@ page import="org.apache.any23.util.***"%>
> <%@ page import="org.apache.any23.mime.***"%>
> <%@ page import="org.apache.any23.**servlet.*"%>
> <%@ page import="org.apache.any23.**plugin.*"%>
> <%@ page import="org.apache.any23.**plugin.htmlscraper.*"%>
> <%@ page import="net.rootdev.javardfa.***"%>
> <%
>
> Any23 runner = new Any23();
> runner.setHTTPUserAgent("test-**user-agent");
> HTTPClient httpClient = runner.getHTTPClient();
> DocumentSource source = new HTTPDocumentSource(httpClient,**"
> http://www.contra.gr");
> ByteArrayOutputStream out1 = new ByteArrayOutputStream();
> TripleHandler handler = new NTriplesWriter(out1);
> try {
> runner.extract(source, handler);
> } finally {
> handler.close();
> }
> String n3 = out1.toString("UTF-8");
> out.print(n3);
> ....
>
>
>
> Well.it works fine if i parse for a direct source of rdf data but in case
> i parse from an html website (www.contra.gr for example) i get the
> following error
> java.lang.**IllegalArgumentException: Illegal XPath expression:
> //*/h:head/h:base[position()=**1]/@href
>
> Can you help me; Can you also send me examples of use (or links) because i
> can't find any examples how to use it;
>
> Thanks for your help
>
>
>
--
*Lewis*