You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@any23.apache.org by Lewis John Mcgibbney <le...@gmail.com> on 2013/01/08 01:53:28 UTC

Re: any23 problem

Hi Iosif,

I know that this was a while back but are you still struggling with this?

Best

Lewis

On Sat, Dec 15, 2012 at 8:05 AM, Iosif Viktoratos <vi...@econ.auth.gr>wrote:

> hello guys i am a phd student and i use any23 for a work of mine. i have
> embedded it into jsp server andi i used the following lines of code:
>
> ......
> some imports
>
> <%@ page import="org.apache.any23.*"%>
> <%@ page import="org.apache.any23.**Any23.*"%>
> <%@ page import="org.apache.any23.**extractor.*"%>
> <%@ page import="org.apache.any23.**extractor.rdf.*"%>
> <%@ page import="org.apache.any23.**extractor.rdfa.*"%>
> <%@ page import="org.apache.any23.**extractor.xpath.*"%>
> <%@ page import="org.apache.any23.**extractor.html.*"%>
> <%@ page import="org.apache.any23.**extractor.html.annotations.*"%**>
> <%@ page import="org.apache.any23.http.***"%>
> <%@ page import="org.apache.any23.**writer.*"%>
> <%@ page import="org.apache.any23.**source.*"%>
> <%@ page import="org.apache.any23.**validator.*"%>
> <%@ page import="org.apache.any23.**validator.rule.*"%>
> <%@ page import="org.apache.any23.**vocab.*"%>
> <%@ page import="org.apache.any23.util.***"%>
> <%@ page import="org.apache.any23.mime.***"%>
> <%@ page import="org.apache.any23.**servlet.*"%>
> <%@ page import="org.apache.any23.**plugin.*"%>
> <%@ page import="org.apache.any23.**plugin.htmlscraper.*"%>
> <%@ page import="net.rootdev.javardfa.***"%>
> <%
>
>   Any23 runner = new Any23();
>  runner.setHTTPUserAgent("test-**user-agent");
>  HTTPClient httpClient = runner.getHTTPClient();
>  DocumentSource source = new HTTPDocumentSource(httpClient,**"
> http://www.contra.gr");
>  ByteArrayOutputStream out1 = new ByteArrayOutputStream();
>  TripleHandler handler = new NTriplesWriter(out1);
>       try {
>      runner.extract(source, handler);
>       } finally {
>      handler.close();
>       }
> String n3 = out1.toString("UTF-8");
> out.print(n3);
> ....
>
>
>
> Well.it works fine if i parse for a direct source of rdf data but in case
> i parse from an html website (www.contra.gr for example) i get the
> following error
> java.lang.**IllegalArgumentException: Illegal XPath expression:
> //*/h:head/h:base[position()=**1]/@href
>
> Can you help me; Can you also send me examples of use (or links) because i
> can't find any examples how to use it;
>
> Thanks for your help
>
>
>


-- 
*Lewis*