You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@any23.apache.org by al...@libero.it on 2017/11/30 14:48:01 UTC

parse broken uri

Hi users, I’m using any23 version 2.0 in my project, I have tested the extraction of RDF microformats from HTML pages. In this HTML there is an inconsistent URI, without protocol specification (example: //any23.apache.org instead of https://any23.apache.org )

The library gives me the log:

WARN rdf.Any23ValueFactoryWrapper: Not a valid (absolute) IRI:

INFO extractor.SingleDocumentExtraction: Processing null

I am seeing the method fixIRIWithException that fixes some potentially broken relative or absolute URI, but for this case it doesn’t fix this problem.Is it possible to integrate a patch to solve this problem? Thanks

Best regards,

Alfonso