You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by "Dotan N." <di...@gmail.com> on 2012/01/15 10:15:09 UTC

boilerpipe - usage with images (or other content)

Hi All,
I'm interested with using Tika/Boilerpipe for extracting images or other
content from a web page. The page would be any kind of web page (news, blog
post, etc).
I've had good experience with the ImageExtractor in Boilerpipe's trunk, and
wondering what is the workflow for extending / using boilerpipe in general
(say I also want video /embed tags).


Thanks!
--
Dotan, @jondot <http://twitter.com/jondot>