You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Barnabas Szasz <bs...@gmail.com> on 2013/11/25 10:37:05 UTC
Stanbol and scraper integration
Dear Stanbol community,
I have a use case where I need to extract metadata from structures (mostly HTML or XML), where the position determines the meaning. Since entity recognition is part of the task (so the extracted strings should be resolved against vocabularies) I am considering Stanbol for this job.
Now the question is where a scraper (like scraperwiki.com) would fit in such an architecture? Shall I implement a wrapper for the scraper as an enhancer? In this case if an engine adds annotation to the document in the chain, would the next engine in the chain be able to do entity recognition on the annotation?
Or would you recommend a different approach?
Thanks,
Barna
Re: Stanbol and scraper integration
Posted by Rafa Haro <rh...@apache.org>.
Hi Barnabas,
Could you provide more details about how your documents are structured
and would be the format of the extracted strings using the scraper?
Regards,
Rafa
El 25/11/13 10:37, Barnabas Szasz escribió:
> Dear Stanbol community,
>
> I have a use case where I need to extract metadata from structures (mostly HTML or XML), where the position determines the meaning. Since entity recognition is part of the task (so the extracted strings should be resolved against vocabularies) I am considering Stanbol for this job.
> Now the question is where a scraper (like scraperwiki.com) would fit in such an architecture? Shall I implement a wrapper for the scraper as an enhancer? In this case if an engine adds annotation to the document in the chain, would the next engine in the chain be able to do entity recognition on the annotation?
> Or would you recommend a different approach?
>
> Thanks,
> Barna
>