You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by ram_sj <rp...@gmail.com> on 2010/01/28 20:48:01 UTC

Knowledge about contents of a page

Hi,

My question is about crawling, I know this is not relevant here, but I asked
nutch people, didn't get any response, I just thought of posing here,

I'm trying to crawl reviews for business, 

a. is there any way to tell the content in a web pages are reviews or not?
Is it possible to do it in automated fashion?

b. How could be map a block of text to a particular business ? ex: like
google reviews

   

Thanks
Ram
-- 
View this message in context: http://old.nabble.com/Knowledge-about-contents-of-a-page-tp27358779p27358779.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Knowledge about contents of a page

Posted by Avlesh Singh <av...@gmail.com>.
Classification? - http://en.wikipedia.org/wiki/Document_classification

Cheers
Avlesh

On Fri, Jan 29, 2010 at 1:18 AM, ram_sj <rp...@gmail.com> wrote:

>
> Hi,
>
> My question is about crawling, I know this is not relevant here, but I
> asked
> nutch people, didn't get any response, I just thought of posing here,
>
> I'm trying to crawl reviews for business,
>
> a. is there any way to tell the content in a web pages are reviews or not?
> Is it possible to do it in automated fashion?
>
> b. How could be map a block of text to a particular business ? ex: like
> google reviews
>
>
>
> Thanks
> Ram
> --
> View this message in context:
> http://old.nabble.com/Knowledge-about-contents-of-a-page-tp27358779p27358779.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>