You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by xingjian <xi...@gmail.com> on 2007/11/13 06:37:34 UTC
takes the URI info, Content, headers, ect into a MYSQL database.
Instead of writing to disc, Id like to draw content of page and create a
method that
takes the URI info, Content, headers, ect into a MYSQL database. Does
anyone have any suggestion on how to do this , where I should look to
place my methods?
--
View this message in context: http://www.nabble.com/takes-the-URI-info%2C-Content%2C-headers%2C-ect-into-a-MYSQL-database.-tf4795882.html#a13720124
Sent from the Nutch - Dev mailing list archive at Nabble.com.
Re: takes the URI info, Content, headers, ect into a MYSQL
database.
Posted by xingjian <xi...@gmail.com>.
i need to extend FetcherOutputFormat ?Have you simple example ?thanks
Sagar Naik-2 wrote:
>
> Hey
> AFAIK, FetcherOutputFormat is the class to look at.
> the getRecordWriter function,
> FILE : new file is opened
> DB : Instantiate the db conn
>
> In the RecordWriter class's write function
> FILE : Contents are written on disk
> DB : insert into db
>
> In the RecordWriter class's close function
> FILE : Close file
> DB : close file
>
> You will also have to look at ParseOutputFormat along same lines
>
>
>
> xingjian wrote:
>> Instead of writing to disc, Id like to draw content of page and create a
>> method that
>> takes the URI info, Content, headers, ect into a MYSQL database. Does
>> anyone have any suggestion on how to do this , where I should look to
>> place my methods?
>>
>>
>
>
> --
> This message has been scanned for viruses and
> dangerous content and is believed to be clean.
>
>
>
--
View this message in context: http://www.nabble.com/takes-the-URI-info%2C-Content%2C-headers%2C-ect-into-a-MYSQL-database-during-crawl.-tf4795882.html#a13720535
Sent from the Nutch - Dev mailing list archive at Nabble.com.
Re: takes the URI info, Content, headers, ect into a MYSQL database.
Posted by Sagar Naik <sa...@visvo.com>.
Hey
AFAIK, FetcherOutputFormat is the class to look at.
the getRecordWriter function,
FILE : new file is opened
DB : Instantiate the db conn
In the RecordWriter class's write function
FILE : Contents are written on disk
DB : insert into db
In the RecordWriter class's close function
FILE : Close file
DB : close file
You will also have to look at ParseOutputFormat along same lines
xingjian wrote:
> Instead of writing to disc, Id like to draw content of page and create a
> method that
> takes the URI info, Content, headers, ect into a MYSQL database. Does
> anyone have any suggestion on how to do this , where I should look to
> place my methods?
>
>
--
This message has been scanned for viruses and
dangerous content and is believed to be clean.