You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by xingjian <xi...@gmail.com> on 2007/11/13 06:37:34 UTC

takes the URI info, Content, headers, ect into a MYSQL database.

Instead of writing to disc, Id like to draw content of page and create a
method that
takes the URI info, Content, headers, ect into a MYSQL database. Does
anyone have any suggestion on how to do this , where I should look to
place my methods?

-- 
View this message in context: http://www.nabble.com/takes-the-URI-info%2C-Content%2C-headers%2C-ect-into-a-MYSQL-database.-tf4795882.html#a13720124
Sent from the Nutch - Dev mailing list archive at Nabble.com.


Re: takes the URI info, Content, headers, ect into a MYSQL database.

Posted by xingjian <xi...@gmail.com>.
i need to extend FetcherOutputFormat ?Have you simple example ?thanks


Sagar Naik-2 wrote:
> 
> Hey
> AFAIK, FetcherOutputFormat is the class to look at.
> the getRecordWriter function,
> FILE : new file is opened
> DB : Instantiate the db conn
> 
> In the RecordWriter class's write function
> FILE : Contents are written on disk
> DB : insert into db
> 
> In the RecordWriter class's close function
> FILE : Close file
> DB : close file
> 
> You will also have to look at ParseOutputFormat along same lines
> 
>  
> 
> xingjian wrote:
>> Instead of writing to disc, Id like to draw content of page and create a
>> method that
>> takes the URI info, Content, headers, ect into a MYSQL database. Does
>> anyone have any suggestion on how to do this , where I should look to
>> place my methods?
>>
>>   
> 
> 
> -- 
> This message has been scanned for viruses and
> dangerous content and is believed to be clean.
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/takes-the-URI-info%2C-Content%2C-headers%2C-ect-into-a-MYSQL-database-during-crawl.-tf4795882.html#a13720535
Sent from the Nutch - Dev mailing list archive at Nabble.com.


Re: takes the URI info, Content, headers, ect into a MYSQL database.

Posted by Sagar Naik <sa...@visvo.com>.
Hey
AFAIK, FetcherOutputFormat is the class to look at.
the getRecordWriter function,
FILE : new file is opened
DB : Instantiate the db conn

In the RecordWriter class's write function
FILE : Contents are written on disk
DB : insert into db

In the RecordWriter class's close function
FILE : Close file
DB : close file

You will also have to look at ParseOutputFormat along same lines

 

xingjian wrote:
> Instead of writing to disc, Id like to draw content of page and create a
> method that
> takes the URI info, Content, headers, ect into a MYSQL database. Does
> anyone have any suggestion on how to do this , where I should look to
> place my methods?
>
>   


-- 
This message has been scanned for viruses and
dangerous content and is believed to be clean.