You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by xingjian <xi...@gmail.com> on 2007/11/07 01:51:14 UTC

how can i get the document object in Nutch.

Hello everyone:
How to wipe off the function of index about Nutch,i only can use the
function of fetch about Nutch.

I get setgments by fetch.These segments include some documents
object,document include some field object,how can i get the document object.
I want to get fields by documnet object in order to save fields to my
database,then i index fields by Lucene.

    Thanks every.
-----------------------------
仅仅需要Nutch的抓取功能,不用它的索引功能。如何把Nutch生成的segments内容中的document对象得到,
我想将document中的field直接写到数据库,然后通过Lunce,我自己实现对Field的索引.
-- 
View this message in context: http://www.nabble.com/how-can-i-get-the-document-object-in-Nutch.-tf4762011.html#a13619299
Sent from the Nutch - User mailing list archive at Nabble.com.


Re: how can i get the document object in Nutch.

Posted by Chee Wu <ch...@gmail.com>.
try to take a look at SegementReader,Hope you could find some clue from
there..

On Nov 7, 2007 4:22 PM, xingjian <xi...@gmail.com> wrote:

>
> Do you help me?thanks.
>
>
> xingjian wrote:
> >
> > Hello everyone:
> > How to wipe off the function of index about Nutch,i only can use the
> > function of fetch about Nutch.
> >
> > I get setgments by fetch.These segments include some documents
> > object,document include some field object,how can i get the document
> > object.
> > I want to get fields by documnet object in order to save fields to my
> > database,then i index fields by Lucene.
> >
> >     Thanks every.
> > -----------------------------
> > 仅仅需要Nutch的抓取功能,不用它的索引功能。如何把Nutch生成的segments内容中的document对象得到,
> > 我想将document中的field直接写到数据库,然后通过Lunce,我自己实现对Field的索引.
> >
>
> --
> View this message in context:
> http://www.nabble.com/how-can-i-get-the-document-object-in-Nutch.-tf4762011.html#a13623041
> Sent from the Nutch - User mailing list archive at Nabble.com.
>
>

Re: how can i get the document object in Nutch.

Posted by xingjian <xi...@gmail.com>.
Do you help me?thanks.


xingjian wrote:
> 
> Hello everyone:
> How to wipe off the function of index about Nutch,i only can use the
> function of fetch about Nutch.
> 
> I get setgments by fetch.These segments include some documents
> object,document include some field object,how can i get the document
> object.
> I want to get fields by documnet object in order to save fields to my
> database,then i index fields by Lucene.
> 
>     Thanks every.
> -----------------------------
> 仅仅需要Nutch的抓取功能,不用它的索引功能。如何把Nutch生成的segments内容中的document对象得到,
> 我想将document中的field直接写到数据库,然后通过Lunce,我自己实现对Field的索引.
> 

-- 
View this message in context: http://www.nabble.com/how-can-i-get-the-document-object-in-Nutch.-tf4762011.html#a13623041
Sent from the Nutch - User mailing list archive at Nabble.com.