You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Yong-gang Cao <ch...@gmail.com> on 2005/06/28 15:37:19 UTC

Be careful of your anti-virus software while indexing

after my carefully check,for the file _ykxc.prx, I don't think it is a virus. 
 it just was detected as virus and was quarantined. 
It was reported as Mad.5131 virus.
Although I can't find any detail information about this virus on internet.
In our cases, the *.prx file is just a term proximity data file. If it
was injected, it will fail to work, but it works.
aha, what a coincidence! It's time to think about the leak of pattern
based anti-virus software.
Be careful of your anti-virus software. It can mash your whole day work.

On 6/28/05, Yong-gang Cao <ch...@gmail.com> wrote:
> Sorry,I got it.
> It was deleted by anti-virus software.
> Damn virus!
> 
> On 6/28/05, Yong-gang Cao <ch...@gmail.com> wrote:
> > I've tried to index large amount of web pages(about 6 million pages)
> > And I encountered the exception as following after indexed 1.34 million records.
> >
> > [java] 050625 121642 Processed 1340000 records (30.161636 rec/s)
> > [java] java.io.FileNotFoundException:
> > D:\DynamicDisk\webdb\segments\20050620232113\index\_ykxc.prx (access
> > denied)
> >
> > I tried to open the incomplete index using Luke,and Luke also reports
> > that _ykxc.prx is not found.
> > Why it is losed?
> > there was no manual interference during its indexing.
> > I've encountered this kind of issue more than one times. Is it bug of
> > lucene or nutch?
> > Any clue about this?
> > Thanks very much!
> > --
> > Best wishes to all diligent guys!
> >
> 
> 
> --
> Best wishes to all diligent guys!
> 


--