You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by Wen Gao <sa...@gmail.com> on 2011/02/16 10:35:05 UTC

duplicate records in index

Hi,

I am creating an index from my database, however, the record in .cfs files
contains duplicate records,  e.g.

"book1", 1, "susan", 1

"book1", 1,"susan",1, 03/01/2010

"book2", 2,"tom",

"book2",2,"tom", 2,03/02/2010

..

 

I got the data from several tables, and am sure that the sql only generate
one record. Also, when I debug the code, the record is only added once.

So I am confused whether data replicate in idex.

 

I define my index as following format:

////////////////////////////////////////////////////////////////////

doc.Add(new Lucene.Net.Documents.Field(

                "lmname",

                readerreader1["lmname"].ToString(),

                    //new
System.IO.StringReader(readerreader["cname"].ToString()),

                Lucene.Net.Documents.Field.Store.YES,

                 Lucene.Net.Documents.Field.Index.TOKENIZED)

 

 

                );

                //lmid

                doc.Add(new Lucene.Net.Documents.Field(

                "lmid",

                readerreader1["lmid"].ToString(),

                 Lucene.Net.Documents.Field.Store.YES,

                 Lucene.Net.Documents.Field.Index.UN_TOKENIZED));

 

                // nick name of user

                doc.Add(new Lucene.Net.Documents.Field(

                "nickName",

                 readerreader1["nickName"].ToString(),

                 Lucene.Net.Documents.Field.Store.YES,

                 Lucene.Net.Documents.Field.Index.UN_TOKENIZED));

 

                // uid

                doc.Add(new Lucene.Net.Documents.Field(

                "uid",

                 readerreader1["uid"].ToString(),

                 Lucene.Net.Documents.Field.Store.YES,

                 Lucene.Net.Documents.Field.Index.UN_TOKENIZED));

                writer.AddDocument(doc);

 

                // acttime

                doc.Add(new Lucene.Net.Documents.Field(

                "acttime",

                 readerreader1["acttime"].ToString(),

                 Lucene.Net.Documents.Field.Store.YES,

                 Lucene.Net.Documents.Field.Index.UN_TOKENIZED));

                writer.AddDocument(doc);

//////////////////////////////////////////////////////////////////

 

Any ideas?

 

Thanks,

Wen Gao

 

 


Re: duplicate records in index

Posted by Wen Gao <sa...@gmail.com>.
I saw that. so careless..
Thanks.

Wen Gao

2011/2/16 Digy <di...@gmail.com>

> You are adding the same doc twice.
> (See how you add "acttime" )
>
> DIGY
>
> -----Original Message-----
> From: Wen Gao [mailto:samuel.gaowen@gmail.com]
> Sent: Wednesday, February 16, 2011 11:35 AM
> To: lucene-net-dev@lucene.apache.org
> Subject: duplicate records in index
>
> Hi,
>
> I am creating an index from my database, however, the record in .cfs files
> contains duplicate records,  e.g.
>
> "book1", 1, "susan", 1
>
> "book1", 1,"susan",1, 03/01/2010
>
> "book2", 2,"tom",
>
> "book2",2,"tom", 2,03/02/2010
>
> ..
>
>
>
> I got the data from several tables, and am sure that the sql only generate
> one record. Also, when I debug the code, the record is only added once.
>
> So I am confused whether data replicate in idex.
>
>
>
> I define my index as following format:
>
> ////////////////////////////////////////////////////////////////////
>
> doc.Add(new Lucene.Net.Documents.Field(
>
>                "lmname",
>
>                readerreader1["lmname"].ToString(),
>
>                    //new
> System.IO.StringReader(readerreader["cname"].ToString()),
>
>                Lucene.Net.Documents.Field.Store.YES,
>
>                 Lucene.Net.Documents.Field.Index.TOKENIZED)
>
>
>
>
>
>                );
>
>                //lmid
>
>                doc.Add(new Lucene.Net.Documents.Field(
>
>                "lmid",
>
>                readerreader1["lmid"].ToString(),
>
>                 Lucene.Net.Documents.Field.Store.YES,
>
>                 Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
>
>
>
>                // nick name of user
>
>                doc.Add(new Lucene.Net.Documents.Field(
>
>                "nickName",
>
>                 readerreader1["nickName"].ToString(),
>
>                 Lucene.Net.Documents.Field.Store.YES,
>
>                 Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
>
>
>
>                // uid
>
>                doc.Add(new Lucene.Net.Documents.Field(
>
>                "uid",
>
>                 readerreader1["uid"].ToString(),
>
>                 Lucene.Net.Documents.Field.Store.YES,
>
>                 Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
>
>                writer.AddDocument(doc);
>
>
>
>                // acttime
>
>                doc.Add(new Lucene.Net.Documents.Field(
>
>                "acttime",
>
>                 readerreader1["acttime"].ToString(),
>
>                 Lucene.Net.Documents.Field.Store.YES,
>
>                 Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
>
>                writer.AddDocument(doc);
>
> //////////////////////////////////////////////////////////////////
>
>
>
> Any ideas?
>
>
>
> Thanks,
>
> Wen Gao
>
>
>
>
>
>
>

RE: duplicate records in index

Posted by Digy <di...@gmail.com>.
You are adding the same doc twice.
(See how you add "acttime" )

DIGY

-----Original Message-----
From: Wen Gao [mailto:samuel.gaowen@gmail.com] 
Sent: Wednesday, February 16, 2011 11:35 AM
To: lucene-net-dev@lucene.apache.org
Subject: duplicate records in index

Hi,

I am creating an index from my database, however, the record in .cfs files
contains duplicate records,  e.g.

"book1", 1, "susan", 1

"book1", 1,"susan",1, 03/01/2010

"book2", 2,"tom",

"book2",2,"tom", 2,03/02/2010

..

 

I got the data from several tables, and am sure that the sql only generate
one record. Also, when I debug the code, the record is only added once.

So I am confused whether data replicate in idex.

 

I define my index as following format:

////////////////////////////////////////////////////////////////////

doc.Add(new Lucene.Net.Documents.Field(

                "lmname",

                readerreader1["lmname"].ToString(),

                    //new
System.IO.StringReader(readerreader["cname"].ToString()),

                Lucene.Net.Documents.Field.Store.YES,

                 Lucene.Net.Documents.Field.Index.TOKENIZED)

 

 

                );

                //lmid

                doc.Add(new Lucene.Net.Documents.Field(

                "lmid",

                readerreader1["lmid"].ToString(),

                 Lucene.Net.Documents.Field.Store.YES,

                 Lucene.Net.Documents.Field.Index.UN_TOKENIZED));

 

                // nick name of user

                doc.Add(new Lucene.Net.Documents.Field(

                "nickName",

                 readerreader1["nickName"].ToString(),

                 Lucene.Net.Documents.Field.Store.YES,

                 Lucene.Net.Documents.Field.Index.UN_TOKENIZED));

 

                // uid

                doc.Add(new Lucene.Net.Documents.Field(

                "uid",

                 readerreader1["uid"].ToString(),

                 Lucene.Net.Documents.Field.Store.YES,

                 Lucene.Net.Documents.Field.Index.UN_TOKENIZED));

                writer.AddDocument(doc);

 

                // acttime

                doc.Add(new Lucene.Net.Documents.Field(

                "acttime",

                 readerreader1["acttime"].ToString(),

                 Lucene.Net.Documents.Field.Store.YES,

                 Lucene.Net.Documents.Field.Index.UN_TOKENIZED));

                writer.AddDocument(doc);

//////////////////////////////////////////////////////////////////

 

Any ideas?

 

Thanks,

Wen Gao