You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by Wen Gao <sa...@gmail.com> on 2011/02/16 10:35:05 UTC
duplicate records in index
Hi,
I am creating an index from my database, however, the record in .cfs files
contains duplicate records, e.g.
"book1", 1, "susan", 1
"book1", 1,"susan",1, 03/01/2010
"book2", 2,"tom",
"book2",2,"tom", 2,03/02/2010
..
I got the data from several tables, and am sure that the sql only generate
one record. Also, when I debug the code, the record is only added once.
So I am confused whether data replicate in idex.
I define my index as following format:
////////////////////////////////////////////////////////////////////
doc.Add(new Lucene.Net.Documents.Field(
"lmname",
readerreader1["lmname"].ToString(),
//new
System.IO.StringReader(readerreader["cname"].ToString()),
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.TOKENIZED)
);
//lmid
doc.Add(new Lucene.Net.Documents.Field(
"lmid",
readerreader1["lmid"].ToString(),
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
// nick name of user
doc.Add(new Lucene.Net.Documents.Field(
"nickName",
readerreader1["nickName"].ToString(),
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
// uid
doc.Add(new Lucene.Net.Documents.Field(
"uid",
readerreader1["uid"].ToString(),
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
writer.AddDocument(doc);
// acttime
doc.Add(new Lucene.Net.Documents.Field(
"acttime",
readerreader1["acttime"].ToString(),
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
writer.AddDocument(doc);
//////////////////////////////////////////////////////////////////
Any ideas?
Thanks,
Wen Gao
Re: duplicate records in index
Posted by Wen Gao <sa...@gmail.com>.
I saw that. so careless..
Thanks.
Wen Gao
2011/2/16 Digy <di...@gmail.com>
> You are adding the same doc twice.
> (See how you add "acttime" )
>
> DIGY
>
> -----Original Message-----
> From: Wen Gao [mailto:samuel.gaowen@gmail.com]
> Sent: Wednesday, February 16, 2011 11:35 AM
> To: lucene-net-dev@lucene.apache.org
> Subject: duplicate records in index
>
> Hi,
>
> I am creating an index from my database, however, the record in .cfs files
> contains duplicate records, e.g.
>
> "book1", 1, "susan", 1
>
> "book1", 1,"susan",1, 03/01/2010
>
> "book2", 2,"tom",
>
> "book2",2,"tom", 2,03/02/2010
>
> ..
>
>
>
> I got the data from several tables, and am sure that the sql only generate
> one record. Also, when I debug the code, the record is only added once.
>
> So I am confused whether data replicate in idex.
>
>
>
> I define my index as following format:
>
> ////////////////////////////////////////////////////////////////////
>
> doc.Add(new Lucene.Net.Documents.Field(
>
> "lmname",
>
> readerreader1["lmname"].ToString(),
>
> //new
> System.IO.StringReader(readerreader["cname"].ToString()),
>
> Lucene.Net.Documents.Field.Store.YES,
>
> Lucene.Net.Documents.Field.Index.TOKENIZED)
>
>
>
>
>
> );
>
> //lmid
>
> doc.Add(new Lucene.Net.Documents.Field(
>
> "lmid",
>
> readerreader1["lmid"].ToString(),
>
> Lucene.Net.Documents.Field.Store.YES,
>
> Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
>
>
>
> // nick name of user
>
> doc.Add(new Lucene.Net.Documents.Field(
>
> "nickName",
>
> readerreader1["nickName"].ToString(),
>
> Lucene.Net.Documents.Field.Store.YES,
>
> Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
>
>
>
> // uid
>
> doc.Add(new Lucene.Net.Documents.Field(
>
> "uid",
>
> readerreader1["uid"].ToString(),
>
> Lucene.Net.Documents.Field.Store.YES,
>
> Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
>
> writer.AddDocument(doc);
>
>
>
> // acttime
>
> doc.Add(new Lucene.Net.Documents.Field(
>
> "acttime",
>
> readerreader1["acttime"].ToString(),
>
> Lucene.Net.Documents.Field.Store.YES,
>
> Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
>
> writer.AddDocument(doc);
>
> //////////////////////////////////////////////////////////////////
>
>
>
> Any ideas?
>
>
>
> Thanks,
>
> Wen Gao
>
>
>
>
>
>
>
RE: duplicate records in index
Posted by Digy <di...@gmail.com>.
You are adding the same doc twice.
(See how you add "acttime" )
DIGY
-----Original Message-----
From: Wen Gao [mailto:samuel.gaowen@gmail.com]
Sent: Wednesday, February 16, 2011 11:35 AM
To: lucene-net-dev@lucene.apache.org
Subject: duplicate records in index
Hi,
I am creating an index from my database, however, the record in .cfs files
contains duplicate records, e.g.
"book1", 1, "susan", 1
"book1", 1,"susan",1, 03/01/2010
"book2", 2,"tom",
"book2",2,"tom", 2,03/02/2010
..
I got the data from several tables, and am sure that the sql only generate
one record. Also, when I debug the code, the record is only added once.
So I am confused whether data replicate in idex.
I define my index as following format:
////////////////////////////////////////////////////////////////////
doc.Add(new Lucene.Net.Documents.Field(
"lmname",
readerreader1["lmname"].ToString(),
//new
System.IO.StringReader(readerreader["cname"].ToString()),
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.TOKENIZED)
);
//lmid
doc.Add(new Lucene.Net.Documents.Field(
"lmid",
readerreader1["lmid"].ToString(),
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
// nick name of user
doc.Add(new Lucene.Net.Documents.Field(
"nickName",
readerreader1["nickName"].ToString(),
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
// uid
doc.Add(new Lucene.Net.Documents.Field(
"uid",
readerreader1["uid"].ToString(),
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
writer.AddDocument(doc);
// acttime
doc.Add(new Lucene.Net.Documents.Field(
"acttime",
readerreader1["acttime"].ToString(),
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.UN_TOKENIZED));
writer.AddDocument(doc);
//////////////////////////////////////////////////////////////////
Any ideas?
Thanks,
Wen Gao