You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by spinergywmy <sp...@gmail.com> on 2006/12/14 02:21:11 UTC

Index Excel File

Hi,

   Is anyone index an excel file before? I took a look at the API classes
provided by POI HSSF, however, I did not find any method to extract the text
from excel file and index them.

   Please assist and leet me know where I can find the example to refer to.
Thanks


regards,
Wooi Meng
-- 
View this message in context: http://www.nabble.com/Index-Excel-File-tf2817920.html#a7865192
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Index Excel File

Posted by rajan <ra...@usindia.com>.
I think there is problem with following line...

row = sheet.getRow(i); -> row = sheet.getRow(j);

Also following code with give you the contents:
===================================================
   Workbook excelDoc = Workbook.getWorkbook(new FileInputStream(
     file));
   String content = "";
   for (int i = 0; i < excelDoc.getNumberOfSheets(); i++) {
    Sheet sheet = excelDoc.getSheet(i);
    Cell[] row = null;
    for (int j = 0; j < sheet.getRows(); j++) {
     row = sheet.getRow(j);
     for (int rows = 0; rows < row.length; rows++) {
      content = row[rows].getContents();
      System.err.println("content inside loop is ::: "
        + content);
     }
    }
   }
===========================================
Regards
Rajan

-----Original Message-----
From: spinergywmy <sp...@gmail.com>
To: java-user@lucene.apache.org
Date: Wed, 13 Dec 2006 19:10:30 -0800 (PST)
Subject: Re: Index Excel File

> 
> Hi,
> 
>    I did use jexcelapi to extract the contents out of excel file,
> however, I
> couldn't get the content when I sysout. Below are the codes that I
> wrote,
> perhaps you can point out where I have done wrong. Thanks.
> 
> 
>    Workbook excelDoc = Workbook.getWorkbook(new FileInputStream(file));
>       String content = "";
>       
>       for(int i = 0; i < excelDoc.getNumberOfSheets(); i++)
>       {
>          Sheet sheet = excelDoc.getSheet(i);
>          
>          Cell[] row = null;
>          
>          for(int j = 0; j < sheet.getRows(); j++)
>          {
>             row = sheet.getRow(i);
>             
>             System.err.println("row is ::: " +row.length);
>             
>             if(row.length > 0)
>             {
>                content = row[i].getContents();
>                System.err.println("content inside loop is ::: " +content);
>             }
>          }
>       }
>       
>       System.err.println("content is ::: " +content);
>       
>       doc.add(new Field(DsConstant.idxFileContent, content,
> Field.Store.YES,
> Field.Index.TOKENIZED));
>       doc.add(new Field(DsConstant.idxFileName, file.getName(),
> Field.Store.YES,
> Field.Index.UN_TOKENIZED));
>       doc.add(new Field(DsConstant.idxPath, file.getPath(),
> Field.Store.YES,
> Field.Index.UN_TOKENIZED));
>       
>       excelDoc.close();
> 
> regards,
> Wooi Meng
> -- 
> View this message in context:
> http://www.nabble.com/Index-Excel-File-tf2817920.html#a7866165
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Index Excel File

Posted by spinergywmy <sp...@gmail.com>.
Hi,

   I did use jexcelapi to extract the contents out of excel file, however, I
couldn't get the content when I sysout. Below are the codes that I wrote,
perhaps you can point out where I have done wrong. Thanks.


   Workbook excelDoc = Workbook.getWorkbook(new FileInputStream(file));
		String content = "";
		
		for(int i = 0; i < excelDoc.getNumberOfSheets(); i++)
		{
			Sheet sheet = excelDoc.getSheet(i);
			
			Cell[] row = null;
			
			for(int j = 0; j < sheet.getRows(); j++)
			{
				row = sheet.getRow(i);
				
				System.err.println("row is ::: " +row.length);
				
				if(row.length > 0)
				{
					content = row[i].getContents();
					System.err.println("content inside loop is ::: " +content);
				}
			}
		}
		
		System.err.println("content is ::: " +content);
		
		doc.add(new Field(DsConstant.idxFileContent, content, Field.Store.YES,
Field.Index.TOKENIZED));
		doc.add(new Field(DsConstant.idxFileName, file.getName(), Field.Store.YES,
Field.Index.UN_TOKENIZED));
		doc.add(new Field(DsConstant.idxPath, file.getPath(), Field.Store.YES,
Field.Index.UN_TOKENIZED));
		
		excelDoc.close();

regards,
Wooi Meng
-- 
View this message in context: http://www.nabble.com/Index-Excel-File-tf2817920.html#a7866165
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Index Excel File

Posted by rajan <ra...@usindia.com>.
Hello,

i used jexcepapi. Within that there is class called CSV.java in demo 
package.
By using that i extracted text from excel, and added that text into the 
index.

I hope this will help you.
Regards
Rajan.


-----Original Message-----
From: spinergywmy <sp...@gmail.com>
To: java-user@lucene.apache.org
Date: Wed, 13 Dec 2006 18:05:29 -0800 (PST)
Subject: Re: Index Excel File

> 
> Hi,
> 
>    Can you show me the example on how to extract the text from excel
> file
> and index them?
> 
>    Thanks
> 
> regards,
> Wooi Meng
> -- 
> View this message in context:
> http://www.nabble.com/Index-Excel-File-tf2817920.html#a7865632
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Index Excel File

Posted by spinergywmy <sp...@gmail.com>.
Hi,

   Can you show me the example on how to extract the text from excel file
and index them?

   Thanks

regards,
Wooi Meng
-- 
View this message in context: http://www.nabble.com/Index-Excel-File-tf2817920.html#a7865632
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Index Excel File

Posted by rajan <ra...@usindia.com>.
Hello,

Please try to use jexcelapi.
I done it successfully. 
While using POI it gave me exception while image is present in excel file.

Regards
Rajan.

-----Original Message-----
From: spinergywmy <sp...@gmail.com>
To: java-user@lucene.apache.org
Date: Wed, 13 Dec 2006 17:21:11 -0800 (PST)
Subject: Index Excel File

> 
> Hi,
> 
>    Is anyone index an excel file before? I took a look at the API
> classes
> provided by POI HSSF, however, I did not find any method to extract the
> text
> from excel file and index them.
> 
>    Please assist and leet me know where I can find the example to refer
> to.
> Thanks
> 
> 
> regards,
> Wooi Meng
> -- 
> View this message in context:
> http://www.nabble.com/Index-Excel-File-tf2817920.html#a7865192
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org