You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by BoBoMonkey <ma...@yahoo.com> on 2009/10/02 22:06:20 UTC

How to perform Find and Findnext on an excel sheet using POI

Hi,

I need to search an excel sheet for specific text and then on finding it I
need to note the cell address (example a1:b2 etc) of the cell that has the
text. The I need to find the next occurrence of the same text within that
sheet and then again note the cell address. In VBA this would be

Cells.Find(What:="Table1", After:=ActiveCell, LookIn:=xlFormulas, LookAt _
        :=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext,
MatchCase:= _
        False, SearchFormat:=False).Activate

&

    Cells.FindNext(After:=ActiveCell).Activate

Is this possible through POI?

Kindly note that iterating through the entire sheet is not an option for me
as the data in an sheet may be very large.

Also If this is not possible using POI is there a way to interact with VBA
interface using java somehow?

Thanks for reading



-- 
View this message in context: http://www.nabble.com/How-to-perform-Find-and-Findnext-on-an-excel-sheet-using-POI-tp25721755p25721755.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: How to perform Find and Findnext on and excel sheet using POI

Posted by MSB <ma...@tiscali.co.uk>.
You're welcome. As I said, I do not have any idea of the way JExcel
approaches that task as it is a long time since I used the API - in fact I
think that feature did not even exist at the time. However, I know that
JExcel parses the Excel file to work with it and so suspect it will do just
that here; if so, it may have the same memory issues with large files. The
question is what information JExcel extracts from the file as it parses it;
many of the memory problems we see are caused by POI's extracting LOTS of
information from the file and making this available through the API in my
opinion.

All the best and if you need any more help, just post a question to the list
and I am confident someone will respond.

Yours

Mark B


BoBoMonkey wrote:
> 
> Thanks your point is noted, I have no clue as what approach is used by the
> jexcelapi. I would try to delve into the eventmodel api during free time.
> Thanks again for you insightful and helpful post.
> 
> 
> MSB wrote:
>> 
>> That may well look easy using JExcel but remember that, as David
>> intimated with regard to VBA, the API is actually hiding the
>> implementation from you. How do you know that JExcel is not reading in
>> the entire workbook and creating an in-memory copy that it can search -
>> think the DOM that can be used to manipulate in-memory version of xml
>> documents. You need to check very carefully to ensure that JExcel will
>> not gobble up memory if you are working with very large worksheets. Bear
>> in mind that I do not know what JExcel is doing so could very well be
>> criticising the API unfairly; my intention here is only to emphasise that
>> you need to test JExcel and place it under the same scrutiny.
>> 
>> Before choosing to abandon POI completely, may I suggest that you
>> consider the alternative methodology that the API offers - to quote from
>> the website;
>> 
>> "If you're merely reading spreadsheet data, then use the eventmodel api
>> in either the org.apache.poi.hssf.eventusermodel package, or the
>> org.apache.poi.xssf.eventusermodel package, depending on your file
>> format."
>> 
>> 
>> Have a look here;
>> 
>> http://poi.apache.org/spreadsheet/how-to.html#event_api
>> 
>> and it may well be the case that you can use a similar approach to parse
>> the Excel file searching for just the terms you want and outputting the
>> cell addresses for each. Being completely honest with you, I have never
>> used the event driven approach myself but do understand that several list
>> members have. Further, they have had considerable success working with
>> very large files in this manner and I am confident that they would be
>> only too happy to advise and assist if you chose this approach and needed
>> to post questions to the list.
>> 
>> To answer your final question, as far as I am aware, there is no
>> interoperability between VBA and Java. I would hazard a guess that the
>> only way to accomplish something like this would be either to use
>> something like Python to glue components together (and you could very
>> well still be limited to Windows platforms only) or to move to the .NET
>> platform and use Microsoft's unsupported (I believe) 'version' of Java.
>> That platform should allow you to create components using VBA and MS Java
>> and glue them together but even it could limit the platforms your code
>> will run on.
>> 
>> Yours
>> 
>> Mark B
>> 
>> 
>> BoBoMonkey wrote:
>>> 
>>> 
>>> 
>>> Thank I found the answer, it's easily possible through JExcelApi
>>> (http://jexcelapi.sourceforge.net/).
>>> 
>>> the code is as below.
>>> 
>>> 
>>> Cell tableStart=sheet.findCell(textToBeFound);
>>> 		startRow=tableStart.getRow();
>>> 		startCol=tableStart.getColumn();
>>> 		
>>> 		Cell tableEnd= sheet.findCell(textToBeFound,startRow, startCol, 64000,
>>> 100, false);
>>> 
>>>                 endRow=tableEnd.getRow();
>>> 		endCol=tableEnd.getColumn();
>>> 
>>> 
>> 
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/How-to-perform-Find-and-Findnext-on-an-excel-sheet-using-POI-tp25721755p25745972.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: How to perform Find and Findnext on and excel sheet using POI

Posted by BoBoMonkey <ma...@yahoo.com>.
Thanks your point is noted, I have no clue as what approach is used by the
jexcelapi. I would try to delve into the eventmodel api during free time.
Thanks again for you insightful and helpful post.


MSB wrote:
> 
> That may well look easy using JExcel but remember that, as David intimated
> with regard to VBA, the API is actually hiding the implementation from
> you. How do you know that JExcel is not reading in the entire workbook and
> creating an in-memory copy that it can search - think the DOM that can be
> used to manipulate in-memory version of xml documents. You need to check
> very carefully to ensure that JExcel will not gobble up memory if you are
> working with very large worksheets. Bear in mind that I do not know what
> JExcel is doing so could very well be criticising the API unfairly; my
> intention here is only to emphasise that you need to test JExcel and place
> it under the same scrutiny.
> 
> Before choosing to abandon POI completely, may I suggest that you consider
> the alternative methodology that the API offers - to quote from the
> website;
> 
> "If you're merely reading spreadsheet data, then use the eventmodel api in
> either the org.apache.poi.hssf.eventusermodel package, or the
> org.apache.poi.xssf.eventusermodel package, depending on your file
> format."
> 
> 
> Have a look here;
> 
> http://poi.apache.org/spreadsheet/how-to.html#event_api
> 
> and it may well be the case that you can use a similar approach to parse
> the Excel file searching for just the terms you want and outputting the
> cell addresses for each. Being completely honest with you, I have never
> used the event driven approach myself but do understand that several list
> members have. Further, they have had considerable success working with
> very large files in this manner and I am confident that they would be only
> too happy to advise and assist if you chose this approach and needed to
> post questions to the list.
> 
> To answer your final question, as far as I am aware, there is no
> interoperability between VBA and Java. I would hazard a guess that the
> only way to accomplish something like this would be either to use
> something like Python to glue components together (and you could very well
> still be limited to Windows platforms only) or to move to the .NET
> platform and use Microsoft's unsupported (I believe) 'version' of Java.
> That platform should allow you to create components using VBA and MS Java
> and glue them together but even it could limit the platforms your code
> will run on.
> 
> Yours
> 
> Mark B
> 
> 
> BoBoMonkey wrote:
>> 
>> 
>> 
>> Thank I found the answer, it's easily possible through JExcelApi
>> (http://jexcelapi.sourceforge.net/).
>> 
>> the code is as below.
>> 
>> 
>> Cell tableStart=sheet.findCell(textToBeFound);
>> 		startRow=tableStart.getRow();
>> 		startCol=tableStart.getColumn();
>> 		
>> 		Cell tableEnd= sheet.findCell(textToBeFound,startRow, startCol, 64000,
>> 100, false);
>> 
>>                 endRow=tableEnd.getRow();
>> 		endCol=tableEnd.getColumn();
>> 
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/How-to-perform-Find-and-Findnext-on-an-excel-sheet-using-POI-tp25721755p25736257.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: How to perform Find and Findnext on and excel sheet using POI

Posted by MSB <ma...@tiscali.co.uk>.
There are commercial products available but I do not have any idea what they
cost. The one I have heard most about is produced by Aspose and I am sure
you can find that by googling their name. With regard to other open source
options, then I have to admit that I do not know of another one, though I
have seen - on that reliable oracle the 'net - talk of PHP based solutions
but I cannot imagine how such would work. There are other solutions but it
very much depends upon what you want to do, what your requirements are re
performance and what platforms you want to deploy the resulting application
on to.

If you are ONLY going to run on Windows platforms then OLE\COM may well be
worth looking into. Using this technique through a library such as SWT, you
will actually be controlling an instance of the application. As a result,
you can do anything that the application allows, but there are problems. The
technique does not scale onto client server platforms easily (or indeed at
all), performance is slow as the application is executing in the background,
there is no way to programmatically capture and handle errors and, perhaps
most importantly, Microsoft themselves warn against trying to control an
application such as Word in this manner. In addition, you will need to get a
handle on VBA because you will actually be using - and forgive the crude
description please - a Java API to execute VBA macros.

OpenOffice is another alternative and they do have rather good filters that
can work with all but the more complex of Word files. The OpenOffice
application exposes an interface that can be controlled directly from Java
code and, unlike OLE\COM, the application is designed to function properly
in this configuration. Obviously, the concerns with regards to platforms do
not apply here but your application will still execute quite slowly (as
again the OpenOffice application will be running in the background), you
will need to deploy the OpenOffice libraries on order for the application to
work, the UNO interface's documentation is quite abstruse and can be
difficult to understand as it adopts a philosophy of exposing 'services'
that you interrogate for objects that can then be used to perform specific
tasks. To develop what are called UNO client applications, you will need to
download the OpenOffice SDK and the plug-in for your IDE (and I would most
definately recommend doing that). It would also help to register with the
oooforum.org as there are a great many code samples to be had from the
members of that forum along with good help and advice. Do not however
underestimate the steepness of the learning curve if you take this route.

If you want to look at the sort of challenge facing you, I have examples of
both the OLE\COM and OpenOffice UNO approaches that you are more than
welcome to. Just let me know.

Being VERY selfish, I would encourage you to consider helping develop HWPF
if you have the time. Ryan has done a great deal of the work already but it
still requires quite an effort to understand the OLE2CDF file format and how
to the code that does exist currently works; were I a braver (or should that
be more talented) man, I would certainly think about trying to contribute to
the development of the API.

Yours

Mark B


Zachary Mitchell, BCIS wrote:
> 
> I'm not planning to leave POI,
> just wondering,
> is there any other
> Java MSWORD/MSEXCEL
> READ/WRITE
> lib out there?
> 
> ----- Original Message ----- 
> From: "MSB" <ma...@tiscali.co.uk>
> To: <us...@poi.apache.org>
> Sent: Saturday, October 03, 2009 4:50 PM
> Subject: Re: How to perform Find and Findnext on and excel sheet using POI
> 
> 
>>
>> That may well look easy using JExcel but remember that, as David
>> intimated
>> with regard to VBA, the API is actually hiding the implementation from 
>> you.
>> How do you know that JExcel is not reading in the entire workbook and
>> creating an in-memory copy that it can search - think the DOM that can be
>> used to manipulate in-memory version of xml documents. You need to check
>> very carefully to ensure that JExcel will not gobble up memory if you are
>> working with very large worksheets. Bear in mind that I do not know what
>> JExcel is doing so could very well be criticising the API unfairly; my
>> intention here is only to emphasise that you need to test JExcel and
>> place
>> it under the same scrutiny.
>>
>> Before choosing to abandon POI completely, may I suggest that you
>> consider
>> the alternative methodology that the API offers - to quote from the 
>> website;
>>
>> "If you're merely reading spreadsheet data, then use the eventmodel api
>> in
>> either the org.apache.poi.hssf.eventusermodel package, or the
>> org.apache.poi.xssf.eventusermodel package, depending on your file 
>> format."
>>
>>
>> Have a look here;
>>
>> http://poi.apache.org/spreadsheet/how-to.html#event_api
>>
>> and it may well be the case that you can use a similar approach to parse 
>> the
>> Excel file searching for just the terms you want and outputting the cell
>> addresses for each. Being completely honest with you, I have never used 
>> the
>> event driven approach myself but do understand that several list members
>> have. Further, they have had considerable success working with very large
>> files in this manner and I am confident that they would be only too happy 
>> to
>> advise and assist if you chose this approach and needed to post questions 
>> to
>> the list.
>>
>> To answer your final question, as far as I am aware, there is no
>> interoperability between VBA and Java. I would hazard a guess that the 
>> only
>> way to accomplish something like this would be either to use something 
>> like
>> Python to glue components together (and you could very well still be 
>> limited
>> to Windows platforms only) or to move to the .NET platform and use
>> Microsoft's unsupported (I believe) 'version' of Java. That platform 
>> should
>> allow you to create components using VBA and MS Java and glue them 
>> together
>> but even it could limit the platforms your code will run on.
>>
>> Yours
>>
>> Mark B
>>
>>
>> BoBoMonkey wrote:
>>>
>>>
>>>
>>> Thank I found the answer, it's easily possible through JExcelApi
>>> (http://jexcelapi.sourceforge.net/).
>>>
>>> the code is as below.
>>>
>>>
>>> Cell tableStart=sheet.findCell(textToBeFound);
>>> startRow=tableStart.getRow();
>>> startCol=tableStart.getColumn();
>>>
>>> Cell tableEnd= sheet.findCell(textToBeFound,startRow, startCol, 64000,
>>> 100, false);
>>>
>>>                 endRow=tableEnd.getRow();
>>> endCol=tableEnd.getColumn();
>>>
>>>
>>
>> -- 
>> View this message in context: 
>> http://www.nabble.com/How-to-perform-Find-and-Findnext-on-an-excel-sheet-using-POI-tp25721755p25726325.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/How-to-perform-Find-and-Findnext-on-an-excel-sheet-using-POI-tp25721755p25727115.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: How to perform Find and Findnext on and excel sheet using POI

Posted by "Zachary Mitchell, BCIS" <za...@internode.on.net>.
I'm not planning to leave POI,
just wondering,
is there any other
Java MSWORD/MSEXCEL
READ/WRITE
lib out there?

----- Original Message ----- 
From: "MSB" <ma...@tiscali.co.uk>
To: <us...@poi.apache.org>
Sent: Saturday, October 03, 2009 4:50 PM
Subject: Re: How to perform Find and Findnext on and excel sheet using POI


>
> That may well look easy using JExcel but remember that, as David intimated
> with regard to VBA, the API is actually hiding the implementation from 
> you.
> How do you know that JExcel is not reading in the entire workbook and
> creating an in-memory copy that it can search - think the DOM that can be
> used to manipulate in-memory version of xml documents. You need to check
> very carefully to ensure that JExcel will not gobble up memory if you are
> working with very large worksheets. Bear in mind that I do not know what
> JExcel is doing so could very well be criticising the API unfairly; my
> intention here is only to emphasise that you need to test JExcel and place
> it under the same scrutiny.
>
> Before choosing to abandon POI completely, may I suggest that you consider
> the alternative methodology that the API offers - to quote from the 
> website;
>
> "If you're merely reading spreadsheet data, then use the eventmodel api in
> either the org.apache.poi.hssf.eventusermodel package, or the
> org.apache.poi.xssf.eventusermodel package, depending on your file 
> format."
>
>
> Have a look here;
>
> http://poi.apache.org/spreadsheet/how-to.html#event_api
>
> and it may well be the case that you can use a similar approach to parse 
> the
> Excel file searching for just the terms you want and outputting the cell
> addresses for each. Being completely honest with you, I have never used 
> the
> event driven approach myself but do understand that several list members
> have. Further, they have had considerable success working with very large
> files in this manner and I am confident that they would be only too happy 
> to
> advise and assist if you chose this approach and needed to post questions 
> to
> the list.
>
> To answer your final question, as far as I am aware, there is no
> interoperability between VBA and Java. I would hazard a guess that the 
> only
> way to accomplish something like this would be either to use something 
> like
> Python to glue components together (and you could very well still be 
> limited
> to Windows platforms only) or to move to the .NET platform and use
> Microsoft's unsupported (I believe) 'version' of Java. That platform 
> should
> allow you to create components using VBA and MS Java and glue them 
> together
> but even it could limit the platforms your code will run on.
>
> Yours
>
> Mark B
>
>
> BoBoMonkey wrote:
>>
>>
>>
>> Thank I found the answer, it's easily possible through JExcelApi
>> (http://jexcelapi.sourceforge.net/).
>>
>> the code is as below.
>>
>>
>> Cell tableStart=sheet.findCell(textToBeFound);
>> startRow=tableStart.getRow();
>> startCol=tableStart.getColumn();
>>
>> Cell tableEnd= sheet.findCell(textToBeFound,startRow, startCol, 64000,
>> 100, false);
>>
>>                 endRow=tableEnd.getRow();
>> endCol=tableEnd.getColumn();
>>
>>
>
> -- 
> View this message in context: 
> http://www.nabble.com/How-to-perform-Find-and-Findnext-on-an-excel-sheet-using-POI-tp25721755p25726325.html
> Sent from the POI - User mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: How to perform Find and Findnext on and excel sheet using POI

Posted by MSB <ma...@tiscali.co.uk>.
That may well look easy using JExcel but remember that, as David intimated
with regard to VBA, the API is actually hiding the implementation from you.
How do you know that JExcel is not reading in the entire workbook and
creating an in-memory copy that it can search - think the DOM that can be
used to manipulate in-memory version of xml documents. You need to check
very carefully to ensure that JExcel will not gobble up memory if you are
working with very large worksheets. Bear in mind that I do not know what
JExcel is doing so could very well be criticising the API unfairly; my
intention here is only to emphasise that you need to test JExcel and place
it under the same scrutiny.

Before choosing to abandon POI completely, may I suggest that you consider
the alternative methodology that the API offers - to quote from the website;

"If you're merely reading spreadsheet data, then use the eventmodel api in
either the org.apache.poi.hssf.eventusermodel package, or the
org.apache.poi.xssf.eventusermodel package, depending on your file format."


Have a look here;

http://poi.apache.org/spreadsheet/how-to.html#event_api

and it may well be the case that you can use a similar approach to parse the
Excel file searching for just the terms you want and outputting the cell
addresses for each. Being completely honest with you, I have never used the
event driven approach myself but do understand that several list members
have. Further, they have had considerable success working with very large
files in this manner and I am confident that they would be only too happy to
advise and assist if you chose this approach and needed to post questions to
the list.

To answer your final question, as far as I am aware, there is no
interoperability between VBA and Java. I would hazard a guess that the only
way to accomplish something like this would be either to use something like
Python to glue components together (and you could very well still be limited
to Windows platforms only) or to move to the .NET platform and use
Microsoft's unsupported (I believe) 'version' of Java. That platform should
allow you to create components using VBA and MS Java and glue them together
but even it could limit the platforms your code will run on.

Yours

Mark B


BoBoMonkey wrote:
> 
> 
> 
> Thank I found the answer, it's easily possible through JExcelApi
> (http://jexcelapi.sourceforge.net/).
> 
> the code is as below.
> 
> 
> Cell tableStart=sheet.findCell(textToBeFound);
> 		startRow=tableStart.getRow();
> 		startCol=tableStart.getColumn();
> 		
> 		Cell tableEnd= sheet.findCell(textToBeFound,startRow, startCol, 64000,
> 100, false);
> 
>                 endRow=tableEnd.getRow();
> 		endCol=tableEnd.getColumn();
> 
> 

-- 
View this message in context: http://www.nabble.com/How-to-perform-Find-and-Findnext-on-an-excel-sheet-using-POI-tp25721755p25726325.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: How to perform Find and Findnext on and excel sheet using POI

Posted by BoBoMonkey <ma...@yahoo.com>.


Thank I found the answer, it's easily possible through JExcelApi
(http://jexcelapi.sourceforge.net/).

the code is as below.


Cell tableStart=sheet.findCell(textToBeFound);
		startRow=tableStart.getRow();
		startCol=tableStart.getColumn();
		
		Cell tableEnd= sheet.findCell(textToBeFound,startRow, startCol, 64000,
100, false);

                endRow=tableEnd.getRow();
		endCol=tableEnd.getColumn();

-- 
View this message in context: http://www.nabble.com/How-to-perform-Find-and-Findnext-on-an-excel-sheet-using-POI-tp25721755p25722080.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: How to perform Find and Findnext on and excel sheet using POI

Posted by David Fisher <df...@jmlafferty.com>.
Hi,

> I need to search an excel sheet for specific text and then on  
> finding it I
> need to note the cell address (example a1:b2 etc) of the cell that  
> has the
> text. The I need to find the next occurrence of the same text within  
> that
> sheet and then again note the cell address. In VBA this would be
>
> Cells.Find(What:="Table1", After:=ActiveCell, LookIn:=xlFormulas,  
> LookAt _
>        :=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext,
> MatchCase:= _
>        False, SearchFormat:=False).Activate
>
> &
>
>    Cells.FindNext(After:=ActiveCell).Activate

VBA is hiding the details of its iteration.

> Is this possible through POI?

Yes, by iteration.

http://poi.apache.org/spreadsheet/quick-guide.html#Iterator

>
> Kindly note that iterating through the entire sheet is not an option  
> for me
> as the data in an sheet may be very large.

So it goes.

http://poi.apache.org/text-extraction.html

>
> Also If this is not possible using POI is there a way to interact  
> with VBA
> interface using java somehow?

Then you are on Windows and would need some type of Java / .net bridge.

Good luck.

Regards,
Dave

>
> Thanks for reading
>
>
>
> -- 
> View this message in context: http://www.nabble.com/How-to-perform-Find-and-Findnext-on-and-excel-sheet-using-POI-tp25721755p25721755.html
> Sent from the POI - User mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org