You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Nick Burch <ni...@torchbox.com> on 2008/05/01 12:24:10 UTC

Re: Blank Cells

On Wed, 30 Apr 2008, Tom Holmes Jr. wrote:
> However, this only gets back cells with data in them ... so I don't know 
> why there is even a CELL_TYPE_BLANK.

If you put some data into a cell, then delete it again, excel will 
normally still store the cell in the file, but with type blank. If you 
have never put anything in the cell, then it won't appear in the file

With OOXML, excel is better at removing cells with no contents from the 
file, but you do still sometimes get cells of type blank.

> The API docs don't offer much information on this. I presumed that 
> getting the cell iterator would internally handle these null cells and 
> they would register as CELL_TYPE_BLANK.

There aren't "null cells" in the file. For example, there will be a record 
for the cell in row 0 column 0, then the next record is for row 0 column 
3. There is nothing in the file for row 0 columns 1 and 2. So, there's 
nothing for the iterator to return. If you ask the row for column 1, it'll 
give you back a null, as there's nothing there in the file

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Blank Cells

Posted by Sebastian Frehmel <uh...@stud.uni-karlsruhe.de>.
I agree with you. I did not think of threads though using them nearly 
all the time :)
Assigning the getters/setters to the workbook looks very plausible.

Thanks Nick!


Zitat von Nick Burch <ni...@torchbox.com>:

> On Wed, 21 May 2008, Daniel Noll wrote:
>>> But in my eyes having a (static) method called 
>>> HSSFCell.setMissingCellPolicy(MissingCellPolicy policy) in addition 
>>> to the normal HSSFCell getCell(short cellNum) would do a better job 
>>> than the MissingCellPolicy being set everytime getCell is called. 
>>> The standard MissingCellPolicy would be set to 
>>> RETURN_NULL_AND_BLANK. Existing code could be edited with minimal 
>>> effort using my suggestion.
>>
>> Personally I dislike static methods.  You never know when some other 
>> code running in the same JVM will set a value which you thought was 
>> untouchable.
>
> I agree. I could well see the case where you have two servlets on a 
> machine, one wanting one policy, one wanting another. POI needs to be 
> thread safe, and I have previously rejected patches that broke this.
>
>> Alternatively there could be some way of setting it via the 
>> top-level workbook object (that way your workbook is isolated from 
>> one opened by someone else's code which has different 
>> requirements...) but I'm not sure how plausible that is.
>
> HSSFRow has a reference to its parent HSSFWorkbook, so it should be 
> fairly easy. We'd then have:
>  row.getCell(num) - existing method, would now use policy off workbook
>  row.getCell(num,Policy) - new method, uses specified policy
>  workbook.getPolicy() - default workbook policy is as now (null & blanks)
>  workbook.setPolicy(Policy) - sets a new policy, used by all calls after that
>
> Does that look sensible to everyone?
>
> Nick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>




---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Blank Cells

Posted by Nick Burch <ni...@torchbox.com>.
On Wed, 21 May 2008, David Fisher wrote:
> Though you might call the workbook routines:
>
> getMissingCellPolicy and setMissingCellPolicy.

Yes, good call. I didn't think too much about the name before writing the 
email :/

Support for this is now in trunk, and the ooxml branch


Hmm, we're now up to 11 entries in the changelog since 3.1 beta 2, and 
that isn't even out yet. Sorry Yegor...

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Blank Cells

Posted by David Fisher <df...@jmlafferty.com>.
On May 21, 2008, at 4:27 AM, Nick Burch wrote:

> On Wed, 21 May 2008, Daniel Noll wrote:
>>> But in my eyes having a (static) method called  
>>> HSSFCell.setMissingCellPolicy(MissingCellPolicy policy) in  
>>> addition to the normal HSSFCell getCell(short cellNum) would do a  
>>> better job than the MissingCellPolicy being set everytime getCell  
>>> is called. The standard MissingCellPolicy would be set to  
>>> RETURN_NULL_AND_BLANK. Existing code could be edited with minimal  
>>> effort using my suggestion.
>>
>> Personally I dislike static methods.  You never know when some  
>> other code running in the same JVM will set a value which you  
>> thought was untouchable.
>
> I agree. I could well see the case where you have two servlets on a  
> machine, one wanting one policy, one wanting another. POI needs to  
> be thread safe, and I have previously rejected patches that broke  
> this.
>
>> Alternatively there could be some way of setting it via the top- 
>> level workbook object (that way your workbook is isolated from one  
>> opened by someone else's code which has different requirements...)  
>> but I'm not sure how plausible that is.
>
> HSSFRow has a reference to its parent HSSFWorkbook, so it should be  
> fairly easy. We'd then have:
> row.getCell(num) - existing method, would now use policy off workbook
> row.getCell(num,Policy) - new method, uses specified policy
> workbook.getPolicy() - default workbook policy is as now (null &  
> blanks)
> workbook.setPolicy(Policy) - sets a new policy, used by all calls  
> after that
>
> Does that look sensible to everyone?

+1

Though you might call the workbook routines:

getMissingCellPolicy and setMissingCellPolicy.

Regards,
Dave

>
>
> Nick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Blank Cells

Posted by Nick Burch <ni...@torchbox.com>.
On Wed, 21 May 2008, Daniel Noll wrote:
>> But in my eyes having a (static) method called 
>> HSSFCell.setMissingCellPolicy(MissingCellPolicy policy) in addition to 
>> the normal HSSFCell getCell(short cellNum) would do a better job than 
>> the MissingCellPolicy being set everytime getCell is called. The 
>> standard MissingCellPolicy would be set to RETURN_NULL_AND_BLANK. 
>> Existing code could be edited with minimal effort using my suggestion.
>
> Personally I dislike static methods.  You never know when some other 
> code running in the same JVM will set a value which you thought was 
> untouchable.

I agree. I could well see the case where you have two servlets on a 
machine, one wanting one policy, one wanting another. POI needs to be 
thread safe, and I have previously rejected patches that broke this.

> Alternatively there could be some way of setting it via the top-level 
> workbook object (that way your workbook is isolated from one opened by 
> someone else's code which has different requirements...) but I'm not 
> sure how plausible that is.

HSSFRow has a reference to its parent HSSFWorkbook, so it should be fairly 
easy. We'd then have:
  row.getCell(num) - existing method, would now use policy off workbook
  row.getCell(num,Policy) - new method, uses specified policy
  workbook.getPolicy() - default workbook policy is as now (null & blanks)
  workbook.setPolicy(Policy) - sets a new policy, used by all calls after that

Does that look sensible to everyone?

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Blank Cells

Posted by Daniel Noll <da...@nuix.com>.
On Wednesday 21 May 2008 03:03:00 Sebastian Frehmel wrote:
> Hi Nick
>
> That's really great!
> But in my eyes having a (static) method called
> HSSFCell.setMissingCellPolicy(MissingCellPolicy policy) in addition to
> the normal HSSFCell getCell(short cellNum) would do a better job than
> the MissingCellPolicy being set everytime getCell is called.
> The standard MissingCellPolicy would be set to RETURN_NULL_AND_BLANK.
> Existing code could be edited with minimal effort using my suggestion.
>
> What do you think?

Personally I dislike static methods.  You never know when some other code 
running in the same JVM will set a value which you thought was untouchable.  
If you need to use the same value for all calls you can always put a constant 
in your own code.

Alternatively there could be some way of setting it via the top-level workbook 
object (that way your workbook is isolated from one opened by someone else's 
code which has different requirements...) but I'm not sure how plausible that 
is.

Daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Blank Cells

Posted by Sebastian Frehmel <uh...@stud.uni-karlsruhe.de>.
Hi Nick

That's really great!
But in my eyes having a (static) method called 
HSSFCell.setMissingCellPolicy(MissingCellPolicy policy) in addition to 
the normal HSSFCell getCell(short cellNum) would do a better job than 
the MissingCellPolicy being set everytime getCell is called.
The standard MissingCellPolicy would be set to RETURN_NULL_AND_BLANK. 
Existing code could be edited with minimal effort using my suggestion.

What do you think?

Thanks


Zitat von Nick Burch <ni...@torchbox.com>:

> On Tue, 6 May 2008, Anthony Andrews wrote:
>> To keep things clean, how about an overloaded version of the
>> getCell(int) method that takes a second parameter so getCell(int, int)
>> for example; manifest constants could be declared as well so that the
>> user could choose how to handle the cell if a null value was returned.
>
> OK, I've added something like this to svn trunk. There's now a new method:
> 	public HSSFCell getCell(int cellnum, MissingCellPolicy policy)
>
> The policies are below:
>
> /** Missing cells are returned as null, Blank cells are returned as normal */
> public static final MissingCellPolicy RETURN_NULL_AND_BLANK
> /** Missing cells are returned as null, as are blank cells */
> public static final MissingCellPolicy RETURN_BLANK_AS_NULL
> /** A new, blank cell is created for missing cells. Blank cells are
> returned as normal */
> public static final MissingCellPolicy CREATE_NULL_AS_BLANK
>
>
> Hopefully that covers everything you'd need?
>
> Nick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>




---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Blank Cells

Posted by Nick Burch <ni...@torchbox.com>.
On Tue, 6 May 2008, Anthony Andrews wrote:
> To keep things clean, how about an overloaded version of the
> getCell(int) method that takes a second parameter so getCell(int, int)
> for example; manifest constants could be declared as well so that the
> user could choose how to handle the cell if a null value was returned.

OK, I've added something like this to svn trunk. There's now a new method:
	public HSSFCell getCell(int cellnum, MissingCellPolicy policy)

The policies are below:

/** Missing cells are returned as null, Blank cells are returned as normal */
public static final MissingCellPolicy RETURN_NULL_AND_BLANK
/** Missing cells are returned as null, as are blank cells */
public static final MissingCellPolicy RETURN_BLANK_AS_NULL
/** A new, blank cell is created for missing cells. Blank cells are
returned as normal */
public static final MissingCellPolicy CREATE_NULL_AS_BLANK


Hopefully that covers everything you'd need?

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Blank Cells

Posted by Anthony Andrews <py...@yahoo.com>.
Nick

To keep things clean, how about an overloaded version of the getCell(int) method that takes a second parameter so getCell(int, int) for example; manifest constants could be declared as well so that the user could choose how to handle the cell if a null value was returned.

Nick Burch <ni...@torchbox.com> wrote: On Mon, 5 May 2008, Daniel Noll wrote:
> > So a blank cell really is different from a null cell, as you can just
> > write straight into a blank cell, while a null cell needs to be created
> > first
>
> This is still exposing the underlying details if you ask me.  In an
> ideal world, getCell(int) would, for a null cell, either create the cell
> for us, or return something with the same interface which creates the
> real thing if you set something on it.

Would a "getOrCreateCell(int)" be sufficient for you? That'd check to see
if the get returned null, and create+return if so

Some people will wnat to know the difference between null, blank or normal
cells, so I'm not keen on breaking things for them by default

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org



       
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.

Re: Blank Cells

Posted by Nick Burch <ni...@torchbox.com>.
On Mon, 5 May 2008, Daniel Noll wrote:
> > So a blank cell really is different from a null cell, as you can just
> > write straight into a blank cell, while a null cell needs to be created
> > first
>
> This is still exposing the underlying details if you ask me.  In an
> ideal world, getCell(int) would, for a null cell, either create the cell
> for us, or return something with the same interface which creates the
> real thing if you set something on it.

Would a "getOrCreateCell(int)" be sufficient for you? That'd check to see
if the get returned null, and create+return if so

Some people will wnat to know the difference between null, blank or normal
cells, so I'm not keen on breaking things for them by default

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Blank Cells

Posted by Anthony Andrews <py...@yahoo.com>.
OK, thats a perfectly valid point of view and I can understand where you are coming from. However I do still believe that asking the usermodel to handle null cells in such a way is quite specific to a particular application.

Daniel Noll <da...@nuix.com> wrote: On Monday 05 May 2008 17:52:54 Anthony Andrews wrote:
> I would argue that you could create a layer to sit over the API that makes
> these types of decisions

Indeed, it's just that for other purposes, usermodel pretends to be such a 
layer.  If you want your "direct exposure of the XLS file with no convenience 
to the developer whatsoever", then use the underlying model, not usermodel.

Daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org



       
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.

Re: Blank Cells

Posted by Daniel Noll <da...@nuix.com>.
On Monday 05 May 2008 17:52:54 Anthony Andrews wrote:
> I would argue that you could create a layer to sit over the API that makes
> these types of decisions

Indeed, it's just that for other purposes, usermodel pretends to be such a 
layer.  If you want your "direct exposure of the XLS file with no convenience 
to the developer whatsoever", then use the underlying model, not usermodel.

Daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Blank Cells

Posted by Anthony Andrews <py...@yahoo.com>.
Sorry Daniel, I have to disagree.

My view of POI is that it simply (???) exposes the .xls file to an application. If it were to function as you suggest then it would be making assumptions on the part of the user - the application in this case - and your requirements may differ from mine. Surely it is our role as developers to decide how to handle the information POI exposes. I would argue that you could create a layer to sit over the API that makes these types of decisions - after all that is what we would do if coding an application that uses a large relational database for example.

Daniel Noll <da...@nuix.com> wrote: On Friday 02 May 2008 20:56:02 Nick Burch wrote:
> If you wanted to set all of those cells to hold the value 42.0, you'd do:
>
> row.getCell(0).setCellValue(42.0)
> row.getCell(1).setCellValue(42.0)
> row.getCell(2).setCellValue(42.0)
> row.createCell(3).setCellValue(42.0)
> row.getCell(4).setCellValue(42.0)
>
> So a blank cell really is different from a null cell, as you can just
> write straight into a blank cell, while a null cell needs to be created
> first

This is still exposing the underlying details if you ask me.  In an ideal 
world, getCell(int) would, for a null cell, either create the cell for us, or 
return something with the same interface which creates the real thing if you 
set something on it.

Daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org



       
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.

Re: Blank Cells

Posted by Daniel Noll <da...@nuix.com>.
On Friday 02 May 2008 20:56:02 Nick Burch wrote:
> If you wanted to set all of those cells to hold the value 42.0, you'd do:
>
> row.getCell(0).setCellValue(42.0)
> row.getCell(1).setCellValue(42.0)
> row.getCell(2).setCellValue(42.0)
> row.createCell(3).setCellValue(42.0)
> row.getCell(4).setCellValue(42.0)
>
> So a blank cell really is different from a null cell, as you can just
> write straight into a blank cell, while a null cell needs to be created
> first

This is still exposing the underlying details if you ask me.  In an ideal 
world, getCell(int) would, for a null cell, either create the cell for us, or 
return something with the same interface which creates the real thing if you 
set something on it.

Daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Blank Cells

Posted by Nick Burch <ni...@torchbox.com>.
On Fri, 2 May 2008, Daniel Noll wrote:
> This is fair enough at the model level, but having it in usermodel is 
> another story entirely.  It seems to me BLANK and null mean the same 
> thing for all intents and purposes, and yet the specifics of the file 
> format are being pushed all the way to the top level. :-)

Alas not quite. Consider the case:
   1.2   "String"  (blank)   (null)   4.2

If you wanted to set all of those cells to hold the value 42.0, you'd do:

row.getCell(0).setCellValue(42.0)
row.getCell(1).setCellValue(42.0)
row.getCell(2).setCellValue(42.0)
row.createCell(3).setCellValue(42.0)
row.getCell(4).setCellValue(42.0)

So a blank cell really is different from a null cell, as you can just 
write straight into a blank cell, while a null cell needs to be created 
first


I'll tweak the javadocs of the iterators to make it clearer about null 
cells, just as soon as svn comes back up...

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Blank Cells

Posted by Daniel Noll <da...@nuix.com>.
On Thursday 01 May 2008 20:24:10 Nick Burch wrote:
> On Wed, 30 Apr 2008, Tom Holmes Jr. wrote:
> > However, this only gets back cells with data in them ... so I don't know
> > why there is even a CELL_TYPE_BLANK.
>
> If you put some data into a cell, then delete it again, excel will
> normally still store the cell in the file, but with type blank. If you
> have never put anything in the cell, then it won't appear in the file

This is fair enough at the model level, but having it in usermodel is another 
story entirely.  It seems to me BLANK and null mean the same thing for all 
intents and purposes, and yet the specifics of the file format are being 
pushed all the way to the top level. :-)

> > The API docs don't offer much information on this. I presumed that
> > getting the cell iterator would internally handle these null cells and
> > they would register as CELL_TYPE_BLANK.

+1, it would be nice if usermodel would either (a) replace missing cells with 
blank cells, or (b) treat all missing/blank cells as null.

For the user of the API, the way it is now means checking for null and then 
checking for blank (and then if it's text, checking for empty text...)

Daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org