You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by David Law <Da...@apconsult.de> on 2010/02/19 10:15:30 UTC

Bug in HSSFCellStyle.getDataFormatString( ?? )

Hi,

well no one responded to my original posting, so I've researched some more.

It would seem there is a problem in HSSFCellStyle.getDataFormatString().
Delving even deeper, HSSFCellStyle relies on a static list from 
BuiltinFormats.
The concept of fixed standard Builtin formats used in poi does not seem 
to hold water.

I have created a test Excel under german Windows & Excel & documented
which format strings were used & what results are displayed in poi & Excel.

I used exclusively the Excel standard formats. No custom formats.

See the list below.

Of course poi should be returning a language-neutral format, which it does,
(DD.MMM.YYYY instead of, for example, the german TT.MMM.JJJJ)
but the lexical content should match, so it is incorrect to return m/d/yy,
when the format TT.MM.JJJJ was selected.

(By the way, perhaps not surprisingly, Open Office exhibits the same 
behaviour)

Opening my Test Excel on an english w2k with Office 97 (yes you have
just found the one person still using it/them!!), there are also some 
issues,
but the field formatted as 14.01.1900, with poi format m/d/yy, is displayed
as 14/01/1900 & not as poi would have done (1/14/00). The format is also
shown as being a custom format: dd/mm/yyyy.

Somehow Excel english seems to half-recognize things from a foreign Excel,
so maybe we can detect the ethnic origins of an Excel file too?

Does anyone have any thoughts on this?

Best regards,
DaveLaw

The following is best viewed in a fixed-width font:

Val Excel Displays As Poi Format (Number: >String<) >Excel (de_DE) 
format as<
--- ----------------- 
----------------------------------------------------- 
-------------------------
1 1 1: >0< >0<
2 2,00 2: >0.00< >0,00<
3 3 3: >#,##0< >#.##0<
4 4,00 4: >#,##0.00< >#.##0,00<
5 5 € 5: >$#,##0_);($#,##0)< >#.##0 €;-#.##0 €<
6 6 € 6: >$#,##0_);[Red]($#,##0)< >#.##0 €;[Rot]-#.##0 €<
7 7,00 € 7: >$#,##0.00_);($#,##0.00)< >#.##0,00 €;-#.##0,00 €<
8 8,00 € 8: >$#,##0.00_);[Red]($#,##0.00)< >#.##0,00 €;[Rot]-#.##0,00 €<
9 900% 9: >0%< >0%<
10 1000,00% 10: >0.00%< >0,00%<
11 1,10E+01 11: >0.00E+00< >0,00E+00<
12 12 12: ># ?/?< ># ?/?<
13 13 13: ># ??/??< ># ??/??<
14 14.01.1900 14: >m/d/yy< >TT.MM.JJJJ<
15 15. Jan 00 15: >d-mmm-yy< >TT. MMM JJ<
16 16. Jan 16: >d-mmm< >TT. MMM<
17 Jan 00 17: >mmm-yy< >MMM JJ<
18 12:00 AM 18: >h:mm AM/PM< >h:mm AM/PM<
19 12:00:00 AM 19: >h:mm:ss AM/PM< >h:mm:ss AM/PM<
20 00:00 20: >h:mm< >hh:mm<
21 00:00:00 21: >h:mm:ss< >hh:mm:ss<
22 22.01.1900 00:00 22: >m/d/yy h:mm< >TT.MM.JJJJ hh:mm<
23 23 37: >#,##0_);(#,##0)< >#.##0 _€;-#.##0 _€<
24 24 38: >#,##0_);[Red](#,##0)< >#.##0 _€;[Rot]-#.##0 _€<
25 25,00 39: >#,##0.00_);(#,##0.00)< >#.##0,00 _€;-#.##0,00 _€<
26 26,00 40: >#,##0.00_);[Red](#,##0.00)< >#.##0,00 _€;[Rot]-#.##0,00 _€<
27 27 41: >_(*#,##0_);_(*(#,##0);_(* "-"_);_(@_)< >_-* #.##0 _€_-;-* 
#.##0 _€_-;_-* "-" _€_-;_-@_-<
28 28 € 42: >_($*#,##0_);_($*(#,##0);_($* "-"_);_(@_)< >_-* #.##0 €_-;-* 
#.##0 €_-;_-* "-" €_-;_-@_-<
29 29,00 43: >_(*#,##0.00_);_(*(#,##0.00);_(*"-"??_);_(@_)< >_-* 
#.##0,00 _€_-;-* #.##0,00 _€_-;_-* "-"?? _€_-;_-@_-<
30 30,00 € 44: >_($*#,##0.00_);_($*(#,##0.00);_($*"-"??_);_(@_)< >_-* 
#.##0,00 €_-;-* #.##0,00 €_-;_-* "-"?? €_-;_-@_-<
31 00:00 45: >mm:ss< >mm:ss<
32 768:00:00 46: >[h]:mm:ss< >[h]:mm:ss<
33 00:00,0 47: >mm:ss.0< >mm:ss,0<
34 34,0E+0 48: >##0.0E+0< >##0,0E+0<
35 35 49: >@< >@<

David Law wrote DataFormatter.formatCellValue :
> Hi,
>
> I am using...
> DataFormatter.formatCellValue(Cell cell, FormulaEvaluator evaluator)
> ...to format a Cell containing a Date.
>
> My (german) Excel shows the Date as follows: "14. Jan"
> (with a space between the dot and the "J" & no quotes)
>
> cell.getCellStyle().getDataFormatString() returns "d-mmm"
> whereas Excel has a user-defined format of "TT. MMM" (german for "DD. 
> MMM").
>
> DataFormatter.formatCellValue(Cell cell, FormulaEvaluator evaluator) 
> returns "14-Jan"
>
> (Likewise, poi formatted Excels "14. Mrz" as "14-Mrz", so the language 
> formatting is ok)
>
> Is this a bug? Can anyone help me get the correct results?
>
> Best Regards,
> DaveLaw
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Bug in HSSFCellStyle.getDataFormatString( ?? )

Posted by Nick Burch <ni...@alfresco.com>.
On Fri, 19 Feb 2010, David Law wrote:
> It would seem there is a problem in HSSFCellStyle.getDataFormatString().
> Delving even deeper, HSSFCellStyle relies on a static list from 
> BuiltinFormats.
> The concept of fixed standard Builtin formats used in poi does not seem to 
> hold water.

It's not our concept. It's a core excel concept. The list of formats from 
BuiltinFormats comes straight out of the microsoft documentation. If you 
grab the latest microsoft docs (june 2008), and turn to page 372 you'll 
see that same list


> I have created a test Excel under german Windows & Excel & documented
> which format strings were used & what results are displayed in poi & Excel.

It's possible that newer versions of excel translate some of the built in 
formats into local equivalents. However, what's generally supposed to 
happen is that if you want the german format, you create a custom format 
of that kind.

POI's only a file format library, not a replacement for excel. We give you 
what's in the file.

If you fancy producing a patch which allows per-locale versions of the 
built in formats, I think we'd be happy to accept it, but it'll be a lot 
of work for you switching your machine between each one to discover the 
magic that excel does and which isn't in the documentation...

> (By the way, perhaps not surprisingly, Open Office exhibits the same 
> behaviour)

They almost certainly read the same public documentation that we did!

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Bug in HSSFCellStyle.getDataFormatString( ?? )

Posted by David Law <Da...@apconsult.de>.
oh, I just love systems that reformat stuff...
...computers (in particular microsoft) should leave the thinking to 
human beings!

Pls find formatted list in attachment. :-)

DaveLaw

David Law wrote:
> Hi,
>
> well no one responded to my original posting, so I've researched some 
> more.
>
> It would seem there is a problem in HSSFCellStyle.getDataFormatString().
> Delving even deeper, HSSFCellStyle relies on a static list from 
> BuiltinFormats.
> The concept of fixed standard Builtin formats used in poi does not 
> seem to hold water.
>
> I have created a test Excel under german Windows & Excel & documented
> which format strings were used & what results are displayed in poi & 
> Excel.
>
> I used exclusively the Excel standard formats. No custom formats.
>
> See the list below.
>
> Of course poi should be returning a language-neutral format, which it 
> does,
> (DD.MMM.YYYY instead of, for example, the german TT.MMM.JJJJ)
> but the lexical content should match, so it is incorrect to return 
> m/d/yy,
> when the format TT.MM.JJJJ was selected.
>
> (By the way, perhaps not surprisingly, Open Office exhibits the same 
> behaviour)
>
> Opening my Test Excel on an english w2k with Office 97 (yes you have
> just found the one person still using it/them!!), there are also some 
> issues,
> but the field formatted as 14.01.1900, with poi format m/d/yy, is 
> displayed
> as 14/01/1900 & not as poi would have done (1/14/00). The format is also
> shown as being a custom format: dd/mm/yyyy.
>
> Somehow Excel english seems to half-recognize things from a foreign 
> Excel,
> so maybe we can detect the ethnic origins of an Excel file too?
>
> Does anyone have any thoughts on this?
>
> Best regards,
> DaveLaw
>
> The following is best viewed in a fixed-width font:
>
> Val Excel Displays As Poi Format (Number: >String<) >Excel (de_DE) 
> format as<
> --- ----------------- 
> ----------------------------------------------------- 
> -------------------------
> 1 1 1: >0< >0<
> 2 2,00 2: >0.00< >0,00<
> 3 3 3: >#,##0< >#.##0<
> 4 4,00 4: >#,##0.00< >#.##0,00<
> 5 5 € 5: >$#,##0_);($#,##0)< >#.##0 €;-#.##0 €<
> 6 6 € 6: >$#,##0_);[Red]($#,##0)< >#.##0 €;[Rot]-#.##0 €<
> 7 7,00 € 7: >$#,##0.00_);($#,##0.00)< >#.##0,00 €;-#.##0,00 €<
> 8 8,00 € 8: >$#,##0.00_);[Red]($#,##0.00)< >#.##0,00 €;[Rot]-#.##0,00 €<
> 9 900% 9: >0%< >0%<
> 10 1000,00% 10: >0.00%< >0,00%<
> 11 1,10E+01 11: >0.00E+00< >0,00E+00<
> 12 12 12: ># ?/?< ># ?/?<
> 13 13 13: ># ??/??< ># ??/??<
> 14 14.01.1900 14: >m/d/yy< >TT.MM.JJJJ<
> 15 15. Jan 00 15: >d-mmm-yy< >TT. MMM JJ<
> 16 16. Jan 16: >d-mmm< >TT. MMM<
> 17 Jan 00 17: >mmm-yy< >MMM JJ<
> 18 12:00 AM 18: >h:mm AM/PM< >h:mm AM/PM<
> 19 12:00:00 AM 19: >h:mm:ss AM/PM< >h:mm:ss AM/PM<
> 20 00:00 20: >h:mm< >hh:mm<
> 21 00:00:00 21: >h:mm:ss< >hh:mm:ss<
> 22 22.01.1900 00:00 22: >m/d/yy h:mm< >TT.MM.JJJJ hh:mm<
> 23 23 37: >#,##0_);(#,##0)< >#.##0 _€;-#.##0 _€<
> 24 24 38: >#,##0_);[Red](#,##0)< >#.##0 _€;[Rot]-#.##0 _€<
> 25 25,00 39: >#,##0.00_);(#,##0.00)< >#.##0,00 _€;-#.##0,00 _€<
> 26 26,00 40: >#,##0.00_);[Red](#,##0.00)< >#.##0,00 _€;[Rot]-#.##0,00 _€<
> 27 27 41: >_(*#,##0_);_(*(#,##0);_(* "-"_);_(@_)< >_-* #.##0 _€_-;-* 
> #.##0 _€_-;_-* "-" _€_-;_-@_-<
> 28 28 € 42: >_($*#,##0_);_($*(#,##0);_($* "-"_);_(@_)< >_-* #.##0 
> €_-;-* #.##0 €_-;_-* "-" €_-;_-@_-<
> 29 29,00 43: >_(*#,##0.00_);_(*(#,##0.00);_(*"-"??_);_(@_)< >_-* 
> #.##0,00 _€_-;-* #.##0,00 _€_-;_-* "-"?? _€_-;_-@_-<
> 30 30,00 € 44: >_($*#,##0.00_);_($*(#,##0.00);_($*"-"??_);_(@_)< >_-* 
> #.##0,00 €_-;-* #.##0,00 €_-;_-* "-"?? €_-;_-@_-<
> 31 00:00 45: >mm:ss< >mm:ss<
> 32 768:00:00 46: >[h]:mm:ss< >[h]:mm:ss<
> 33 00:00,0 47: >mm:ss.0< >mm:ss,0<
> 34 34,0E+0 48: >##0.0E+0< >##0,0E+0<
> 35 35 49: >@< >@<
>
> David Law wrote DataFormatter.formatCellValue :
>> Hi,
>>
>> I am using...
>> DataFormatter.formatCellValue(Cell cell, FormulaEvaluator evaluator)
>> ...to format a Cell containing a Date.
>>
>> My (german) Excel shows the Date as follows: "14. Jan"
>> (with a space between the dot and the "J" & no quotes)
>>
>> cell.getCellStyle().getDataFormatString() returns "d-mmm"
>> whereas Excel has a user-defined format of "TT. MMM" (german for "DD. 
>> MMM").
>>
>> DataFormatter.formatCellValue(Cell cell, FormulaEvaluator evaluator) 
>> returns "14-Jan"
>>
>> (Likewise, poi formatted Excels "14. Mrz" as "14-Mrz", so the 
>> language formatting is ok)
>>
>> Is this a bug? Can anyone help me get the correct results?
>>
>> Best Regards,
>> DaveLaw
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>