You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2016/11/27 09:57:26 UTC

[Bug 60422] New: DataFormatter.formatCellValue retunrs incorrect value for german 'Buchhaltung' format

https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

            Bug ID: 60422
           Summary: DataFormatter.formatCellValue retunrs incorrect value
                    for german 'Buchhaltung' format
           Product: POI
           Version: 3.15-FINAL
          Hardware: PC
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HSSF
          Assignee: dev@poi.apache.org
          Reporter: rbri@rbri.de
  Target Milestone: ---

Created attachment 34481
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34481&action=edit
the input file

I have some code that tries to reproduce the visible values of a spreadsheet.
So i made a small test sheet with cells using different formats.
Works fine so far.

But starting with version 3.14 i got strange results for one cell.

The cell is formatted as 'Buchhaltung' (see screenshot).
The (visible) value of the cell is '4,33 €' in excel and '0,004,,33.00 €' when
using poi.

My code is this / the locale used is Locale.GERMAN:

  /**
   * Reads the content of an excel cell and converts it into the string
   * visible in the excel sheet.
   *
   * @param aRow the row
   * @param aColumnsNo the column
   * @param aFormulaEvaluator the formula Evaluator
   * @param aLocale used for parsing and formating
   * @return the display string
   */
  public static String readCellContentAsString(final Row aRow, final int
aColumnsNo,
      final FormulaEvaluator aFormulaEvaluator, final Locale aLocale) {
    final Cell tmpCell = aRow.getCell(aColumnsNo);
    if (null == tmpCell) {
      return null;
    }

    final DataFormatter tmpDataFormatter = new DataFormatter(aLocale);
    try {
      final String tmpResult = tmpDataFormatter.formatCellValue(tmpCell,
aFormulaEvaluator);
      return tmpResult;
    } catch (final NotImplementedException e) {
      final StringBuilder tmpMsg = new StringBuilder(e.getMessage());
      if (null != e.getCause()) {
        tmpMsg.append(" (");
        tmpMsg.append(e.getCause().toString());
        tmpMsg.append(')');
      }
      LOG.error(tmpMsg.toString());
      final String tmpResult = tmpDataFormatter.formatCellValue(tmpCell, null);
      return tmpResult;
    }
  }

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #18 from PJ Fanning <fa...@yahoo.com> ---
I've published a new attempt at fixing this at
https://github.com/apache/poi/pull/62

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


Re: [Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by Greg Woolsey <gr...@gmail.com>.
I'm suspicious Excel is doing extra work to translate formats for display
to match the OS locale display settings even though the underlying storage
always uses English standards.

On Wed, Jun 28, 2017, 11:24 <bu...@apache.org> wrote:

> https://bz.apache.org/bugzilla/show_bug.cgi?id=60422
>
> --- Comment #16 from PJ Fanning <fa...@yahoo.com> ---
> Looking at wet_test.xls, the 4.33 cell has this format.
>
>  CELL_RECORD_TYPE
>     .formatindex     = aa
>
> This cell format is added as a custom format in FormatTrackingHSSFListener.
> The Format String is:
> _-* #,##0.00\ "€"_-;\-* #,##0.00\ "€"_-;_-* "-"??\ "€"_-;_-@_-
>
> This is not what Excel run in Germany locale on My Mac shows - it shows
> this
> for Accounting/Buchhaltung:
> _-* #.##0,00 €_-;-* #.##0,00 €_-;_-* "-"?? €_-;_-@_-
>
> Note that the decimal and grouping separators do match up.
>
> I don't know much about xls data format so I saved the file as an xlsx and
> checked the xl/styles.xml and the format string seems to match up to the
> #,##0.00 format as opposed to #.##0,00.
>
> Does anyone have any insight on this?
>
> --
> You are receiving this mail because:
> You are the assignee for the bug.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
> For additional commands, e-mail: dev-help@poi.apache.org
>
>

[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #16 from PJ Fanning <fa...@yahoo.com> ---
Looking at wet_test.xls, the 4.33 cell has this format.

 CELL_RECORD_TYPE
    .formatindex     = aa

This cell format is added as a custom format in FormatTrackingHSSFListener.
The Format String is:
_-* #,##0.00\ "€"_-;\-* #,##0.00\ "€"_-;_-* "-"??\ "€"_-;_-@_-

This is not what Excel run in Germany locale on My Mac shows - it shows this
for Accounting/Buchhaltung:
_-* #.##0,00 €_-;-* #.##0,00 €_-;_-* "-"?? €_-;_-@_-

Note that the decimal and grouping separators do match up.

I don't know much about xls data format so I saved the file as an xlsx and
checked the xl/styles.xml and the format string seems to match up to the
#,##0.00 format as opposed to #.##0,00.

Does anyone have any insight on this?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #22 from RBRi <rb...@rbri.de> ---
(In reply to PJ Fanning from comment #21)
> RBRi - would you be able to try the proposed fix that I have posted to
> https://github.com/apache/poi/pull/62 ?

Looks good, the new test passing.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

Tobias Wörenkämper <to...@gmx.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|DataFormatter.formatCellVal |DataFormatter.formatCellVal
                   |ue returns incorrect value  |ue returns incorrect value
                   |for german 'Buchhaltung'    |for German 'Buchhaltung'
                   |format                      |format

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue retunrs incorrect value for german 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #1 from RBRi <rb...@rbri.de> ---
Created attachment 34482
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34482&action=edit
Screenshot of the format dialog

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #20 from RBRi <rb...@rbri.de> ---
Created attachment 35092
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35092&action=edit
patch adding tests for german locale to TestDataFormatter

patch adding tests for german locale to TestDataFormatter

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for german 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

RBRi <rb...@rbri.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 OS|                            |All
            Summary|DataFormatter.formatCellVal |DataFormatter.formatCellVal
                   |ue retunrs incorrect value  |ue returns incorrect value
                   |for german 'Buchhaltung'    |for german 'Buchhaltung'
                   |format                      |format

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #6 from Andreas Beeker <ki...@apache.org> ---
Although I understand German ... I don't understand the potential fix :)

How about not using Locale.US directly, but setting it via
LocaleUtil.setUserLocale() before calling the formatter.

 ... so it looks like, the format strings are always parsed in a
Locale.US-manner?

Btw. it's ok to remind us, if a bug entry is important to you ... but we are
simply too few people to catch up with the 488 bugs.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #8 from Andreas Beeker <ki...@apache.org> ---
First of all, I know that DataFormatter is not trivial and we also had some
discussions about it [1]

When using an unspecific default locale, we usually use Locale.ROOT - but I'm
not sure, if using .ROOT or .US as the default for LocaleUtil.getUserLocale()
(instead of Locale.getDefault()) would be less pain, especially since there a
quite a few LocaleUtil invocations.

So how to proceed from here? ...
- check if Locale.US is working for other localized excel formats
- or what's wrong with the "Buchhaltung" format "_-* #.##0,00 "€"_-;-* #.##0,00
"€"_-;_-* "-"?? "€"_-;_-@_-" ... it looks like custom formats need to be
processed with the Locale of the file - where is it defined? (HPSF has
sometimes a Locale entry)



[1]
http://apache-poi.1045710.n5.nabble.com/DataFormatter-vs-org-apache-poi-ss-format-classes-td5721100.html

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #19 from RBRi <rb...@rbri.de> ---
Made a first patch for adding tests for german locale to the test. Please give
me a sign, if you like more of this.

Looks like the expectations for testPaddingSpaces() for us are also not
correct.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #25 from WI <wi...@yopmail.com> ---
And in writeFractional as well.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #13 from PJ Fanning <fa...@yahoo.com> ---
I like Javen's idea of adding something like a LocaleUtil set/getDefaultLocale.

I'm wondering if it would be useful to add extra tests like this sample I just
produced:

    @Test
    public void testDateFormattingWithLocales() {
        // 2017-12-01 09:54:33 which is 42747.412892397523 as double
        DataFormatter dfDE = new DataFormatter(Locale.GERMANY);
        DataFormatter dfZH = new DataFormatter(Locale.PRC);
        DataFormatter dfIE = new DataFormatter(new Locale("GA", "IE"));
        double date = 42747.412892397523;
        String format = "dd MMMM yyyy HH:mm:ss";
        assertEquals("12 Januar 2017 09:54:33",
dfDE.formatRawCellContents(date, -1, format));
        assertEquals("12 \u4E00\u6708 2017 09:54:33",
dfZH.formatRawCellContents(date, -1, format));
        assertEquals("12 Ean\u00E1ir 2017 09:54:33",
dfIE.formatRawCellContents(date, -1, format));
    }

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #2 from WI <wi...@yopmail.com> ---
C'mon guys, this library is useless if it only works for US/English. 8 months
passed and not a SINGLE response? Is this project dead and we should look for
alternatives?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #7 from PJ Fanning <fa...@yahoo.com> ---
Andreas - it seems like the existing code works if the user sets the POI
UserLocale to US.

import org.apache.poi.util.LocaleUtil;
LocaleUtil.setUserLocale(Locale.US);

Going forward, I think we should not require the user to make this LocaleUtil
call.

Would it make sense to have LocaleUtil user locale default to Locale.US?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #9 from Javen O'Neal <on...@apache.org> ---
The changes in https://github.com/apache/poi/pull/60 look fine to me. Should we
continue the discussion til after the 3.17 beta 1 release?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #24 from WI <wi...@yopmail.com> ---
FWIW:I think this bug/issue was introduced with the below commit 

https://bz.apache.org/bugzilla/show_bug.cgi?id=58326 (Commit 1701688 in the
SVN)

Index: CellNumberFormatter.java
===================================================================
--- CellNumberFormatter.java    (revision 1701687)
+++ CellNumberFormatter.java    (revision 1701688)
@@ -17,6 +17,7 @@
 package org.apache.poi.ss.format;

 import java.text.DecimalFormat;
+import java.text.DecimalFormatSymbols;
 import java.text.FieldPosition;
 import java.util.BitSet;
 import java.util.Collections;
@@ -30,6 +31,7 @@
 import java.util.regex.Matcher;

 import org.apache.poi.ss.format.CellFormatPart.PartHandler;
+import org.apache.poi.util.LocaleUtil;

 /**
  * This class implements printing out a value using a number format.
@@ -182,8 +184,8 @@
         private char insertSignForExponent;

         public String handlePart(Matcher m, String part, CellFormatType type,
-                StringBuffer desc) {
-            int pos = desc.length();
+                StringBuffer descBuf) {
+            int pos = descBuf.length();
             char firstCh = part.charAt(0);
             switch (firstCh) {
             case 'e':
@@ -203,7 +205,7 @@
             case '#':
                 if (insertSignForExponent != '\0') {
                     specials.add(new Special(insertSignForExponent, pos));
-                    desc.append(insertSignForExponent);
+                    descBuf.append(insertSignForExponent);
                     insertSignForExponent = '\0';
                     pos++;
                 }
@@ -354,7 +356,8 @@
             fmtBuf.append('E');
             placeZeros(fmtBuf, exponentSpecials.subList(2,
                     exponentSpecials.size()));
-            decimalFmt = new DecimalFormat(fmtBuf.toString());
+            DecimalFormatSymbols dfs =
DecimalFormatSymbols.getInstance(LocaleUtil.getUserLocale());
+            decimalFmt = new DecimalFormat(fmtBuf.toString(), dfs);
         }

         if (exponent != null)
@@ -594,7 +597,7 @@
             writeFraction(value, null, fractional, output, mods);
         } else {
             StringBuffer result = new StringBuffer();
-            Formatter f = new Formatter(result);
+            Formatter f = new Formatter(result, LOCALE);
             try {
                 f.format(LOCALE, printfFmt, value);
             } finally {
@@ -767,6 +770,7 @@
         writeInteger(exponentNum, output, exponentDigitSpecials, mods, false);
     }

+    @SuppressWarnings("unchecked")
     private void writeFraction(double value, StringBuffer result,
             double fractional, StringBuffer output, Set<StringMod> mods) {

@@ -869,7 +873,7 @@
             List<Special> numSpecials, Set<StringMod> mods) {

         StringBuffer sb = new StringBuffer();
-        Formatter formatter = new Formatter(sb);
+        Formatter formatter = new Formatter(sb, LOCALE);
         try {
             formatter.format(LOCALE, fmt, num);
         } finally {


Which change the way the locale is obtained. This I think brought the
"123.456,00" into the format, which then has brings new issues in
CellNumberFormatter.java inside the "writeInteger" method, because the "." and
"," are hardcoded there. 

As far as I get it, this method removes leading and trailing zeros from format
(i.e. 0000123,567.00" and adds/removes as well as replace the "," and "." with
locale specific symbols

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #15 from PJ Fanning <fa...@yahoo.com> ---
Actally, it seems that I'm wrong about the '.' in the format string - it does
seem that this should be a '.' in the result and that if you want ',' in the
output then your format must be something like '#.##0,00'.
I've changed my Mac's locale to Germany and the Accounting/Buchhaltung format
does appear to be '#.##0,00'. I'll need to double check the parsing code that I
used to extract the formatString from attached xls file because that seems to
give me a US locale formatString instead of the Germany equivalent.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #14 from PJ Fanning <fa...@yahoo.com> ---
I've been doing a bit more experimentation.
It seems that simpler formats are handled correctly, locale wise.
I think some of the main issues are:
* DataFormatter should set the LocaleUtils user locale to match the
DataFormatter locale if set (and if LocaleUtils does not have an explicit
locale set already)
* org.apache.poi.ss.format.CellNumberFormatter has a lot of hardcoding about
commas for grouping separators and dots for decimal separators - should use
DecimalFormatSymbols class
* org.apache.poi.ss.format.CellNumberFormatter also applies the separator from
the format string but my understanding is that "#,##0.00" means to apply locale
specific separators, that the '.' is assumed to mean apply locale specific
decimal separator as oppposed to specifically apply '.' as the decimal
separator.
* need to check if org.apache.poi.ss.format.CellNumberFormatter assumes 3
digits in each group - in India, they have lakhs and write 1 million (10 lakh)
as 10,00,000

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #21 from PJ Fanning <fa...@yahoo.com> ---
RBRi - would you be able to try the proposed fix that I have posted to
https://github.com/apache/poi/pull/62 ?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

PJ Fanning <fa...@yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #23 from PJ Fanning <fa...@yahoo.com> ---
https://svn.apache.org/viewvc?view=revision&revision=1800713

I will follow up with new issues to cover extra cases, eg Indian locale number
formatting, trying to get some of the commented out test cases to work in
TestDataFormatter and taking a look at reducing the amount the formatting code.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #4 from PJ Fanning <fa...@yahoo.com> ---
I added a test case and potential fix in https://github.com/apache/poi/pull/60

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

Javen O'Neal <on...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|HSSF                        |SS Common

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #11 from Javen O'Neal <on...@apache.org> ---
Calling LocaleUtil.setUserLocale inside the CellNumberFormatter.formatValue
means that the function has a side effect, which may be surprising to a user.
If we restore the UserLocale at the exit of formatValue, then there are still
side effects from parallel execution.

Are there parts of the Microsoft Office file format spec that use the user's OS
language and locale, some that use language and locale that are set in
Microsoft Office by the user or by the installation medium, and some that are
hard-coded to a particular value?

If these are distinct concepts, perhaps we should have different getters and
setters in LocaleUtil, and we can use the appropriate kind of locale (user, os,
hard-coded) throughout POI according to how Microsoft Office interprets the
files.

Rather than picking between ROOT and US as the default Locale, we could alias
the "Microsoft Excel default locale", using that locale throughout our code,
which would mean the only place we'd have to change it is where that alias is
assigned. We could also let the user change that alias if that's needed for any
reason.
setUserLocale and setDefaultLocale...

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #10 from Javen O'Neal <on...@apache.org> ---
Calling LocaleUtil.setUserLocale inside the CellNumberFormatter.formatValue
means that the function has a side effect, which may be surprising to a user.
If we restore the UserLocale at the exit of formatValue, then there are still
side effects from parallel execution.

Rather than picking between ROOT and US as the default Locale, we could alias
the "Microsoft Excel default locale", using that locale throughout our code,
which would mean the only place we'd have to change it is where that alias is
assigned. We could also let the user change that alias if that's needed for any
reason.
setUserLocale and setDefaultLocale...

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #17 from RBRi <rb...@rbri.de> ---
In general Excel respects the locale settings (at least on Win it is possible
to configure this independently from the locale to whatever you like). But i
have not seen any problems when exchanging documents between different
settings. So my good-feeling is:
The format string stored inside the file uses chars with a fixed semantic
(means there are some special chars like -,#,0, the dot, the comma and so on).

I would expect that the format string parser has a list of constants (chars)
acting as tokens for the parser.

> This cell format is added as a custom format in FormatTrackingHSSFListener.
> The Format String is:
> _-* #,##0.00\ "€"_-;\-* #,##0.00\ "€"_-;_-* "-"??\ "€"_-;_-@_-

The next step is the display. MS seems to have made the decision to always use
a localized display. If this is clever or not (at least when showing the format
string might be a different story).

> This is not what Excel run in Germany locale on My Mac shows - it shows this for > Accounting/Buchhaltung:
> _-* #.##0,00 €_-;-* #.##0,00 €_-;_-* "-"?? €_-;_-@_-

So i can confirm the effect here. If you like to proof this you can change the
decimal delimiter to e.g. 'y' and the display of the number in excel itself and
the display of the format string changes in my case it looks like this
_-* #.##0y00 €_-;-* #.##0y00 €_-;_-* "-"?? €_-;_-@_-

From the POI API point of view i like to see this behavior:
   final DataFormatter tmpDataFormatter = new DataFormatter(aLocale);
The locale provided here represents the windows system settings, if i change
eg.g the decimal separator here the following call of 
    final String tmpResult = tmpDataFormatter.formatCellValue(tmpCell,
aFormulaEvaluator);
should respect my separator.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #3 from Mark Murphy <jm...@apache.org> ---
(In reply to WI from comment #2)
> C'mon guys, this library is useless if it only works for US/English. 8
> months passed and not a SINGLE response? Is this project dead and we should
> look for alternatives?

I'm sorry, I don't know German, so it is difficult for me to help you. 

This project is certainly not dead. The devs here all have jobs other than POI,
and work on this in their free time, or provide patches that pertain directly
to their jobs. So our priorities may not be the same as yours. However, we are
happy to review patches submitted from the community. 

If you decide to contribute to the project, be sure to read the contribution
guidelines here https://poi.apache.org/guidelines.html, and include unit tests,
examples, and updates to the 'How To' document with your patch.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for german 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

RBRi <rb...@rbri.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |regression

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #12 from RBRi <rb...@rbri.de> ---
>- check if Locale.US is working for other localized excel formats
I think that is the base line of this problem. We have to understand if the
format description is locale dependent or not. If you need someone to provide
more german excel files you can count on me.

And regarding the locale. My program processes the excel file on a machine
different from the one in was created on (i guess this is the usual use case
for POI). This implies from my point of view, that i have to know the locale
setting of the creating machine if i like to reproduce the same presentations
of the cell values. That is the reason, for the locale i hand over to the
method and in the end to the DataFormatter constructor. From the user point of
view this is all i can do and all i like to do.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 60422] DataFormatter.formatCellValue returns incorrect value for German 'Buchhaltung' format

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60422

--- Comment #5 from RBRi <rb...@rbri.de> ---
> I'm sorry, I don't know German, so it is difficult for me to help you.
Maybe yes but that was the reason why i spend some time to make a reproducible
case for you. I have attached the spreadsheet, have added the complete source
code for reproducing the problem and switching the default locale of the JVM is
no rocket science.
I think for open source software we have to take special care of different user
locales - that is part of the story.
And maybe this is a hint for your ci test suite also; maybe it is a good idea
to do the regression test with different locales.
And yes i know that your (we - because i spend also a lot of time doing open
source software development) are short of time. But maybe a little hint or
response will be great or even better that you have tried to reproduce my case. 

Thanks for working on the this great project - it is a real help.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org