You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Henry Martin <hh...@mac.com> on 2007/07/11 19:36:14 UTC

Re: HSSFCell.getRichStringCellValue() doesnt read properly Greek Polytonic characters

On Jul 11, 2007, at 11:21 AM, Filippos Papadopoulos wrote:

> Hi,
> i am using POI 3.0.1-FINAL-20070705 and i am trying to read polytonic 
> Greek words written inside Excel cells. I use the following code to 
> print each cell. The problem is that although getRichStringCellValue() 
> reads perfectly modern Greek, it renders as '?' the Greek polytonic 
> characters (which have accents). Am i doing something wrong, or is 
> this a known problem?
>
> HSSFCell cell = row.getCell((short) j);
> int cellType = cell.getCellType();
>                                   if (cellType == 
> HSSFCell.CELL_TYPE_BOOLEAN) {
> ...
>                    } else if (cellType == HSSFCell.CELL_TYPE_NUMERIC) {
> ...
>                    } else if (cellType == HSSFCell.CELL_TYPE_STRING) {
>                        System.out.print(" " + 
> cell.getRichStringCellValue());
> }
>

I am using the event API so the code is different and I can't say off 
the top of my head if this will be at all useful to you. In case it is 
helpful, here is a segment of the code I am using to extract Arabic 
text from a spreadsheet and get it into a text file.

The last couple of lines, which get the bytes in the UTF-8 encoding and 
then create a new String, seemed to be necessary at some point although 
I can't tell you why at the moment. I remember getting the question 
marks and my recollection is that these lines were added to fix that 
problem. They could also be there because in the full code there is an 
option to convert the string to a hexadecimal representation and doing 
that requires pulling the bytes out of the string.

     public void processRecord(Record record )
     {
	 String outStr;
         String compositeID = "1234";			// just to have a value below

         switch (record.getSid())
         {
             case LabelSSTRecord.sid:
                 LabelSSTRecord lrec = (LabelSSTRecord) record;
                 String cellVal = sstrec.getString(lrec.getSSTIndex());

                if (lrec.getColumn() == 1) {
                     arabicText = new String( cellVal.trim() );

                     //System.out.println(" output from arabic text 
parse");
                     // fix Farsi keyboard KAF and YEH
                     arabicText = arabicText.replaceAll("\u06a9", 
"\u0643");
                     arabicText = arabicText.replaceAll("\u06cc", 
"\u064a");
                     byte[] utf8Bytes = 
(arabicText+compositeID).getBytes("UTF-8");
                     outStr = new String( utf8Bytes );

                }
         }
     }



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org