You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Henry Martin <hh...@mac.com> on 2007/07/11 19:36:14 UTC
Re: HSSFCell.getRichStringCellValue() doesnt read properly Greek Polytonic characters
On Jul 11, 2007, at 11:21 AM, Filippos Papadopoulos wrote:
> Hi,
> i am using POI 3.0.1-FINAL-20070705 and i am trying to read polytonic
> Greek words written inside Excel cells. I use the following code to
> print each cell. The problem is that although getRichStringCellValue()
> reads perfectly modern Greek, it renders as '?' the Greek polytonic
> characters (which have accents). Am i doing something wrong, or is
> this a known problem?
>
> HSSFCell cell = row.getCell((short) j);
> int cellType = cell.getCellType();
> if (cellType ==
> HSSFCell.CELL_TYPE_BOOLEAN) {
> ...
> } else if (cellType == HSSFCell.CELL_TYPE_NUMERIC) {
> ...
> } else if (cellType == HSSFCell.CELL_TYPE_STRING) {
> System.out.print(" " +
> cell.getRichStringCellValue());
> }
>
I am using the event API so the code is different and I can't say off
the top of my head if this will be at all useful to you. In case it is
helpful, here is a segment of the code I am using to extract Arabic
text from a spreadsheet and get it into a text file.
The last couple of lines, which get the bytes in the UTF-8 encoding and
then create a new String, seemed to be necessary at some point although
I can't tell you why at the moment. I remember getting the question
marks and my recollection is that these lines were added to fix that
problem. They could also be there because in the full code there is an
option to convert the string to a hexadecimal representation and doing
that requires pulling the bytes out of the string.
public void processRecord(Record record )
{
String outStr;
String compositeID = "1234"; // just to have a value below
switch (record.getSid())
{
case LabelSSTRecord.sid:
LabelSSTRecord lrec = (LabelSSTRecord) record;
String cellVal = sstrec.getString(lrec.getSSTIndex());
if (lrec.getColumn() == 1) {
arabicText = new String( cellVal.trim() );
//System.out.println(" output from arabic text
parse");
// fix Farsi keyboard KAF and YEH
arabicText = arabicText.replaceAll("\u06a9",
"\u0643");
arabicText = arabicText.replaceAll("\u06cc",
"\u064a");
byte[] utf8Bytes =
(arabicText+compositeID).getBytes("UTF-8");
outStr = new String( utf8Bytes );
}
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org