You are viewing a plain text version of this content. The canonical link for it is here.
Posted to embperl@perl.apache.org by Jean-Christophe Boggio <em...@thefreecat.org> on 2008/10/01 01:59:30 UTC

Re: Problem with file upload

Ben Hiebert wrote :
> Perl usually 
> tries to guess at the best encoding when it takes in the data and then 
> encodes it internally as best it can.  You may have a problem where the 
> data comes in as ISO88591 but perl thinks it is UTF8 data, encodes it 
> internally as UTF8 and then prints out the UTF8-as-ISO88591 to give you 
> the bad results.  

Yes, that is my guess too.

> It may be worth checking to see what format Perl thinks your incoming 
> data is by using
> $flag = utf8::is_utf8(STRING);

Good idea. I modified the code to this :

while (read($fdat{efilename},$buffer,32768)) {
	if (utf8::is_utf8($buffer)) {
		print OUT "u";
	}
	print FILE $buffer;
}

...but in both cases (working and not) I never get the "uuuuu" lines.
BUT when the $buffer is written to disk it is transformed ! I tried
with binmode FILE just after opening the file for output but same
things happen.

> If perl thinks UTF8 then it is misintepreting your incoming data and 
> you'll need to either decode it with decode or with one of the other 
> UTF8 utilities.  This may work:
> 
> $GoodInternalString = decode("iso-8859-1", $IncomingData);

That's what I use when the file *is* iso-8859-1.

> These are the pages I read over and over and over again until my pages 
> magically work:

:-) I see *exactly* what you mean. I've read these pages over and over too.

I don't get the reason for that random behaviour.

Thanks,

JC

---------------------------------------------------------------------
To unsubscribe, e-mail: embperl-unsubscribe@perl.apache.org
For additional commands, e-mail: embperl-help@perl.apache.org