You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by robert burrell donkin <ro...@gmail.com> on 2007/05/07 08:42:39 UTC
[jSieve] Script Encoding [WAS Re: i am not getting subject content in utf-8 format]
On 5/7/07, robert burrell donkin <ro...@gmail.com> wrote:
> On 5/7/07, ketanbparekh <ta...@yahoo.com> wrote:
> >
> > I am running on Windows XP Professional.
>
> windows has a difficult default platform encoding so this may well be
> the problem
i've taken a look at the code in SieveToMultiMailbox and SieveFactory.
i think that we have encoding issues. the current code will use the
default platform encoding. when using windoz, this will result in
UFT-8 and UFT-16 encoded files being decoded incorrectly when (some)
non-ASCII characters are present.
to fix these issues, an encoding charset needs to be specified
i can think of a couple of options (hopefully people will jump in with
any i've missed):
1 JAMES should support a single, hard coded charset (probably UFT-8)
2 we allow charset to be injected through configuration; defaulting to:
2a UFT-8
2b platform
opinions?
- robert
---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org
Re: [jSieve] Script Encoding [WAS Re: i am not getting subject content in utf-8 format]
Posted by robert burrell donkin <ro...@gmail.com>.
On 5/8/07, sbrewin@synergy.demon.co.uk <sb...@synergy.demon.co.uk> wrote:
> norman@apache.org wrote:
> > robert burrell donkin schrieb:
> > > On 5/7/07, robert burrell donkin <ro...@gmail.com> wrote:
> > >> On 5/7/07, ketanbparekh <ta...@yahoo.com> wrote:
> > >> >
> > >> > I am running on Windows XP Professional.
> > >>
> > >> windows has a difficult default platform encoding so this may well be
> > >> the problem
> > >
> > > i've taken a look at the code in SieveToMultiMailbox and SieveFactory.
> > > i think that we have encoding issues. the current code will use the
> > > default platform encoding. when using windoz, this will result in
> > > UFT-8 and UFT-16 encoded files being decoded incorrectly when (some)
> > > non-ASCII characters are present.
>
> Yes, I rather suspected this :(
>
> > > to fix these issues, an encoding charset needs to be specified
> > >
> > > i can think of a couple of options (hopefully people will jump in with
> > > any i've missed):
> > >
> > > 1 JAMES should support a single, hard coded charset (probably UFT-8)
> > >
> > > 2 we allow charset to be injected through configuration; defaulting to:
> > > 2a UFT-8
> > > 2b platform
> > >
> > > opinions?
> > >
> > > - robert
> >
> > Hi Robert,
> >
> > I think UTF-8 as default i not a good choice because some OS not support
> > UTF-8 by default. Maybe
> > ISO-8859-1 is a better choice.. A configuration option whould be cool
> > too for sure ;-)
> >
> > bye
> > Norman
>
>
> As the spec. prescribes UTF-8, I rather think that this should be the default.
+1
(i should have checked the specification before posting)
> As we are running in a VM, that the undetlying OS does or doesn't support UTF-8 isn't an
> issue. The VM will. We just need to make sure this is what we specify when dealing with
> character set sensitive code.
unless anyone objects or beats me to it, i'll patch SieveFactory so
that UFT-8 is forced by default and add a FAQ about this
- robert
---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org
Re: [jSieve] Script Encoding [WAS Re: i am not getting subject content in utf-8 format]
Posted by sb...@synergy.demon.co.uk.
norman@apache.org wrote:
> robert burrell donkin schrieb:
> > On 5/7/07, robert burrell donkin <ro...@gmail.com> wrote:
> >> On 5/7/07, ketanbparekh <ta...@yahoo.com> wrote:
> >> >
> >> > I am running on Windows XP Professional.
> >>
> >> windows has a difficult default platform encoding so this may well be
> >> the problem
> >
> > i've taken a look at the code in SieveToMultiMailbox and SieveFactory.
> > i think that we have encoding issues. the current code will use the
> > default platform encoding. when using windoz, this will result in
> > UFT-8 and UFT-16 encoded files being decoded incorrectly when (some)
> > non-ASCII characters are present.
Yes, I rather suspected this :(
> > to fix these issues, an encoding charset needs to be specified
> >
> > i can think of a couple of options (hopefully people will jump in with
> > any i've missed):
> >
> > 1 JAMES should support a single, hard coded charset (probably UFT-8)
> >
> > 2 we allow charset to be injected through configuration; defaulting to:
> > 2a UFT-8
> > 2b platform
> >
> > opinions?
> >
> > - robert
>
> Hi Robert,
>
> I think UTF-8 as default i not a good choice because some OS not support
> UTF-8 by default. Maybe
> ISO-8859-1 is a better choice.. A configuration option whould be cool
> too for sure ;-)
>
> bye
> Norman
As the spec. prescribes UTF-8, I rather think that this should be the default. As we are running in a VM, that the undetlying OS does or doesn't support UTF-8 isn't an issue. The VM will. We just need to make sure this is what we specify when dealing with character set sensitive code.
Cheers
Steve
---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org
Re: [jSieve] Script Encoding [WAS Re: i am not getting subject content
in utf-8 format]
Posted by Norman Maurer <no...@apache.org>.
robert burrell donkin schrieb:
> On 5/7/07, robert burrell donkin <ro...@gmail.com> wrote:
>> On 5/7/07, ketanbparekh <ta...@yahoo.com> wrote:
>> >
>> > I am running on Windows XP Professional.
>>
>> windows has a difficult default platform encoding so this may well be
>> the problem
>
> i've taken a look at the code in SieveToMultiMailbox and SieveFactory.
> i think that we have encoding issues. the current code will use the
> default platform encoding. when using windoz, this will result in
> UFT-8 and UFT-16 encoded files being decoded incorrectly when (some)
> non-ASCII characters are present.
>
> to fix these issues, an encoding charset needs to be specified
>
> i can think of a couple of options (hopefully people will jump in with
> any i've missed):
>
> 1 JAMES should support a single, hard coded charset (probably UFT-8)
>
> 2 we allow charset to be injected through configuration; defaulting to:
> 2a UFT-8
> 2b platform
>
> opinions?
>
> - robert
Hi Robert,
I think UTF-8 as default i not a good choice because some OS not support
UTF-8 by default. Maybe
ISO-8859-1 is a better choice.. A configuration option whould be cool
too for sure ;-)
bye
Norman
---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org