You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@camel.apache.org by राकेश रवळेकर <ra...@gmail.com> on 2020/10/16 06:10:44 UTC

Character encoding lost when using Camel AWS2S3Endpoint

Hi All,

We have been using Camel to read files from AWS S3 bucket, however we could
see the accented characters in the file body are not readable despite using
covertBodyTo in our routes. Further check in AWS2S3Endpoint.java shows that
the default charset is used as UTF-8 only?

Reader reader = new BufferedReader(new InputStreamReader(s3Object,
Charset.forName(StandardCharsets.UTF_8.name())));

Is there any other way if we need to read the file body containing such
accented characters? Have even tried to change the default charset using

System.setProperty("org.apache.camel.default.charset", "Cp1252");

We are using below versions

<java.version>11</java.version>
<aws.java.sdk.version>2.13.56</aws.java.sdk.version>
<camel-version>3.5.0</camel-version>

Please suggest if we are missing anything?

Re: Character encoding lost when using Camel AWS2S3Endpoint

Posted by Andrea Cosentino <an...@gmail.com>.
Yes, it should be related to EC2 instance environment.

Il giorno ven 16 ott 2020 alle ore 15:04 राकेश रवळेकर <
rakesh.ravlekar@gmail.com> ha scritto:

> Hi Team,
>
> It works with 3.6.0 Snapshot on our local windows/mac machines with
> convertBodyTo(String.class,
> "ISO-8859-1"), however when we run the same on AWS EC2 instance it is
> unable to read the characters. I assume this is related to encoding charset
> or do you have any suggestions?
>
> Thanks
>
> On Fri, Oct 16, 2020 at 12:00 PM राकेश रवळेकर <ra...@gmail.com>
> wrote:
>
> > Thanks for the prompt response, surely I will check with the snapshot
> > version and get back.
> >
> > On Fri, Oct 16, 2020 at 11:49 AM Andrea Cosentino <an...@gmail.com>
> > wrote:
> >
> >> Hello,
> >>
> >> This should be fixed in 3.6.0. We changed the way we populate the body,
> so
> >> the charset is not enforced anymore.
> >>
> >> You can try with 3.6.0-SNAPSHOT.
> >>
> >> Let us know if it works, the release for 3.6.0 should be on vote this
> >> weekend.
> >>
> >> Cheers.
> >>
> >> Il giorno ven 16 ott 2020 alle ore 08:11 राकेश रवळेकर <
> >> rakesh.ravlekar@gmail.com> ha scritto:
> >>
> >> > Hi All,
> >> >
> >> > We have been using Camel to read files from AWS S3 bucket, however we
> >> could
> >> > see the accented characters in the file body are not readable despite
> >> using
> >> > covertBodyTo in our routes. Further check in AWS2S3Endpoint.java shows
> >> that
> >> > the default charset is used as UTF-8 only?
> >> >
> >> > Reader reader = new BufferedReader(new InputStreamReader(s3Object,
> >> > Charset.forName(StandardCharsets.UTF_8.name())));
> >> >
> >> > Is there any other way if we need to read the file body containing
> such
> >> > accented characters? Have even tried to change the default charset
> using
> >> >
> >> > System.setProperty("org.apache.camel.default.charset", "Cp1252");
> >> >
> >> > We are using below versions
> >> >
> >> > <java.version>11</java.version>
> >> > <aws.java.sdk.version>2.13.56</aws.java.sdk.version>
> >> > <camel-version>3.5.0</camel-version>
> >> >
> >> > Please suggest if we are missing anything?
> >> >
> >>
> >
>

Re: Character encoding lost when using Camel AWS2S3Endpoint

Posted by राकेश रवळेकर <ra...@gmail.com>.
Hi Team,

It works with 3.6.0 Snapshot on our local windows/mac machines with
convertBodyTo(String.class,
"ISO-8859-1"), however when we run the same on AWS EC2 instance it is
unable to read the characters. I assume this is related to encoding charset
or do you have any suggestions?

Thanks

On Fri, Oct 16, 2020 at 12:00 PM राकेश रवळेकर <ra...@gmail.com>
wrote:

> Thanks for the prompt response, surely I will check with the snapshot
> version and get back.
>
> On Fri, Oct 16, 2020 at 11:49 AM Andrea Cosentino <an...@gmail.com>
> wrote:
>
>> Hello,
>>
>> This should be fixed in 3.6.0. We changed the way we populate the body, so
>> the charset is not enforced anymore.
>>
>> You can try with 3.6.0-SNAPSHOT.
>>
>> Let us know if it works, the release for 3.6.0 should be on vote this
>> weekend.
>>
>> Cheers.
>>
>> Il giorno ven 16 ott 2020 alle ore 08:11 राकेश रवळेकर <
>> rakesh.ravlekar@gmail.com> ha scritto:
>>
>> > Hi All,
>> >
>> > We have been using Camel to read files from AWS S3 bucket, however we
>> could
>> > see the accented characters in the file body are not readable despite
>> using
>> > covertBodyTo in our routes. Further check in AWS2S3Endpoint.java shows
>> that
>> > the default charset is used as UTF-8 only?
>> >
>> > Reader reader = new BufferedReader(new InputStreamReader(s3Object,
>> > Charset.forName(StandardCharsets.UTF_8.name())));
>> >
>> > Is there any other way if we need to read the file body containing such
>> > accented characters? Have even tried to change the default charset using
>> >
>> > System.setProperty("org.apache.camel.default.charset", "Cp1252");
>> >
>> > We are using below versions
>> >
>> > <java.version>11</java.version>
>> > <aws.java.sdk.version>2.13.56</aws.java.sdk.version>
>> > <camel-version>3.5.0</camel-version>
>> >
>> > Please suggest if we are missing anything?
>> >
>>
>

Re: Character encoding lost when using Camel AWS2S3Endpoint

Posted by राकेश रवळेकर <ra...@gmail.com>.
Thanks for the prompt response, surely I will check with the snapshot
version and get back.

On Fri, Oct 16, 2020 at 11:49 AM Andrea Cosentino <an...@gmail.com> wrote:

> Hello,
>
> This should be fixed in 3.6.0. We changed the way we populate the body, so
> the charset is not enforced anymore.
>
> You can try with 3.6.0-SNAPSHOT.
>
> Let us know if it works, the release for 3.6.0 should be on vote this
> weekend.
>
> Cheers.
>
> Il giorno ven 16 ott 2020 alle ore 08:11 राकेश रवळेकर <
> rakesh.ravlekar@gmail.com> ha scritto:
>
> > Hi All,
> >
> > We have been using Camel to read files from AWS S3 bucket, however we
> could
> > see the accented characters in the file body are not readable despite
> using
> > covertBodyTo in our routes. Further check in AWS2S3Endpoint.java shows
> that
> > the default charset is used as UTF-8 only?
> >
> > Reader reader = new BufferedReader(new InputStreamReader(s3Object,
> > Charset.forName(StandardCharsets.UTF_8.name())));
> >
> > Is there any other way if we need to read the file body containing such
> > accented characters? Have even tried to change the default charset using
> >
> > System.setProperty("org.apache.camel.default.charset", "Cp1252");
> >
> > We are using below versions
> >
> > <java.version>11</java.version>
> > <aws.java.sdk.version>2.13.56</aws.java.sdk.version>
> > <camel-version>3.5.0</camel-version>
> >
> > Please suggest if we are missing anything?
> >
>

Re: Character encoding lost when using Camel AWS2S3Endpoint

Posted by Andrea Cosentino <an...@gmail.com>.
Hello,

This should be fixed in 3.6.0. We changed the way we populate the body, so
the charset is not enforced anymore.

You can try with 3.6.0-SNAPSHOT.

Let us know if it works, the release for 3.6.0 should be on vote this
weekend.

Cheers.

Il giorno ven 16 ott 2020 alle ore 08:11 राकेश रवळेकर <
rakesh.ravlekar@gmail.com> ha scritto:

> Hi All,
>
> We have been using Camel to read files from AWS S3 bucket, however we could
> see the accented characters in the file body are not readable despite using
> covertBodyTo in our routes. Further check in AWS2S3Endpoint.java shows that
> the default charset is used as UTF-8 only?
>
> Reader reader = new BufferedReader(new InputStreamReader(s3Object,
> Charset.forName(StandardCharsets.UTF_8.name())));
>
> Is there any other way if we need to read the file body containing such
> accented characters? Have even tried to change the default charset using
>
> System.setProperty("org.apache.camel.default.charset", "Cp1252");
>
> We are using below versions
>
> <java.version>11</java.version>
> <aws.java.sdk.version>2.13.56</aws.java.sdk.version>
> <camel-version>3.5.0</camel-version>
>
> Please suggest if we are missing anything?
>