You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Averell <lv...@gmail.com> on 2020/08/27 11:37:50 UTC

SAX2 driver class org.apache.xerces.parsers.SAXParser not found

Hello,

I have a Flink 1.10 job which runs in AWS EMR, checkpointing to S3a as well
as writing output to S3a using StreamingFileSink. The job runs well until I
add the Java Hadoop properties:  /-Dfs.s3a.acl.default=
BucketOwnerFullControl/. Since after that, the checkpoint process fails to
complete.

/Caused by: org.xml.sax.SAXException: SAX2 driver class
org.apache.xerces.parsers.SAXParser not found/
I tried to add a jar file with that class
(https://mvnrepository.com/artifact/xerces/xercesImpl/2.12.0) to my
flink/lib/ directory, then got the same error but different stacktrace:
/Caused by: org.apache.flink.util.SerializedThrowable: SAX2 driver class
org.apache.xerces.parsers.SAXParser not found/

This seems to be a dependencies conflict, but I couldn't track its root.
In my IDE I didn't have any dependencies issue, while I couldn't find
SAXParser in the dependencies tree.

*Here is the stacktrace when the jar file is not there:*
/Caused by: org.apache.hadoop.fs.s3a.AWSClientIOException: getFileStatus on
s3a://mybucket/checkpoint/a9502b1c81ced10dfcbb21ac43f03e61/chk-2/41f51c24-60fd-474b-9f89-3d65d87037c7:
com.amazonaws.SdkClientException: Couldn't initialize a SAX driver to create
an XMLReader: Couldn't initialize a SAX driver to create an XMLReader
        at
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:177)
        at
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:145)
        at
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2251)
        at
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149)
        at
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088)
        at
org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:749)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1169)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1149)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1038)
        at
org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.create(HadoopFileSystem.java:141)
        at
org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.create(HadoopFileSystem.java:37)
        at
org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.create(PluginFileSystemFactory.java:164)
        at
org.apache.flink.core.fs.SafetyNetWrapperFileSystem.create(SafetyNetWrapperFileSystem.java:126)
        at
org.apache.flink.core.fs.EntropyInjector.createEntropyAware(EntropyInjector.java:61)
        at
org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.createStream(FsCheckpointStreamFactory.java:356)
        ... 17 more
Caused by: com.amazonaws.SdkClientException: Couldn't initialize a SAX
driver to create an XMLReader
        at
com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:118)
        at
com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:87)
        at
com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:77)
        at
com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:62)
        at
com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:31)
        at
com.amazonaws.http.response.AwsResponseHandlerAdapter.handle(AwsResponseHandlerAdapter.java:70)
        at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleResponse(AmazonHttpClient.java:1554)
        at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1272)
        at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
        at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
        at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
        at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
        at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
        at
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
        at
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
        at
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325)
        at
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272)
        at
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4266)
        at
com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:876)
        at
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$5(S3AFileSystem.java:1262)
        at
org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317)
        at
org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:280)
        at
org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:1255)
        at
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2223)
        ... 29 more
Caused by: org.xml.sax.SAXException: SAX2 driver class
org.apache.xerces.parsers.SAXParser not found
java.lang.ClassNotFoundException: org.apache.xerces.parsers.SAXParser
        at
org.xml.sax.helpers.XMLReaderFactory.loadClass(XMLReaderFactory.java:230)
        at
org.xml.sax.helpers.XMLReaderFactory.createXMLReader(XMLReaderFactory.java:191)
        at
com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:115)
        ... 52 more/

*And here is the stacktrace when that jar file added to /lib/ folder*

/Could not materialize checkpoint 1 for operator Source:
<my_operators_chain> (1/2).
	at
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.handleExecutionException(StreamTask.java:1238)
	at
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:1180)
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.util.SerializedThrowable: java.io.IOException:
Could not open output stream for state backend
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
	at
org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:461)
	at
org.apache.flink.streaming.api.operators.OperatorSnapshotFinalizer.<init>(OperatorSnapshotFinalizer.java:53)
	at
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:1143)
	... 3 common frames omitted
Caused by: org.apache.flink.util.SerializedThrowable: Could not open output
stream for state backend
	at
org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.createStream(FsCheckpointStreamFactory.java:367)
	at
org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.flush(FsCheckpointStreamFactory.java:234)
	at
org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.write(FsCheckpointStreamFactory.java:209)
	at java.io.DataOutputStream.write(DataOutputStream.java:107)
	at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
	at
org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.serialize(BytePrimitiveArraySerializer.java:78)
	at
org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.serialize(BytePrimitiveArraySerializer.java:33)
	at
org.apache.flink.runtime.state.PartitionableListState.write(PartitionableListState.java:116)
	at
org.apache.flink.runtime.state.DefaultOperatorStateBackendSnapshotStrategy$1.callInternal(DefaultOperatorStateBackendSnapshotStrategy.java:155)
	at
org.apache.flink.runtime.state.DefaultOperatorStateBackendSnapshotStrategy$1.callInternal(DefaultOperatorStateBackendSnapshotStrategy.java:108)
	at
org.apache.flink.runtime.state.AsyncSnapshotCallable.call(AsyncSnapshotCallable.java:75)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at
org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:458)
	... 5 common frames omitted
Caused by: org.apache.flink.util.SerializedThrowable: getFileStatus on
s3a://mybucket/checkpoint/d8ed6d1524169c942bbc455d2c519a39/chk-1/7f2d8fd6-4f3f-4da7-9ffd-5a7e3ea8e7e3:
com.amazonaws.SdkClientException: Couldn't initialize a SAX driver to create
an XMLReader: Couldn't initialize a SAX driver to create an XMLReader
	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:177)
	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:145)
	at
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2251)
	at
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149)
	at
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:749)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1169)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1149)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1038)
	at
org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.create(HadoopFileSystem.java:141)
	at
org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.create(HadoopFileSystem.java:37)
	at
org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.create(PluginFileSystemFactory.java:164)
	at
org.apache.flink.core.fs.SafetyNetWrapperFileSystem.create(SafetyNetWrapperFileSystem.java:126)
	at
org.apache.flink.core.fs.EntropyInjector.createEntropyAware(EntropyInjector.java:61)
	at
org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.createStream(FsCheckpointStreamFactory.java:356)
	... 17 common frames omitted
Caused by: org.apache.flink.util.SerializedThrowable: Couldn't initialize a
SAX driver to create an XMLReader
	at
com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:118)
	at
com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:87)
	at
com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:77)
	at
com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:62)
	at
com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:31)
	at
com.amazonaws.http.response.AwsResponseHandlerAdapter.handle(AwsResponseHandlerAdapter.java:70)
	at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleResponse(AmazonHttpClient.java:1554)
	at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1272)
	at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
	at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
	at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
	at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
	at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
	at
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
	at
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325)
	at
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272)
	at
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4266)
	at
com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:876)
	at
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$5(S3AFileSystem.java:1262)
	at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317)
	at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:280)
	at
org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:1255)
	at
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2223)
	... 29 common frames omitted
Caused by: org.apache.flink.util.SerializedThrowable: SAX2 driver class
org.apache.xerces.parsers.SAXParser not found
	at
org.xml.sax.helpers.XMLReaderFactory.loadClass(XMLReaderFactory.java:230)
	at
org.xml.sax.helpers.XMLReaderFactory.createXMLReader(XMLReaderFactory.java:191)
	at
com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:115)
	... 52 common frames omitted
Caused by: org.apache.flink.util.SerializedThrowable:
org.apache.xerces.parsers.SAXParser
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
	at
org.apache.flink.core.plugin.PluginLoader$PluginClassLoader.loadClass(PluginLoader.java:149)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
	at org.xml.sax.helpers.NewInstance.newInstance(NewInstance.java:82)
	at
org.xml.sax.helpers.XMLReaderFactory.loadClass(XMLReaderFactory.java:228)
	... 54 common frames omitted
/



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: SAX2 driver class org.apache.xerces.parsers.SAXParser not found

Posted by Averell <lv...@gmail.com>.
Hello Robert,

I'm not sure why the screenshot I attached in the previous post was not
shown. I'm trying to re-attach in this post. 
As shown in this screenshot, part-1-33, part-1-34, and part-1-35 have
already been closed, but the temp file for part-1-33 is still there.

Thanks and regards
Averell 
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1586/FlinkFileSink.png> 



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: SAX2 driver class org.apache.xerces.parsers.SAXParser not found

Posted by Robert Metzger <rm...@apache.org>.
Hi Averell,
as far as I know these tmp files should be removed when the Flink job is
recovering. So you should have these files around only for the latest
incomplete checkpoint while recovery has not completed yet.

On Tue, Sep 1, 2020 at 2:56 AM Averell <lv...@gmail.com> wrote:

> Hello Robert, Arvid,
>
> As I am running on EMR, and currently AWS only supports version 1.10.
> I tried both solutions that you suggested ((i) copying a SAXParser
> implementation to the plugins folder and (ii) using the S3FS Plugin from
> 1.10.1), and both worked - I could have successful checkpoints.
>
> However, intermittenly my checkpoints still fail (about 10%). And whenever
> it fails, there are non-completed files left in S3 (attached screenshot
> below).
> I'm not sure whether those uncompleted files are expected, or is that a
> bug?
>
> Thanks and regards,
> Averell
> <
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1586/Screen_Shot_2020-08-28_at_11.png>
>
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Re: SAX2 driver class org.apache.xerces.parsers.SAXParser not found

Posted by Averell <lv...@gmail.com>.
Hello Robert, Arvid,

As I am running on EMR, and currently AWS only supports version 1.10.
I tried both solutions that you suggested ((i) copying a SAXParser
implementation to the plugins folder and (ii) using the S3FS Plugin from
1.10.1), and both worked - I could have successful checkpoints.

However, intermittenly my checkpoints still fail (about 10%). And whenever
it fails, there are non-completed files left in S3 (attached screenshot
below).
I'm not sure whether those uncompleted files are expected, or is that a bug?

Thanks and regards,
Averell
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1586/Screen_Shot_2020-08-28_at_11.png> 



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: SAX2 driver class org.apache.xerces.parsers.SAXParser not found

Posted by Arvid Heise <ar...@ververica.com>.
Hi Averell,

This is a known bug [1] caused by the used AWS S3 library not respecting
the classloader [2].

The best solution is to upgrade to 1.10.1 (or take the s3-hadoop jar from
1.10.1). Don't try to put Xerces manually anywhere.

[1] https://issues.apache.org/jira/browse/FLINK-16014
[2] https://github.com/aws/aws-sdk-java/issues/2242

On Thu, Aug 27, 2020 at 4:34 PM Robert Metzger <rm...@apache.org> wrote:

> Hi,
> I guess you've loaded the S3 filesystem using the s3 FS plugin.
>
> You need to put the right jar file containing the SAX2 driver class into
> the plugin directory where you've also put the S3 filesystem plugin.
> You can probably find out the name of the right sax2 jar file from your
> local setup where everything is working.
>
> I hope that helps!
>
> Best,
> Robert
>
> On Thu, Aug 27, 2020 at 1:38 PM Averell <lv...@gmail.com> wrote:
>
>> Hello,
>>
>> I have a Flink 1.10 job which runs in AWS EMR, checkpointing to S3a as
>> well
>> as writing output to S3a using StreamingFileSink. The job runs well until
>> I
>> add the Java Hadoop properties:  /-Dfs.s3a.acl.default=
>> BucketOwnerFullControl/. Since after that, the checkpoint process fails to
>> complete.
>>
>> /Caused by: org.xml.sax.SAXException: SAX2 driver class
>> org.apache.xerces.parsers.SAXParser not found/
>> I tried to add a jar file with that class
>> (https://mvnrepository.com/artifact/xerces/xercesImpl/2.12.0) to my
>> flink/lib/ directory, then got the same error but different stacktrace:
>> /Caused by: org.apache.flink.util.SerializedThrowable: SAX2 driver class
>> org.apache.xerces.parsers.SAXParser not found/
>>
>> This seems to be a dependencies conflict, but I couldn't track its root.
>> In my IDE I didn't have any dependencies issue, while I couldn't find
>> SAXParser in the dependencies tree.
>>
>> *Here is the stacktrace when the jar file is not there:*
>> /Caused by: org.apache.hadoop.fs.s3a.AWSClientIOException: getFileStatus
>> on
>>
>> s3a://mybucket/checkpoint/a9502b1c81ced10dfcbb21ac43f03e61/chk-2/41f51c24-60fd-474b-9f89-3d65d87037c7:
>> com.amazonaws.SdkClientException: Couldn't initialize a SAX driver to
>> create
>> an XMLReader: Couldn't initialize a SAX driver to create an XMLReader
>>         at
>> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:177)
>>         at
>> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:145)
>>         at
>>
>> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2251)
>>         at
>>
>> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149)
>>         at
>>
>> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088)
>>         at
>> org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:749)
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1169)
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1149)
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1038)
>>         at
>>
>> org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.create(HadoopFileSystem.java:141)
>>         at
>>
>> org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.create(HadoopFileSystem.java:37)
>>         at
>>
>> org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.create(PluginFileSystemFactory.java:164)
>>         at
>>
>> org.apache.flink.core.fs.SafetyNetWrapperFileSystem.create(SafetyNetWrapperFileSystem.java:126)
>>         at
>>
>> org.apache.flink.core.fs.EntropyInjector.createEntropyAware(EntropyInjector.java:61)
>>         at
>>
>> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.createStream(FsCheckpointStreamFactory.java:356)
>>         ... 17 more
>> Caused by: com.amazonaws.SdkClientException: Couldn't initialize a SAX
>> driver to create an XMLReader
>>         at
>>
>> com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:118)
>>         at
>>
>> com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:87)
>>         at
>>
>> com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:77)
>>         at
>>
>> com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:62)
>>         at
>>
>> com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:31)
>>         at
>>
>> com.amazonaws.http.response.AwsResponseHandlerAdapter.handle(AwsResponseHandlerAdapter.java:70)
>>         at
>>
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleResponse(AmazonHttpClient.java:1554)
>>         at
>>
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1272)
>>         at
>>
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
>>         at
>>
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
>>         at
>>
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
>>         at
>>
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
>>         at
>>
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
>>         at
>>
>> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
>>         at
>> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
>>         at
>> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325)
>>         at
>> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272)
>>         at
>> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4266)
>>         at
>>
>> com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:876)
>>         at
>>
>> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$5(S3AFileSystem.java:1262)
>>         at
>> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317)
>>         at
>> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:280)
>>         at
>>
>> org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:1255)
>>         at
>>
>> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2223)
>>         ... 29 more
>> Caused by: org.xml.sax.SAXException: SAX2 driver class
>> org.apache.xerces.parsers.SAXParser not found
>> java.lang.ClassNotFoundException: org.apache.xerces.parsers.SAXParser
>>         at
>> org.xml.sax.helpers.XMLReaderFactory.loadClass(XMLReaderFactory.java:230)
>>         at
>>
>> org.xml.sax.helpers.XMLReaderFactory.createXMLReader(XMLReaderFactory.java:191)
>>         at
>>
>> com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:115)
>>         ... 52 more/
>>
>> *And here is the stacktrace when that jar file added to /lib/ folder*
>>
>> /Could not materialize checkpoint 1 for operator Source:
>> <my_operators_chain> (1/2).
>>         at
>>
>> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.handleExecutionException(StreamTask.java:1238)
>>         at
>>
>> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:1180)
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>         at java.lang.Thread.run(Thread.java:748)
>> Caused by: org.apache.flink.util.SerializedThrowable: java.io.IOException:
>> Could not open output stream for state backend
>>         at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>>         at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>>         at
>>
>> org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:461)
>>         at
>>
>> org.apache.flink.streaming.api.operators.OperatorSnapshotFinalizer.<init>(OperatorSnapshotFinalizer.java:53)
>>         at
>>
>> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:1143)
>>         ... 3 common frames omitted
>> Caused by: org.apache.flink.util.SerializedThrowable: Could not open
>> output
>> stream for state backend
>>         at
>>
>> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.createStream(FsCheckpointStreamFactory.java:367)
>>         at
>>
>> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.flush(FsCheckpointStreamFactory.java:234)
>>         at
>>
>> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.write(FsCheckpointStreamFactory.java:209)
>>         at java.io.DataOutputStream.write(DataOutputStream.java:107)
>>         at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
>>         at
>>
>> org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.serialize(BytePrimitiveArraySerializer.java:78)
>>         at
>>
>> org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.serialize(BytePrimitiveArraySerializer.java:33)
>>         at
>>
>> org.apache.flink.runtime.state.PartitionableListState.write(PartitionableListState.java:116)
>>         at
>>
>> org.apache.flink.runtime.state.DefaultOperatorStateBackendSnapshotStrategy$1.callInternal(DefaultOperatorStateBackendSnapshotStrategy.java:155)
>>         at
>>
>> org.apache.flink.runtime.state.DefaultOperatorStateBackendSnapshotStrategy$1.callInternal(DefaultOperatorStateBackendSnapshotStrategy.java:108)
>>         at
>>
>> org.apache.flink.runtime.state.AsyncSnapshotCallable.call(AsyncSnapshotCallable.java:75)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>         at
>>
>> org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:458)
>>         ... 5 common frames omitted
>> Caused by: org.apache.flink.util.SerializedThrowable: getFileStatus on
>>
>> s3a://mybucket/checkpoint/d8ed6d1524169c942bbc455d2c519a39/chk-1/7f2d8fd6-4f3f-4da7-9ffd-5a7e3ea8e7e3:
>> com.amazonaws.SdkClientException: Couldn't initialize a SAX driver to
>> create
>> an XMLReader: Couldn't initialize a SAX driver to create an XMLReader
>>         at
>> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:177)
>>         at
>> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:145)
>>         at
>>
>> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2251)
>>         at
>>
>> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149)
>>         at
>>
>> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088)
>>         at
>> org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:749)
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1169)
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1149)
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1038)
>>         at
>>
>> org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.create(HadoopFileSystem.java:141)
>>         at
>>
>> org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.create(HadoopFileSystem.java:37)
>>         at
>>
>> org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.create(PluginFileSystemFactory.java:164)
>>         at
>>
>> org.apache.flink.core.fs.SafetyNetWrapperFileSystem.create(SafetyNetWrapperFileSystem.java:126)
>>         at
>>
>> org.apache.flink.core.fs.EntropyInjector.createEntropyAware(EntropyInjector.java:61)
>>         at
>>
>> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.createStream(FsCheckpointStreamFactory.java:356)
>>         ... 17 common frames omitted
>> Caused by: org.apache.flink.util.SerializedThrowable: Couldn't initialize
>> a
>> SAX driver to create an XMLReader
>>         at
>>
>> com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:118)
>>         at
>>
>> com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:87)
>>         at
>>
>> com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:77)
>>         at
>>
>> com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:62)
>>         at
>>
>> com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:31)
>>         at
>>
>> com.amazonaws.http.response.AwsResponseHandlerAdapter.handle(AwsResponseHandlerAdapter.java:70)
>>         at
>>
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleResponse(AmazonHttpClient.java:1554)
>>         at
>>
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1272)
>>         at
>>
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
>>         at
>>
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
>>         at
>>
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
>>         at
>>
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
>>         at
>>
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
>>         at
>>
>> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
>>         at
>> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
>>         at
>> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325)
>>         at
>> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272)
>>         at
>> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4266)
>>         at
>>
>> com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:876)
>>         at
>>
>> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$5(S3AFileSystem.java:1262)
>>         at
>> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317)
>>         at
>> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:280)
>>         at
>>
>> org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:1255)
>>         at
>>
>> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2223)
>>         ... 29 common frames omitted
>> Caused by: org.apache.flink.util.SerializedThrowable: SAX2 driver class
>> org.apache.xerces.parsers.SAXParser not found
>>         at
>> org.xml.sax.helpers.XMLReaderFactory.loadClass(XMLReaderFactory.java:230)
>>         at
>>
>> org.xml.sax.helpers.XMLReaderFactory.createXMLReader(XMLReaderFactory.java:191)
>>         at
>>
>> com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:115)
>>         ... 52 common frames omitted
>> Caused by: org.apache.flink.util.SerializedThrowable:
>> org.apache.xerces.parsers.SAXParser
>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
>>         at
>>
>> org.apache.flink.core.plugin.PluginLoader$PluginClassLoader.loadClass(PluginLoader.java:149)
>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
>>         at
>> org.xml.sax.helpers.NewInstance.newInstance(NewInstance.java:82)
>>         at
>> org.xml.sax.helpers.XMLReaderFactory.loadClass(XMLReaderFactory.java:228)
>>         ... 54 common frames omitted
>> /
>>
>>
>>
>> --
>> Sent from:
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>>
>

-- 

Arvid Heise | Senior Java Developer

<https://www.ververica.com/>

Follow us @VervericaData

--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Toni) Cheng

Re: SAX2 driver class org.apache.xerces.parsers.SAXParser not found

Posted by Robert Metzger <rm...@apache.org>.
Hi,
I guess you've loaded the S3 filesystem using the s3 FS plugin.

You need to put the right jar file containing the SAX2 driver class into
the plugin directory where you've also put the S3 filesystem plugin.
You can probably find out the name of the right sax2 jar file from your
local setup where everything is working.

I hope that helps!

Best,
Robert

On Thu, Aug 27, 2020 at 1:38 PM Averell <lv...@gmail.com> wrote:

> Hello,
>
> I have a Flink 1.10 job which runs in AWS EMR, checkpointing to S3a as well
> as writing output to S3a using StreamingFileSink. The job runs well until I
> add the Java Hadoop properties:  /-Dfs.s3a.acl.default=
> BucketOwnerFullControl/. Since after that, the checkpoint process fails to
> complete.
>
> /Caused by: org.xml.sax.SAXException: SAX2 driver class
> org.apache.xerces.parsers.SAXParser not found/
> I tried to add a jar file with that class
> (https://mvnrepository.com/artifact/xerces/xercesImpl/2.12.0) to my
> flink/lib/ directory, then got the same error but different stacktrace:
> /Caused by: org.apache.flink.util.SerializedThrowable: SAX2 driver class
> org.apache.xerces.parsers.SAXParser not found/
>
> This seems to be a dependencies conflict, but I couldn't track its root.
> In my IDE I didn't have any dependencies issue, while I couldn't find
> SAXParser in the dependencies tree.
>
> *Here is the stacktrace when the jar file is not there:*
> /Caused by: org.apache.hadoop.fs.s3a.AWSClientIOException: getFileStatus on
>
> s3a://mybucket/checkpoint/a9502b1c81ced10dfcbb21ac43f03e61/chk-2/41f51c24-60fd-474b-9f89-3d65d87037c7:
> com.amazonaws.SdkClientException: Couldn't initialize a SAX driver to
> create
> an XMLReader: Couldn't initialize a SAX driver to create an XMLReader
>         at
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:177)
>         at
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:145)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2251)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088)
>         at
> org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:749)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1169)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1149)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1038)
>         at
>
> org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.create(HadoopFileSystem.java:141)
>         at
>
> org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.create(HadoopFileSystem.java:37)
>         at
>
> org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.create(PluginFileSystemFactory.java:164)
>         at
>
> org.apache.flink.core.fs.SafetyNetWrapperFileSystem.create(SafetyNetWrapperFileSystem.java:126)
>         at
>
> org.apache.flink.core.fs.EntropyInjector.createEntropyAware(EntropyInjector.java:61)
>         at
>
> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.createStream(FsCheckpointStreamFactory.java:356)
>         ... 17 more
> Caused by: com.amazonaws.SdkClientException: Couldn't initialize a SAX
> driver to create an XMLReader
>         at
>
> com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:118)
>         at
>
> com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:87)
>         at
>
> com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:77)
>         at
>
> com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:62)
>         at
>
> com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:31)
>         at
>
> com.amazonaws.http.response.AwsResponseHandlerAdapter.handle(AwsResponseHandlerAdapter.java:70)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleResponse(AmazonHttpClient.java:1554)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1272)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
>         at
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
>         at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325)
>         at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272)
>         at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4266)
>         at
>
> com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:876)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$5(S3AFileSystem.java:1262)
>         at
> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317)
>         at
> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:280)
>         at
> org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:1255)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2223)
>         ... 29 more
> Caused by: org.xml.sax.SAXException: SAX2 driver class
> org.apache.xerces.parsers.SAXParser not found
> java.lang.ClassNotFoundException: org.apache.xerces.parsers.SAXParser
>         at
> org.xml.sax.helpers.XMLReaderFactory.loadClass(XMLReaderFactory.java:230)
>         at
>
> org.xml.sax.helpers.XMLReaderFactory.createXMLReader(XMLReaderFactory.java:191)
>         at
>
> com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:115)
>         ... 52 more/
>
> *And here is the stacktrace when that jar file added to /lib/ folder*
>
> /Could not materialize checkpoint 1 for operator Source:
> <my_operators_chain> (1/2).
>         at
>
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.handleExecutionException(StreamTask.java:1238)
>         at
>
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:1180)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.flink.util.SerializedThrowable: java.io.IOException:
> Could not open output stream for state backend
>         at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>         at
>
> org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:461)
>         at
>
> org.apache.flink.streaming.api.operators.OperatorSnapshotFinalizer.<init>(OperatorSnapshotFinalizer.java:53)
>         at
>
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:1143)
>         ... 3 common frames omitted
> Caused by: org.apache.flink.util.SerializedThrowable: Could not open output
> stream for state backend
>         at
>
> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.createStream(FsCheckpointStreamFactory.java:367)
>         at
>
> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.flush(FsCheckpointStreamFactory.java:234)
>         at
>
> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.write(FsCheckpointStreamFactory.java:209)
>         at java.io.DataOutputStream.write(DataOutputStream.java:107)
>         at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
>         at
>
> org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.serialize(BytePrimitiveArraySerializer.java:78)
>         at
>
> org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.serialize(BytePrimitiveArraySerializer.java:33)
>         at
>
> org.apache.flink.runtime.state.PartitionableListState.write(PartitionableListState.java:116)
>         at
>
> org.apache.flink.runtime.state.DefaultOperatorStateBackendSnapshotStrategy$1.callInternal(DefaultOperatorStateBackendSnapshotStrategy.java:155)
>         at
>
> org.apache.flink.runtime.state.DefaultOperatorStateBackendSnapshotStrategy$1.callInternal(DefaultOperatorStateBackendSnapshotStrategy.java:108)
>         at
>
> org.apache.flink.runtime.state.AsyncSnapshotCallable.call(AsyncSnapshotCallable.java:75)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at
>
> org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:458)
>         ... 5 common frames omitted
> Caused by: org.apache.flink.util.SerializedThrowable: getFileStatus on
>
> s3a://mybucket/checkpoint/d8ed6d1524169c942bbc455d2c519a39/chk-1/7f2d8fd6-4f3f-4da7-9ffd-5a7e3ea8e7e3:
> com.amazonaws.SdkClientException: Couldn't initialize a SAX driver to
> create
> an XMLReader: Couldn't initialize a SAX driver to create an XMLReader
>         at
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:177)
>         at
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:145)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2251)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088)
>         at
> org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:749)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1169)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1149)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1038)
>         at
>
> org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.create(HadoopFileSystem.java:141)
>         at
>
> org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.create(HadoopFileSystem.java:37)
>         at
>
> org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.create(PluginFileSystemFactory.java:164)
>         at
>
> org.apache.flink.core.fs.SafetyNetWrapperFileSystem.create(SafetyNetWrapperFileSystem.java:126)
>         at
>
> org.apache.flink.core.fs.EntropyInjector.createEntropyAware(EntropyInjector.java:61)
>         at
>
> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.createStream(FsCheckpointStreamFactory.java:356)
>         ... 17 common frames omitted
> Caused by: org.apache.flink.util.SerializedThrowable: Couldn't initialize a
> SAX driver to create an XMLReader
>         at
>
> com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:118)
>         at
>
> com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:87)
>         at
>
> com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:77)
>         at
>
> com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:62)
>         at
>
> com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:31)
>         at
>
> com.amazonaws.http.response.AwsResponseHandlerAdapter.handle(AwsResponseHandlerAdapter.java:70)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleResponse(AmazonHttpClient.java:1554)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1272)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
>         at
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
>         at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325)
>         at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272)
>         at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4266)
>         at
>
> com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:876)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$5(S3AFileSystem.java:1262)
>         at
> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317)
>         at
> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:280)
>         at
> org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:1255)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2223)
>         ... 29 common frames omitted
> Caused by: org.apache.flink.util.SerializedThrowable: SAX2 driver class
> org.apache.xerces.parsers.SAXParser not found
>         at
> org.xml.sax.helpers.XMLReaderFactory.loadClass(XMLReaderFactory.java:230)
>         at
>
> org.xml.sax.helpers.XMLReaderFactory.createXMLReader(XMLReaderFactory.java:191)
>         at
>
> com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:115)
>         ... 52 common frames omitted
> Caused by: org.apache.flink.util.SerializedThrowable:
> org.apache.xerces.parsers.SAXParser
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
>         at
>
> org.apache.flink.core.plugin.PluginLoader$PluginClassLoader.loadClass(PluginLoader.java:149)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
>         at org.xml.sax.helpers.NewInstance.newInstance(NewInstance.java:82)
>         at
> org.xml.sax.helpers.XMLReaderFactory.loadClass(XMLReaderFactory.java:228)
>         ... 54 common frames omitted
> /
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>