You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2021/01/23 13:09:00 UTC

[jira] [Comment Edited] (TIKA-3244) General upgrades for 1.26

    [ https://issues.apache.org/jira/browse/TIKA-3244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270655#comment-17270655 ] 

Tilman Hausherr edited comment on TIKA-3244 at 1/23/21, 1:08 PM:
-----------------------------------------------------------------

I want to set jackcess to 4.0.0 but can't. Consider this small test code
{code:java}
    @Test
    public void testTilman() throws IOException, URISyntaxException
    {
        System.out.println(DocumentBuilderFactory.newInstance());
        new DatabaseBuilder(new File(JackcessParserTest.class.getResource("/test-documents/testAccess2_encrypted.accdb").toURI()))
                        .setCodecProvider(new CryptCodecProvider("tika"))
                        .setReadOnly(true).open();
    }
 {code}
It prints out "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl@10e41621" and then
{noformat}
com.healthmarketscience.jackcess.crypt.InvalidCryptoConfigurationException: Failed parsing encryption descriptor
	at org.apache.tika.parser.microsoft.JackcessParserTest.testTilman(JackcessParserTest.java:103)
Caused by: java.lang.IllegalArgumentException: Property 'http://javax.xml.XMLConstants/property/accessExternalDTD' is not recognized.
	at org.apache.tika.parser.microsoft.JackcessParserTest.testTilman(JackcessParserTest.java:103)
 {noformat}
The same code works outside of tika. Then the output is "com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl@1540e19d".

The reason is explained [here|http://example.com|https://stackoverflow.com/questions/53299280/], Apache Xerces does not support {{javax.xml.XMLConstants.ACCESS_EXTERNAL_DTD}}.


was (Author: tilman):
I want to set jackcess to 4.0.0 but can't. Consider this small test code
{code:java}
    @Test
    public void testTilman() throws IOException, URISyntaxException
    {
        System.out.println(DocumentBuilderFactory.newInstance());
        new DatabaseBuilder(new File(JackcessParserTest.class.getResource("/test-documents/testAccess2_encrypted.accdb").toURI()))
                        .setCodecProvider(new CryptCodecProvider("tika"))
                        .setReadOnly(true).open();
    }
 {code}
It prints out "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl@10e41621" and then
{noformat}
com.healthmarketscience.jackcess.crypt.InvalidCryptoConfigurationException: Failed parsing encryption descriptor
	at org.apache.tika.parser.microsoft.JackcessParserTest.testTilman(JackcessParserTest.java:103)
Caused by: java.lang.IllegalArgumentException: Property 'http://javax.xml.XMLConstants/property/accessExternalDTD' is not recognized.
	at org.apache.tika.parser.microsoft.JackcessParserTest.testTilman(JackcessParserTest.java:103)
 {noformat}
The same code works outside of tika. Then the output is "com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl@1540e19d".

The reason is explained [here|[http://example.com|https://stackoverflow.com/questions/53299280/]], Apache Xerces does not support {{{{javax.xml.XMLConstants.ACCESS_EXTERNAL_DTD}}}}.

> General upgrades for 1.26
> -------------------------
>
>                 Key: TIKA-3244
>                 URL: https://issues.apache.org/jira/browse/TIKA-3244
>             Project: Tika
>          Issue Type: Task
>    Affects Versions: 1.25
>            Reporter: Tilman Hausherr
>            Priority: Major
>             Fix For: 2.0.0, 1.26
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)