You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2021/03/03 14:02:00 UTC

[jira] [Comment Edited] (TIKA-3244) General upgrades for 1.26

    [ https://issues.apache.org/jira/browse/TIKA-3244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270655#comment-17270655 ] 

Tim Allison edited comment on TIKA-3244 at 3/3/21, 2:01 PM:
------------------------------------------------------------

I want to set jackcess to 4.0.0 but can't because some tests in {{JackcessParserTest}} fail. Consider this small test code
{code:java}
    @Test
    public void testTilman() throws IOException, URISyntaxException
    {
        System.out.println(DocumentBuilderFactory.newInstance());
        new DatabaseBuilder(new File(JackcessParserTest.class.getResource("/test-documents/testAccess2_encrypted.accdb").toURI()))
                        .setCodecProvider(new CryptCodecProvider("tika"))
                        .setReadOnly(true).open();
    }
 {code}
It prints out "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl@10e41621" and then
{noformat}
com.healthmarketscience.jackcess.crypt.InvalidCryptoConfigurationException: Failed parsing encryption descriptor
	at org.apache.tika.parser.microsoft.JackcessParserTest.testTilman(JackcessParserTest.java:103)
Caused by: java.lang.IllegalArgumentException: Property 'http://javax.xml.XMLConstants/property/accessExternalDTD' is not recognized.
	at org.apache.tika.parser.microsoft.JackcessParserTest.testTilman(JackcessParserTest.java:103)
 {noformat}
The same code works outside of tika. Then the output is "com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl@1540e19d".

The reason is explained [here|https://stackoverflow.com/questions/53299280/], Apache Xerces does not support {{javax.xml.XMLConstants.ACCESS_EXTERNAL_DTD}}.


was (Author: tilman):
I want to set jackcess to 4.0.0 but can't because some tests in {{JackcessParserTest}} fail. Consider this small test code
{code:java}
    @Test
    public void testTilman() throws IOException, URISyntaxException
    {
        System.out.println(DocumentBuilderFactory.newInstance());
        new DatabaseBuilder(new File(JackcessParserTest.class.getResource("/test-documents/testAccess2_encrypted.accdb").toURI()))
                        .setCodecProvider(new CryptCodecProvider("tika"))
                        .setReadOnly(true).open();
    }
 {code}
It prints out "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl@10e41621" and then
{noformat}
com.healthmarketscience.jackcess.crypt.InvalidCryptoConfigurationException: Failed parsing encryption descriptor
	at org.apache.tika.parser.microsoft.JackcessParserTest.testTilman(JackcessParserTest.java:103)
Caused by: java.lang.IllegalArgumentException: Property 'http://javax.xml.XMLConstants/property/accessExternalDTD' is not recognized.
	at org.apache.tika.parser.microsoft.JackcessParserTest.testTilman(JackcessParserTest.java:103)
 {noformat}
The same code works outside of tika. Then the output is "com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl@1540e19d".

The reason is explained [here|http://example.com|https://stackoverflow.com/questions/53299280/], Apache Xerces does not support {{javax.xml.XMLConstants.ACCESS_EXTERNAL_DTD}}.

> General upgrades for 1.26
> -------------------------
>
>                 Key: TIKA-3244
>                 URL: https://issues.apache.org/jira/browse/TIKA-3244
>             Project: Tika
>          Issue Type: Task
>    Affects Versions: 1.25
>            Reporter: Tilman Hausherr
>            Priority: Major
>             Fix For: 2.0.0, 1.26
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)