You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Matthew Harrison <ma...@equinox.co.nz> on 2018/07/16 19:56:24 UTC

Parsing a file inside a zip file locking zip file on windows

I've hit a bit of an odd case, that I'm hoping someone can tell me what I'm doing wrong, or how to fix it.
As part of a transform (using Saxon), we're processing files that are inside a zip file.  To access the files we're using a 'jar' URI, however after accessing/parsing the file it seems to be locked, and so cannot be deleted until the java program terminates.
I believe I've created a reproducing case:

import java.io.File;
import java.io.IOException;
import java.net.URISyntaxException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import org.w3c.dom.Document;
import org.xml.sax.SAXException;

import com.google.common.io.Files;

public class CleanupInvestigation {
public static void main(String[] args)
throws IOException, SAXException, ParserConfigurationException, URISyntaxException, InterruptedException {
File input = new File("src/test/resources/books.zip");
File inputCopy = new File("src/test/resources/booksCopy.zip");
Files.copy(input, inputCopy);

DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = builder.parse("jar:file:src/test/resources/booksCopy.zip!/app.xml");

System.out.println(doc.getLastChild().getNodeName());
System.out.println("File delete (booksCopy.zip): " + inputCopy.delete());
System.out.println("complete");
}
}


I'm running this using Xerces 2.12.0 running on a Windows 10 machine, talking with the Saxon guys its looks like it works ok on a Mac (i.e. the booksCopy.zip file can be deleted).

If anyone has any ideas on what the issue might be that would be great.

Thanks,

Matt


Re: Parsing a file inside a zip file locking zip file on windows

Posted by Matthew Harrison <ma...@equinox.co.nz>.
Hi,


Thanks for getting back to me.  That makes sense that other OS still have file handles kicking around, and its just the problem shows up more obviously in windows.  I must say I did try looking into the (Xerces?) code that opens up the file specified by the URI, but didn't manage to be able to dig into it, so thank you for those pointers Bernd.


We've ended up working around the issue, and use different Saxon xslt functions to read the file inside the zip for what we need, so this isn't a blocker issue for us.  I just thought I should note that this kind of URI passed into the 'parse' function seems to be problematic - or did I misread some of the suggestions here?


Thanks,


Matt

________________________________
From: sebb <se...@gmail.com>
Sent: 17 July 2018 12:57:51
To: j-users@xerces.apache.org
Subject: Re: Parsing a file inside a zip file locking zip file on windows

IIRC when Windows opens files for read (or write) it locks them
against deletion.

You need to ensure that you close the file before you try to delete it.


On 16 July 2018 at 21:37, Bernd Eckenfels <ec...@zusammenkunft.net> wrote:
> Matt,
>
> I think this is a general problem of the JarURLConnection, especially if in
> caching mode. I think if you do some of the steps to open a stream yourself
> you can influence that better. I might be able to dig up a more detailed
> example later if you still need it. (It might not be a good idea to rely on
> the static setUseCache(false) method)
>
> You only see that on Windows because the other OS allows you to delete open
> files (however the file handle will linger around on those OS, too)
>
> Gruss
> Bernd
> --
> http://bernd.eckenfels.net
>
> ________________________________
> Von: Matthew Harrison <ma...@equinox.co.nz>
> Gesendet: Montag, Juli 16, 2018 9:56 PM
> An: j-users@xerces.apache.org
> Betreff: Parsing a file inside a zip file locking zip file on windows
>
>
> I've hit a bit of an odd case, that I'm hoping someone can tell me what I'm
> doing wrong, or how to fix it.
> As part of a transform (using Saxon), we're processing files that are inside
> a zip file.  To access the files we're using a 'jar' URI, however after
> accessing/parsing the file it seems to be locked, and so cannot be deleted
> until the java program terminates.
> I believe I've created a reproducing case:
>
> import java.io.File;
> import java.io.IOException;
> import java.net.URISyntaxException;
>
> import javax.xml.parsers.DocumentBuilder;
> import javax.xml.parsers.DocumentBuilderFactory;
> import javax.xml.parsers.ParserConfigurationException;
>
> import org.w3c.dom.Document;
> import org.xml.sax.SAXException;
>
> import com.google.common.io.Files;
>
> public class CleanupInvestigation {
> public static void main(String[] args)
> throws IOException, SAXException, ParserConfigurationException,
> URISyntaxException, InterruptedException {
> File input = new File("src/test/resources/books.zip");
> File inputCopy = new File("src/test/resources/booksCopy.zip");
> Files.copy(input, inputCopy);
>
> DocumentBuilder builder =
> DocumentBuilderFactory.newInstance().newDocumentBuilder();
> Document doc =
> builder.parse("jar:file:src/test/resources/booksCopy.zip!/app.xml");
>
> System.out.println(doc.getLastChild().getNodeName());
> System.out.println("File delete (booksCopy.zip): " + inputCopy.delete());
> System.out.println("complete");
> }
> }
>
>
> I'm running this using Xerces 2.12.0 running on a Windows 10 machine,
> talking with the Saxon guys its looks like it works ok on a Mac (i.e. the
> booksCopy.zip file can be deleted).
>
> If anyone has any ideas on what the issue might be that would be great.
>
> Thanks,
>
> Matt
>

---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org


Re: Parsing a file inside a zip file locking zip file on windows

Posted by sebb <se...@gmail.com>.
IIRC when Windows opens files for read (or write) it locks them
against deletion.

You need to ensure that you close the file before you try to delete it.


On 16 July 2018 at 21:37, Bernd Eckenfels <ec...@zusammenkunft.net> wrote:
> Matt,
>
> I think this is a general problem of the JarURLConnection, especially if in
> caching mode. I think if you do some of the steps to open a stream yourself
> you can influence that better. I might be able to dig up a more detailed
> example later if you still need it. (It might not be a good idea to rely on
> the static setUseCache(false) method)
>
> You only see that on Windows because the other OS allows you to delete open
> files (however the file handle will linger around on those OS, too)
>
> Gruss
> Bernd
> --
> http://bernd.eckenfels.net
>
> ________________________________
> Von: Matthew Harrison <ma...@equinox.co.nz>
> Gesendet: Montag, Juli 16, 2018 9:56 PM
> An: j-users@xerces.apache.org
> Betreff: Parsing a file inside a zip file locking zip file on windows
>
>
> I've hit a bit of an odd case, that I'm hoping someone can tell me what I'm
> doing wrong, or how to fix it.
> As part of a transform (using Saxon), we're processing files that are inside
> a zip file.  To access the files we're using a 'jar' URI, however after
> accessing/parsing the file it seems to be locked, and so cannot be deleted
> until the java program terminates.
> I believe I've created a reproducing case:
>
> import java.io.File;
> import java.io.IOException;
> import java.net.URISyntaxException;
>
> import javax.xml.parsers.DocumentBuilder;
> import javax.xml.parsers.DocumentBuilderFactory;
> import javax.xml.parsers.ParserConfigurationException;
>
> import org.w3c.dom.Document;
> import org.xml.sax.SAXException;
>
> import com.google.common.io.Files;
>
> public class CleanupInvestigation {
> public static void main(String[] args)
> throws IOException, SAXException, ParserConfigurationException,
> URISyntaxException, InterruptedException {
> File input = new File("src/test/resources/books.zip");
> File inputCopy = new File("src/test/resources/booksCopy.zip");
> Files.copy(input, inputCopy);
>
> DocumentBuilder builder =
> DocumentBuilderFactory.newInstance().newDocumentBuilder();
> Document doc =
> builder.parse("jar:file:src/test/resources/booksCopy.zip!/app.xml");
>
> System.out.println(doc.getLastChild().getNodeName());
> System.out.println("File delete (booksCopy.zip): " + inputCopy.delete());
> System.out.println("complete");
> }
> }
>
>
> I'm running this using Xerces 2.12.0 running on a Windows 10 machine,
> talking with the Saxon guys its looks like it works ok on a Mac (i.e. the
> booksCopy.zip file can be deleted).
>
> If anyone has any ideas on what the issue might be that would be great.
>
> Thanks,
>
> Matt
>

---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org


Re: Parsing a file inside a zip file locking zip file on windows

Posted by Bernd Eckenfels <ec...@zusammenkunft.net>.
Matt,

I think this is a general problem of the JarURLConnection, especially if in caching mode. I think if you do some of the steps to open a stream yourself you can influence that better. I might be able to dig up a more detailed example later if you still need it. (It might not be a good idea to rely on the static setUseCache(false) method)

You only see that on Windows because the other OS allows you to delete open files (however the file handle will linger around on those OS, too)

Gruss
Bernd
--
http://bernd.eckenfels.net

________________________________
Von: Matthew Harrison <ma...@equinox.co.nz>
Gesendet: Montag, Juli 16, 2018 9:56 PM
An: j-users@xerces.apache.org
Betreff: Parsing a file inside a zip file locking zip file on windows


I've hit a bit of an odd case, that I'm hoping someone can tell me what I'm doing wrong, or how to fix it.
As part of a transform (using Saxon), we're processing files that are inside a zip file.  To access the files we're using a 'jar' URI, however after accessing/parsing the file it seems to be locked, and so cannot be deleted until the java program terminates.
I believe I've created a reproducing case:

import java.io.File;
import java.io.IOException;
import java.net.URISyntaxException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import org.w3c.dom.Document;
import org.xml.sax.SAXException;

import com.google.common.io.Files;

public class CleanupInvestigation {
public static void main(String[] args)
throws IOException, SAXException, ParserConfigurationException, URISyntaxException, InterruptedException {
File input = new File("src/test/resources/books.zip");
File inputCopy = new File("src/test/resources/booksCopy.zip");
Files.copy(input, inputCopy);

DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = builder.parse("jar:file:src/test/resources/booksCopy.zip!/app.xml");

System.out.println(doc.getLastChild().getNodeName());
System.out.println("File delete (booksCopy.zip): " + inputCopy.delete());
System.out.println("complete");
}
}


I'm running this using Xerces 2.12.0 running on a Windows 10 machine, talking with the Saxon guys its looks like it works ok on a Mac (i.e. the booksCopy.zip file can be deleted).

If anyone has any ideas on what the issue might be that would be great.

Thanks,

Matt