You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@commons.apache.org by Ken Tanaka <Ke...@noaa.gov> on 2007/10/30 21:17:13 UTC

VFS how to read gzipped content from tar file

I would like to create an uncompressed file from a compressed file 
inside of a tar archive.

Can VFS allow me to do this in one step? I can get the compressed.gz 
file from archive.tar as a file on disk, then I can decompress the gzip 
file and then delete the .gz version. If there is an example, tutorial 
or book online or in print that would be great, I haven't found anything 
like this yet.

Conceptually there is a tar file:

archive.tar
 +- tardir/
     +- content.txt.gz

I'd like to end up with an uncompressed file "content.txt".

I tried something like:

    FileObject gzTarFile = 
fsManager.resolveFile("tar:gz:/archive.tar!/tardir/content.txt.gz");

    LocalFile newFile = (LocalFile) 
fsManager.resolveFile("file:///destination/content.txt");
    newFile.copyFrom(gzTarFile, new AllFileSelector());

Thanks in advance for any advice
-Ken

The test program I'm working with follows:
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - -
package gov.noaa.eds.tryVfs;

import org.apache.commons.vfs.FileName;
import org.apache.commons.vfs.FileObject;
import org.apache.commons.vfs.FileSystemException;
import org.apache.commons.vfs.FileSystemManager;
import org.apache.commons.vfs.VFS;

/**
 * Try using VFS to read the content of a compressed (gz) file inside of
 * a tar file.
 */
public class App
{
    static FileSystemManager fsManager = null;
   
    public static void main( String[] args )
    {
        try {
            fsManager = VFS.getManager();
        } catch (FileSystemException ex) {
            ex.printStackTrace();
        }
       
        try {
            /* resolveFile OK */
            System.out.println("Resolve tar file:");
            FileObject tarFile = fsManager.resolveFile(
                    "tar:/extra/data/tryVfs/archive.tar");
           
            FileName tarFileName = tarFile.getName();
            System.out.println("  Path     : " + tarFileName.getPath());
            System.out.println("  URI      : " + tarFileName.getURI());
           
           
            /* resolveFile OK */
            System.out.println("Resolve gzip file inside tar file:");
            FileObject gzTarFile = fsManager.resolveFile(
                    
"tar:file:///extra/data/tryVfs/archive.tar!/tarDir/content.txt.gz");
           
            FileName gzTarFileName = gzTarFile.getName();
            System.out.println("  Path     : " + gzTarFileName.getPath());
            System.out.println("  URI      : " + gzTarFileName.getURI());
           
           
            /* resolveFile has an error
             * uncomment one of the // "file string" arguments for 
resolveFile below
             * each of the strings I've tried has an /* error message * /
             */
            System.out.println("Resolve content of gzip file inside tar 
file:");
            FileObject contentFile = fsManager.resolveFile(
//                
"tar:gz:/extra/data/tryVfs/archive.tar!/tarDir/content.txt.gz"
                /* Unknown message with code "Unknown message with code 
"vfs.provider.tar/open-tar-file.error".". */
                   
//                
"tar:gz:///extra/data/tryVfs/archive.tar!/tarDir/content.txt.gz"
                /* Unknown message with code "Unknown message with code 
"vfs.provider.tar/open-tar-file.error".". */
                   
//                
"tar:file:gz:/extra/data/tryVfs/archive.tar!/tarDir/content.txt.gz"
                /* URI 
"file:gz:///extra/data/tryVfs/archive.tar!/tarDir/content.txt.gz" is not 
an absolute file name. */
                   
//                
"tar:file:gz:/extra/data/tryVfs/archive.tar!///tarDir/content.txt.gz!/content.txt"
                /* URI 
"file:gz:///extra/data/tryVfs/archive.tar!/tarDir/content.txt.gz" is not 
an absolute file name. */
                   
                
"tar:gz:///extra/data/tryVfs/archive.tar!/tarDir/content.txt"
                /* Unknown message with code "Unknown message with code 
"vfs.provider.tar/open-tar-file.error".". */
                );
           
            FileName contentFileName = contentFile.getName();
            System.out.println("  Path     : " + contentFileName.getPath());
            System.out.println("  URI      : " + contentFileName.getURI());
           
            /* copy uncompressed content to a new file */
//            LocalFile newFile = (LocalFile) fsManager.resolveFile(
//                    "file:///extra/data/tryVfs/content.txt");           
//            newFile.copyFrom(contentFile, new AllFileSelector());
        } catch (FileSystemException ex) {
            ex.printStackTrace();
        }
    } // main( String[] args )
}

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: VFS how to read gzipped content from tar file

Posted by Ken Tanaka <Ke...@noaa.gov>.
Actually the tarfile is not compressed, files inside the tar file are 
gzipped files, for example

tar tvf archive.tar
drwxrwsr-x ktanaka/ktanaka   0 2007-10-30 12:45:26 tardir/
-rw-rw-r-- ktanaka/ktanaka  56 2007-10-30 12:44:37 tardir/content.txt.gz

I'd like to directly create a content.txt file from the above archive.tar

Thanks for the posting though, you gave me an idea to try that led to a 
solution:

    
gz:tar:file:///extra/data/tryVfs/archive.tar!/tardir/content.txt.gz!content.txt

It was unclear to me from the Javadoc how to build up this name 
parameter for
FileSystemManager.resolveFile(name). Although I see after the fact that 
if I had studied
http://commons.apache.org/vfs/filesystems.html
the "Zip, Jar and Tar" section has a 5th example
"|tar:gz:http://anyhost/dir/mytar.tar.gz!/mytar.tar!/path/in/tar/README.txt|"
 From this maybe I could have deduced that multiple paths can be chained 
together
with a "!" as a separator, while file system designators ("file:", "tar:"
and "gz:") should be prepended onto the front in reverse order.

I'll update the example I started in the VFS wiki to reflect the much 
simpler name

http://wiki.apache.org/jakarta-commons/ExtractAndDecompressGzipFiles

-Ken


Mark Fortner wrote:
> You mentioned that you wanted to look into a tarball (gzipped tar 
> file), but
> the URL you gave was only for a tar file. Something like this should work:
>
> gz:tar:file:///extra/data/tryVfs/archive.tar.gz!/myfile.txt
>
> Hope this helps,
>
> Mark
>
> On 10/31/07, Ken Tanaka < Ken.Tanaka@noaa.gov> wrote:
>> Thanks for the suggestion, but I'm getting a different error when I try
>> that:
>> org.apache.commons.vfs.FileSystemException: Could not resolve file
>> "gz:tar:file:///extra/data/tryVfs/archive.tar!/!/".
...
>>
>>
>>
>> Here is the exact code corresponding to the above error:
>> FileObject contentFile = fsManager.resolveFile(
>>
>> "gz:tar:///extra/data/tryVfs/archive.tar!/tardir/content.txt.gz"
>> );
>>
>> Philippe Poulard wrote:
>>> Hi Ken,
>>>
>>> Ken Tanaka a écrit :
>>>> FileObject gzTarFile =
>>>> fsManager.resolveFile("tar:gz:/archive.tar!/tardir/content.txt.gz");
>>> try this :
>>>
>>> fsManager.resolveFile("gz:tar:/archive.tar!/tardir/content.txt.gz");
>>>
>> --
>> = Enterprise Data Services Division ===============
>> | CIRES, National Geophysical Data Center / NOAA |
>> | 303-497-6221 |
>> = Ken.Tanaka@noaa.gov =============================
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
>> For additional commands, e-mail: user-help@commons.apache.org
>>
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: VFS how to read gzipped content from tar file

Posted by Mark Fortner <ph...@gmail.com>.
You mentioned that you wanted to look into a tarball (gzipped tar file), but
the URL you gave was only for a tar file.  Something like this should work:

gz:tar:file:///extra/data/tryVfs/archive.tar.gz!/myfile.txt

Hope this helps,

Mark

On 10/31/07, Ken Tanaka < Ken.Tanaka@noaa.gov> wrote:
>
> Thanks for the suggestion, but I'm getting a different error when I try
> that:
> org.apache.commons.vfs.FileSystemException: Could not resolve file
> "gz:tar:file:///extra/data/tryVfs/archive.tar!/!/".
>         at
> org.apache.commons.vfs.provider.AbstractFileSystem.resolveFile (
> AbstractFileSystem.java:301)
>         at
> org.apache.commons.vfs.provider.AbstractFileSystem.resolveFile(
> AbstractFileSystem.java:267)
>         at
> org.apache.commons.vfs.provider.AbstractFileSystem.getRoot(
> AbstractFileSystem.java :242)
>         at
>
> org.apache.commons.vfs.provider.AbstractLayeredFileProvider.createFileSystem
> (AbstractLayeredFileProvider.java:82)
>         at
> org.apache.commons.vfs.provider.AbstractLayeredFileProvider.findFile (
> AbstractLayeredFileProvider.java:59)
>         at
> org.apache.commons.vfs.impl.DefaultFileSystemManager.resolveFile(
> DefaultFileSystemManager.java:641)
>         at
> org.apache.commons.vfs.impl.DefaultFileSystemManager.resolveFile (
> DefaultFileSystemManager.java:602)
>         at
> org.apache.commons.vfs.impl.DefaultFileSystemManager.resolveFile(
> DefaultFileSystemManager.java:570)
>         at gov.noaa.eds.tryVfs.App.main(App.java:51)
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out
> of range: -1
>         at java.lang.String.substring(String.java:1768)
>         at
> org.apache.commons.vfs.provider.compressed.CompressedFileFileObject.<init>(
> CompressedFileFileObject.java:48)
>         at
> org.apache.commons.vfs.provider.gzip.GzipFileObject.<init>(
> GzipFileObject.java:39)
>         at
> org.apache.commons.vfs.provider.gzip.GzipFileSystem.createFile(
> GzipFileSystem.java :42)
>         at
> org.apache.commons.vfs.provider.AbstractFileSystem.resolveFile(
> AbstractFileSystem.java:296)
>         ... 8 more
>
>
> Here is the exact code corresponding to the above error:
>             FileObject contentFile = fsManager.resolveFile(
>
> "gz:tar:///extra/data/tryVfs/archive.tar!/tardir/content.txt.gz"
>                 );
>
> Philippe Poulard wrote:
> > Hi Ken,
> >
> > Ken Tanaka a écrit :
> >>
> >>    FileObject gzTarFile =
> >> fsManager.resolveFile("tar:gz:/archive.tar!/tardir/content.txt.gz");
> >
> > try this :
> >
> > fsManager.resolveFile("gz:tar:/archive.tar!/tardir/content.txt.gz");
> >
>
> --
> = Enterprise Data Services Division ===============
> | CIRES, National Geophysical Data Center / NOAA  |
> | 303-497-6221                                    |
> = Ken.Tanaka@noaa.gov =============================
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
>

Re: VFS how to read gzipped content from tar file

Posted by Ken Tanaka <Ke...@noaa.gov>.
To follow up: I never did get a direct extract of the gzipped content 
from inside of a tar file, but took a multistep approach to get the 
files I want.

I've documented what I've come up with so far:

http://wiki.apache.org/jakarta-commons/ExtractAndDecompressGzipFiles

I started a VfsCookbook page in the wiki for people to contribute 
examples to (hint, hint). I think that working examples of VFS are lacking.

-Ken

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: VFS how to read gzipped content from tar file

Posted by Ken Tanaka <Ke...@noaa.gov>.
Thanks for the suggestion, but I'm getting a different error when I try 
that:
org.apache.commons.vfs.FileSystemException: Could not resolve file 
"gz:tar:file:///extra/data/tryVfs/archive.tar!/!/".       
        at 
org.apache.commons.vfs.provider.AbstractFileSystem.resolveFile(AbstractFileSystem.java:301)
        at 
org.apache.commons.vfs.provider.AbstractFileSystem.resolveFile(AbstractFileSystem.java:267)
        at 
org.apache.commons.vfs.provider.AbstractFileSystem.getRoot(AbstractFileSystem.java:242)
        at 
org.apache.commons.vfs.provider.AbstractLayeredFileProvider.createFileSystem(AbstractLayeredFileProvider.java:82)
        at 
org.apache.commons.vfs.provider.AbstractLayeredFileProvider.findFile(AbstractLayeredFileProvider.java:59)
        at 
org.apache.commons.vfs.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:641)
        at 
org.apache.commons.vfs.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:602)
        at 
org.apache.commons.vfs.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:570)
        at gov.noaa.eds.tryVfs.App.main(App.java:51)
Caused by: java.lang.StringIndexOutOfBoundsException: String index out 
of range: -1
        at java.lang.String.substring(String.java:1768)
        at 
org.apache.commons.vfs.provider.compressed.CompressedFileFileObject.<init>(CompressedFileFileObject.java:48)
        at 
org.apache.commons.vfs.provider.gzip.GzipFileObject.<init>(GzipFileObject.java:39)
        at 
org.apache.commons.vfs.provider.gzip.GzipFileSystem.createFile(GzipFileSystem.java:42)
        at 
org.apache.commons.vfs.provider.AbstractFileSystem.resolveFile(AbstractFileSystem.java:296)
        ... 8 more


Here is the exact code corresponding to the above error:
            FileObject contentFile = fsManager.resolveFile(
                    
"gz:tar:///extra/data/tryVfs/archive.tar!/tardir/content.txt.gz"
                );

Philippe Poulard wrote:
> Hi Ken,
>
> Ken Tanaka a écrit :
>>
>>    FileObject gzTarFile = 
>> fsManager.resolveFile("tar:gz:/archive.tar!/tardir/content.txt.gz");
>
> try this :
>
> fsManager.resolveFile("gz:tar:/archive.tar!/tardir/content.txt.gz");
>

-- 
= Enterprise Data Services Division ===============
| CIRES, National Geophysical Data Center / NOAA  |
| 303-497-6221                                    |
= Ken.Tanaka@noaa.gov =============================


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: VFS how to read gzipped content from tar file

Posted by Philippe Poulard <ph...@sophia.inria.fr>.
Hi Ken,

Ken Tanaka a écrit :
> 
>    FileObject gzTarFile = 
> fsManager.resolveFile("tar:gz:/archive.tar!/tardir/content.txt.gz");

try this :

fsManager.resolveFile("gz:tar:/archive.tar!/tardir/content.txt.gz");

-- 
Cordialement,

               ///
              (. .)
  --------ooO--(_)--Ooo--------
|      Philippe Poulard       |
  -----------------------------
  http://reflex.gforge.inria.fr/
        Have the RefleX !

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org