Class TarFile

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable

    public class TarFile
    extends java.lang.Object
    implements java.io.Closeable
    Provides random access to UNIX archives.
    Since:
    1.21
    • Constructor Summary

      Constructors 
      Constructor Description
      TarFile​(byte[] content)
      Constructor for TarFile.
      TarFile​(byte[] content, boolean lenient)
      Constructor for TarFile.
      TarFile​(byte[] content, java.lang.String encoding)
      Constructor for TarFile.
      TarFile​(java.io.File archive)
      Constructor for TarFile.
      TarFile​(java.io.File archive, boolean lenient)
      Constructor for TarFile.
      TarFile​(java.io.File archive, java.lang.String encoding)
      Constructor for TarFile.
      TarFile​(java.nio.channels.SeekableByteChannel content)
      Constructor for TarFile.
      TarFile​(java.nio.channels.SeekableByteChannel archive, int blockSize, int recordSize, java.lang.String encoding, boolean lenient)
      Constructor for TarFile.
      TarFile​(java.nio.file.Path archivePath)
      Constructor for TarFile.
      TarFile​(java.nio.file.Path archivePath, boolean lenient)
      Constructor for TarFile.
      TarFile​(java.nio.file.Path archivePath, java.lang.String encoding)
      Constructor for TarFile.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      private void applyPaxHeadersToCurrentEntry​(java.util.Map<java.lang.String,​java.lang.String> headers, java.util.List<TarArchiveStructSparse> sparseHeaders)
      Update the current entry with the read pax headers
      private void buildSparseInputStreams()
      Build the input streams consisting of all-zero input streams and non-zero input streams.
      void close()  
      private void consumeRemainderOfLastBlock()
      This method is invoked once the end of the archive is hit, it tries to consume the remaining bytes under the assumption that the tool creating this archive has padded the last block.
      java.util.List<TarArchiveEntry> getEntries()
      Gets all TAR Archive Entries from the TarFile
      java.io.InputStream getInputStream​(TarArchiveEntry entry)
      Gets the input stream for the provided Tar Archive Entry.
      private byte[] getLongNameData()
      Gets the next entry in this tar archive as long name data.
      private TarArchiveEntry getNextTarEntry()
      Gets the next entry in this tar archive.
      private java.nio.ByteBuffer getRecord()
      Gets the next record in this tar archive.
      protected boolean isAtEOF()  
      private boolean isDirectory()  
      private boolean isEOFRecord​(java.nio.ByteBuffer headerBuf)  
      private void paxHeaders()
      For PAX Format 0.0, the sparse headers(GNU.sparse.offset and GNU.sparse.numbytes) may appear multi times, and they look like:
      private void readGlobalPaxHeaders()  
      private void readOldGNUSparse()
      Adds the sparse chunks from the current entry to the sparse chunks, including any additional sparse entries following the current entry.
      private java.nio.ByteBuffer readRecord()
      Read a record from the input stream and return the data.
      private void repositionForwardBy​(long offset)  
      private void repositionForwardTo​(long newPosition)  
      protected void setAtEOF​(boolean b)  
      private void skipRecordPadding()
      The last record block should be written at the full size, so skip any additional space used to fill a record after an entry
      private void throwExceptionIfPositionIsNotInArchive()
      Checks if the current position of the SeekableByteChannel is in the archive.
      private void tryToConsumeSecondEOFRecord()
      Tries to read the next record resetting the position in the archive if it is not an EOF record.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • smallBuf

        private final byte[] smallBuf
      • archive

        private final java.nio.channels.SeekableByteChannel archive
      • zipEncoding

        private final ZipEncoding zipEncoding
        The encoding of the tar file
      • blockSize

        private final int blockSize
      • lenient

        private final boolean lenient
      • recordSize

        private final int recordSize
      • recordBuffer

        private final java.nio.ByteBuffer recordBuffer
      • hasHitEOF

        private boolean hasHitEOF
      • currEntry

        private TarArchiveEntry currEntry
        The meta-data about the current entry
      • globalPaxHeaders

        private java.util.Map<java.lang.String,​java.lang.String> globalPaxHeaders
      • sparseInputStreams

        private final java.util.Map<java.lang.String,​java.util.List<java.io.InputStream>> sparseInputStreams
    • Constructor Detail

      • TarFile

        public TarFile​(byte[] content)
                throws java.io.IOException
        Constructor for TarFile.
        Parameters:
        content - the content to use
        Throws:
        java.io.IOException - when reading the tar archive fails
      • TarFile

        public TarFile​(byte[] content,
                       boolean lenient)
                throws java.io.IOException
        Constructor for TarFile.
        Parameters:
        content - the content to use
        lenient - when set to true illegal values for group/userid, mode, device numbers and timestamp will be ignored and the fields set to TarArchiveEntry.UNKNOWN. When set to false such illegal fields cause an exception instead.
        Throws:
        java.io.IOException - when reading the tar archive fails
      • TarFile

        public TarFile​(byte[] content,
                       java.lang.String encoding)
                throws java.io.IOException
        Constructor for TarFile.
        Parameters:
        content - the content to use
        encoding - the encoding to use
        Throws:
        java.io.IOException - when reading the tar archive fails
      • TarFile

        public TarFile​(java.io.File archive)
                throws java.io.IOException
        Constructor for TarFile.
        Parameters:
        archive - the file of the archive to use
        Throws:
        java.io.IOException - when reading the tar archive fails
      • TarFile

        public TarFile​(java.io.File archive,
                       boolean lenient)
                throws java.io.IOException
        Constructor for TarFile.
        Parameters:
        archive - the file of the archive to use
        lenient - when set to true illegal values for group/userid, mode, device numbers and timestamp will be ignored and the fields set to TarArchiveEntry.UNKNOWN. When set to false such illegal fields cause an exception instead.
        Throws:
        java.io.IOException - when reading the tar archive fails
      • TarFile

        public TarFile​(java.io.File archive,
                       java.lang.String encoding)
                throws java.io.IOException
        Constructor for TarFile.
        Parameters:
        archive - the file of the archive to use
        encoding - the encoding to use
        Throws:
        java.io.IOException - when reading the tar archive fails
      • TarFile

        public TarFile​(java.nio.file.Path archivePath)
                throws java.io.IOException
        Constructor for TarFile.
        Parameters:
        archivePath - the path of the archive to use
        Throws:
        java.io.IOException - when reading the tar archive fails
      • TarFile

        public TarFile​(java.nio.file.Path archivePath,
                       boolean lenient)
                throws java.io.IOException
        Constructor for TarFile.
        Parameters:
        archivePath - the path of the archive to use
        lenient - when set to true illegal values for group/userid, mode, device numbers and timestamp will be ignored and the fields set to TarArchiveEntry.UNKNOWN. When set to false such illegal fields cause an exception instead.
        Throws:
        java.io.IOException - when reading the tar archive fails
      • TarFile

        public TarFile​(java.nio.file.Path archivePath,
                       java.lang.String encoding)
                throws java.io.IOException
        Constructor for TarFile.
        Parameters:
        archivePath - the path of the archive to use
        encoding - the encoding to use
        Throws:
        java.io.IOException - when reading the tar archive fails
      • TarFile

        public TarFile​(java.nio.channels.SeekableByteChannel content)
                throws java.io.IOException
        Constructor for TarFile.
        Parameters:
        content - the content to use
        Throws:
        java.io.IOException - when reading the tar archive fails
      • TarFile

        public TarFile​(java.nio.channels.SeekableByteChannel archive,
                       int blockSize,
                       int recordSize,
                       java.lang.String encoding,
                       boolean lenient)
                throws java.io.IOException
        Constructor for TarFile.
        Parameters:
        archive - the seekable byte channel to use
        blockSize - the blocks size to use
        recordSize - the record size to use
        encoding - the encoding to use
        lenient - when set to true illegal values for group/userid, mode, device numbers and timestamp will be ignored and the fields set to TarArchiveEntry.UNKNOWN. When set to false such illegal fields cause an exception instead.
        Throws:
        java.io.IOException - when reading the tar archive fails
    • Method Detail

      • applyPaxHeadersToCurrentEntry

        private void applyPaxHeadersToCurrentEntry​(java.util.Map<java.lang.String,​java.lang.String> headers,
                                                   java.util.List<TarArchiveStructSparse> sparseHeaders)
                                            throws java.io.IOException
        Update the current entry with the read pax headers
        Parameters:
        headers - Headers read from the pax header
        sparseHeaders - Sparse headers read from pax header
        Throws:
        java.io.IOException
      • buildSparseInputStreams

        private void buildSparseInputStreams()
                                      throws java.io.IOException
        Build the input streams consisting of all-zero input streams and non-zero input streams. When reading from the non-zero input streams, the data is actually read from the original input stream. The size of each input stream is introduced by the sparse headers.
        Throws:
        java.io.IOException
      • close

        public void close()
                   throws java.io.IOException
        Specified by:
        close in interface java.lang.AutoCloseable
        Specified by:
        close in interface java.io.Closeable
        Throws:
        java.io.IOException
      • consumeRemainderOfLastBlock

        private void consumeRemainderOfLastBlock()
                                          throws java.io.IOException
        This method is invoked once the end of the archive is hit, it tries to consume the remaining bytes under the assumption that the tool creating this archive has padded the last block.
        Throws:
        java.io.IOException
      • getEntries

        public java.util.List<TarArchiveEntry> getEntries()
        Gets all TAR Archive Entries from the TarFile
        Returns:
        All entries from the tar file
      • getInputStream

        public java.io.InputStream getInputStream​(TarArchiveEntry entry)
                                           throws java.io.IOException
        Gets the input stream for the provided Tar Archive Entry.
        Parameters:
        entry - Entry to get the input stream from
        Returns:
        Input stream of the provided entry
        Throws:
        java.io.IOException - Corrupted TAR archive. Can't read entry.
      • getLongNameData

        private byte[] getLongNameData()
                                throws java.io.IOException
        Gets the next entry in this tar archive as long name data.
        Returns:
        The next entry in the archive as long name data, or null.
        Throws:
        java.io.IOException - on error
      • getNextTarEntry

        private TarArchiveEntry getNextTarEntry()
                                         throws java.io.IOException
        Gets the next entry in this tar archive. This will skip to the end of the current entry, if there is one, and place the position of the channel at the header of the next entry, and read the header and instantiate a new TarEntry from the header bytes and return that entry. If there are no more entries in the archive, null will be returned to indicate that the end of the archive has been reached.
        Returns:
        The next TarEntry in the archive, or null if there is no next entry.
        Throws:
        java.io.IOException - when reading the next TarEntry fails
      • getRecord

        private java.nio.ByteBuffer getRecord()
                                       throws java.io.IOException
        Gets the next record in this tar archive. This will skip over any remaining data in the current entry, if there is one, and place the input stream at the header of the next entry.

        If there are no more entries in the archive, null will be returned to indicate that the end of the archive has been reached. At the same time the hasHitEOF marker will be set to true.

        Returns:
        The next TarEntry in the archive, or null if there is no next entry.
        Throws:
        java.io.IOException - when reading the next TarEntry fails
      • isAtEOF

        protected final boolean isAtEOF()
      • isDirectory

        private boolean isDirectory()
      • isEOFRecord

        private boolean isEOFRecord​(java.nio.ByteBuffer headerBuf)
      • paxHeaders

        private void paxHeaders()
                         throws java.io.IOException

        For PAX Format 0.0, the sparse headers(GNU.sparse.offset and GNU.sparse.numbytes) may appear multi times, and they look like:

         GNU.sparse.size=size
         GNU.sparse.numblocks=numblocks
         repeat numblocks times
           GNU.sparse.offset=offset
           GNU.sparse.numbytes=numbytes
         end repeat
         

        For PAX Format 0.1, the sparse headers are stored in a single variable : GNU.sparse.map

         GNU.sparse.map
            Map of non-null data chunks. It is a string consisting of comma-separated values "offset,size[,offset-1,size-1...]"
         

        For PAX Format 1.X:
        The sparse map itself is stored in the file data block, preceding the actual file data. It consists of a series of decimal numbers delimited by newlines. The map is padded with nulls to the nearest block boundary. The first number gives the number of entries in the map. Following are map entries, each one consisting of two numbers giving the offset and size of the data block it describes.

        Throws:
        java.io.IOException
      • readGlobalPaxHeaders

        private void readGlobalPaxHeaders()
                                   throws java.io.IOException
        Throws:
        java.io.IOException
      • readOldGNUSparse

        private void readOldGNUSparse()
                               throws java.io.IOException
        Adds the sparse chunks from the current entry to the sparse chunks, including any additional sparse entries following the current entry.
        Throws:
        java.io.IOException - when reading the sparse entry fails
      • readRecord

        private java.nio.ByteBuffer readRecord()
                                        throws java.io.IOException
        Read a record from the input stream and return the data.
        Returns:
        The record data or null if EOF has been hit.
        Throws:
        java.io.IOException - if reading from the archive fails
      • repositionForwardBy

        private void repositionForwardBy​(long offset)
                                  throws java.io.IOException
        Throws:
        java.io.IOException
      • repositionForwardTo

        private void repositionForwardTo​(long newPosition)
                                  throws java.io.IOException
        Throws:
        java.io.IOException
      • setAtEOF

        protected final void setAtEOF​(boolean b)
      • skipRecordPadding

        private void skipRecordPadding()
                                throws java.io.IOException
        The last record block should be written at the full size, so skip any additional space used to fill a record after an entry
        Throws:
        java.io.IOException - when skipping the padding of the record fails
      • throwExceptionIfPositionIsNotInArchive

        private void throwExceptionIfPositionIsNotInArchive()
                                                     throws java.io.IOException
        Checks if the current position of the SeekableByteChannel is in the archive.
        Throws:
        java.io.IOException - If the position is not in the archive
      • tryToConsumeSecondEOFRecord

        private void tryToConsumeSecondEOFRecord()
                                          throws java.io.IOException
        Tries to read the next record resetting the position in the archive if it is not an EOF record.

        This is meant to protect against cases where a tar implementation has written only one EOF record when two are expected. Actually this won't help since a non-conforming implementation likely won't fill full blocks consisting of - by default - ten records either so we probably have already read beyond the archive anyway.

        Throws:
        java.io.IOException - if reading the record of resetting the position in the archive fails