Class TarFile
- java.lang.Object
-
- org.apache.commons.compress.archivers.tar.TarFile
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
public class TarFile extends java.lang.Object implements java.io.Closeable
Provides random access to UNIX archives.- Since:
- 1.21
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private class
TarFile.BoundedTarEntryInputStream
-
Field Summary
Fields Modifier and Type Field Description private java.nio.channels.SeekableByteChannel
archive
private int
blockSize
private TarArchiveEntry
currEntry
The meta-data about the current entryprivate java.util.LinkedList<TarArchiveEntry>
entries
private java.util.Map<java.lang.String,java.lang.String>
globalPaxHeaders
private java.util.List<TarArchiveStructSparse>
globalSparseHeaders
private boolean
hasHitEOF
private boolean
lenient
private java.nio.ByteBuffer
recordBuffer
private int
recordSize
private static int
SMALL_BUFFER_SIZE
private byte[]
smallBuf
private java.util.Map<java.lang.String,java.util.List<java.io.InputStream>>
sparseInputStreams
private ZipEncoding
zipEncoding
The encoding of the tar file
-
Constructor Summary
Constructors Constructor Description TarFile(byte[] content)
Constructor for TarFile.TarFile(byte[] content, boolean lenient)
Constructor for TarFile.TarFile(byte[] content, java.lang.String encoding)
Constructor for TarFile.TarFile(java.io.File archive)
Constructor for TarFile.TarFile(java.io.File archive, boolean lenient)
Constructor for TarFile.TarFile(java.io.File archive, java.lang.String encoding)
Constructor for TarFile.TarFile(java.nio.channels.SeekableByteChannel content)
Constructor for TarFile.TarFile(java.nio.channels.SeekableByteChannel archive, int blockSize, int recordSize, java.lang.String encoding, boolean lenient)
Constructor for TarFile.TarFile(java.nio.file.Path archivePath)
Constructor for TarFile.TarFile(java.nio.file.Path archivePath, boolean lenient)
Constructor for TarFile.TarFile(java.nio.file.Path archivePath, java.lang.String encoding)
Constructor for TarFile.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private void
applyPaxHeadersToCurrentEntry(java.util.Map<java.lang.String,java.lang.String> headers, java.util.List<TarArchiveStructSparse> sparseHeaders)
Update the current entry with the read pax headersprivate void
buildSparseInputStreams()
Build the input streams consisting of all-zero input streams and non-zero input streams.void
close()
private void
consumeRemainderOfLastBlock()
This method is invoked once the end of the archive is hit, it tries to consume the remaining bytes under the assumption that the tool creating this archive has padded the last block.java.util.List<TarArchiveEntry>
getEntries()
Gets all TAR Archive Entries from the TarFilejava.io.InputStream
getInputStream(TarArchiveEntry entry)
Gets the input stream for the provided Tar Archive Entry.private byte[]
getLongNameData()
Gets the next entry in this tar archive as long name data.private TarArchiveEntry
getNextTarEntry()
Gets the next entry in this tar archive.private java.nio.ByteBuffer
getRecord()
Gets the next record in this tar archive.protected boolean
isAtEOF()
private boolean
isDirectory()
private boolean
isEOFRecord(java.nio.ByteBuffer headerBuf)
private void
paxHeaders()
For PAX Format 0.0, the sparse headers(GNU.sparse.offset and GNU.sparse.numbytes) may appear multi times, and they look like:private void
readGlobalPaxHeaders()
private void
readOldGNUSparse()
Adds the sparse chunks from the current entry to the sparse chunks, including any additional sparse entries following the current entry.private java.nio.ByteBuffer
readRecord()
Read a record from the input stream and return the data.private void
repositionForwardBy(long offset)
private void
repositionForwardTo(long newPosition)
protected void
setAtEOF(boolean b)
private void
skipRecordPadding()
The last record block should be written at the full size, so skip any additional space used to fill a record after an entryprivate void
throwExceptionIfPositionIsNotInArchive()
Checks if the current position of the SeekableByteChannel is in the archive.private void
tryToConsumeSecondEOFRecord()
Tries to read the next record resetting the position in the archive if it is not an EOF record.
-
-
-
Field Detail
-
SMALL_BUFFER_SIZE
private static final int SMALL_BUFFER_SIZE
- See Also:
- Constant Field Values
-
smallBuf
private final byte[] smallBuf
-
archive
private final java.nio.channels.SeekableByteChannel archive
-
zipEncoding
private final ZipEncoding zipEncoding
The encoding of the tar file
-
entries
private final java.util.LinkedList<TarArchiveEntry> entries
-
blockSize
private final int blockSize
-
lenient
private final boolean lenient
-
recordSize
private final int recordSize
-
recordBuffer
private final java.nio.ByteBuffer recordBuffer
-
globalSparseHeaders
private final java.util.List<TarArchiveStructSparse> globalSparseHeaders
-
hasHitEOF
private boolean hasHitEOF
-
currEntry
private TarArchiveEntry currEntry
The meta-data about the current entry
-
globalPaxHeaders
private java.util.Map<java.lang.String,java.lang.String> globalPaxHeaders
-
sparseInputStreams
private final java.util.Map<java.lang.String,java.util.List<java.io.InputStream>> sparseInputStreams
-
-
Constructor Detail
-
TarFile
public TarFile(byte[] content) throws java.io.IOException
Constructor for TarFile.- Parameters:
content
- the content to use- Throws:
java.io.IOException
- when reading the tar archive fails
-
TarFile
public TarFile(byte[] content, boolean lenient) throws java.io.IOException
Constructor for TarFile.- Parameters:
content
- the content to uselenient
- when set to true illegal values for group/userid, mode, device numbers and timestamp will be ignored and the fields set toTarArchiveEntry.UNKNOWN
. When set to false such illegal fields cause an exception instead.- Throws:
java.io.IOException
- when reading the tar archive fails
-
TarFile
public TarFile(byte[] content, java.lang.String encoding) throws java.io.IOException
Constructor for TarFile.- Parameters:
content
- the content to useencoding
- the encoding to use- Throws:
java.io.IOException
- when reading the tar archive fails
-
TarFile
public TarFile(java.io.File archive) throws java.io.IOException
Constructor for TarFile.- Parameters:
archive
- the file of the archive to use- Throws:
java.io.IOException
- when reading the tar archive fails
-
TarFile
public TarFile(java.io.File archive, boolean lenient) throws java.io.IOException
Constructor for TarFile.- Parameters:
archive
- the file of the archive to uselenient
- when set to true illegal values for group/userid, mode, device numbers and timestamp will be ignored and the fields set toTarArchiveEntry.UNKNOWN
. When set to false such illegal fields cause an exception instead.- Throws:
java.io.IOException
- when reading the tar archive fails
-
TarFile
public TarFile(java.io.File archive, java.lang.String encoding) throws java.io.IOException
Constructor for TarFile.- Parameters:
archive
- the file of the archive to useencoding
- the encoding to use- Throws:
java.io.IOException
- when reading the tar archive fails
-
TarFile
public TarFile(java.nio.file.Path archivePath) throws java.io.IOException
Constructor for TarFile.- Parameters:
archivePath
- the path of the archive to use- Throws:
java.io.IOException
- when reading the tar archive fails
-
TarFile
public TarFile(java.nio.file.Path archivePath, boolean lenient) throws java.io.IOException
Constructor for TarFile.- Parameters:
archivePath
- the path of the archive to uselenient
- when set to true illegal values for group/userid, mode, device numbers and timestamp will be ignored and the fields set toTarArchiveEntry.UNKNOWN
. When set to false such illegal fields cause an exception instead.- Throws:
java.io.IOException
- when reading the tar archive fails
-
TarFile
public TarFile(java.nio.file.Path archivePath, java.lang.String encoding) throws java.io.IOException
Constructor for TarFile.- Parameters:
archivePath
- the path of the archive to useencoding
- the encoding to use- Throws:
java.io.IOException
- when reading the tar archive fails
-
TarFile
public TarFile(java.nio.channels.SeekableByteChannel content) throws java.io.IOException
Constructor for TarFile.- Parameters:
content
- the content to use- Throws:
java.io.IOException
- when reading the tar archive fails
-
TarFile
public TarFile(java.nio.channels.SeekableByteChannel archive, int blockSize, int recordSize, java.lang.String encoding, boolean lenient) throws java.io.IOException
Constructor for TarFile.- Parameters:
archive
- the seekable byte channel to useblockSize
- the blocks size to userecordSize
- the record size to useencoding
- the encoding to uselenient
- when set to true illegal values for group/userid, mode, device numbers and timestamp will be ignored and the fields set toTarArchiveEntry.UNKNOWN
. When set to false such illegal fields cause an exception instead.- Throws:
java.io.IOException
- when reading the tar archive fails
-
-
Method Detail
-
applyPaxHeadersToCurrentEntry
private void applyPaxHeadersToCurrentEntry(java.util.Map<java.lang.String,java.lang.String> headers, java.util.List<TarArchiveStructSparse> sparseHeaders) throws java.io.IOException
Update the current entry with the read pax headers- Parameters:
headers
- Headers read from the pax headersparseHeaders
- Sparse headers read from pax header- Throws:
java.io.IOException
-
buildSparseInputStreams
private void buildSparseInputStreams() throws java.io.IOException
Build the input streams consisting of all-zero input streams and non-zero input streams. When reading from the non-zero input streams, the data is actually read from the original input stream. The size of each input stream is introduced by the sparse headers.- Throws:
java.io.IOException
-
close
public void close() throws java.io.IOException
- Specified by:
close
in interfacejava.lang.AutoCloseable
- Specified by:
close
in interfacejava.io.Closeable
- Throws:
java.io.IOException
-
consumeRemainderOfLastBlock
private void consumeRemainderOfLastBlock() throws java.io.IOException
This method is invoked once the end of the archive is hit, it tries to consume the remaining bytes under the assumption that the tool creating this archive has padded the last block.- Throws:
java.io.IOException
-
getEntries
public java.util.List<TarArchiveEntry> getEntries()
Gets all TAR Archive Entries from the TarFile- Returns:
- All entries from the tar file
-
getInputStream
public java.io.InputStream getInputStream(TarArchiveEntry entry) throws java.io.IOException
Gets the input stream for the provided Tar Archive Entry.- Parameters:
entry
- Entry to get the input stream from- Returns:
- Input stream of the provided entry
- Throws:
java.io.IOException
- Corrupted TAR archive. Can't read entry.
-
getLongNameData
private byte[] getLongNameData() throws java.io.IOException
Gets the next entry in this tar archive as long name data.- Returns:
- The next entry in the archive as long name data, or null.
- Throws:
java.io.IOException
- on error
-
getNextTarEntry
private TarArchiveEntry getNextTarEntry() throws java.io.IOException
Gets the next entry in this tar archive. This will skip to the end of the current entry, if there is one, and place the position of the channel at the header of the next entry, and read the header and instantiate a new TarEntry from the header bytes and return that entry. If there are no more entries in the archive, null will be returned to indicate that the end of the archive has been reached.- Returns:
- The next TarEntry in the archive, or null if there is no next entry.
- Throws:
java.io.IOException
- when reading the next TarEntry fails
-
getRecord
private java.nio.ByteBuffer getRecord() throws java.io.IOException
Gets the next record in this tar archive. This will skip over any remaining data in the current entry, if there is one, and place the input stream at the header of the next entry.If there are no more entries in the archive, null will be returned to indicate that the end of the archive has been reached. At the same time the
hasHitEOF
marker will be set to true.- Returns:
- The next TarEntry in the archive, or null if there is no next entry.
- Throws:
java.io.IOException
- when reading the next TarEntry fails
-
isAtEOF
protected final boolean isAtEOF()
-
isDirectory
private boolean isDirectory()
-
isEOFRecord
private boolean isEOFRecord(java.nio.ByteBuffer headerBuf)
-
paxHeaders
private void paxHeaders() throws java.io.IOException
For PAX Format 0.0, the sparse headers(GNU.sparse.offset and GNU.sparse.numbytes) may appear multi times, and they look like:
GNU.sparse.size=size GNU.sparse.numblocks=numblocks repeat numblocks times GNU.sparse.offset=offset GNU.sparse.numbytes=numbytes end repeat
For PAX Format 0.1, the sparse headers are stored in a single variable : GNU.sparse.map
GNU.sparse.map Map of non-null data chunks. It is a string consisting of comma-separated values "offset,size[,offset-1,size-1...]"
For PAX Format 1.X:
The sparse map itself is stored in the file data block, preceding the actual file data. It consists of a series of decimal numbers delimited by newlines. The map is padded with nulls to the nearest block boundary. The first number gives the number of entries in the map. Following are map entries, each one consisting of two numbers giving the offset and size of the data block it describes.- Throws:
java.io.IOException
-
readGlobalPaxHeaders
private void readGlobalPaxHeaders() throws java.io.IOException
- Throws:
java.io.IOException
-
readOldGNUSparse
private void readOldGNUSparse() throws java.io.IOException
Adds the sparse chunks from the current entry to the sparse chunks, including any additional sparse entries following the current entry.- Throws:
java.io.IOException
- when reading the sparse entry fails
-
readRecord
private java.nio.ByteBuffer readRecord() throws java.io.IOException
Read a record from the input stream and return the data.- Returns:
- The record data or null if EOF has been hit.
- Throws:
java.io.IOException
- if reading from the archive fails
-
repositionForwardBy
private void repositionForwardBy(long offset) throws java.io.IOException
- Throws:
java.io.IOException
-
repositionForwardTo
private void repositionForwardTo(long newPosition) throws java.io.IOException
- Throws:
java.io.IOException
-
setAtEOF
protected final void setAtEOF(boolean b)
-
skipRecordPadding
private void skipRecordPadding() throws java.io.IOException
The last record block should be written at the full size, so skip any additional space used to fill a record after an entry- Throws:
java.io.IOException
- when skipping the padding of the record fails
-
throwExceptionIfPositionIsNotInArchive
private void throwExceptionIfPositionIsNotInArchive() throws java.io.IOException
Checks if the current position of the SeekableByteChannel is in the archive.- Throws:
java.io.IOException
- If the position is not in the archive
-
tryToConsumeSecondEOFRecord
private void tryToConsumeSecondEOFRecord() throws java.io.IOException
Tries to read the next record resetting the position in the archive if it is not an EOF record.This is meant to protect against cases where a tar implementation has written only one EOF record when two are expected. Actually this won't help since a non-conforming implementation likely won't fill full blocks consisting of - by default - ten records either so we probably have already read beyond the archive anyway.
- Throws:
java.io.IOException
- if reading the record of resetting the position in the archive fails
-
-