File Compression and Archiving

4.3. File Compression and Archiving

Sometimes it is useful to store a group of files in one file so that they can be backed up, easily transferred to another directory, or even transferred to a different computer. It is also sometimes useful to compress files into one file so that they use less disk space and download faster via the Internet.

It is important to understand the distinction between an archive file and a compressed file. An archive file is a collection of files and directories that are stored in one file. The archive file is not compressed — it uses the same amount of disk space as all the individual files and directories combined. A compressed file is a collection of files and directories that are stored in one file and stored in a way that uses less disk space than all the individual files and directories combined. If you do not have enough disk space on your computer, you can compress files that you do not use very often or files that you want to save but do not use anymore. You can even create an archive file and then compress it to save disk space.

NoteNote
 

An archive file is not compressed, but a compressed file can be an archive file.

4.3.1. Using File Roller

Red Hat Enterprise Linux includes a graphical utility called File Roller that can compress, decompress, and archive files and directories. File Roller supports common UNIX and Linux file compression and archiving formats and has a simple interface and extensive help documentation if you need it. It is also integrated into the desktop environment and graphical file manager to make working with archived files easier.

To start File Roller click Main Menu => Accessories => File Roller. You can also start File Roller from a shell prompt by typing file-roller. Figure 4-1 shows File Roller in action.

TipTip
 

If you are using a file manager (such as Nautilus), you can double-click the file you wish to unarchive or decompress to start File Roller. The File Roller browser window appears with the decompressed/unarchived file in a folder for you to extract or browse.

Figure 4-1. File Roller in Action

4.3.1.1. Decompressing and Unarchiving with File Roller

To unarchive and/or decompress a file click the Open toolbar button. A file menu pops up, allowing you to choose the archive you wish to work with. For example, if you have a file called foo.tar.gz located in your home directory, highlight the file and click OK. The file appears in the main File Roller browser window as a folder, which you can navigate by double-clicking the folder icon. File Roller preserves all directory and subdirectory structures, which is convenient if you are looking for a particular file in the archive. You can extract individual files or entire archives by clicking the Extract button, choosing the directory you would like to save the unarchived files, and clicking OK.

4.3.1.2. Creating Archives with File Roller

If you need to free some hard drive space, or send multiple files or a directory of files to another user, File Roller allows you to create archives of your files and directories. To create a new archive, click New on the toolbar. A file browser pops up, allowing you to specify an archive name and the compression technique. For example, you may choose a Tar Compressed with gzip (.tar.gz) format from the drop-down menu and type the name of the archive file you want to create. Click OK and your new archive is now ready to be filled with files and directories. To add files to your new archive, click Add, which opens a browser window (Figure 4-2) that you can navigate to find the file or directory you want to be in the archive. Click OK when you are finished, and click Archive => Close to close the archive.

Figure 4-2. Creating an Archive with File Roller

TipTip
 

There is much more you can do with File Roller than is explained here. Refer to the File Roller manual (available by clicking Help => Manual) for more information.

4.3.2. Compressing Files at the Shell Prompt

Compressed files use less disk space and download faster than large, uncompressed files. In Red Hat Enterprise Linux you can compress files with the compression tools bzip2, gzip, or zip.

The bzip2 compression tool is recommended because it provides the most compression and is found on most UNIX-like operating systems. The gzip compression tool can also be found on most UNIX-like operating systems. If you need to transfer files between Linux and other operating system such as MS Windows, you should use zip because it is more compatible with the compression utilities on Windows.

Compression ToolFile ExtensionDecompression Tool
bzip2.bz2bunzip2
gzip.gzgunzip
zip.zipunzip

Table 4-1. Compression Tools

By convention, files compressed with bzip2 are given the extension .bz2, files compressed with gzip are given the extension .gz, and files compressed with zip are given the extension .zip.

Files compressed with gzip are uncompressed with gunzip, files compressed with bzip2 are uncompressed with bunzip2, and files compressed with zip are uncompressed with unzip.

4.3.2.1. Bzip2 and Bunzip2

To use bzip2 to compress a file, type the following command at a shell prompt:

bzip2 filename

The file is compressed and saved as filename.bz2.

To expand the compressed file, type the following command:

bunzip2 filename.bz2

The filename.bz2 compressed file is deleted and replaced with filename.

You can use bzip2 to compress multiple files and directories at the same time by listing them with a space between each one:

bzip2 filename.bz2 file1 file2 file3 /usr/work/school 

The above command compresses file1, file2, file3, and the contents of the /usr/work/school/ directory (assuming this directory exists) and places them in a file named filename.bz2.

TipTip
 

For more information, type man bzip2 and man bunzip2 at a shell prompt to read the man pages for bzip2 and bunzip2.

4.3.2.2. Gzip and Gunzip

To use gzip to compress a file, type the following command at a shell prompt:

gzip filename

The file is compressed and saved as filename.gz.

To expand the compressed file, type the following command:

gunzip filename.gz

The filename.gz compressed file is deleted and replaced with filename.

You can use gzip to compress multiple files and directories at the same time by listing them with a space between each one:

gzip -r filename.gz file1 file2 file3 /usr/work/school 

The above command compresses file1, file2, file3, and the contents of the /usr/work/school/ directory (assuming this directory exists) and places them in a file named filename.gz.

TipTip
 

For more information, type man gzip and man gunzip at a shell prompt to read the man pages for gzip and gunzip.

4.3.2.3. Zip and Unzip

To compress a file with zip, type the following command:

zip -r filename.zip filesdir

In this example, filename.zip represents the file you are creating and filesdir represents the directory you want to put in the new zip file. The -r option specifies that you want to include all files contained in the filesdir directory recursively.

To extract the contents of a zip file, type the following command:

unzip filename.zip

You can use zip to compress multiple files and directories at the same time by listing them with a space between each one:

zip -r filename.zip file1 file2 file3 /usr/work/school 

The above command compresses file1, file2, file3, and the contents of the /usr/work/school/ directory (assuming this directory exists) and places them in a file named filename.zip.

TipTip
 

For more information, type man zip and man unzip at a shell prompt to read the man pages for zip and unzip.

4.3.3. Archiving Files at the Shell Prompt

A tar file is a collection of several files and/or directories in one file. This is a good way to create backups and archives.

Some of the options used with the tar are:

  • -c — create a new archive

  • -f — when used with the -c option, use the filename specified for the creation of the tar file; when used with the -x option, unarchive the specified file

  • -t — show the list of files in the tar file

  • -v — show the progress of the files being archived

  • -x — extract files from an archive

  • -z — compress the tar file with gzip

  • -j — compress the tar file with bzip2

To create a tar file, type:

tar -cvf filename.tar directory/file

In this example, filename.tar represents the file you are creating and directory/file represents the directory and file you want to put in the archived file.

You can tar multiple files and directories at the same time by listing them with a space between each one:

tar -cvf filename.tar /home/mine/work /home/mine/school

The above command places all the files in the work and the school subdirectories of /home/mine in a new file called filename.tar in the current directory.

To list the contents of a tar file, type:

tar -tvf filename.tar

To extract the contents of a tar file, type:

tar -xvf filename.tar

This command does not remove the tar file, but it places copies of its unarchived contents in the current working directory, preserving any directory structure that the archive file used. For example, if the tarfile contains a file called bar.txt within a directory called foo/, then extracting the archive file results in the creation of the directory foo/ in your current working directory with the file bar.txt inside of it.

Remember, the tar command does not compress the files by default. To create a tarred and bzipped compressed file, use the -j option:

tar -cjvf filename.tbz file

tar files compressed with bzip2 are conventionally given the extension .tbz; however, sometimes users archive their files using the tar.bz2 extension.

The above command creates an archive file and then compresses it as the file filename.tbz. If you uncompress the filename.tbz file with the bunzip2 command, the filename.tbz file is removed and replaced with filename.tar.

You can also expand and unarchive a bzip tar file in one command:

tar -xjvf filename.tbz

To create a tarred and gzipped compressed file, use the -z option:

tar -czvf filename.tgz file

tar files compressed with gzip are conventionally given the extension .tgz.

This command creates the archive file filename.tar and compresses it as the file filename.tgz. (The file filename.tar is not saved.) If you uncompress the filename.tgz file with the gunzip command, the filename.tgz file is removed and replaced with filename.tar.

You can expand a gzip tar file in one command:

tar -xzvf filename.tgz

TipTip
 

Type the command man tar for more information about the tar command.