Most versions of UNIX come with a program called split whose purpose is to split large files into smaller files for tasks such as editing them in an editor that cannot handle large files, or mailing them if they are so big that some mailers will refuse to deal with them. For example, let's say you have a really big text file that you want to mail to someone:
ls -l bigfile-r--r--r-- 1 jik 139070 Oct 15 21:02 bigfile
Note the default naming scheme, which is to append "aa," "ab," "ac," etc., to the letter "x" for each subsequent filename. It is possible to modify the default behavior. For example, you can make it create files that are 1500 lines long instead of 1000:
split -1500 bigfile%
ls -ltotal 288 -r--r--r-- 1 jik 139070 Oct 15 21:02 bigfile -rw-rw-r-- 1 jik 74016 Oct 15 21:06 xaa -rw-rw-r-- 1 jik 65054 Oct 15 21:06 xab
You can also get it to use a name prefix other than "x":
split -1500 bigfile bigfile.split.%
ls -ltotal 288 -r--r--r-- 1 jik 139070 Oct 15 21:02 bigfile -rw-rw-r-- 1 jik 74016 Oct 15 21:07 bigfile.split.aa -rw-rw-r-- 1 jik 65054 Oct 15 21:07 bigfile.split.ab
Although the simple behavior described above tends to be relatively universal, there are differences in the functionality of split on different UNIX systems. There are four basic variants of split as shipped with various implementations of UNIX:
A split that understands only how to deal with splitting text files into chunks of n lines or less each.
|bsplit||A split, usually called bsplit, that understands only how to deal with splitting non-text files into n-character chunks. A public domain version of bsplit is available on the Power Tools disc.|
A split that will do either text files or non-text files, but needs to be told explicitly when it is working on a non-text file.
The only way to tell which version you've got is to read the manual page for it on your system, which will also tell you the exact syntax for using it.
The problem with the third variant is that although it tries to be smart and automatically do the right thing with both text and non-text files, it sometimes guesses wrong and splits a text file as a non-text file or vice versa, with completely unsatisfactory results. Therefore, if the variant on your system is (3), you probably want to get your hands on one of the many split clones out there that is closer to one of the other variants (see below).
Variants (1) and (2) listed above are OK as far as they go, but they aren't adequate if your environment provides only one of them rather than both. If you find yourself needing to split a non-text file when you have only a text split, or needing to split a text file when you have only bsplit, you need to get one of the clones that will perform the function you need.
Variant (4) is the most reliable and versatile of the four listed, and is therefore what you should go with if you find it necessary to get a clone and install it on your system. There are several such clones in the various source archives, including the freely available BSD UNIX version. Alternatively, if you have installed, it is quite easy to write a simple split clone in perl, and you don't have to worry about compiling a C program to do it; this is an especially significant advantage if you need to run your split on multiple architectures that would need separate binaries.
If you need to split a non-text file and don't feel like going to all of the trouble of finding a split clone that handles them, one standard UNIX tool you can use to do the splitting is. For example, if bigfile above were a non-text file and you wanted to split it into 20,000-byte pieces, you could do something like this: