![Trending Articles on Technical and Non Technical topics](/images/trending_categories.jpeg)
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Split a File at Given Line Number
Introduction
Sometimes, it may be necessary to split a large file into smaller chunks for easier manipulation or for transfer to other systems. In Linux, the split command can be used to split a file into smaller files based on a specified number of lines.
The split command is a utility that is used to split a file into smaller files, called chunks or pieces. The split command reads the input file, and writes the output files in the specified size or at the specified line number. By default, the split command creates files with a .x suffix, where x is a letter starting from aa, ab, ac, and so on.
How to Use the Split Command?
To split a file based on the number of lines, use the following syntax -
$ split -l lines file output_prefix
-l lines − Specifies the number of lines for each output file.
file − The input file that you want to split.
output_prefix − The prefix for the output files. The output files will be named output_prefixaa, output_prefixab, output_prefixac, and so on.
For example, to split the file bigfile.txt into chunks of 1000 lines each, with the output files having the prefix splitfile, use the following command
$ split -l 1000 bigfile.txt splitfile
This will create the following files: splitfileaa, splitfileab, splitfileac, and so on.
Examples of splitting files
Here are some examples of using the split command to split a file at specific line numbers
Split a File into Chunks of 1000 Lines Each
To split the file bigfile.txt into chunks of 1000 lines each, with the output files having the prefix splitfile, use the following command
$ split -l 1000 bigfile.txt splitfile
This will create the following files − splitfileaa, splitfileab, splitfileac, and so on.
Split a File into Chunks of 500 Lines Each, Starting at Line 100
To split the file bigfile.txt into chunks of 500 lines each, starting at line 100, with the output files having the prefix splitfile, use the following command
$ split -l 500 -d bigfile.txt splitfile 100
This will create the following files − splitfile00, splitfile01, splitfile02, and so on.
Split a File into using Numeric Suffixes
To split the file bigfile.txt into chunks of 100 lines each, starting at line 1000, with the output files having the prefix splitfile and numeric suffixes, use the following command
$ split -l 100 -d bigfile.txt splitfile 1000
This will create the following files − splitfile000, splitfile001, splitfile002, and so on.
Split a File into Chunks using a Different Suffix
To split the file bigfile.txt into chunks of 2000 lines each, with the output files having the prefix splitfile and the suffix .txt, use the following command
$ split -l 2000 --suffix-length=4 bigfile.txt splitfile
This will create the following files − splitfile0000.txt, splitfile0001.txt, splitfile0002.txt, and so on.
Split a File into Chunks and specify the Output Directory
To split the file bigfile.txt into chunks of 1000 lines each, with the output files having the prefix splitfile and stored in the output directory, use the following command
$ split -l 1000 bigfile.txt output/splitfile
This will create the following files in the output directory − splitfileaa, splitfileab, splitfileac, and so on.
Split a File into Chunks of 500 Lines Each, and Store the Line Numbers in the Output Filenames
To split the file bigfile.txt into chunks of 500 lines each, with the line numbers included in the output filenames, use the following command
$ split -l 500 --additional-suffix=.txt bigfile.txt splitfile
This will create the following files: splitfileaa.txt, splitfileab.txt, splitfileac.txt, and so on. The line numbers will be included in the suffix, separated by a period. For example, splitfileaa.txt will contain lines 1-500, splitfileab.txt will contain lines 501-1000, and so on.
Alternative Commands
There are a few other commands that can be used to split a file in Linux, although they may not have all the options and functionality of the split command. Some alternatives to the split command include
csplit − The csplit command is similar to split, but it allows you to specify the point at which to split the file using a pattern or a line number. For example, to split a file at every occurrence of the pattern "---", use the following command: csplit file /---/
awk − The awk command is a powerful text-processing tool that can be used to split a file based on a given pattern or field. For example, to split a CSV file into separate files for each line, use the following command − awk -F, '{print > $1".txt"}' file
sed − The sed command is a text-processing tool that can be used to perform various operations on a file, including splitting. To split a file into separate files based on a pattern, use the sed command in combination with the awk command. For example, to split a file at every occurrence of the pattern "---", use the following command − sed -n '/---/{h;d};H;${x;s/.*//;p;}' file | awk -F"---" '{print > (NR+1)".txt"}'
It's worth noting that these alternatives may not be as efficient or as easy to use as the split command, and may require more advanced knowledge of text processing in Linux.
Conclusion
Overall, the split command is a useful utility for splitting a large file into smaller chunks based on a specified number of lines in Linux. It is a convenient tool for cases where you need to manipulate or transfer large files more easily. The split command has several options that allow you to customize the output files, including specifying the prefix, suffix, nd starting line number.
There are also several other commands that can be used to split a file in Linux, such as csplit, awk, and sed. These alternatives may offer more advanced functionality or the ability to split based on patterns, but they may not be as efficient or easy to use as the split command.