Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to join lines of two files on a common field in Linux?
To join lines of two files on a common field, we use the join command in the Linux system. The join command is used to merge lines from two sorted files based on a common field. Instead of physically combining files, the join command creates output by matching corresponding fields from both files. By default, the join field is the first field, delimited by whitespace.
Syntax
The general syntax of the join command is as follows −
join [OPTION]... FILE1 FILE2
Note − If FILE1 or FILE2 is not specified, the join command reads from standard input. Both files must be sorted on the join field for the command to work correctly.
Join Command Options
| Option | Description |
|---|---|
| -a FILENUM | Print unpaired lines from file FILENUM (1 or 2) |
| -e EMPTY | Replace missing input fields with EMPTY string |
| -i, --ignore-case | Ignore case differences when comparing fields |
| -j FIELD | Equivalent to '-1 FIELD -2 FIELD' |
| -o FORMAT | Specify output format for result lines |
| -t CHAR | Use CHAR as field separator for input and output |
| -v FILENUM | Print only unpaired lines (opposite of -a) |
| -1 FIELD | Join on this FIELD number of file 1 |
| -2 FIELD | Join on this FIELD number of file 2 |
| --header | Treat first line as headers, don't pair them |
Basic Example
First, create two sorted files with common fields −
$ cat > names.txt 1 Alice 2 Bob 3 Charlie 4 David $ cat > ages.txt 1 25 2 30 3 28 4 35
Join the files on the common field (first column) −
$ join names.txt ages.txt
1 Alice 25 2 Bob 30 3 Charlie 28 4 David 35
Joining with Different Field Separators
Use the -t option to specify a custom delimiter −
$ join -t ',' file1.csv file2.csv
Handling Unpaired Lines
To include lines that don't have matches in both files −
$ join -a 1 -a 2 names.txt ages.txt
Advanced Usage
Save the joined output to a new file −
$ join names.txt ages.txt > combined.txt
Join on different field numbers −
$ join -1 2 -2 1 file1.txt file2.txt
To check version information −
$ join --version
Key Points
Both input files must be sorted on the join field
The join field must be the same data type in both files
By default, only lines with matching keys appear in output
Use
-aoption to include unpaired lines from either file
Conclusion
The join command is a powerful utility for merging data from two sorted files based on common fields. It provides flexible options for handling different separators, field positions, and unpaired records, making it essential for data processing tasks in Linux systems.
