How to join lines of two files on a common field in Linux?

To join lines of two files on a common field, we use the join command in the Linux system. The join command is used to merge lines from two sorted files based on a common field. Instead of physically combining files, the join command creates output by matching corresponding fields from both files. By default, the join field is the first field, delimited by whitespace.

Syntax

The general syntax of the join command is as follows −

join [OPTION]... FILE1 FILE2

Note − If FILE1 or FILE2 is not specified, the join command reads from standard input. Both files must be sorted on the join field for the command to work correctly.

Join Command Options

Option Description
-a FILENUM Print unpaired lines from file FILENUM (1 or 2)
-e EMPTY Replace missing input fields with EMPTY string
-i, --ignore-case Ignore case differences when comparing fields
-j FIELD Equivalent to '-1 FIELD -2 FIELD'
-o FORMAT Specify output format for result lines
-t CHAR Use CHAR as field separator for input and output
-v FILENUM Print only unpaired lines (opposite of -a)
-1 FIELD Join on this FIELD number of file 1
-2 FIELD Join on this FIELD number of file 2
--header Treat first line as headers, don't pair them

Basic Example

First, create two sorted files with common fields −

$ cat > names.txt
1 Alice
2 Bob
3 Charlie
4 David

$ cat > ages.txt
1 25
2 30
3 28
4 35

Join the files on the common field (first column) −

$ join names.txt ages.txt
1 Alice 25
2 Bob 30
3 Charlie 28
4 David 35

Joining with Different Field Separators

Use the -t option to specify a custom delimiter −

$ join -t ',' file1.csv file2.csv

Handling Unpaired Lines

To include lines that don't have matches in both files −

$ join -a 1 -a 2 names.txt ages.txt

Advanced Usage

Save the joined output to a new file −

$ join names.txt ages.txt > combined.txt

Join on different field numbers −

$ join -1 2 -2 1 file1.txt file2.txt

To check version information −

$ join --version

Key Points

  • Both input files must be sorted on the join field

  • The join field must be the same data type in both files

  • By default, only lines with matching keys appear in output

  • Use -a option to include unpaired lines from either file

Conclusion

The join command is a powerful utility for merging data from two sorted files based on common fields. It provides flexible options for handling different separators, field positions, and unpaired records, making it essential for data processing tasks in Linux systems.

Updated on: 2026-03-17T09:01:38+05:30

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements