Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to add a Column of Numbers in Bash?
Bash provides several methods to add up numeric columns in data files. This article explores different approaches including awk, loops, and text processing commands, comparing their performance on column summation tasks.
Using the awk Tool
The awk command is the most straightforward approach for column arithmetic. It reads each line, accumulates values, and displays the final sum.
$ awk '{Total=Total+$1} END{print "Total is: " Total}' numbers.csv
Total is: 49471228
To measure performance, use the time command:
$ time awk '{Total=Total+$1} END{print "Total is: " Total}' numbers.csv
Total is: 49471228 real 0m0.228s user 0m0.141s sys 0m0.047s
Multiple Columns with Field Separator
For CSV files with multiple columns, use the -F option to specify the delimiter and reference the desired column number:
$ cat prices.csv
Books,40 Bag,70 Dress,80 Box,10
$ awk -F "," '{Total=Total+$2} END{print "Total is: " Total}' prices.csv
Total is: 200
Handling Header Lines
When files contain header rows, use NR!=1 to skip the first line:
$ cat prices.csv
Item,Value Books,40 Bag,70 Dress,80 Box,10
$ awk -F "," 'NR!=1{Total=Total+$2} END{print "Total is: " Total}' prices.csv
Total is: 200
Using Bash Loops
Bash loops provide an alternative approach, though typically with different performance characteristics.
With expr Command
The expr command performs arithmetic but creates significant overhead due to subprocess calls:
$ time (sum=0; for number in `cat numbers.csv`; do sum=`expr $sum + $number`; done; echo "Total is: $sum")
Total is: 49471228 real 212m48.418s user 7m19.375s sys 145m48.203s
With Arithmetic Expansion
Arithmetic expansion $((...)) provides much better performance than expr:
$ time (sum=0; for number in `cat numbers.csv`; do sum=$((sum+number)); done; echo "Total is: $sum")
Total is: 49471228 real 0m1.961s user 0m1.813s sys 0m0.125s
Using bc Command with Text Processing
The bc calculator can handle expressions by converting column data into addition statements.
With paste Command
The paste command joins lines with a specified delimiter:
$ cat numbers.csv | head -10 | paste -sd+ -
2+44+6+15+23+0+15+88+82+1
$ time echo "Total is: $(cat numbers.csv | paste -sd+ - | bc)"
Total is: 49471228 real 0m0.244s user 0m0.203s sys 0m0.063s
With tr Command
The tr (translate) command replaces newlines with plus signs, requiring a trailing zero for proper bc input:
$ time ((cat numbers.csv | tr "<br>" "+" ; echo "0") | bc)
49471228 real 0m0.217s user 0m0.203s sys 0m0.031s
With sed Command
The sed command provides similar functionality using substitution patterns:
$ time ((cat numbers.csv | sed -z 's#<br>#+#g' ; echo "0") | bc)
49471228 real 0m0.343s user 0m0.281s sys 0m0.109s
Performance Comparison
| Method | Real Time | Performance |
|---|---|---|
| tr + bc | 0m0.217s | Fastest |
| awk | 0m0.228s | Very Fast |
| paste + bc | 0m0.244s | Fast |
| sed + bc | 0m0.343s | Moderate |
| Loop with $(()) | 0m1.961s | Slow |
| Loop with expr | 212m48.418s | Very Slow |
Conclusion
For adding columns of numbers in bash, awk provides the best balance of simplicity and performance. The tr + bc combination offers slightly better speed, while bash loops with expr should be avoided due to extreme overhead. Choose awk for readability or text processing commands with bc for maximum performance.
