![Trending Articles on Technical and Non Technical topics](/images/trending_categories.jpeg)
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to match two string vectors if the strings case is different in both the vectors in R?
We know that, R is a case sensitive programming language, hence matching strings of different case is not simple. For example, if a vector contains tutorialspoint and the other contains TUTORIALSPOINT then to check whether the strings match or not, we cannot use match function directly. To do this, we have to convert the lowercase string to uppercase or uppercase to lowercase with the match function.
Examples
> x1<-sample(letters[1:26],100,replace=TRUE) > x1
Output
[1] "z" "v" "r" "y" "z" "l" "v" "t" "f" "p" "p" "z" "e" "b" "a" "o" "m" "d" [19] "e" "l" "y" "y" "u" "u" "w" "b" "a" "j" "n" "v" "b" "q" "b" "d" "l" "a" [37] "g" "g" "g" "o" "k" "r" "q" "e" "x" "i" "r" "l" "b" "r" "j" "k" "b" "f" [55] "r" "f" "r" "n" "y" "y" "l" "k" "y" "s" "b" "a" "s" "f" "a" "l" "j" "i" [73] "q" "o" "t" "v" "t" "r" "i" "x" "s" "q" "h" "t" "y" "k" "a" "h" "e" "m" [91] "u" "d" "q" "i" "h" "x" "k" "j" "p" "h"
Example
> x2<-sample(LETTERS[1:10],100,replace=TRUE) > x2
Output
[1] "E" "C" "F" "F" "A" "H" "E" "F" "D" "F" "J" "G" "G" "D" "E" "G" "G" "F" [19] "A" "C" "C" "H" "E" "G" "H" "A" "B" "A" "H" "G" "D" "J" "G" "C" "D" "I" [37] "F" "B" "D" "D" "C" "D" "E" "D" "B" "E" "E" "H" "D" "D" "I" "B" "I" "J" [55] "C" "C" "H" "D" "B" "D" "F" "F" "D" "F" "E" "B" "F" "J" "D" "B" "G" "J" [73] "G" "C" "E" "A" "I" "B" "D" "A" "G" "G" "F" "D" "E" "E" "G" "I" "D" "D" [91] "I" "E" "J" "D" "E" "B" "C" "A" "I" "C"
> match(x1,tolower(x2))
Output
[1] NA NA NA NA NA NA NA NA 3 NA NA NA 1 27 5 NA NA 9 1 NA NA NA NA NA NA [26] 27 5 11 NA NA 27 NA 27 9 NA 5 12 12 12 NA NA NA NA 1 NA 36 NA NA 27 NA [51] 11 NA 27 3 NA 3 NA NA NA NA NA NA NA NA 27 5 NA 3 5 NA 11 36 NA NA NA [76] NA NA NA 36 NA NA NA 6 NA NA NA 5 6 1 NA NA 9 NA 36 6 NA NA 11 NA 6
> x3<-c("AK", "AL", "AR", "AS", "AZ", "CA", "CO", "CT", "DC", "DE", "FL", "GA", "GU", "HI", "IA", "ID", "IL", "IN", "KS", "KY", "LA", "MA", "MD", "ME", "MI", "MN", "MO", "MP", "MS", "MT", "NC", "ND", "NE", "NH", "NJ", "NM", "NV", "NY", "OH", "OK", "OR", "PA", "PR", "RI", "SC", "SD", "TN", "TX", "UM", "UT", "VA", "VI", "VT", "WA", "WI", "WV", "WY") > x3 [1] "AK" "AL" "AR" "AS" "AZ" "CA" "CO" "CT" "DC" "DE" "FL" "GA" "GU" "HI" "IA" [16] "ID" "IL" "IN" "KS" "KY" "LA" "MA" "MD" "ME" "MI" "MN" "MO" "MP" "MS" "MT" [31] "NC" "ND" "NE" "NH" "NJ" "NM" "NV" "NY" "OH" "OK" "OR" "PA" "PR" "RI" "SC" [46] "SD" "TN" "TX" "UM" "UT" "VA" "VI" "VT" "WA" "WI" "WV" "WY"
> x4<-c("ak", "al", "ar", "as", "az", "ca", "co", "ct", "dc", "de", "fl", "ga", "gu", "hi", "ia", "id", "il", "in", "ks", "ky", "la", "ma", "md", "me", "mi", "mn", "mo", "mp", "ms", "mt", "nc", "nd", "ne", "nh", "nj", "nm", "nv", "ny", "oh", "ok", "or", "pa", "pr", "ri", "sc", "sd", "tn", "tx", "um", "ut", "va", "vi", "vt", "wa", "wi", "wv", "wy") > x4 [1] "ak" "al" "ar" "as" "az" "ca" "co" "ct" "dc" "de" "fl" "ga" "gu" "hi" "ia" [16] "id" "il" "in" "ks" "ky" "la" "ma" "md" "me" "mi" "mn" "mo" "mp" "ms" "mt" [31] "nc" "nd" "ne" "nh" "nj" "nm" "nv" "ny" "oh" "ok" "or" "pa" "pr" "ri" "sc" [46] "sd" "tn" "tx" "um" "ut" "va" "vi" "vt" "wa" "wi" "wv" "wy" > length(x4) [1] 57 > length(x3) [1] 57
> match(x3,toupper(x4))
Output
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 [26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 [51] 51 52 53 54 55 56 57
> match(LETTERS[1:20],toupper(c(letters[1:26])))
Output
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
> match(LETTERS[1:20],toupper(c(letters[1:10])))
Output
[1] 1 2 3 4 5 6 7 8 9 10 NA NA NA NA NA NA NA NA NA NA
> match(LETTERS[10:1],toupper(c(letters[1:10])))
Output
[1] 10 9 8 7 6 5 4 3 2 1
> match(sample(LETTERS[10:1],50,replace=TRUE),toupper(c(letters[1:10])))
Output
[1] 2 10 8 7 10 8 4 8 2 7 10 3 5 5 4 6 4 10 7 3 7 1 1 3 10 [26] 3 7 4 3 7 5 4 5 2 7 4 5 1 5 7 2 4 4 2 8 8 9 5 1 2
> match(sample(letters[26:1],50,replace=TRUE),tolower(c(LETTERS[1:20])))
Output
[1] 8 2 18 15 15 14 11 13 18 5 9 13 14 20 18 15 4 14 5 NA NA 5 NA 8 17 [26] 5 16 3 4 9 NA 5 16 17 16 6 12 1 2 NA NA 8 16 9 NA 14 NA 11 16 15
> match(sample(c("india","russia","china","uk"),50,replace=TRUE),tolower(c("INDIA","R USSIA","CHINA")))
Output
[1] NA 3 2 1 3 1 NA NA NA 2 NA 3 3 3 1 NA NA 3 3 3 2 3 2 3 2 [26] 3 3 NA 3 3 2 NA 3 1 NA 3 NA 3 1 NA 3 NA NA NA NA 3 NA 2 NA NA
Advertisements