How to remove partial string after a special character in R?


Sometimes we don’t require the whole string to proceed with the analysis, especially when it complicates the analysis or making no sense. In such type of situations, the part of string which we feel that is not necessary can be removed from the complete string. For example, suppose we have a string ID:00001-1 but we don’t want -1 in this string then we can remove it and this can be done with the help of gsub function.

Example

> x1<-c("ID:00001-1","ID:00100-1","ID:00201-4","ID:014700-3","ID:12045-5","ID:00012-2","ID:10078-3")
> gsub("\-.*","",x1)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x2<-c("ID:00001/1","ID:00100/1","ID:00201/4","ID:014700/3","ID:12045/5","ID:00012/2","ID:10078/3")
> gsub("\/.*","",x2)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x3<-c("ID:00001_1","ID:00100_1","ID:00201_4","ID:014700_3","ID:12045_5","ID:00012_2","ID:10078_3")
> gsub("\_.*","",x3)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x4<-c("ID:00001@1","ID:00100@1","ID:00201@4","ID:014700@3","ID:12045@5","ID:00012@2","ID:10078@3")
> gsub("\@.*","",x4)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x5<-c("ID:00001*1","ID:00100*1","ID:00201*4","ID:014700*3","ID:12045*5","ID:00012*2","ID:10078*3")
> gsub("\*.*","",x5)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x6<-c("ID:00001#1","ID:00100#1","ID:00201#4","ID:014700#3","ID:12045#5","ID:00012#2","ID:10078#3")
> gsub("\#.*","",x6)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x7<-c("ID:00001()1","ID:00100()1","ID:00201()4","ID:014700()3","ID:12045()5","ID:00012()2","ID:10078()3")
> gsub("\().*","",x7)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x8<-c("ID:00001<>1","ID:00100<>1","ID:00201<>4","ID:014700<>3","ID:12045<>5","ID:00012<>2","ID:10078<>3")
> gsub("\<>.*","",x8)
[1] "ID:00001<>1" "ID:00100<>1" "ID:00201<>4" "ID:014700<>3" "ID:12045<>5" "ID:00012<>2" "ID:10078<>3"
> x9<-c("ID:00001&1","ID:00100&1","ID:00201&4","ID:014700&3","ID:12045&5","ID:00012&2","ID:10078&3")
> gsub("\&.*","",x9)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x10<-c("ID:00001;1","ID:00100;1","ID:00201;4","ID:014700;3","ID:12045;5","ID:00012;2","ID:10078;3")
> gsub("\;.*","",x10)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"

Updated on: 12-Aug-2020

476 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements