Found 2038 Articles for R Programming

How to split string values that contain special characters in R?

Nizamuddin Siddiqui
Updated on 09-Oct-2020 14:48:49

827 Views

When we have a single long string or a vector of string values and the values within the string are separated by some special characters then splitting the values can help us to properly understand those strings. This could happen in situations when the string data is recorded with mistakes or have some other purpose. We can do the splitting using strsplit function.Example Live Demox1

How to plot all the values of an R data frame?

Nizamuddin Siddiqui
Updated on 09-Oct-2020 14:46:43

355 Views

To plot all the values of an R data frame, we can use matplot function. This function plots all the values based on the columns of an R data frame and represent them by the column number. For example, if we have five columns in an R data frame then matplot will represent the first column by 1, second column by 2, third column by 3 and so on.Consider the below data frame −Example Live Demoset.seed(555) v1

How to find the proportion of row values in an R data frame?

Nizamuddin Siddiqui
Updated on 09-Oct-2020 14:42:06

416 Views

The proportion of row values can be calculated if we divide each row value with the sum of all values in a particular row. Therefore, the total sum of proportions will be equal to 1. This can be done by dividing the data frame with the row sums and for this purpose we can use the below syntax −Syntaxdata_frame_name/rowSums(data_frame_name)Consider the below data frame −Example Live Demoset.seed(111) x1

How to create a transparent polygon using ggplot2 in R?

Nizamuddin Siddiqui
Updated on 09-Oct-2020 14:30:57

565 Views

A transparent polygon just represents the border lines and a hollow area; thus, we can only understand the area covered but it becomes a little difficult to understand the scales. Hence, this visualisation technique is not as useful as others that fills the area with a different color. But it could be used if the range of the data is not large.Consider the below data frame −Example Live Demoset.seed(123) x

How to subset an R data frame based on string values of a columns with OR condition?

Nizamuddin Siddiqui
Updated on 09-Oct-2020 14:28:08

3K+ Views

We might want to create a subset of an R data frame using one or more values of a particular column. For example, suppose we have a data frame df that contain columns C1, C2, C3, C4, and C5 and each of these columns contain values from A to Z. If we want to select rows using values A or B in column C1 then it can be done as df[df$C1=="A"|df$C1=="B",].Consider the below data frame −Exampleset.seed(99) x1

How to find contingency table of means from an R data frame using cast function?

Nizamuddin Siddiqui
Updated on 09-Oct-2020 14:20:00

191 Views

The contingency table considers the numerical values for two categorical variables. Often, we require contingency table for counts, especially in non-parametric analysis but it is also possible that we want to use means for our analysis. Hence, we can use cast function from reshape package which solves the problem of creating contingency table easily.Consider the below data frame −Example Live Demoset.seed(99) x1

How to find the number of NA’s in each column of an R data frame?

Nizamuddin Siddiqui
Updated on 09-Oct-2020 14:19:24

693 Views

Sometimes the data frame is filled with too many missing values/ NA’s and each column of the data frame contains at least one NA. In this case, we might want to find out how many missing values exists in each of the columns. Therefore, we can use colSums function along with is.na in the following manner: colSums(is.na(df)) #here df refers to data frame name.Consider the below data frame −Example Live Demoset.seed(109) x1

How to simulate normal distribution for a fixed limit in R?

Nizamuddin Siddiqui
Updated on 09-Oct-2020 13:26:24

517 Views

To simulate the normal distribution, we can use rnorm function in R but we cannot put a limit on the range of values for the simulation. If we want simulate this distribution for a fixed limit then truncnorm function of truncnorm package can be used. In this function, we can pass the limits with and without mean and standard deviation.Loading and installing truncnorm package −>install.packages("truncnorm") >library(truncnorm)Examplertruncnorm(n=10, a=0, b=10)[1] 0.76595522 0.33315633 1.29565988 0.67154230 0.04957334 0.38338705 [7] 0.75753005 0.65265304 0.63616552 0.45710877rtruncnorm(n=50, a=0, b=100)[1] 0.904997947 0.035692016 0.402963452 1.001102057 1.445190636 0.109245234 [7] 0.205630845 0.312428027 0.465876772 0.424647787 0.309222394 0.442172805 [13] 0.365503292 1.277570451 0.235747661 1.128447123 ... Read More

How to create a transparent histogram using ggplot2 in R?

Nizamuddin Siddiqui
Updated on 08-Oct-2020 15:23:20

2K+ Views

When we create a histogram using ggplot2 package, the area covered by the histogram is filled with grey color but we can remove that color to make the histogram look transparent. This can be done by using fill="transparent" and color="black" arguments in geom_histogram, we need to use color argument because if we don’t use then the borders of the histogram bars will also be removed and this color is not restricted to black color only.ExampleConsider the below data frame −set.seed(987) x

How to select values less than or greater than a specific percentile from an R data frame column?

Nizamuddin Siddiqui
Updated on 08-Oct-2020 15:21:21

544 Views

The percentiles divide a set of numeric values into hundred groups or individual values if the size of the values is 100. We can find percentiles for a numeric column of an R data frame, therefore, it is also possible to select values of a column based on these percentiles. For this purpose, we can use quantile function.ExampleConsider the below data frame −set.seed(111) x

Advertisements