Found 2038 Articles for R Programming

How to remove NA’s from an R data frame that contains them at different places?

Nizamuddin Siddiqui
Updated on 06-Nov-2020 12:23:26

225 Views

If NA values are placed at different positions in an R data frame then they cannot be easily removed in base R, we would be needing a package for that. The best package to solve this problem is dplyr and we can use summarise_each function of dplyr with na.omit to remove all the NA’s. But if we have more than one column in the data frame then the number of non-NA values must be same in all the columns.ExampleConsider the below data frame:Live Demo> x1 x2 df1 df1Output x1 x2 1 NA 15 2 NA 15 3 NA 15 ... Read More

How to check if a value exists in an R data frame or not?

Nizamuddin Siddiqui
Updated on 06-Nov-2020 12:13:04

7K+ Views

There are many small objectives that helps us to achieve a greater objective in data analysis. One such small objective is checking if a value exists in the data set or not. In R, we have many objects for data set such as data frame, matrix, data.table object etc. If we want to check if a value exists in an R data frame then any function can be used.ExampleConsider the below data frame:Live Demo> set.seed(3654) > x1 x2 x3 x4 df1 df1Output x1 x2 x3 x4 1 4 5 16 2 2 5 4 ... Read More

How to create line chart using ggplot2 in R with 3-sigma limits?

Nizamuddin Siddiqui
Updated on 06-Nov-2020 12:09:17

238 Views

To create a line chart with 3-sigma limits using ggplot2, we first need to calculate the limits then the chart can be created. We can use geom_ribbon function of ggplot2 for this purpose where we can pass lower 3-sigma limit for ymin argument in aes and upper 3-sigma limit for ymin argument in aes, also we need to specify alpha so that the color of lines and the limits can be differentiated.ExampleConsider the below data frame:Live Demo> set.seed(14) > x y df dfOutput x y 1 1 0.6690751 2 2 1.8594771 3 ... Read More

How to create a new column in an R data frame based on some condition of another column?

Nizamuddin Siddiqui
Updated on 06-Nov-2020 12:04:50

2K+ Views

Sometimes we want to change a column or create a new by using other columns of a data frame in R, this is mostly required when we want to create a categorical column but it can be done for numerical columns as well. For example, we might want to create a column based on salary for which if salaries are greater than the salary in another column then adding those salaries otherwise taking the difference between them. This will help us to understand whether the salaries in two columns are equivalent, lesser, or greater. In R, we can use transform ... Read More

How to set the alignment of labels in horizontal bar plot to left in R?

Nizamuddin Siddiqui
Updated on 06-Nov-2020 12:00:15

906 Views

When we create a horizontal bar plot using ggplot2 package, the labels of the categorical variable are aligned to the right-side of the axis and if the size of these labels are different then it looks a little ambiguous. Therefore, we might want to set the alignment of the labels to left-side and this can be done by using theme function of ggplot2 package.ExampleConsider the below data frame:> df dfOutput    x y 1 India  14 2 UK     15 3 Russia 12 4 United States of America 18Loading ggplot2 package and creating a horizontal ... Read More

How to draw a circle in R?

Nizamuddin Siddiqui
Updated on 06-Nov-2020 11:56:43

3K+ Views

There is no direct function in R to draw a circle but we can make use of plotrix package for this purpose. The plotrix package has a function called draw.cirlce which is can be used to draw a circle but we first need to draw a plot in base R then pass the correct arguments in draw.circle. The first and second arguments of draw.circle takes x and y coordinates, and the third one is for radius, hence these should be properly chosen based on the chart in base R.Loading plotrix package:> library(plotrix)Creating different circles using draw.circle:ExampleLive Demo> plot(1:10, type="n") > ... Read More

How to convert diagonal elements of a matrix in R into missing values?

Nizamuddin Siddiqui
Updated on 06-Nov-2020 11:47:43

707 Views

First thing we need to understand is diagonal elements are useful only if we have a square matrix, otherwise it would not make sense to set diagonal elements, this is known to almost all mathematicians but some freshman might get confused because we can create diagonal in a non-square matrix which should not be called a diagonal. In R, we can set the diagonal elements of a matrix to missing values/NA by using diag function.Example1Live Demo> M1 M1Output  [, 1] [, 2] [, 3] [, 4] [1, ] 1   5    9   13 [2, ] 2   6 ... Read More

How to create different Y-axis for group levels using ggplot2 in R?

Nizamuddin Siddiqui
Updated on 06-Nov-2020 11:42:56

745 Views

If we have a categorical variable or a group variable then we might want to create a line chart for each of the categories or levels, this will help us to understand the range of multiple levels in a single plot. For this purpose, we can use facet_grid function of ggplot2 package as shown in the below example.ExampleConsider the below data frame:Live Demo> x y df dfOutput x y 1 C -1.55668689 2 A 2.41399136 3 D -0.78520253 4 A -0.43092594 5 C 1.94379390 6 A ... Read More

What is the use of type = "h" in base R for plotting a graph?

Nizamuddin Siddiqui
Updated on 06-Nov-2020 11:40:18

2K+ Views

The type = "h" is a graphing argument in base R which is generally used inside a plot function. It helps to generate the vertical lines in the R environment instead of points. For example, if we want to plot values from 1 to 10 then type = "h" will plot the vertical lines starting from X-axis and the upper end of the lines will represent the actual value.Example1Live Demo> plot(1:10,type="h")Output:Example2Live Demo> plot(rnorm(10),type="h")Output:

How to find the less than probability using normal distribution in R?

Nizamuddin Siddiqui
Updated on 06-Nov-2020 11:36:10

1K+ Views

The less than probability using normal distribution is the cumulative probability which can be found by using cumulative distribution function of the normal distribution. In R, we have pnorm function that directly calculates the less than probability for a normally distributed random variable that takes Z score, mean and standard deviation.ExamplesLive Demopnorm(0.95,1,0) pnorm(0.95,0,1) pnorm(0.10,0,1) pnorm(0.10,1,5) pnorm(0.10,1,50) pnorm(0.10,25,50) pnorm(0.12,25,50) pnorm(0.12,2,0.004) pnorm(0.12,2,0.5) pnorm(1,2,0.5) pnorm(12,20,3) pnorm(12,12,3) pnorm(12,15,3) pnorm(200,15,3) pnorm(200,201,3) pnorm(200,201,5) pnorm(20,25,5)Output[1] 0 [1] 0.8289439 [1] 0.5398278 [1] 0.4285763 [1] 0.4928194 [1] 0.309242 [1] 0.309383 [1] 0 [1] 8.495668e-05 [1] 0.02275013 [1] 0.003830381 [1] 0.5 [1] 0.1586553 [1] 1 [1] 0.3694413 [1] 0.4207403 [1] 0.1586553

Advertisements