Found 2038 Articles for R Programming

How to convert a numerical column into factor column in R?

Nizamuddin Siddiqui
Updated on 08-Oct-2020 14:44:54

5K+ Views

Often, we find that the values that represent factor levels are recorded as numerical values, therefore, we need to convert those numerical values to factor. In this way, we can use the factor column properly in our analysis otherwise R program will treat the factors as numerical values and the analysis output will be incorrect.Example Live Demodata(mtcars) str(mtcars)Output'data.frame': 32 obs. of 11 variables: $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ... $ cyl : num 6 6 4 6 8 6 8 4 4 6 ... $ disp: num 160 160 108 258 360 ... $ hp : num 110 110 93 110 175 105 245 62 95 123 ... $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ... $ wt : num 2.62 2.88 2.32 3.21 3.44 ... $ qsec: num 16.5 17 18.6 19.4 17 ... $ vs : num 0 0 1 1 0 1 0 1 1 1 ... $ am : num 1 1 1 0 0 0 0 0 0 0 ... $ gear: num 4 4 4 3 3 3 3 4 4 4 ... $ carb: num 4 4 1 1 2 1 4 2 2 4 ... mtcars$cyl

How to reduce the size of the area covered by legend in R for a plot created by using plot function?

Nizamuddin Siddiqui
Updated on 08-Oct-2020 14:39:15

1K+ Views

By default, the area covered by legends for a plot created by using plot function is of full size that is 1 (the area size has a range of 0 to 1, where 1 refers to the full size and 0 refers to none). To reduce the size, we can use cex argument with the legend function as shown in the below example.ExampleConsider the below vectors and the plot created between these two vectors −x

How to display R-squared value on scatterplot with regression model line in R?

Nizamuddin Siddiqui
Updated on 08-Oct-2020 14:20:27

7K+ Views

The R-squared value is the coefficient of determination, it gives us the percentage or proportion of variation in dependent variable explained by the independent variable. To display this value on the scatterplot with regression model line without taking help from any package, we can use plot function with abline and legend functions.Consider the below data frame −Example Live Demoset.seed(1234) x

How to find the absolute pairwise difference among values of a vector in R?

Nizamuddin Siddiqui
Updated on 08-Oct-2020 14:15:54

350 Views

If a vector contains five values then there will be ten pairwise differences. For example, suppose we have five numbers starting from 1, then the pairwise combinations for these values will be (1,2), (1,3), (1,4), (1,5), (2,3), (2,4), (2,5), (3,4), (3,5), (4,5). Now to find the absolute pairwise differences, we would be need to find the differences between each of these combinations and take the absolute value of the answer hence the result will be 1, 2, 3, 4, 1, 2, 3, 1, 2, 1.Example Live Demox1

How to create a sequence of dates by using starting date in R?

Nizamuddin Siddiqui
Updated on 08-Oct-2020 14:11:47

844 Views

The best way to create a sequence of anything is creating it with the help of seq function and this also applies to sequences of dates. But in case of dates, we need to read the dates in date format so that R can understand the input type and create the appropriate vector. If we do not use the date format for the date value then it won’t make sense to R and it will result in error.Examples Live Demox1

How to create a contingency table with sum on the margins from an R data frame?

Nizamuddin Siddiqui
Updated on 08-Oct-2020 14:09:35

2K+ Views

The sum of rows and columns on the margins in a contingency table are always useful because they are used for different type of calculations such as odds ratio, probability etc. If an R data frame has factor columns then we can create a contingency table for that data frame and it can be done by using addmargins function.ExampleConsider the below data frame − Live Demox1

How to create a data frame with combinations of values in R?

Nizamuddin Siddiqui
Updated on 07-Oct-2020 16:36:25

465 Views

Suppose we have two values 0 and 1 then how many combinations of these values are possible, the answer is 8 and these combinations are (0,0), (1,0), (0,1), (1,1). In R, we can use expand.grid function to create these combinations but to save it in a data frame, we would need to use as.data.frame function.Example Live Demodf1

How to create a standard normal distribution curve with 3-sigma limits in R?

Nizamuddin Siddiqui
Updated on 07-Oct-2020 16:29:52

334 Views

A standard normal distribution has mean equals to zero and the standard deviation equals to one. Therefore, when we plot it with three sigma limits, we have six points on the X-axis referring to the plus and minus around zero. If the limits are defined then the plotting can be shown with larger width and that will change the display of the curve. We can do this by creating a sequence for the length of the standard normal variable and its density.Consider the below vectors corresponding to the limits and density−x

How to change the size of correlation coefficient value in correlation matrix plot using corrplot in R?

Nizamuddin Siddiqui
Updated on 07-Oct-2020 16:26:27

3K+ Views

The correlation coefficient value size in correlation matrix plot created by using corrplot function ranges from 0 to 1, 0 referring to the smallest and 1 referring to the largest, by default it is 1. To change this size, we need to use number.cex argument. For example, if we want to decrease the size to half then we can use number.cex = 0.5.ExampleConsider the below matrix − Live Demoset.seed(99) M corrplot(cor(M), addCoef.col="black")OutputChanging the size of correlation coefficient value to 0.75 −corrplot(cor(M), addCoef.col="black", number.cex=0.75)OutputChanging the size of correlation coefficient value to 0.30 −> corrplot(cor(M), addCoef.col="black", number.cex=0.30)OutputRead More

How to find the groupwise mean and save it in a data frame object in R?

Nizamuddin Siddiqui
Updated on 07-Oct-2020 16:16:25

324 Views

We often need groupwise mean in data analysis, especially in situations where analysis of variance techniques is used because these techniques helps us to compare different groups based on their measures of central tendencies and measures of variations. It can be done by using aggregate function so that the output can be saved in a data frame object. In the below examples, we can see how it can be done and also check the final object type.ExampleConsider the below data frame − Live Demoset.seed(109) Salary

Advertisements