Found 2038 Articles for R Programming

How to select only one column from an R data frame and return it as a data frame instead of vector?

Nizamuddin Siddiqui
Updated on 10-Oct-2020 13:20:40


Generally, if we extract a single column from an R data frame then it is extracted as a vector but we might want it in data frame form so that we can apply operations of a data frame on it. Therefore, we can use single square brackets for the extraction with T (TRUE) or (FALSE) values and drop = FALSE so that the output becomes a data frame.Consider the below data frame −Example Live Demoset.seed(999) x1

How to find the indexes of minimum values in a vector if there are ties in R?

Nizamuddin Siddiqui
Updated on 10-Oct-2020 13:16:40


If we have repeated values in a vector that means we have ties in the vector, therefore, the indexes of values will help us to identify the positions of a particular value in the vector. We can use which function with min function to find the positions of minimum values in a vector, if there exists more than one minimum then the output will show all the relevant positions.Example Live Demox1

How to create train, test and validation samples from an R data frame?

Nizamuddin Siddiqui
Updated on 10-Oct-2020 13:14:30


To create predictive models, it is necessary to create three subsets of a data set for the purpose of training the model, testing the model and checking the validation of the model. These subsets are usually called train, test and validation. For this purpose, we can use different type of sampling methods and the most common is random sampling. In the below example, you can see how it can be done.Consider the mtcars data set in base R −Example Live Demodata(mtcars) str(mtcars)Output'data.frame':32 obs. of 11 variables: $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ... ... Read More

How to create a histogram using weights in R?

Nizamuddin Siddiqui
Updated on 10-Oct-2020 13:07:31

1K+ Views

A histogram using weights represent the weighted distribution of the values. In R, we can use weighted.hist function of plotrix package to create this type of histogram and we just need the values and weights corresponding to each value. Since plotrix is not frequently used, we must make sure that we install this package using install.packages("plotrix") then load it in R environment.Loading plotrix package −library("plotrix")Consider the below vector and the weight associated with that vector −Examplex

How to find critical value for one-sided and two-sided t test in R?

Nizamuddin Siddiqui
Updated on 10-Oct-2020 13:05:17


To find the critical value for t test in R, we need to use qt function. This function requires level of significance and the sample size and returns the tabulated or critical value of t distribution. Below examples shows the calculation of critical value for different situations such as left-side test, right-side test or two-sided test.Left side critical value with sample size 30 and 95% confidence level −Example Live Demoqt(0.05, 30)Output[1] -1.697261Right side critical value with sample size 30 and 95% confidence level −Example Live Demoabs(qt(0.05, 30))Output[1] 1.697261 Example Live Demoqt(0.05, 50)Output[1] -1.675905Example Live Demoabs(qt(0.05, 50))Output[1] 1.675905 Example Live Demoqt(0.01, 50)Output[1] -2.403272Example Live Demoabs(qt(0.01, 50))Output[1] 2.403272 ... Read More

How to find row minimum for an R data frame?

Nizamuddin Siddiqui
Updated on 10-Oct-2020 13:03:37


Data analysis is a difficult task because it has so much variation in terms of the smaller objectives of a big project. One of the smallest tasks could be finding the minimum value in each row contained in a data frame. For this purpose, we cam use apply function and pass the FUN argument as min so that we can get minimum values.Consider the below data frame −Example Live Demoset.seed(101) x1

How to extract correlation coefficient value from correlation test in R?

Nizamuddin Siddiqui
Updated on 10-Oct-2020 12:59:49

1K+ Views

To perform the correlation test in R, we need to use cor.test function with two variables and it returns so many values such as test statistic value, degrees of freedom, the p-value, the confidence interval, and the correlation coefficient value. If we want to extract the correlation coefficient value from the correlation test output then estimate function could be used as shown in below examples.Example Live Demox1

How to find 95% confidence interval for binomial data in R?

Nizamuddin Siddiqui
Updated on 10-Oct-2020 12:57:27


The binomial data has two parameters, the sample size and the number of successes. To find the 95% confidence interval we just need to use prop.test function in R but we need to make sure that we put correct argument to FALSE so that the confidence interval will be calculated without continuity correction. In the below examples, we have found the 95% confidence interval for different values of sample size and number of successes.Example Live Demoprop.test(x=25, n=100, conf.level=0.95, correct=FALSE)Output1-sample proportions test without continuity correction data: 25 out of 100, null probability 0.5 X-squared = 25, df = 1, p-value = 5.733e-07 ... Read More

How to find the maximum of factor levels in numerical column and return the output including other columns in the R data frame?

Nizamuddin Siddiqui
Updated on 10-Oct-2020 12:54:58


When we have factor column that helps to differentiate between numerical column then we might want to find the maximum value for each of the factor levels. This will help us to compare the factor levels in terms of their maximum and if we want to do this by getting all the columns in the data frame then aggregate function needs to be used with merge function.Consider the below data frame −Example Live Demoset.seed(78) Group

How to take a subset of a matrix in R with finite values only if the matrix contains NA and Inf values?

Nizamuddin Siddiqui
Updated on 10-Oct-2020 12:50:52


If we have a matrix that contains NA or Inf values and we want to take the subset of that matrix with finite values then only the rows that do not contain NA or Inf values will be the output. We can do this in R by using rowSums and is.finite function with negation operator !.Example Live Demoset.seed(999) M1
