Found 2038 Articles for R Programming

Why the t.test returns a smallest p-value of 2.2e – 16 in R?

Nizamuddin Siddiqui
Updated on 04-Sep-2020 12:37:03

2K+ Views

When we perform a t test in R and the difference between two groups is very large then the p-value of the test is printed as 2.2e – 16 which is a printing behaviour of R for hypothesis testing procedures. The actual p-value can be extracted by using the t test function as t.test(“Var1”, ”Var2”, var.equal=FALSE)$p.value. This p-value is not likely to be the same as 2.2e – 16.Example1 Live Demo> x1 y1 t.test(x1, y1, var.equal=FALSE)Output   Welch Two Sample t-test data: x1 and y1 t = -3617.2, df = 10098, p-value < 2.2e-16 alternative hypothesis: true difference in means is not ... Read More

How to concatenate column values and create a new column in an R data frame?

Nizamuddin Siddiqui
Updated on 04-Sep-2020 12:34:39

798 Views

Sometimes we want to combine column values of two columns to create a new column. This is mostly used when we have a unique column that maybe combined with a numerical or any other type of column. Also, we can do this by separating the column values that is going to be created with difference characters. And it can be done with the help of apply function.ExampleConsider the below data frame − Live Demo> ID Country df1 df1Output ID Country 1 1 UK 2 2 UK 3 3 India 4 4 USA 5 5 USA 6 6 UK 7 7 Nepal 8 ... Read More

How to draw gridlines in a graph with abline function in R?

Nizamuddin Siddiqui
Updated on 04-Sep-2020 12:21:34

144 Views

Gridlines are the horizontal and vertical dotted lines, and they help to organize the chart so that the values on the labels becomes better readable to viewers. This is helpful specially in situations where we plot a large number of data points. A graph drawn by plot function can have gridlines by defining the vertical and horizontal lines using abline.ExampleConsider the below data and scatterplot − Live Demo> x y plot(x,y)OutputAdding gridlines using abline function −> abline(h=seq(0,5,0.5),lty=5) > abline(v=seq(-2,2,0.5),lty=5)Output

How to select rows based on range of values of a column in an R data frame?

Nizamuddin Siddiqui
Updated on 04-Sep-2020 12:19:29

2K+ Views

Extraction or selection of data can be done in many ways such as based on an individual value, range of values, etc. This is mostly required when we want to either compare the subsets of the data set or use the subset for analysis. The selection of rows based on range of value may be done for testing as well. We can do this by subset function.ExampleConsider the below data frame − Live Demo> x1 x2 x3 df dfOutput x1 x2 x3 1 3 2 6 2 3 4 9 3 4 4 12 4 4 8 12 5 3 5 11 ... Read More

How to change the color and size of the axes labels of a plot created by using plot function in R?

Nizamuddin Siddiqui
Updated on 04-Sep-2020 12:16:41

259 Views

The default size of axes labels created by using plot function does not seem to be large enough and also it does not look appealing. Therefore, we might want to change their size and color because the appearance of a plot matters a lot. This can be done by setting colors with col.lab and size with cex.lab.Example Live Demo> x y plot(x,y)OutputChanging the color of axes labels and the size of those axes labels −> plot(x,y,col.lab="blue",cex.lab=2)Output> plot(x,y,col.lab="dark blue",cex.lab=3)Output

How to add a new column to an R data frame with largest value in each row?

Nizamuddin Siddiqui
Updated on 04-Sep-2020 12:14:10

614 Views

When we have a data frame that contains all numerical columns then we might want to find the largest value in each row. For example, if we have a sales data set in which each row represents a customer and columns represent the products with quantities of values as values then we might want to find the maximum of each row to find out who buys which product the most. This can be done by using max with apply function for rows.ExampleConsider the below data frame − Live Demo> x1 x2 x3 x4 x5 df1 df1Output      x1     ... Read More

How to select rows of a data frame that are not in other data frame in R?

Nizamuddin Siddiqui
Updated on 04-Sep-2020 12:04:21

2K+ Views

Instead of finding the common rows, sometimes we need to find the uncommon rows between two data frames. It is mostly used when we expect that a large number of rows are uncommon instead of few ones. We can do this by using the negation operator which is represented by exclamation sign with subset function.ExampleConsider the below data frames − Live Demo> x1 y1 df1 df1Output x1 y1 1 10 6 2 5 9 3 10 10 4 4 10 5 1 6 6 1 4 7 9 3 8 5 10 9 10 3 10 8 2 11 6 10 12 ... Read More

How to match two string vectors if the strings case is different in both the vectors in R?

Nizamuddin Siddiqui
Updated on 04-Sep-2020 13:06:16

109 Views

We know that, R is a case sensitive programming language, hence matching strings of different case is not simple. For example, if a vector contains tutorialspoint and the other contains TUTORIALSPOINT then to check whether the strings match or not, we cannot use match function directly. To do this, we have to convert the lowercase string to uppercase or uppercase to lowercase with the match function.Examples Live Demo> x1 x1Output[1] "z" "v" "r" "y" "z" "l" "v" "t" "f" "p" "p" "z" "e" "b" "a" "o" "m" "d" [19] "e" "l" "y" "y" "u" "u" "w" "b" "a" "j" "n" "v" ... Read More

How to extract strings based on first character from a vector of strings in R?

Nizamuddin Siddiqui
Updated on 04-Sep-2020 11:20:54

247 Views

Sometimes a vector strings have patterns and sometimes we need to make patterns from a vector of strings based on the characters. For example, we might want to extract the states name of United States of America from a vector that contains all the names. This can be done by using grepl function.ExampleConsider the below vector containing states name in USA −> US_states US_states[grepl("^A", US_states)] [1] "Alabama" "Alaska" "American Samoa" "Arizona" [5] "Arkansas" > US_states[grepl("^B", US_states)] character(0) > US_states[grepl("^C", US_states)] [1] "California" "Colorado" "Connecticut" > US_states[grepl("^D", US_states)] [1] "Delaware" "District of Columbia" > US_states[grepl("^E", US_states)] character(0) > US_states[grepl("^F", US_states)] [1] ... Read More

How to find the difference of values of each row from previous by group in an R data frame?

Nizamuddin Siddiqui
Updated on 04-Sep-2020 11:11:23

1K+ Views

In Data Analysis, sometimes we need to find the difference of the current value from the previous value and it can be also needed for groups. It helps us to compare the differences among the values. In R, we can use dplyr package’s group_by and mutate function with lag.ExampleConsider the below data frame − Live Demo> Group Frequency df1 df1Output Group Frequency 1 A    7 2 A    6 3 A    9 4 A    12 5 B    19 6 B    19 7 B    4 8 B    6 9 C    14 10 C    6 ... Read More

Advertisements