How to Find Unmatched Records in R » Data Science Tutorials

How to Find Unmatched Records in R?, To retrieve all rows in one data frame that do not have matching values in another data frame, use the anti_join() function from the dplyr package in R.

What Is the Best Way to Filter by Date in R? – Data Science Tutorials

The following is the fundamental syntax for this function.

anti_join(df1, df2, by='col_name')

The examples below demonstrate how to utilise this syntax in practise.

How to make a rounded corner bar plot in R? – Data Science Tutorials

Example 1: Use anti join() with One Column

Let’s pretend we have the following two R data frames:

Now we data frames

df1 <- data.frame(team=c('A', 'B', 'C', 'D', 'E'),
                  points=c(102, 104, 129, 224, 436))

df2 <- data.frame(team=c('A', 'B', 'C', 'F', 'G'),
                  points=c(412, 514, 519, 233, 117))

To return all rows in the first data frame that do not have a matching team in the second data frame, we can use the anti_join() function.

How to get the last value of each group in R – Data Science Tutorials

library(dplyr)

Using the ‘team’ column, execute an anti-join.

anti_join(df1, df2, by='team')

  team points
1    D    224
2    E    436

We can see that in the second data frame, there are exactly two teams from the first data frame that do not have a corresponding team name.

Example 2: Use anti_join() with Multiple Columns

Let’s pretend we have the following two R data frames.

Change ggplot2 Theme Color in R- Data Science Tutorials

Let’s create the data frames

df1 <- data.frame(team=c('A', 'A', 'A', 'B', 'B', 'B'),
                  position=c('G', 'G', 'F', 'G', 'F', 'C'),
                  points=c(182, 164, 159, 124, 136, 441))

df2 <- data.frame(team=c('A', 'A', 'A', 'B', 'B', 'B'),
                  position=c('G', 'G', 'C', 'G', 'F', 'F'),
                  points=c(152, 154, 159, 322, 217, 522))

The anti_join() method can be used to return all rows in the first data frame that do not match a team or position in the second data frame.

How to perform the Kruskal-Wallis test in R? – Data Science Tutorials

library(dplyr)

Use the ‘team’ and ‘position’ columns to do an anti-join.

anti_join(df1, df2, by=c('team', 'position'))

  team position points
1    A        F    159
2    B        C    441

We can see that in the second data frame, there are exactly two records from the first data frame that do not have a corresponding team name and position.

Example 1: Use anti join() with One Column

Example 2: Use anti_join() with Multiple Columns

Related Posts

Leave a Reply Cancel reply