We provide effective and economically affordable training courses for R and Python, click here for more details and course registration !

In this post we show several methods of dropping unwanted variables of a data frame in R. Say we have a data frame ‘grade’ in the current R working session, and we try to remove variables ‘Math’ and ‘Physics’. The first method is just using minus – symbol in front of the column number for these two variables inside the c() function and square brackets for data frame indexing.

#show first observations of data frame 'grade'
head(grade)
#output
  StudentID    First    Last Gender Country Age Math Physics
1         1    James   Zhang   Male      US  23   73      70
2         2   Wilson      Li   Male      UK  26   95     999
3         3  Richard Nuan Ye   Male      UK  35   77      83
4         4     Mary    Deng Female      US  21   60      99
5         5    Jason  Wilson   Male      UK  19   77      89
6         6 Jennifer  Hopkin Female      UK  43   79      64
  Chemistry     Date
1        87 10/31/08
2        83 03/16/08
3        92 05/22/08
4        84 01/24/09
5        93 07/30/09
6        83 04/05/09

#to remove variables 'Math' and 'Physics'
newgrade2 <- grade[c(-7,-8)]
#show first observations of the new data frame
head(newgrade2)
#output
 StudentID    First    Last Gender Country Age Chemistry     Date
1         1    James   Zhang   Male      US  23        87 10/31/08
2         2   Wilson      Li   Male      UK  26        83 03/16/08
3         3  Richard Nuan Ye   Male      UK  35        92 05/22/08
4         4     Mary    Deng Female      US  21        84 01/24/09
5         5    Jason  Wilson   Male      UK  19        93 07/30/09
6         6 Jennifer  Hopkin Female      UK  43        83 04/05/09

The second method is using a conditional test statement then include this Boolean vector in the indexing statement.


#to remove variables "Math" and "Physics"
#create a Boolean vector using %in% operator
exvar1 <- names(grade) %in% c("Math","Physics")
#then use exclamation symbol ! to exclude these variables
newgrade1 <- grade[!exvar]
#show first observations of the new data frame
head(newgrade1)
#output
 StudentID    First    Last Gender Country Age Chemistry     Date
1         1    James   Zhang   Male      US  23        87 10/31/08
2         2   Wilson      Li   Male      UK  26        83 03/16/08
3         3  Richard Nuan Ye   Male      UK  35        92 05/22/08
4         4     Mary    Deng Female      US  21        84 01/24/09
5         5    Jason  Wilson   Male      UK  19        93 07/30/09
6         6 Jennifer  Hopkin Female      UK  43        83 04/05/09

The third method of excluding variables is assigning value NULL to the unwanted variables.

#create a new data frame
newgrade3 <- grade
#set NULL value to unwanted variables
newgrade3$Math <-newgrade3$Physics <- NULL
#show first observations of the new data frame
head(newgrade3)
#output 
 StudentID    First    Last Gender Country Age Chemistry     Date
1         1    James   Zhang   Male      US  23        87 10/31/08
2         2   Wilson      Li   Male      UK  26        83 03/16/08
3         3  Richard Nuan Ye   Male      UK  35        92 05/22/08
4         4     Mary    Deng Female      US  21        84 01/24/09
5         5    Jason  Wilson   Male      UK  19        93 07/30/09
6         6 Jennifer  Hopkin Female      UK  43        83 04/05/09


 

For getting more knowledge of R, you can watch R tutorial videos on our YouTube channel !


0 Comments

Leave a Reply

Avatar placeholder