We provide effective and economically affordable training courses for R and Python, click here for more details and course registration !

Data frame is the most widely used data object in R programming, due to the fact that a data frame can store different type of data contents (numeric, character, etc) into a tabular form. Creating a data frame is more than just simple. You can concretely combine several vectors into a data frame, as shown in the following code example.

#to create four vectors
idx <- c(13, 14, 15, 16) 
nbr <- c(25, 34, 20, 52)
Type <- c("Type5", "Type6", "Type5", "Type6")
Condition <- c("good", "normal", "bad", "good")
#to create a data frame by including vectors
hdata <- data.frame(idx, nbr, Type, Condition)
#to show first three observations of the data frame
  idx nbr  Type Condition
1  13  25 Type5      good
2  14  34 Type6    normal
3  15  20 Type5       bad

After a data frame is created, its elements can be returned by indexing. Indexing of a data frame can be applied upon its row and column, its row only, its column only, and column by variable name. The following examples show each of these cases.

#return first and second variable of data frame
  idx nbr
1  13  25
2  14  34
3  15  20
4  16  52
#return column by indexing variable of data frame 
[1] 25 34 20 52
#return two variables of data frame using c()
#the result is a data frame
hdata[c("Type", "Condition")] 
   Type Condition
1 Type5      good
2 Type6    normal
3 Type5       bad
4 Type6      good
 #show information of data frame 
'data.frame':	4 obs. of  4 variables:
 $ idx      : num  13 14 15 16
 $ nbr      : num  25 34 20 52
 $ Type     : chr  "Type5" "Type6" "Type5" "Type6"
 $ Condition: chr  "good" "normal" "bad" "good"
#select 2nd variable of all observations(rows)
[1] 25 34 20 52
#select all columns of first observation(row) of data frame
  idx nbr  Type Condition
1  13  25 Type5      good
#return first 3 rows and second column, the result is a vector
[1] 25 34 20

Modify a data frame involves same indexing mechanism. Rows, columns and elements of data frame can be modified, removed, added. Following code examples show each of these cases.

#third row, column'nbr' assigned with new value
hdata[3,"nbr"] <- 99
#show first 3 observations of data frame
head(hdata, 3)
  idx nbr  Type Condition
1  13  25 Type5      good
2  14  34 Type6    normal
3  15  99 Type5       bad
#add a new column to the new data frame
#using cbind()
#the new data frame
  idx nbr  Type Condition age
1  13  25 Type5      good  32
2  14  34 Type6    normal   8
3  15  99 Type5       bad  99
4  16  52 Type6      good  NA
#add a new row to data frame by rbind()
ndata_2<-rbind(ndata, c(15,26, "Type5","bad",19))
#new data frame
  idx nbr  Type Condition  age
1  13  25 Type5      good   32
2  14  34 Type6    normal    8
3  15  99 Type5       bad   99
4  16  52 Type6      good <NA>
5  15  26 Type5       bad   19
#remove first row of data frame using -1
ndata_2 <- ndata_2[-1,]
#show new data frame
  idx nbr  Type Condition  age
2  14  34 Type6    normal    8
3  15  99 Type5       bad   99
4  16  52 Type6      good <NA>
5  15  26 Type5       bad   19
#remove a column using NULL
ndata_2$Type <- NULL
#new data frame
  idx nbr Condition  age
2  14  34    normal    8
3  15  99       bad   99
4  16  52      good <NA>
5  15  26       bad   19

You can also watch R course videos on our YouTube channel.


Leave a Reply

Avatar placeholder