Data frame is the most widely used object type in R data analysis, because it allows storing different modes of data in a tabular form. The rows of a data frame represents each observation, and the columns denotes different variables each observation has. When the data is collected and read into a data frame, the order of the observations may not meet the needs of the analyst. So it often needs to sort the data frame with respect the to the value(s) of one or several variables. In R, function order() can be easily implemented to sort a data frame. In the following code example, the data frame ‘grade’ is sorted by age of student age from youngest to oldest.
#set working directory
setwd("d:\\RStatistics-Tutorial")
#read csv file into a data frame
vartype<-c("character", "character", "character", "character", "character", "numeric","numeric", "numeric","numeric","character")
grade <- read.table("University-NA.csv", colClasses=vartype, header=TRUE, sep=",")
#show data frame
grade
#output
StudentID First Last Gender Country Age Math Physics
1 1 James Zhang Male US 23 73 70
2 2 Wilson Li Male UK 26 95 999
3 3 Richard Nuan Ye Male UK 35 77 83
4 4 Mary Deng Female US 21 60 99
5 5 Jason Wilson Male UK 19 77 89
6 6 Jennifer Hopkin Female UK 43 79 64
7 7 Kari Gjendem Female US 37 87 99
8 8 Wenche Dale Female US 28 95 87
9 9 Jane Larsen Female US 19 73 92
10 10 Steinar Hansen Male UK 25 66 93
11 11 Michael Chen Male UK 42 83 90
12 12 Josef Curton Male US 32 71 63
13 13 Jennifer Jones Male US 27 79 76
14 14 Gary Grant Female UK 35 90 78
15 15 Phil Yao Male UK 21 69 69
16 16 Nora Spears Female US 29 79 83
17 17 Goril Nordmann Female UK 36 91 79
18 18 Lisa Bondvik Female US 39 65 73
19 19 Guri Olsen Female US 24 87 72
20 20 Martin Jones Male US 25 82 73
Chemistry Date
1 87 10/31/08
2 83 03/16/08
3 92 05/22/08
4 84 01/24/09
5 93 07/30/09
6 83 04/05/09
7 67 11/24/08
8 93 10/02/08
9 84 06/05/09
10 999 08/01/08
11 77 10/24/08
12 96 11/08/09
13 82 10/29/08
14 92 10/24/08
15 83 10/15/08
16 76 03/11/09
17 69 05/24/08
18 87 07/09/09
19 89 08/12/09
20 62 999
#sorted grade from youngest student to oldest student
test <- grade[order(grade$Age),]
#show data frame
test
#output
StudentID First Last Gender Country Age Math Physics
5 5 Jason Wilson Male UK 19 77 89
9 9 Jane Larsen Female US 19 73 92
4 4 Mary Deng Female US 21 60 99
15 15 Phil Yao Male UK 21 69 69
1 1 James Zhang Male US 23 73 70
19 19 Guri Olsen Female US 24 87 72
10 10 Steinar Hansen Male UK 25 66 93
20 20 Martin Jones Male US 25 82 73
2 2 Wilson Li Male UK 26 95 999
13 13 Jennifer Jones Male US 27 79 76
8 8 Wenche Dale Female US 28 95 87
16 16 Nora Spears Female US 29 79 83
12 12 Josef Curton Male US 32 71 63
3 3 Richard Nuan Ye Male UK 35 77 83
14 14 Gary Grant Female UK 35 90 78
17 17 Goril Nordmann Female UK 36 91 79
7 7 Kari Gjendem Female US 37 87 99
18 18 Lisa Bondvik Female US 39 65 73
11 11 Michael Chen Male UK 42 83 90
6 6 Jennifer Hopkin Female UK 43 79 64
Chemistry Date
5 93 07/30/09
9 84 06/05/09
4 84 01/24/09
15 83 10/15/08
1 87 10/31/08
19 89 08/12/09
10 999 08/01/08
20 62 999
2 83 03/16/08
13 82 10/29/08
8 93 10/02/08
16 76 03/11/09
12 96 11/08/09
3 92 05/22/08
14 92 10/24/08
17 69 05/24/08
7 67 11/24/08
18 87 07/09/09
11 77 10/24/08
6 83 04/05/09
>
The sorting can be based on more than one variable. The following example shows observations in the data frame are sorted by gender first, then within each gender, sorted by age.
#sorts by Gender first, then by Age within each gender
test <- grade[order(grade$Gender, grade$Age),]
#show data frame
test
#output
StudentID First Last Gender Country Age Math Physics
9 9 Jane Larsen Female US 19 73 92
4 4 Mary Deng Female US 21 60 99
19 19 Guri Olsen Female US 24 87 72
8 8 Wenche Dale Female US 28 95 87
16 16 Nora Spears Female US 29 79 83
14 14 Gary Grant Female UK 35 90 78
17 17 Goril Nordmann Female UK 36 91 79
7 7 Kari Gjendem Female US 37 87 99
18 18 Lisa Bondvik Female US 39 65 73
6 6 Jennifer Hopkin Female UK 43 79 64
5 5 Jason Wilson Male UK 19 77 89
15 15 Phil Yao Male UK 21 69 69
1 1 James Zhang Male US 23 73 70
10 10 Steinar Hansen Male UK 25 66 93
20 20 Martin Jones Male US 25 82 73
2 2 Wilson Li Male UK 26 95 999
13 13 Jennifer Jones Male US 27 79 76
12 12 Josef Curton Male US 32 71 63
3 3 Richard Nuan Ye Male UK 35 77 83
11 11 Michael Chen Male UK 42 83 90
Chemistry Date
9 84 06/05/09
4 84 01/24/09
19 89 08/12/09
8 93 10/02/08
16 76 03/11/09
14 92 10/24/08
17 69 05/24/08
7 67 11/24/08
18 87 07/09/09
6 83 04/05/09
5 93 07/30/09
15 83 10/15/08
1 87 10/31/08
10 999 08/01/08
20 62 999
2 83 03/16/08
13 82 10/29/08
12 96 11/08/09
3 92 05/22/08
11 77 10/24/08
>
Sorting of numerical variables can also with the largest value to the smallest. For example, the next example shows the sorting of data frame by gender first, then from the oldest student to the youngest student with each gender.
#sorts by Gender first, then by Age with each gender
#from old to young student
test <- grade[order(grade$Gender, -grade$Age),]
#show data frame
test
#output
StudentID First Last Gender Country Age Math Physics
6 6 Jennifer Hopkin Female UK 43 79 64
18 18 Lisa Bondvik Female US 39 65 73
7 7 Kari Gjendem Female US 37 87 99
17 17 Goril Nordmann Female UK 36 91 79
14 14 Gary Grant Female UK 35 90 78
16 16 Nora Spears Female US 29 79 83
8 8 Wenche Dale Female US 28 95 87
19 19 Guri Olsen Female US 24 87 72
4 4 Mary Deng Female US 21 60 99
9 9 Jane Larsen Female US 19 73 92
11 11 Michael Chen Male UK 42 83 90
3 3 Richard Nuan Ye Male UK 35 77 83
12 12 Josef Curton Male US 32 71 63
13 13 Jennifer Jones Male US 27 79 76
2 2 Wilson Li Male UK 26 95 999
10 10 Steinar Hansen Male UK 25 66 93
20 20 Martin Jones Male US 25 82 73
1 1 James Zhang Male US 23 73 70
15 15 Phil Yao Male UK 21 69 69
5 5 Jason Wilson Male UK 19 77 89
Chemistry Date
6 83 04/05/09
18 87 07/09/09
7 67 11/24/08
17 69 05/24/08
14 92 10/24/08
16 76 03/11/09
8 93 10/02/08
19 89 08/12/09
4 84 01/24/09
9 84 06/05/09
11 77 10/24/08
3 92 05/22/08
12 96 11/08/09
13 82 10/29/08
2 83 03/16/08
10 999 08/01/08
20 62 999
1 87 10/31/08
15 83 10/15/08
5 93 07/30/09
>
For getting more knowledge of R, you can watch R tutorial videos on our YouTube channel !
Click here to download Python Course Source Files !
For online Python training registration, click here ! Pandas provides flexible ways of generating data…
For online Python training registration, click here ! Data frame is the tabular data object…
Click her for course registration ! When a data frame in Python is created via…
We provide affordable online training course(via ZOOM meeting) for Python and R programming at fundamental…