R Programming

How to use arrange() function in R to sort dataset

We provide effective and economically affordable online training courses for R and Python, click here for more details and course registration !

arrange() function from Dplyr package in R provides an alternative way as sort() function in R base installation for sorting a data frame with respect to variables, either in ascending or descending order. Moreover, Dplyr is part of a bigger package framework called ‘tidyverse’, and it allows its functions to be chained using pipeline(%>%) structure. Say we have a data frame ‘grade’.

#to show first observations of the grade data frame
head(grade)
#output
 StudentID        Fullname Race Gender Country Age Math Physics
1         1     James Zhang    A   Male      US  23   73      70
2         2       Wilson Li    E Female      UK  26   95      76
3         3 Richard Nuan Ye    A   Male      UK  35   77      83
4         4       Mary Deng    E Female      US  21   60      99
5         5    Jason Wilson    A   Male      UK  19   77      89
6         6 Jennifer Hopkin    A Female      UK  43   79      64
  Chemistry       Date
1        87 10/31/2008
2        83  3/16/2008
3        92  5/22/2008
4        84  1/24/2009
5        93  7/30/2009
6        83   4/5/2009

Then we can sort grade data frame by Math score in ascending order, and store the result into a new data frame.

#load the library
library(tidyverse)
#sort grade data frame by Math score, in ascending order
#and store in a new data frame
grade_math_asc<-arrange(grade,Math)
#to show first observations
head(grade_math_asc)
#output
  StudentID       Fullname Race Gender Country Age Math Physics
1         4      Mary Deng    E Female      US  21   60      99
2        18   Lisa Bondvik    E Female      US  39   65      73
3        10 Steinar Hansen    A   Male      US  25   66      93
4        15       Phil Yao    A   Male      UK  21   69      69
5        12   Josef Curton    E   Male      US  32   71      63
6         1    James Zhang    A   Male      US  23   73      70
  Chemistry       Date
1        84  1/24/2009
2        87   7/9/2009
3        65   8/1/2008
4        83 10/15/2008
5        96  11/8/2009
6        87 10/31/2008

Similarly, arrange() can sort data by variable in descending order with the option mentioned.

#create a new data frame, by sorting data frame 'grade'
#with respect to 'Math' score, in descending order
grade_math_dsc<-arrange(grade,desc(Math))
#to show first observations
head(grade_math_dsc)
#output
  StudentID       Fullname Race Gender Country Age Math Physics
1         2      Wilson Li    E Female      UK  26   95      76
2         8    Wenche Dale    E Female      US  28   95      87
3        17 Goril Nordmann    A Female      UK  36   91      79
4        14     Gary Grant    E Female      UK  35   90      78
5         7   Kari Gjendem    E Female      US  37   87      99
6        19     Guri Olsen    E Female      US  24   87      72
  Chemistry       Date
1        83  3/16/2008
2        93  10/2/2008
3        69  5/24/2008
4        92 10/24/2008
5        67 11/24/2008
6        89  8/12/2009

Of course, it is fairy straightforward if there are more than one variables are the sorting variables in arrange() function. In the following example, we sort Math first in ascending order, then sort Physics in descending order.

grade_math_phy<-arrange(grade,Math, desc(Physics))
> head(grade_math_phy)
  StudentID       Fullname Race Gender Country Age Math Physics
1         4      Mary Deng    E Female      US  21   60      99
2        18   Lisa Bondvik    E Female      US  39   65      73
3        10 Steinar Hansen    A   Male      US  25   66      93
4        15       Phil Yao    A   Male      UK  21   69      69
5        12   Josef Curton    E   Male      US  32   71      63
6         9    Jane Larsen    A Female      US  19   73      92
  Chemistry       Date
1        84  1/24/2009
2        87   7/9/2009
3        65   8/1/2008
4        83 10/15/2008
5        96  11/8/2009
6        84   6/5/2009
> 

The same operations can be realized by using pipeline(%>%) structure. In pipeline operation, the object in front of the pipeline block will act as the first argument passed to the arrange() function here.

#we pass grade to the arrange() function using pipeline structure
#sort data by Math and Physics
grade_math_phy<- grade %>% arrange(Math, desc(Physics))
> head(grade_math_phy)
  StudentID       Fullname Race Gender Country Age Math Physics
1         4      Mary Deng    E Female      US  21   60      99
2        18   Lisa Bondvik    E Female      US  39   65      73
3        10 Steinar Hansen    A   Male      US  25   66      93
4        15       Phil Yao    A   Male      UK  21   69      69
5        12   Josef Curton    E   Male      US  32   71      63
6         9    Jane Larsen    A Female      US  19   73      92
  Chemistry       Date
1        84  1/24/2009
2        87   7/9/2009
3        65   8/1/2008
4        83 10/15/2008
5        96  11/8/2009
6        84   6/5/2009

For getting more knowledge of R and a preview of our training course, you can watch R tutorial videos on our YouTube channel !

wilsonzhang746

Recent Posts

Download R Course source files

Click here to download R Course source files !

2 months ago

Download Python Course source files

Click here to download Python Course Source Files !

2 months ago

How to create a data frame from nested dictionary with Pandas in Python

For online Python training registration, click here ! Pandas provides flexible ways of generating data…

5 months ago

How to delete columns of a data frame in Python

For online Python training registration, click here ! Data frame is the tabular data object…

5 months ago

Using isin() to check membership of a data frame in Python

Click her for course registration ! When a data frame in Python is created via…

5 months ago

How to assign values to Pandas data frame in Python

We provide affordable online training course(via ZOOM meeting) for Python and R programming at fundamental…

5 months ago