We provide effective and economically affordable training courses for R and Python, click here for more details and course registration !

In R programming, variable type can be numeric or character. Categorical variables are usually called factors. R provides a series of functions working with specific type test and type conversion. In the following code example, a data frame is loaded from a csv file, then specific variables are check if they are numeric or character, or converted to factors.

# to set working directory
setwd("d:\\RStatistics-Tutorial")    
vartype<-c("character", "character", "character", "character", "character", "numeric","numeric", "numeric","numeric","character")
#read csv file from working directory , create a data frame
grade <- read.table("University-NA.csv", colClasses=vartype, header=TRUE, sep=",")                                      
#to show first 6 observations
head(grade)
#uutput
  StudentID    First    Last Gender Country Age Math Physics
1         1    James   Zhang   Male      US  23   73      70
2         2   Wilson      Li   Male      UK  26   95     999
3         3  Richard Nuan Ye   Male      UK  35   77      83
4         4     Mary    Deng Female      US  21   60      99
5         5    Jason  Wilson   Male      UK  19   77      89
6         6 Jennifer  Hopkin Female      UK  43   79      64
  Chemistry     Date
1        87 10/31/08
2        83 03/16/08
3        92 05/22/08
4        84 01/24/09
5        93 07/30/09
6        83 04/05/09
#show data frame structure
str(grade)
#output
'data.frame':	20 obs. of  10 variables:
 $ StudentID: chr  "1" "2" "3" "4" ...
 $ First    : chr  "James" "Wilson" "Richard" "Mary" ...
 $ Last     : chr  "Zhang" "Li" "Nuan Ye" "Deng" ...
 $ Gender   : chr  "Male" "Male" "Male" "Female" ...
 $ Country  : chr  "US" "UK" "UK" "US" ...
 $ Age      : num  23 26 35 21 19 43 37 28 19 25 ...
 $ Math     : num  73 95 77 60 77 79 87 95 73 66 ...
 $ Physics  : num  70 999 83 99 89 64 99 87 92 93 ...
 $ Chemistry: num  87 83 92 84 93 83 67 93 84 999 ...
 $ Date     : chr  "10/31/08" "03/16/08" "05/22/08" "01/24/09" ...

#to check if variable Age is numeric
is.numeric(grade$Age)
#output
[1] TRUE
#to check variable First is a vector
is.vector(grade$First)
#output
[1] TRUE

#convert to character vector
test <- as.character(grade$Age)    
test
#output
[1] "23" "26" "35" "21" "19" "43" "37" "28" "19" "25" "42" "32"
[13] "27" "35" "21" "29" "36" "39" "24" "25"
#to check if test is numeric
is.numeric(test)
#output
[1] FALSE
#to test if test is a vector
is.vector(test)
#output
[1] TRUE
#to test if test is character type
is.character(test)
#output
[1] TRUE
#convert Gender as factor
grade$Gender <- as.factor(grade$Gender)
#to show data frame structure again
str(grade)
#output
'data.frame':	20 obs. of  10 variables:
 $ StudentID: chr  "1" "2" "3" "4" ...
 $ First    : chr  "James" "Wilson" "Richard" "Mary" ...
 $ Last     : chr  "Zhang" "Li" "Nuan Ye" "Deng" ...
 $ Gender   : Factor w/ 2 levels "Female","Male": 2 2 2 1 2 1 1 1 1 2 ...
 $ Country  : chr  "US" "UK" "UK" "US" ...
 $ Age      : num  23 26 35 21 19 43 37 28 19 25 ...
 $ Math     : num  73 95 77 60 77 79 87 95 73 66 ...
 $ Physics  : num  70 999 83 99 89 64 99 87 92 93 ...
 $ Chemistry: num  87 83 92 84 93 83 67 93 84 999 ...
 $ Date     : chr  "10/31/08" "03/16/08" "05/22/08" "01/24/09" ...
#to show first 6 observations
head(grade)
#output
  StudentID    First    Last Gender Country Age Math Physics
1         1    James   Zhang   Male      US  23   73      70
2         2   Wilson      Li   Male      UK  26   95     999
3         3  Richard Nuan Ye   Male      UK  35   77      83
4         4     Mary    Deng Female      US  21   60      99
5         5    Jason  Wilson   Male      UK  19   77      89
6         6 Jennifer  Hopkin Female      UK  43   79      64
  Chemistry     Date
1        87 10/31/08
2        83 03/16/08
3        92 05/22/08
4        84 01/24/09
5        93 07/30/09
6        83 04/05/09

You can also watch R tutorial videos from our YouTube channel.


0 Comments

Leave a Reply

Avatar placeholder