We provide effective and economically affordable training courses for R and Python, click here for more details and course registration !
In R programming, variable type can be numeric or character. Categorical variables are usually called factors. R provides a series of functions working with specific type test and type conversion. In the following code example, a data frame is loaded from a csv file, then specific variables are check if they are numeric or character, or converted to factors.
# to set working directory
setwd("d:\\RStatistics-Tutorial")
vartype<-c("character", "character", "character", "character", "character", "numeric","numeric", "numeric","numeric","character")
#read csv file from working directory , create a data frame
grade <- read.table("University-NA.csv", colClasses=vartype, header=TRUE, sep=",")
#to show first 6 observations
head(grade)
#uutput
StudentID First Last Gender Country Age Math Physics
1 1 James Zhang Male US 23 73 70
2 2 Wilson Li Male UK 26 95 999
3 3 Richard Nuan Ye Male UK 35 77 83
4 4 Mary Deng Female US 21 60 99
5 5 Jason Wilson Male UK 19 77 89
6 6 Jennifer Hopkin Female UK 43 79 64
Chemistry Date
1 87 10/31/08
2 83 03/16/08
3 92 05/22/08
4 84 01/24/09
5 93 07/30/09
6 83 04/05/09
#show data frame structure
str(grade)
#output
'data.frame': 20 obs. of 10 variables:
$ StudentID: chr "1" "2" "3" "4" ...
$ First : chr "James" "Wilson" "Richard" "Mary" ...
$ Last : chr "Zhang" "Li" "Nuan Ye" "Deng" ...
$ Gender : chr "Male" "Male" "Male" "Female" ...
$ Country : chr "US" "UK" "UK" "US" ...
$ Age : num 23 26 35 21 19 43 37 28 19 25 ...
$ Math : num 73 95 77 60 77 79 87 95 73 66 ...
$ Physics : num 70 999 83 99 89 64 99 87 92 93 ...
$ Chemistry: num 87 83 92 84 93 83 67 93 84 999 ...
$ Date : chr "10/31/08" "03/16/08" "05/22/08" "01/24/09" ...
#to check if variable Age is numeric
is.numeric(grade$Age)
#output
[1] TRUE
#to check variable First is a vector
is.vector(grade$First)
#output
[1] TRUE
#convert to character vector
test <- as.character(grade$Age)
test
#output
[1] "23" "26" "35" "21" "19" "43" "37" "28" "19" "25" "42" "32"
[13] "27" "35" "21" "29" "36" "39" "24" "25"
#to check if test is numeric
is.numeric(test)
#output
[1] FALSE
#to test if test is a vector
is.vector(test)
#output
[1] TRUE
#to test if test is character type
is.character(test)
#output
[1] TRUE
#convert Gender as factor
grade$Gender <- as.factor(grade$Gender)
#to show data frame structure again
str(grade)
#output
'data.frame': 20 obs. of 10 variables:
$ StudentID: chr "1" "2" "3" "4" ...
$ First : chr "James" "Wilson" "Richard" "Mary" ...
$ Last : chr "Zhang" "Li" "Nuan Ye" "Deng" ...
$ Gender : Factor w/ 2 levels "Female","Male": 2 2 2 1 2 1 1 1 1 2 ...
$ Country : chr "US" "UK" "UK" "US" ...
$ Age : num 23 26 35 21 19 43 37 28 19 25 ...
$ Math : num 73 95 77 60 77 79 87 95 73 66 ...
$ Physics : num 70 999 83 99 89 64 99 87 92 93 ...
$ Chemistry: num 87 83 92 84 93 83 67 93 84 999 ...
$ Date : chr "10/31/08" "03/16/08" "05/22/08" "01/24/09" ...
#to show first 6 observations
head(grade)
#output
StudentID First Last Gender Country Age Math Physics
1 1 James Zhang Male US 23 73 70
2 2 Wilson Li Male UK 26 95 999
3 3 Richard Nuan Ye Male UK 35 77 83
4 4 Mary Deng Female US 21 60 99
5 5 Jason Wilson Male UK 19 77 89
6 6 Jennifer Hopkin Female UK 43 79 64
Chemistry Date
1 87 10/31/08
2 83 03/16/08
3 92 05/22/08
4 84 01/24/09
5 93 07/30/09
6 83 04/05/09
You can also watch R tutorial videos from our YouTube channel.
0 Comments