Create data frame in R using read.table() function

We provide effective and economically affordable training courses for R and Python, Click here for more details and course registration !

read.table() function in R is often used when a delimited ASCII file (e.g. text file or csv file) is to be imported to generate a data frame. The basic syntax is

df <- read.table(input, options) Where

input is usually a csv or txt file from working directory, and a common list of options are :

header: specify whether the first line in the file will be made as variable names;

sep: specify the delimiter symbol in the file, default is sep=” “, denoting any white space character;

row.names: parameters specifying row identifier;

col.names: parameters specifying column identifier;

colClasses: a vector specifying the classes for each column, e,g, ‘character’, ‘numeric’, etc.

Look at an example csv file University-Fullname-full located in the current working directory, it contains 20 students’ testing scores for mathemtatics, physics, chemistry, as well as their demographic informations.

Full contents of the data file can be found here:

https://rdatacode.com/wp-admin/post.php?post=192&action=edit

We can use read.table() to import this file into a new data frame in R.

# to set working directory
> setwd("d:\\RStatistics-Tutorial")   

#use read.table() to read a csv file to create a data frame

#use comma as delimiter, first line of the file used as 
# variable names, and set a row identifier
> grade <- read.table("University-Fullname-full.csv", header=TRUE, row.names="StudentID", sep=",")
#to show the new data frame
> grade
          Fullname Race Gender Country Age Math Physics Chemistry       Date
1      James Zhang    A   Male      US  23   73      70        87 10/31/2008
2        Wilson Li    E Female      UK  26   95      76        83  3/16/2008
3  Richard Nuan Ye    A   Male      UK  35   77      83        92  5/22/2008
4        Mary Deng    E Female      US  21   60      99        84  1/24/2009
5     Jason Wilson    A   Male      UK  19   77      89        93  7/30/2009
6  Jennifer Hopkin    A Female      UK  43   79      64        83   4/5/2009
7     Kari Gjendem    E Female      US  37   87      99        67 11/24/2008
8      Wenche Dale    E Female      US  28   95      87        93  10/2/2008
9      Jane Larsen    A Female      US  19   73      92        84   6/5/2009
10  Steinar Hansen    A   Male      US  25   66      93        65   8/1/2008
11    Michael Chen    A   Male      UK  42   83      90        77 10/24/2008
12    Josef Curton    E   Male      US  32   71      63        96  11/8/2009
13  Jennifer Jones    E   Male      US  27   79      76        82 10/29/2008
14      Gary Grant    E Female      UK  35   90      78        92 10/24/2008
15        Phil Yao    A   Male      UK  21   69      69        83 10/15/2008
16     Nora Spears    E Female      US  29   79      83        76  3/11/2009
17  Goril Nordmann    A Female      UK  36   91      79        69  5/24/2008
18    Lisa Bondvik    E Female      US  39   65      73        87   7/9/2009
19      Guri Olsen    E Female      US  24   87      72        89  8/12/2009
20    Martin Jones    A   Male      US  25   82      73        62  3/27/2008

#show the structure information of the data frame
> str(grade)
'data.frame':	20 obs. of  9 variables:
 $ Fullname : chr  "James Zhang" "Wilson Li" "Richard Nuan Ye" "Mary Deng" ...
 $ Race     : chr  "A" "E" "A" "E" ...
 $ Gender   : chr  "Male" "Female" "Male" "Female" ...
 $ Country  : chr  "US" "UK" "UK" "US" ...
 $ Age      : int  23 26 35 21 19 43 37 28 19 25 ...
 $ Math     : int  73 95 77 60 77 79 87 95 73 66 ...
 $ Physics  : int  70 76 83 99 89 64 99 87 92 93 ...
 $ Chemistry: int  87 83 92 84 93 83 67 93 84 65 ...
 $ Date     : chr  "10/31/2008" "3/16/2008" "5/22/2008" "1/24/2009" ...
>

Option colClasses can be used to definitely setting the variable type simutaneouly when importing the csv file.

#set column variable type in read.table()
> vartype<-c("character", "character", "character", "character", "character", "numeric","numeric", "numeric","numeric","character")
> grade <- read.table("University-Fullname-full.csv", colClasses=vartype, header=TRUE, sep=",")                                      
> #grade
> str(grade)
'data.frame':	20 obs. of  10 variables:
 $ StudentID: chr  "1" "2" "3" "4" ...
 $ Fullname : chr  "James Zhang" "Wilson Li" "Richard Nuan Ye" "Mary Deng" ...
 $ Race     : chr  "A" "E" "A" "E" ...
 $ Gender   : chr  "Male" "Female" "Male" "Female" ...
 $ Country  : chr  "US" "UK" "UK" "US" ...
 $ Age      : num  23 26 35 21 19 43 37 28 19 25 ...
 $ Math     : num  73 95 77 60 77 79 87 95 73 66 ...
 $ Physics  : num  70 76 83 99 89 64 99 87 92 93 ...
 $ Chemistry: num  87 83 92 84 93 83 67 93 84 65 ...
 $ Date     : chr  "10/31/2008" "3/16/2008" "5/22/2008" "1/24/2009" ...
>

You can also watch our videos on YouTube to get more illustrative experience about data creation.

Published by wilsonzhang746 on December 8, 2023December 8, 2023

0 Comments

Leave a Reply Cancel reply

Download R Course source files

Evaluate relative importance of variables in regression using standardized regression coefficients approach in R

How to calculate probabilities for Exponential distributions in R