We provide effective and economically affordable training courses for R and Python, Click here for more details and course registration !

Histogram is a type of graph that shows the distribution of a continuous variable. The range of value of the variable is divided into parts (range), then the frequency for each bin is plotted. In R, with ggplot2 package, function geom_histogram() can be used to plot histograms. Histogram can be a simple graph using the default setting and number of bins, can be defined manually with bin number, or shown with percentage values in place of frequencies.

  1. In the following example, Physics scores for students are plotted in a simple histogram , using default setting.
#load library
library(ggplot2)
library(scales)

#setting working directory
setwd("d:\\RStatistics-Tutorial")  
#create a grade data frame
vartype<-c("character", "character", "character", "character", "character", "numeric","numeric", "numeric","numeric","character")
grade <- read.table("University-Fullname-full.csv", colClasses=vartype, header=TRUE, sep=",")                                      
grade$Gender<-as.factor(grade$Gender)
head(grade)

##output
 StudentID       Fullname Race Gender Country Age Math Physics
7          7   Kari Gjendem    E Female      US  37   87      99
8          8    Wenche Dale    E Female      US  28   95      87
9          9    Jane Larsen    A Female      US  19   73      92
10        10 Steinar Hansen    A   Male      US  25   66      93
   Chemistry       Date
7         67 11/24/2008
8         93  10/2/2008
9         84   6/5/2009
10        65   8/1/2008

#simple histogram
ggplot(grade, aes(x=Physics)) + 
  geom_histogram() +
  labs(title="Simple Histogram for Physics score")

A Simple Histogram

2. In the following example, a Colored histogram of Physics scores with 5 bins are plotted.

ggplot(grade, aes(x=Physics)) + 
  geom_histogram(bins=5, color="white", fill="steelblue") +
  labs(title="Colored histogram of Physics with 5 bins",
       x="Physics score",
       y="Frequency")

A Colored Histogram with Defined Bins

3. In the following example, the percentage in place of frequency for each bin in the histogram

#Histogram with percentages
ggplot(grade, aes(x=Physics, y=..density..)) + 
  geom_histogram(bins=30, color="white", fill="steelblue") +
  scale_y_continuous(labels=scales::percent) +
  labs(title="Histogram of Physics with percentages",
       y= "Percent",
       x="Physics score")

Histogram with Percentage

You can also watch R programming video from our YouTube channel.


0 Comments

Leave a Reply

Avatar placeholder