R Programming

Plot histograms using ggplot2 in R

We provide effective and economically affordable training courses for R and Python, Click here for more details and course registration !

Histogram is a type of graph that shows the distribution of a continuous variable. The range of value of the variable is divided into parts (range), then the frequency for each bin is plotted. In R, with ggplot2 package, function geom_histogram() can be used to plot histograms. Histogram can be a simple graph using the default setting and number of bins, can be defined manually with bin number, or shown with percentage values in place of frequencies.

  1. In the following example, Physics scores for students are plotted in a simple histogram , using default setting.
#load library
library(ggplot2)
library(scales)

#setting working directory
setwd("d:\\RStatistics-Tutorial")  
#create a grade data frame
vartype<-c("character", "character", "character", "character", "character", "numeric","numeric", "numeric","numeric","character")
grade <- read.table("University-Fullname-full.csv", colClasses=vartype, header=TRUE, sep=",")                                      
grade$Gender<-as.factor(grade$Gender)
head(grade)

##output
 StudentID       Fullname Race Gender Country Age Math Physics
7          7   Kari Gjendem    E Female      US  37   87      99
8          8    Wenche Dale    E Female      US  28   95      87
9          9    Jane Larsen    A Female      US  19   73      92
10        10 Steinar Hansen    A   Male      US  25   66      93
   Chemistry       Date
7         67 11/24/2008
8         93  10/2/2008
9         84   6/5/2009
10        65   8/1/2008

#simple histogram
ggplot(grade, aes(x=Physics)) + 
  geom_histogram() +
  labs(title="Simple Histogram for Physics score")

A Simple Histogram

2. In the following example, a Colored histogram of Physics scores with 5 bins are plotted.

ggplot(grade, aes(x=Physics)) + 
  geom_histogram(bins=5, color="white", fill="steelblue") +
  labs(title="Colored histogram of Physics with 5 bins",
       x="Physics score",
       y="Frequency")

A Colored Histogram with Defined Bins

3. In the following example, the percentage in place of frequency for each bin in the histogram

#Histogram with percentages
ggplot(grade, aes(x=Physics, y=..density..)) + 
  geom_histogram(bins=30, color="white", fill="steelblue") +
  scale_y_continuous(labels=scales::percent) +
  labs(title="Histogram of Physics with percentages",
       y= "Percent",
       x="Physics score")

Histogram with Percentage

You can also watch R programming video from our YouTube channel.

wilsonzhang746

Recent Posts

Download source files for R Machine learning

Click here to go to source files for R Machine Learning

4 weeks ago

Python Machine Learning Source Files

Click here to download Python Machine Learning Source Files !

2 months ago

Install PyTorch on Windows

PyTorch is a deep learning package for machine learning, or deep learning in particular for…

2 months ago

Topic Modeling using Latent Dirichlet Allocation with Python

Topic modeling is a subcategory of unsupervised machine learning method, and a clustering task in…

3 months ago

Document sentiment classification using bag-of-words in Python

For online Python training registration, click here ! Sentiment classification is a type of machine…

3 months ago

Download R Course source files

Click here to download R Course source files !

12 months ago