November 26, 2019

Get data from the web

The best way is to download the data file and save it into a local folder

Then you can read it as much as you like

Telling beautiful stories

Choosing color, size and symbol

The commands in this page produce the plots of the following page

plot(survey$height_cm, main="1")
plot(survey$height_cm, main="2", col="red")
plot(survey$height_cm, main="3", cex=2)
plot(survey$height_cm, main="4", cex=0.5)
plot(survey$height_cm, main="5", pch=16)
plot(survey$height_cm, main="6", pch=".")

Choosing color, size and symbol

Choosing the type of plot

The commands in this page produce the plots of the following page

plot(survey$height_cm, main="1", type = "l")
plot(survey$height_cm, main="2", type = "o")
plot(survey$height_cm, main="3", type = "b")
plot(survey$height_cm, main="4", type = "p")
plot(survey$height_cm, main="5", xlim=c(1,20))
plot(survey$height_cm, main="6", xlim=c(30,51))

Choosing the type of plot

Two plots in parallel

plot(survey$height_cm, ylim=c(0,200))
points(survey$weight_kg, pch=2)
plot(survey$height_cm, type="l", ylim=c(0,200))
lines(survey$weight_kg, col="red")

Two plots in parallel


Adding legend

plot(survey$height_cm, col=survey$Gender)
legend("topleft", legend=c("Female", "Male"), fill=c(1,2))

Adding straight lines

abline(h=mean(survey$height_cm), col="red", lwd=5)

AB line

This command adds a straight line in a specific position

  • abline(h=1) adds a horizontal line in 1
  • abline(v=2) adds a vertical line in 2
  • abline(a=3, b=4) adds an \(y=a +b\cdot x\) line
    • a is the intercept when \(x=0\)
    • b is the slope


abline(v=20, col="blue")
abline(a=160, b=0.5)

Scatter plots

Comparing two variables

plot(survey$height_cm, survey$weight_kg)

Other example

plot(survey$height_cm, survey$hand_span_cm)

Formulas in R

Formulas are summaries of a relationship

Instead of

plot(survey$height_cm, survey$weight_kg)

we can write

plot(survey$weight_kg ~ survey$height_cm)

or even

plot(weight_kg ~ height_cm, data = survey)

Using formulas makes life easier

plot(height_cm ~ hand_span_cm, data = survey)

plot(height_cm ~ hand_span_cm, data = survey,
     subset = Gender=="Female")

plot(height_cm ~ hand_span_cm, data = survey,
     subset = Gender=="Male")

It is easier to specify the data.frame and which values to plot

Graphics depend on the type of data

Numeric v/s Numeric

plot(height_cm ~ weight_kg, data=survey)

Factor v/s Factor

survey$handness <- as.factor(survey$handness)
plot(Gender ~ handness, data=survey)

Factor v/s Numeric

plot(Gender ~ weight_kg, data=survey)

Numeric v/s Factor

plot(weight_kg ~ Gender, data=survey)

This is called “Boxplot”

Plotting a numeric value depending on a factor results in a boxplot

It is a graphical version of summary().

  • The center is the median
  • The box is between the first and third quartile (50% of cases)
  • The whiskers extend a prediction of 95% of cases
  • Points are outliers

Nicer boxplot

plot(weight_kg ~ Gender, data=survey, boxwex=0.3,
    notch=TRUE, col="grey")

Exploring all data: plot data frame



Plot function

  • plot() can be used with one or two vectors, or with a formula
  • plot(y ~ x) looks like plot(x, y)
  • Formulas are nice: plot(y~x, data=dframe) is better than plot(dframe$x, dframe$y)
  • In general the defaults are good
    • axis labels are the names of the variables being plotted
    • ranges are automatic
  • You can use numbers to choose colors, symbols and sizes of points
  • You can choose the ranges, labels and

Plot is a generic function

The figure type depends on the data type of the vector

  • numeric: similar to points() or lines()
  • factor: count frequency and draws barplot()
  • numeric v/s factor: same as boxplot()
  • complete data frame: same as pairs()
  • factor v/s factor: like a histogram in 2D

Adding details to a plot

  • The plot() command defines the ranges, labels and title
  • You can add more elements over a pre-existing plot:
    • points(), lines()
    • text()
    • segment(), arrows(),
    • rect(), polygon() xspline()
    • legend()

Learn more on the help page of each command


Colors can be specified in several ways:

  • A numeric value is an index into a palette
  • A character with a color name in English
    • such as “red” or “steelblue”
  • A character with a hexadecimal code
    • such as “#A11F1F”
    • Google “hexadecimal colors” to learn more