Class 7.1: Writing Structured Documents

Methodology of Scientific Research

Andrés Aravena, PhD

March 31, 2022

Separation of concerns

The key idea is to describe what things are, not how they look

Describe the role of text, not the “looks”

Separate style from structure

This part is based on the ideas discussed in “LaTeX: A Document Preparation System” by Leslie Lamport (1986).

It is like a house

Structure makes the house solid and comfortable

If you only do decoration, the house looks nice but it is not solid

Structure of the walls come first

Painting the walls in a nice color is secondary

Style is not Structure

You can follow the same philosophy:

  • Separate style from structure

  • Focus on content

Structure without style

Historical note

Mechanical typewriters were invented in 1874

They had only one font

We still use the same keyboard

Using UPPERCASE and underline for emphasis

Early computers had only text, no graphics

Giving style to plain text

Since there was only one type of letter, people used some symbols as “magic”

For example \ or @

If you write a “magic” symbol, you tell the computer that the next symbol shows a change of format

This is called Markup Language

TeX

An important system to prepare documents in the computer was invented in tht 70’s by Donald Knuth, who is probably the most important computer scientist of the last 70 years.

Knuth invented TeX to write this

LaTeX

TeX has styles but not structure. In the 80’s Leslie Lamport created LaTeX as an TeX

Example: writing in LaTeX

A LaTeX document looks like this

\documentclass[a4paper]{article}
\title{On computable numbers, with an application to the Entscheidungsproblem}
\author{Alan M. Turing}
\date{28 May, 1936}
\begin{document}
\section{Introduction}
The ``computable'' numbers may be described as the real numbers whose
expression as a decimal are calculable by finite means.
\end{document}

LaTeX files are text files. They will never be obsolete.

Changing the documentclass will change the document look

Advantages of LaTeX

  • Write first, compile later

  • Do not waste time playing with fonts

  • Good journals accept LaTeX submissions
    (they also accept Microsoft Word format)

LaTeX files are text files

  • Independent of any provider

  • Use your favorite text editor (VScode?)

  • Version control friendly (GitHub?)

  • Can probably be read 20 years from now

We cannot say the same about Microsoft Word

According to the author of LaTeX

LaTeX It’s easy to use—if you’re one of the 2% of the population who thinks logically and can read an instruction manual. The other 98 % of the population would find it very hard or impossible to use.

So maybe the main advantage is that it forces you to think logically and organize your ideas

“How (La)TeX changed the face of Mathematics”. An E-interview with Leslie Lamport. http://lamport.azurewebsites.net/pubs/lamport-latex-interview.pdf

3 mistakes that people should stop making

  1. Worrying too much about formatting and not enough about content.
  2. Worrying too much about formatting and not enough about content.
  3. Worrying too much about formatting and not enough about content.

“How (La)TeX changed the face of Mathematics”. An E-interview with Leslie Lamport. http://lamport.azurewebsites.net/pubs/lamport-latex-interview.pdf

Good ideas in LaTeX

  • Chapters, sections, subsections
  • Automatic creation of Table of Contents
  • Automatic numbering of sections, figures, tables
  • Cross referencing sections, figures, tables
  • Floating figures
  • Math formulas
  • Bibliographic references

Writing Math Expressions

LaTeX is favored by people who writes mathematical formulas

$$(a+b)^n=\sum_{k=0}^n \frac{n!}{k!(n-k)!} a^k b^{n-k}$$

\[(a+b)^n=\sum_{k=0}^n \frac{n!}{k!(n-k)!} a^k b^{n-k}\]

You can use this syntax in Microsoft Word’s Equation Editor, and in web pages

Learning how to write math is a good investment

Bibliographic References

There are hundreds of citation styles

Life is too short to sort references manually

LaTeX also provides a convenient way to handle references

References are stored in a separate text file, in BiBTeX format

Many tools can create BiBTeX files for you

  • Zotero
  • Mendeley

Collaborating with other people

Since LaTeX files are text files, it can be put under version control

In practice this means git, and maybe GitHub or GitLab
(or something in your server)

Several people can edit the same file at the same time
Git will do the right thing when merging

It does not need permanent Internet access
(i.e. you can write while traveling)

Real time collaboration

Overleaf is an online collaborative writing and publishing tool

Overleaf provides … an easy-to-use LaTeX editor with real-time collaboration and the compiled output produced automatically … as you type

You do not need to install anything in your computer

https://www.overleaf.com/

LaTeX disadvantages

  • LaTeX is hard to learn
    • This discourages many people
    • Your collaborators may not use it
    • You need to have the Reference Manual at hand
  • It is oriented to producing printed material
    • It produces PDF files or equivalents
    • Not suitable for Web or eBook
  • Writing tables is hard

Web Pages

In the 90’s many computers had graphic capabilities and Internet access

Researchers at CERN invented the web, using “hyper-text”

Web pages are written in Hyper Text Markup Language

HTML

These are also text files. It looks like this:

<head>
<title>On computable numbers, with an application to the Entscheidungsproblem</title>
</head>
<body>
<h1>Introduction</h1>
The "computable" numbers may be described as the real numbers whose
expression as a decimal are calculable by finite means.
</body>

Good ideas from HTML

  • Works well on the screen: adapts to screen size

  • Links to other pages

  • Structural elements

    • <h1>…</h1> marks Header level 1
    • There are also <h2><h6>
  • Comments: <!-- this part is not shown -->

  • Structure separated from Style

    • Style is defined in CSS files

Disadvantages of HTML

  • It does not work well for paper

  • It is hard to write manually

  • There are editors, but they often focus on style, not structure

Alternative: Markdown

It is a light markup system that can be easily converted into nice presentations

---
title: On computable numbers, with an application to the Entscheidungsproblem
author: Alan M. Turing
date: 28 May, 1936
---

# Introduction
The "computable" numbers may be described as the real numbers whose
expression as a decimal are calculable by finite means.

Writing a paper

  • You can use Visual Studio Code to write Markdown

    • please install the latest version on your computer
  • You can do bibliographies and cross references

  • Watch https://youtu.be/hpAJMSS8pvs and other YouTube videos by Nicholas Cifuentes-Goodbody