Class 7.2: The ancient art

Methodology of Scientific Research

Andrés Aravena, PhD

March 31, 2022

This is an Ancient Art

     

It is not a Lost Art

     

My book of Ancient Arts

In my view, the “old ways” that are powerful magic include:

  • Command line

  • Text files

  • Editing text in a terminal

  • Combining all of them

Text documents are good

Text files are for humans and computers

  • Binary files are hard to read
    • unless you have the correct program
  • Text files can be read by humans
    • Each byte is a letter
  • Text files can be read by computers
    • Data must be recyclable
    • The output of one program may be the input of another program

Text editors instead of Word processors

The easiest way to handle text files is to use a text editor

These are programs to view and edit text files

They use a monospaced font, like Courier

Each letter has the same width

Text editor have syntax coloring

Since each letter has the same size, text editor use color

The color depends on the role of each text

For example, headings can be in red color

The color is not in the file. The editor puts colors

Text editors handling Markdown

These work with Markdown and other formats

All are good. We use VSCode

Markdown Text editors

Online Markdown editors

Text files are for ever

Free

  • nothing to pay

  • you can do whatever you want

Never get obsolete

But they do not have structure

Structured Documents

We want to identify the meaning, not the shapes

  • Title
  • Sections
    • Subsections
      • Lists
      • Figures
      • Tables
  • References to other works

Separation of concerns

The key idea is to describe what things are, not how they look

Describe the role of text, not the “looks”

Separate style from structure

Text files with structure

There are several markup languages that encode the structure of a text document

  • LaTeX
  • ReStructured Text
  • MediaWiki
  • HTML
  • Markdown
  • Textile
  • AsciiDoc

Markdown

Markdown is a widely used markup language

  • Same philosophy as LaTeX, but simpler

  • The text file can be read and understood easily

  • It can be transformed into other formats

    • PDF, Word, Webpage (HTML)
  • Used in R, Python, Julia (Jupyter), in GitHub, and many other modern platforms

Markdown’s author says

“The overriding design goal for Markdown’s formatting syntax is to make it as readable as possible.

“The idea is that a Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions.”

Flavors of Markdown

Compiling is transforming from Markdown to other format

There are many different Markdown compilers

Many people make their own compiler, and they expand the original idea

Unfortunately, they are not always 100% compatible

There is not yet an official standard

Recommendation: pandoc

(if you have RStudio, you have Pandoc)

Pandoc

If you need to convert files from one markup format into another, pandoc is your swiss-army knife

Pandoc can convert between many formats, including

  • Markdown
  • Microsoft Word/Powerpoint
  • LaTeX
  • Jupyter notebook

Pandoc advantages

  • Text files

  • It is easy to write tables in Markdown

  • It is easy to write lists

  • Can be used for slides

    • Several web platforms (like this document)
    • Microsoft Powerpoint
  • Handles BiBTeX references

Markdown format

Paragraphs

  • Consecutive lines of text are one paragraph.
  • They are separated by an empty line
The first paragraph.

Another paragraph

The first paragraph.

Another paragraph

Headers

# Header 1
## Header 2
### Header 3
#### Header 4

Header 1

Header 2

Header 3

Header 4

Unordered Lists

+ Item 1
+ Item 2
    + Item 2a
    + Item 2b
  • Item 1
  • Item 2
    • Item 2a
    • Item 2b

Sub-lists are indented by 4 spaces

Ordered Lists

1. Item 1
1. Item 2
1. Item 3
    1. Item 3a
    1. Item 3b
  1. Item 1
  2. Item 2
  3. Item 3
    1. Item 3a
    2. Item 3b

Images

You have to indicate the web address of the image

![optional text](http://example.com/logo.png)

or the name of a file in the same directory

![optional text](images/logo.png)

optional text

Optional text is shown when the image is not found

![optional text](images/logo.pn)

optional text

Figures with Captions

This is a pandoc extension, not standard Markdown

If the figure is a paragraph (has empty lines before and after_then the_optional text_ becomes the caption,

![This is the caption of the figure.](images/logo.png)

This is the caption of the figure.

Tables

There are several formats. The easiest one is this

|        | sample   | dose | time   | agent            |
|--------|----------|------|--------|------------------|
| 1      | GSM91440 | low  | 5 min  | caffeine         |
| 2      | GSM91893 | low  | 5 min  | caffeine         |
| 3      | GSM91428 | low  | 5 min  | calcofluor white |
| 4      | GSM91881 | low  | 5 min  | calcofluor white |
sample dose time agent
1 GSM91440 low 5 min caffeine
2 GSM91893 low 5 min caffeine
3 GSM91428 low 5 min calcofluor white
4 GSM91881 low 5 min calcofluor white

Tables with captions (pandoc extension)

Write Table: and the caption just after the table

|        | sample   | dose | time   | agent            |
|--------|----------|------|--------|------------------|
| 1      | GSM91440 | low  | 5 min  | caffeine         |
| 2      | GSM91893 | low  | 5 min  | caffeine         |
| 3      | GSM91428 | low  | 5 min  | calcofluor white |
| 4      | GSM91881 | low  | 5 min  | calcofluor white |

Table: This is the table caption
This is the table caption
sample dose time agent
1 GSM91440 low 5 min caffeine
2 GSM91893 low 5 min caffeine
3 GSM91428 low 5 min calcofluor white
4 GSM91881 low 5 min calcofluor white

Making tables

There are some VSCode plug-ins that can make tables for you

Or you can make them in R using knitr or pander libraries

A good alternative is this website:

https://www.tablesgenerator.com/markdown_tables

Computer code

Programs are usually written in a monospaced font.
That is, all letters have the same width.

```
this <- is.computer(code) {
    # comment
}
```
this <- is.computer(code) {
    # comment
}

Nicer computer code

You can indicate the language, and get colors

```r
this <- is.computer(code) {
    # comment
}
```
this <- is.computer(code) {
    # comment
}

Bibliography

Bibliographic References

There are hundreds of citation styles

Life is too short to sort references manually

There are many tools to manage your references

Pandoc can do it for you

References are stored in a separate text file, in BiBTeX format

BiBTeX format

@book{ RyderCarroll3260,
    title = "The Bullet Journal Method: Track Your Past, Order Your Present, Plan Your Future",
    author = "Ryder Carroll",
    year = "2018",
    month = "Oct",
    publisher = "HarperCollins Publishers" }

@article{Annesley2010c,
    author = {Annesley, Thomas M.},
    doi = {10.1373/clinchem.2010.150060},
    issn = {00099147},
    journal = {Clinical Chemistry},
    number = {8},
    pages = {1229--1233},
    pmid = {20551381},
    title = {{Put your best figure forward: Line graphs and scattergrams}},
    volume = {56},
    year = {2010} }

Citations in the text

  • [@RyderCarroll3260] becomes (Carroll 2018)
  • [@RyderCarroll3260, pp. 33-35, 38-39] becomes (Carroll 2018, 33–35, 38–39).
  • [@RyderCarroll3260; @Annesley2010c] becomes (Carroll 2018; Annesley 2010).
  • @RyderCarroll3260 [p. 33] says … becomes Carroll (2018, 33) says …

Telling pandoc to manage references

We need two things

  • tell pandoc to handle citations
  • tell pandoc where are the citations
pandoc --citeproc --bibliography=references.bib input.md -o output.pdf

Next class we will do it better

Making BiBTeX files

Many tools can create BiBTeX files for you

Online resources

For your weekend

References

Annesley, Thomas M. 2010. Put your best figure forward: Line graphs and scattergrams.” Clinical Chemistry 56 (8): 1229–33. https://doi.org/10.1373/clinchem.2010.150060.
Carroll, Ryder. 2018. The Bullet Journal Method: Track Your Past, Order Your Present, Plan Your Future. HarperCollins Publishers.