What is the value of a result that is not made public?
Most of research is done in teams
Good practices help teamwork, by:
Even if we work alone, we are still communicating
Each one of these interactions can improve following a good practice
Research results are not enough
You must convince your boss (and the jury) that you deserve to be called “Doctor”
Make your work easy to understand
Make clear what is your original contribution
Referees are busy people who works for free
Give them all they need to replicate and validate your work
Being clear and transparent helps them to decide fast
You will get published faster
(or at least get good feedback)
…that will read your paper (and hopefully cite it)
The game does not end when you publish
50% of papers are read only by the referee
Evans, J. A. (2008). Electronic Publication and the Narrowing of Science and Scholarship. Science, 321(5887), 395–399.
Eventually, your work will have an impact outside academia
(the end goal is to make a better world, no?)
We need to be aware of the ethical implications
Nothing is more frustrating that reading your old work
As they say: “The past is a foreign country”
Undocumented code/protocols are hard to understand…
and you can only blame yourself
Someone unfamiliar with your project should be able to look at your computer files and understand in detail what you did and why
The ideas of this section are mostly based on
William Stafford Noble. “A Quick Guide to Organizing Computational
Biology Projects.” PLoS Computational Biology 5, no. 7 (2009): 1–5.
https://doi.org/10.1371/journal.pcbi.1000424.
Most commonly, however, that “someone” is you.
William Stafford Noble. “A Quick Guide to Organizing Computational Biology Projects.” PLoS Computational Biology 5, no. 7 (2009): 1–5.
docs
is where you write your paper/talk/thesisdata
is anything that you get from outside the
computerresults
is what your code producescode
is where you write your codebib
to store documents cited in your document
extra
for other documents without doiCookiecutter is a python tool to create new projects
You can find search for recipes in GitHub with a query like topic:cookiecutter
topic:r
Producing data is expensive and time consuming
You don’t want to lose it. Mark it read only
immediately
(and make backups)
Never modify raw data. Use a script to make a clean version
Use folders raw
and clean
inside
data/YYYY-MM-DD
Code for that in scripts
Good filenames help a lot to understand the project
But they are usually not enough
A README
file in each folder can explain the purpose of
each file
It takes time to write them, but it saves time in the long run
We can distinguish four categories
Each one requires a separate folder
Tiago Forte Building a Second Brain, Simon and Schuster, 2022
Personally I like to group my Projects/Areas/Resources/ Archives by major topic
Decide when to use .
, -
, and
_
Avoid spaces in filenames
Either John-Smith.txt
or John_Smith.txt
Usually .
separates filetypes, like .csv
or
.yml
Check periodically that you are following your standard
(maybe with a script)
1-Introduction.docx
2_Methods.docx
3.Results.docx
4 discussion.docx
10-conclusions.docx
results-01-03-09.txt
01_Introduction.docx
02_Methods.docx
03_Results.docx
04_Discussion.docx
10_conclusions.docx
20090103results.txt
Both are good, but use only one
When was 8/3/1965? August or March?
Is today 6/10/2023 or 10/6/2023?
It is better to write YYYY-MM-DD. This is an ISO standard
There is no ambiguity of meaning
Sorting alphabetically, numerically, and chronologically give the same result
Sharing Word documents by email is a VERY BAD
IDEA
It leads to chaos and confusion
You can share your document via Dropbox or Google Drive
You can edit online using Microsoft Office 365 or Google Docs
Several people can work in the same document at the same time
Advantage: better spelling and grammar correction
But they require a permanent internet connection
In the server only
Cloud drive like Dropbox, Google Drive
Version control system like GitHub, GitLab, Bitbucket
It can easily become corrupt
Hybrid, using symbolic links
Or use an online editor