The Best Ever Paper Written

Author

You and Your Awesome Collaborators

Importing the most amazing dataset

Welcome! In this demo, we will explore GitHub, reproducible workflows, and data visualisation using a small dataset of birds.

Introduction

GitHub & GitLab are platforms for version control and collaboration using Git. They allow you to track changes, work with collaborators, and host your projects online.

FAIR principles (Wilkinson et al. 2016):

  • Findable: easy to locate your data/code
  • Accessible: public and retrievable (you need a DOI!)
  • Interoperable: standard formats and conventions
  • Reusable: documented, licensed, ready for reuse

This repository illustrates both:

  • A clear project structure (data/, code/, figures/, docs/)
  • Reproducible analysis using Quarto
  • FAIR metadata in the README

For similar guidelines for code, see Ivimey-Cook et al. 2026 and for additional information on reproducible repositories, see Pick et al. 2025

Importing the most amazing dataset

We use a simple dataset (birds_map.csv) as an example. Yours will be much more Awesome!

Code
birds_map <- read.csv("../data/birds_map.csv",
                          header=T,sep=",")

# Display the first few rows nicely in HTML
knitr::kable(head(birds_map), format = "html")
StudyID Species Developmental_mode Study_area Lat_study_area Long_study_area
rayyan-123815011 Ficedula albicollis altricial Gotland; Sweden 57.17 18.33
rayyan-123815011 Ficedula albicollis altricial Gotland; Sweden 57.17 18.33
rayyan-123815011 Ficedula albicollis altricial Gotland; Sweden 57.17 18.33
rayyan-123815011 Ficedula albicollis altricial Gotland; Sweden 57.17 18.33
rayyan-123815144 Tachycineta bicolor altricial Watauga County; USA 36.20 -81.67
rayyan-123815659 Hirundo rustica altricial Badajoz; Spain NA NA

Data Preparation

We extract Location and Country from the Study_area column and create a summary dataset.

Code
# Split Study_area into Location and Country
birds_map[c("Location", "Country")] <- str_split_fixed(birds_map$Study_area, ";", 2)

# Keep one row per unique location
birds_map_unique <- birds_map[!duplicated(birds_map$Location), ]

# Add a size variable for plotting
birds_map_unique <- ddply(birds_map_unique, "Location", transform, size = count(Location))
birds_map_unique$size1 <- birds_map_unique$size.freq + 1

Map Visualization

Let’s plot something, because we love plots. We create a world map with points showing data locations.

Code
# World map coordinates
world_coordinates <- map_data("world")

# Remove rows with missing coordinates
birds_map_unique <- subset(birds_map_unique,
                           !is.na(Long_study_area) & !is.na(Lat_study_area))

# Create world map using geom_polygon
fig_map <- ggplot() +
  geom_polygon(
    data = world_coordinates,
    aes(x = long, y = lat, group = group),
    color = "white", fill = "lightblue"
  ) +
  geom_point(
    data = birds_map_unique,
    aes(x = Long_study_area, y = Lat_study_area),
    color = "black", alpha = 0.7,
    size = 3
  ) +
  coord_quickmap() +                     # correct aspect ratio
  theme_classic() +
  labs(title = "Bird Data Locations") +
  theme(
    legend.position = "none",
    plot.title = element_text(hjust = 0.5)
  )

fig_map

The map shows the geographic locations of the bird data analyzed in Do Egg Hormones Have Fitness Consequences in Wild Birds? A Systematic Review and Meta-Analysis by L. Mentesana, M. Hau, P. B. D’Amelio, N. M. Adreani, A. Sánchez-Tójar (2025).

The original dataset can be found in the Zenodo repository associated with the publication:

Publication: Mentesana et al. 2025, Ecology Letters 28: e70100

Zenodo repository: DOI: 10.5281/zenodo.14930059

Reproducibility

Always include R session info for reproducibility.

Code
sessionInfo()
R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default


locale:
[1] LC_COLLATE=English_Germany.utf8  LC_CTYPE=English_Germany.utf8   
[3] LC_MONETARY=English_Germany.utf8 LC_NUMERIC=C                    
[5] LC_TIME=English_Germany.utf8    

time zone: Europe/London
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] maps_3.4.1.1  knitr_1.49    stringr_1.5.1 plyr_1.8.9    ggplot2_3.5.1
[6] pacman_0.5.1 

loaded via a namespace (and not attached):
 [1] vctrs_0.6.5       cli_3.6.1         rlang_1.1.4       xfun_0.49        
 [5] stringi_1.8.3     generics_0.1.3    jsonlite_1.8.8    labeling_0.4.3   
 [9] glue_1.8.0        colorspace_2.1-0  htmltools_0.5.8   scales_1.3.0     
[13] rmarkdown_2.29    grid_4.3.1        evaluate_1.0.3    munsell_0.5.1    
[17] tibble_3.2.1      fastmap_1.1.1     yaml_2.3.10       lifecycle_1.0.4  
[21] compiler_4.3.1    dplyr_1.1.4       Rcpp_1.1.0        htmlwidgets_1.6.4
[25] pkgconfig_2.0.3   rstudioapi_0.17.1 farver_2.1.2      digest_0.6.35    
[29] R6_2.6.1          tidyselect_1.2.1  pillar_1.10.1     magrittr_2.0.3   
[33] withr_3.0.2       tools_4.3.1       gtable_0.3.6