We randomly sampled articles published in ecological journals for which code-sharing has been either mandatory or encouraged since June 2015 at the latest. Sampling within these journals maximizes the code available for quality assessment (see below), and also helps us to understand where other ecological journals will likely be in the near future, as code archiving policies proliferate (Stodden, Guo, and Ma 2013). The sample of journals was obtained from Mislan et al. (2016), who identified 14 out of 96 (15%) ecological journals with mandatory or encouraging code-sharing policies in place since at least the 1st of June 2015 (Table S3). We sampled all the articles published in these 14 journals using the Web of Science Core Collection (search performed at the Netherlands Institute of Ecology) for two distinct temporal periods: 1st of June 2015 to 9th of December 2016 (2015/16 dataset, N= 4,366 records), and 1st of January 2018 to 21st of May 2019 (2018/19 dataset, N=4,291). A random sample of 200 articles from each period was then selected using the function “sample” in R v. 3.3.2 (R Core Team 2018).
For each of the 400 articles, we determined whether the article was relevant to our survey by screening all titles and abstracts using the software Rayyan (Ouzzani et al. 2016). Reviews, opinion pieces, and commentaries were excluded, whereas articles that conducted some kind of statistical analysis and/or simulations were kept for further screening. All the articles were double screened at this stage (2015/16 by AC and IvB; and 2018/19 by AST and SE). As a result, 195 of the 200 articles in 2015/16 dataset, and 187 of the 200 articles in 2018/19 dataset were kept for further evaluation. The full list of these articles, with the results of the screening is provided (see file data/code_availability_full_and_clean.csv
).
In the next steps (summarized in Figure S1) AC evaluated 195 articles from 2015/16, and AST evaluated 187 articles from 2018/19. First, each article was read in detail to determine if it was a purely bioinformatical (i.e. molecular) study with solely bioinformatical analyses. Only studies that conducted at least some statistical analyses other than bioinformatics (including simulations) were considered further (hereafter referred to as ‘non-molecular’). These non-molecular articles (172 in 2015/16; 174 in 2018/19) were then evaluated based on the following (and as shown in Figure S1):
Was the analytical code published? (‘yes’, ‘no’, ‘some’). This was done by checking both the article’s main text, data accessibly statements, and the supplementary materials, including archived data, if any. If no code (‘code’, ‘script’, ‘syntax’) was mentioned in any of these sections, then all links to archived data, and all supplementary materials were searched.
What software was used in the analysis? (e.g. R, Python, SAS…)
Were there data used in the analysis (‘yes’, ‘no’)? If yes, were the data published (‘yes’, ‘no’, ‘some’)?
For those articles that had published seemingly all or at least some of their code (40 in 2015/16; 52 in 2018/19), we evaluated the following:
Where was the code published (‘repository’, ‘supplementary material’, ‘version control platform’, ‘webpage’)?
Was the existence of code clearly mentioned (‘yes’, ‘no’), and if so, how (‘code’, ‘script’, ‘other’)?
Where was the reference to the code made, if at all (‘methods section’, ‘data accessibility statement’, ‘supplementary material’, ‘other’)?
Did the code include some kind of documentation such as README or HowTo files (‘yes’, ‘no’ or ‘with some comments’)?
Were there inline comments within the code (‘yes’, ‘no’)?
To ensure data reliability, we randomly selected and double-checked data extraction (two observers: AST and AC) for 16% of all the articles that had passed the title-and-abstract screening (29 out of 195 articles from 2015/16, and 32 out of 187 from 2018/19 dataset). The decision overlap between observers was very high (please see file data/code_availability_full_and_clean.csv
for changes and explanations made to the table after the double screening). Some additional articles were further intentionally double-checked whenever the original screener was not sure about the assignment of some scores (42 articles) or when the original screener could not find any reference to the software used for the analyses (34 articles; more details in file data/code_availability_full_and_clean.csv
).
We explored if the number of ecological journals with code-sharing policies has increased since 2015, when only 14 out of 96 (15%) had code-sharing policies (Mislan, Heer, and White 2016). In March 2020 we investigated the code-sharing policies for all the 96 ecological journals identified by Mislan et al. [(2016); Table S3]. One observer (AC) read the ‘Authors Instructions’ and ‘Editorial Policies’ of each journal, and a second observer (AST) did so but only for the 14 journals reviewed in this study and for those journals scored as ‘No’ (see below) by the first observer - unclear cases were discussed together. If none of these sections explicitly mentioned data and/or code, we also checked other sections of the journal’s website, whenever possible. Based on the information collected, we scored the journals’ code-sharing policies as: ‘encouraged’ (publication of the code is explicitly encouraged, but not required), ‘mandatory’ (code must be published together with the article), or ‘encouraged/mandatory’ (when the wording made it difficult to judge if code publication is encouraged or required). If a journal did not explicitly encourage code-sharing (i.e. even when it mentioned research artefacts, but not ‘code’ or ‘script’ explicitly: 10 out of 96, 10%), the journal’s code-sharing policy was scored as ‘No’ (i.e. code-sharing is not mentioned explicitly; see comments in file data/Updated_Table_Mislan_2020_v2.xlsx
).
Table S1. List of 96 ecological journals and their code-sharing policies in June 2015 (Mislan, Heer, and White 2016) and in March 2020 (this study).
# printing Table S1
journal.policies %>% select(Full.Journal.Title,Require.computer.code.with.publication_2015,Require.computer.code.with.publication_2020) %>%
rename(Journal=Full.Journal.Title,'Code-sharing policy 2015'=Require.computer.code.with.publication_2015,'Code-sharing policy 2020'=Require.computer.code.with.publication_2020) %>% arrange(Journal) %>%
kable("html") %>% kable_styling() %>%
scroll_box(width = "100%", height = "500px")
Journal | Code-sharing policy 2015 | Code-sharing policy 2020 |
---|---|---|
Acta Oecologica-International Journal of Ecology | No | Encouraged |
Agriculture Ecosystems & Environment | No | Encouraged |
American Naturalist | Yes | Encouraged |
Animal Conservation | No | Encouraged/Mandatory |
Annual Review of Ecology Evolution and Systematics | No | No |
Applied Vegetation Science | No | Encouraged/Mandatory |
Aquatic Ecology | No | Encouraged |
Aquatic Microbial Ecology | No | No |
Austral Ecology | No | Encouraged |
Basic and Applied Ecology | No | No |
Behavioral Ecology | No | No |
Behavioral Ecology and Sociobiology | No | No |
Biodiversity and Conservation | No | Encouraged |
Biogeosciences | No | Encouraged |
Biological Conservation | No | Encouraged |
Biological Invasions | No | Encouraged |
Biology Letters | No | Mandatory |
Biotropica | No | Encouraged |
Chemoecology | No | Encouraged |
Community Ecology | No | Encouraged |
Conservation Biology | No | Encouraged |
Diversity and Distributions | No | Encouraged/Mandatory |
Ecography | No | Mandatory |
Ecohydrology | No | Encouraged |
Ecological Applications | Yes | Mandatory |
Ecological Complexity | No | Encouraged |
Ecological Economics | No | Encouraged |
Ecological Engineering | No | Encouraged |
Ecological Informatics | No | Encouraged |
Ecological Modelling | No | Encouraged |
Ecological Monographs | Yes | Mandatory |
Ecological Research | No | Encouraged |
Ecology | Yes | Mandatory |
Ecology and Evolution | No | No |
Ecology and Society | No | Mandatory |
Ecology Letters | No | Encouraged/Mandatory |
Ecosphere | Yes | Mandatory |
Ecosystems | No | No |
Ecotoxicology | No | Encouraged |
Environmental Biology of Fishes | No | Encouraged |
European Journal of Soil Biology | No | Encouraged |
European Journal of Wildlife Research | No | Encouraged |
Evolution | No | Mandatory |
Evolutionary Ecology | No | Encouraged |
Flora | No | Encouraged |
Freshwater Science | No | No |
Frontiers in Ecology and the Environment | No | No |
Functional Ecology | Yes | Encouraged/Mandatory |
Fungal Ecology | No | Encouraged |
Global Change Biology | No | No |
Global Ecology and Biogeography | No | Encouraged/Mandatory |
Heredity | Yes | Encouraged/Mandatory |
International Journal of Sustainable Development and World Ecology | No | No |
ISME Journal | No | Encouraged |
Journal for Nature Conservation | No | Encouraged |
Journal of Animal Ecology | Yes | Encouraged/Mandatory |
Journal of Applied Ecology | Yes | Encouraged/Mandatory |
Journal of Arid Environments | No | Encouraged |
Journal of Biogeography | No | Encouraged/Mandatory |
Journal of Ecology | Yes | Encouraged/Mandatory |
Journal of Evolutionary Biology | No | Mandatory |
Journal of Experimental Marine Biology and Ecology | No | Encouraged |
Journal of Plant Ecology | No | No |
Journal of Soil and Water Conservation | No | No |
Journal of the North American Benthological Society | No | No |
Journal of Tropical Ecology | No | Encouraged |
Journal of Vegetation Science | No | Encouraged/Mandatory |
Journal of Wildlife Management | No | Encouraged |
Landscape and Urban Planning | No | Encouraged |
Landscape Ecology | No | No |
Marine Ecology Progress Series | No | No |
Methods in Ecology and Evolution | Yes | Encouraged/Mandatory |
Microbial Ecology | No | No |
Molecular Ecology | Yes | Encouraged |
Molecular Ecology Resources | Yes | Encouraged |
Oecologia | No | No |
Oikos | No | No |
Oryx | No | No |
Paleobiology | No | No |
Pedobiologia | No | Encouraged |
Perspectives in Plant Ecology Evolution and Systematics | No | Encouraged |
Plant Ecology | No | Encouraged |
Plant Species Biology | No | Encouraged |
Polar Biology | No | Encouraged |
Polar Research | No | No |
Population Ecology | No | Encouraged |
Proceedings of the Royal Society B-Biological Sciences | Yes | Mandatory |
Rangeland Ecology & Management | No | Encouraged |
Restoration Ecology | No | Encouraged |
Theoretical Ecology | No | Encouraged |
Theoretical Population Biology | No | Encouraged |
Trends in Ecology & Evolution | No | No |
Urban Ecosystems | No | Encouraged |
Wetlands | No | Encouraged |
Wildlife Monographs | No | Encouraged |
Wildlife Research | No | No |
# number of eligible articles
eligible.articles.code <- as.numeric(data.full %>%
filter(!(is.na(statistical.analysis.and.or.simulations.2))) %>%
summarise(eligible_articles = sum(statistical.analysis.and.or.simulations.2 == "yes")))
# number of eligible articles
eligible.articles.code.year <- data.full %>%
filter(!(is.na(statistical.analysis.and.or.simulations.2))) %>%
group_by(Publication_year.2) %>%
summarise(eligible_articles = sum(statistical.analysis.and.or.simulations.2 == "yes"))
# number of articles that published at least some code
at.least.some.code <- as.numeric(data.full %>%
filter(!(is.na(CodePublished.3))) %>%
summarise(code_published = sum(CodePublished.3 == "yes")))
# number of articles per year that published at least some code
at.least.some.code.year <- data.full %>%
filter(!(is.na(CodePublished.3))) %>%
group_by(Publication_year.2) %>%
summarise(code_published = sum(CodePublished.3 == "yes"))
# number of articles that published seemingly all code
seemingly.all.code <- as.numeric(data.full %>%
filter(!(is.na(CodePublished.2))) %>%
summarise(code_published = sum(CodePublished.2 == "yes")))
# number of articles that published only somecode
only.some.code <- as.numeric(data.full %>%
filter(!(is.na(CodePublished.2))) %>%
summarise(code_published = sum(CodePublished.2 == "some")))
# number of eligible articles that used data
eligible.articles.data <- as.numeric(data.full %>%
filter(!(is.na(statistical.analysis.and.or.simulations.2)) & !(is.na(DataUsed))) %>%
summarise(eligible_articles = sum(statistical.analysis.and.or.simulations.2 == "yes" & DataUsed == "yes")))
# number of articles that published at least some data
at.least.some.data <- as.numeric(data.full %>%
filter(!(is.na(DataShared.3))) %>%
summarise(data_published = sum(DataShared.3 == "yes")))
# number of articles that published seemingly all code and data (if any used)
seemingly.all.code.and.data <- as.numeric(data.full %>%
filter(!(is.na(CodePublished.2))) %>%
summarise(code_and_data_published = sum(CodePublished.2 == "yes" & (DataShared.2 == "yes" | is.na(DataShared.2)))))
# number of articles that published seemingly all code and data (if any used)
seemingly.all.code.and.data.year <-data.full %>%
filter(!(is.na(CodePublished.2))) %>%
group_by(Publication_year.2) %>%
summarise(code_and_data_published = sum(CodePublished.2 == "yes" & (DataShared.2 == "yes" | is.na(DataShared.2))))
# number of journals with some type of code-sharing policy in 2015
journals.with.policy.2015 <- as.numeric(table(journal.policies$Require.computer.code.with.publication_2015)['Yes'])
# number of journals reviewed in 2015
journals.2015 <- nrow(journal.policies)
# number of journals with some type of code-sharing policy in 2020
journals.with.policy.2020 <- nrow(journal.policies[!(is.na(journal.policies$Require.computer.code.with.publication_2020)) &
journal.policies$Require.computer.code.with.publication_2020!="No",])
# number of journals reviewed in 2015 that were still active in 2020
journals.2020 <- nrow(journal.policies[!(is.na(journal.policies$Require.computer.code.with.publication_2020)),])
# # number of journals covered each year
# number.journals.covered <- as.data.frame(data.full %>% group_by(Publication_year.2,Journal) %>% summarise(count = n_distinct(CodePublished.2)) %>% summarise(n = n()))
# counting number of articles per journal
articles.per.journal <- as.data.frame(data.full %>% group_by(Journal) %>% summarise(total = n()))
# counting number of articles with at least some code per journal
code.published.per.journal <- as.data.frame(data.full %>% filter(CodePublished.3=="yes") %>% group_by(Journal) %>% summarise(codepublished = n()))
# merging dataframes together
full.journal <- merge(code.published.per.journal,articles.per.journal)
full.journal$percentage <- round((full.journal$codepublished/full.journal$total)*100,0)
# saving journal percentages information to be used in script 004_plotting.R
write.csv(full.journal,"data/journal_percentages.csv",row.names=FALSE)
# number of articles mentioning that code was published
code.mentioned <- as.numeric(table(data.full$CodeMentioned.2)["yes"])
# number of articles not mentioning that code was published
code.not.mentioned <- as.numeric(table(data.full$CodeMentioned.2)["no"])
# number of articles mentioning that code was published
code.mentioned.code.script <- as.numeric(sum(table(data.full$CodeMentioned)["yes, code"],
table(data.full$CodeMentioned)["yes, code and script"],
table(data.full$CodeMentioned)["yes, script"]))
# number of articles mentioning code availability in the data accessibility and/or materials and methods
code.mentioned.section.dataacc.methods <- as.numeric(sum(table(data.full$Location_CodeMentioned)["dataaccessibility"],
table(data.full$Location_CodeMentioned)["dataaccessibility and methods"],
table(data.full$Location_CodeMentioned)["dataaccessibility and methods and supplement"],
table(data.full$Location_CodeMentioned)["dataaccessibility and supplement"],
table(data.full$Location_CodeMentioned)["methods"],
table(data.full$Location_CodeMentioned)["methods and supplement"]))
# number of articles mentioning code availability only in the supplements
code.mentioned.section.supplements <- as.numeric(sum(table(data.full$Location_CodeMentioned)["supplement"],
table(data.full$Location_CodeMentioned)["supplementary files descriptions"]))
# number of articles using permanent repositories to host their data
code.hosted.repository <- as.numeric(table(data.full$LocationShared.2)["repository"])-as.numeric(table(data.full$LocationShared)["version control platform"])
# number of articles using only non-permanent repositories (i.e. version controlled plaform = GitHub)to host their data
code.hosted.github.only <- as.numeric(table(data.full$LocationShared)["version control platform"])
# number of articles using repositories to host their data in 2015/2016
code.hosted.repository.2015.16 <- as.numeric(table(data.full[data.full$Publication_year<2018,"LocationShared.2"])["repository"])
# number of articles using repositories to host their data in 2018/2019
code.hosted.repository.2018.19 <- as.numeric(table(data.full[data.full$Publication_year>2016,"LocationShared.2"])["repository"])
# number of articles using only the supplements to host their data
code.hosted.supplements <- as.numeric(table(data.full$LocationShared.2)["supplementary file"])
# number of articles using exclusively free software
free.software <- as.numeric(table(data.full$FreeSoftware)["yes"])
# number of articles using R in combination with other software
R.and.others <- as.numeric(table(str_detect(data.full$Stat_analysis_software, "R "))["TRUE"])
# number of articles using only R
R.only <- nrow(data.full[data.full$Stat_analysis_software=="R" & !(is.na(data.full$Stat_analysis_software)),])
# number of articles reporting the software used
eligible.articles.code.reporting.software.used <- as.numeric(data.full %>%
filter(!(is.na(statistical.analysis.and.or.simulations.2))) %>%
summarise(eligible_articles = sum(statistical.analysis.and.or.simulations.2 == "yes" & Stat_analysis_software != "Not Stated")))
# number of articles reporting the software used
eligible.articles.code.not.reporting.software.used <- as.numeric(data.full %>%
filter(!(is.na(statistical.analysis.and.or.simulations.2))) %>%
summarise(eligible_articles = sum(statistical.analysis.and.or.simulations.2 == "yes" & Stat_analysis_software == "Not Stated")))
# creating vectors to build the data frame with
# vector with a name for each percentage
all.names <- c("articles sharing at least some code",
"articles sharing at least some data",
"journals with code-sharing policies in 2015",
"journals with code-sharing policies in 2020",
"articles sharing seemingly all code",
"articles sharing only some code",
"articles sharing at least some code in 2015/2016",
"articles sharing at least some code in 2018/2019",
"articles sharing at least some code per journal (min)",
"articles sharing at least some code per journal (max)",
"articles sharing at least some code per journal (median)",
"articles sharing at least some code per journal (mean)",
"articles sharing at least some code and highlighting code availability",
"articles sharing at least some code and highlighting code availability using code and/or script",
"articles sharing at least some code and highlighting code availability in data accessibility and/or methods section",
"articles sharing at least some code and highlighting code availability only in the supplements",
"articles sharing at least some code and not highlighting code availability",
"articles sharing at least some code and hosting it in a permanent repository",
"articles sharing at least some code and hosting it in GitHub only",
"articles sharing at least some code and hosting it only in the supplements",
"articles sharing at least some code and hosting it in a repository in 2015/2016",
"articles sharing at least some code and hosting it in a repository in 2018/2019",
"articles with the potential to be computationally reproducible",
"articles with the potential to be computationally reproducible in 2015/2016",
"articles with the potential to be computationally reproducible in 2018/2019",
"articles using free (non-proprietary) software",
"articles using R",
"articles not reporting the software used")
# vector with all percentages
all.percentages <- round(
c((at.least.some.code/eligible.articles.code)*100,
(at.least.some.data/eligible.articles.data)*100,
(journals.with.policy.2015/journals.2015)*100,
(journals.with.policy.2020/journals.2020)*100,
(seemingly.all.code/eligible.articles.code)*100,
(only.some.code/eligible.articles.code)*100,
(at.least.some.code.year$code_published[1]/eligible.articles.code.year$eligible_articles[1])*100,
(at.least.some.code.year$code_published[2]/eligible.articles.code.year$eligible_articles[2])*100,
as.numeric(summary(full.journal$percentage)["Min."]),
as.numeric(summary(full.journal$percentage)["Max."]),
as.numeric(summary(full.journal$percentage)["Median"]),
as.numeric(summary(full.journal$percentage)["Mean"]),
(code.mentioned/at.least.some.code)*100,
(code.mentioned.code.script/code.mentioned)*100,
(code.mentioned.section.dataacc.methods/code.mentioned)*100,
(code.mentioned.section.supplements/code.mentioned)*100,
(code.not.mentioned/at.least.some.code)*100,
(code.hosted.repository/at.least.some.code)*100,
(code.hosted.github.only/at.least.some.code)*100,
(code.hosted.supplements/at.least.some.code)*100,
(code.hosted.repository.2015.16/at.least.some.code.year$code_published[1])*100,
(code.hosted.repository.2018.19/at.least.some.code.year$code_published[2])*100,
(seemingly.all.code.and.data/eligible.articles.code)*100,
(seemingly.all.code.and.data.year$code_and_data_published[1]/eligible.articles.code.year$eligible_articles[1])*100,
(seemingly.all.code.and.data.year$code_and_data_published[2]/eligible.articles.code.year$eligible_articles[2])*100,
(free.software/eligible.articles.code.reporting.software.used)*100,
((R.and.others+R.only)/eligible.articles.code.reporting.software.used)*100,
(eligible.articles.code.not.reporting.software.used/eligible.articles.code)*100),0)
# vector with all numerators for calculating the percentage
all.numerators <- c("#eligible articles sharing at least some code",
"#eligible articles sharing at least some data",
"#journals with code-sharing policies in 2015",
"#journals with code-sharing policies in 2020",
"#eligible articles sharing seemingly all code",
"#eligible articles sharing only some code",
"#eligible articles sharing at least some code in 2015/2016",
"#eligible articles sharing at least some code in 2018/2019",
"#eligible articles per journal sharing at least some code",
"#eligible articles per journal sharing at least some code",
"#eligible articles per journal sharing at least some code",
"#eligible articles per journal sharing at least some code",
"#eligible articles sharing at least some code that highlighted code availability",
"#eligible articles sharing at least some code that highlighted code availability using code and/or script",
"#eligible articles sharing at least some code that highlighted code availability in data accessibility and/or methods section",
"#eligible articles sharing at least some code that highlighted code availability only in the supplements",
"#eligible articles sharing at least some code that did not highlighted code availability",
"#eligible articles sharing at least some code and hosting it in a repository",
"#eligible articles sharing at least some code and hosting it in GitHub only",
"#eligible articles sharing at least some code and hosting it only in the supplements",
"#eligible articles sharing at least some code and hosting it in a repository or GitHub only in 2015/2016",
"#eligible articles sharing at least some code and hosting it in a repository or GitHub only in 2018/2019",
"#eligible articles sharing seemingly all code and data (if any used)",
"#eligible articles with the potential to be computationally reproducible in 2015/2016",
"#eligible articles with the potential to be computationally reproducible in 2018/2019",
"#eligible articles using free (non-proprietary) software",
"#eligible articles using R alone or together with other software",
"#eligible articles not reporting the software used")
# vector with all denominators for calculating the percentage
all.denominators <- c("#eligible articles",
"#eligible articles that used data",
"#journals reviewed in 2015",
"#journals reviewed and still existing in 2020",
"#eligible articles",
"#eligible articles",
"#eligible articles in 2015/2016",
"#eligible articles in 2018/2019",
"#eligible articles per journal",
"#eligible articles per journal",
"#eligible articles per journal",
"#eligible articles per journal",
"#eligible articles sharing at least some code",
"#eligible articles sharing at least some code that highlighted code availability",
"#eligible articles sharing at least some code that highlighted code availability",
"#eligible articles sharing at least some code that highlighted code availability",
"#eligible articles sharing at least some code",
"#eligible articles sharing at least some code",
"#eligible articles sharing at least some code",
"#eligible articles sharing at least some code",
"#eligible articles sharing at least some code in 2015/2016",
"#eligible articles sharing at least some code in 2018/2019",
"#eligible articles",
"#eligible articles in 2015/2016",
"#eligible articles in 2018/2019",
"#eligible articles reporting software used",
"#eligible articles reporting software used",
"#eligible articles")
# putting together tableS2 and renaming columns
tableS2 <- as.data.frame(cbind(all.names,all.percentages,all.numerators,all.denominators))
names(tableS2) <- c("Name","%","Numerator","Denominator")
Table S2. List of all percentages that we calculated from our data, and presented in the main manuscript, and how they were calculated to avoid any misunderstanding. Values sorted by the order of appearance in the main manuscript (including the abstract).
# printing Table S2
tableS2 %>% kable("html") %>% kable_styling() %>% scroll_box(width = "100%", height = "500px")
Name | % | Numerator | Denominator |
---|---|---|---|
articles sharing at least some code | 27 | #eligible articles sharing at least some code | #eligible articles |
articles sharing at least some data | 79 | #eligible articles sharing at least some data | #eligible articles that used data |
journals with code-sharing policies in 2015 | 15 | #journals with code-sharing policies in 2015 | #journals reviewed in 2015 |
journals with code-sharing policies in 2020 | 75 | #journals with code-sharing policies in 2020 | #journals reviewed and still existing in 2020 |
articles sharing seemingly all code | 22 | #eligible articles sharing seemingly all code | #eligible articles |
articles sharing only some code | 5 | #eligible articles sharing only some code | #eligible articles |
articles sharing at least some code in 2015/2016 | 23 | #eligible articles sharing at least some code in 2015/2016 | #eligible articles in 2015/2016 |
articles sharing at least some code in 2018/2019 | 30 | #eligible articles sharing at least some code in 2018/2019 | #eligible articles in 2018/2019 |
articles sharing at least some code per journal (min) | 7 | #eligible articles per journal sharing at least some code | #eligible articles per journal |
articles sharing at least some code per journal (max) | 53 | #eligible articles per journal sharing at least some code | #eligible articles per journal |
articles sharing at least some code per journal (median) | 22 | #eligible articles per journal sharing at least some code | #eligible articles per journal |
articles sharing at least some code per journal (mean) | 25 | #eligible articles per journal sharing at least some code | #eligible articles per journal |
articles sharing at least some code and highlighting code availability | 76 | #eligible articles sharing at least some code that highlighted code availability | #eligible articles sharing at least some code |
articles sharing at least some code and highlighting code availability using code and/or script | 96 | #eligible articles sharing at least some code that highlighted code availability using code and/or script | #eligible articles sharing at least some code that highlighted code availability |
articles sharing at least some code and highlighting code availability in data accessibility and/or methods section | 94 | #eligible articles sharing at least some code that highlighted code availability in data accessibility and/or methods section | #eligible articles sharing at least some code that highlighted code availability |
articles sharing at least some code and highlighting code availability only in the supplements | 4 | #eligible articles sharing at least some code that highlighted code availability only in the supplements | #eligible articles sharing at least some code that highlighted code availability |
articles sharing at least some code and not highlighting code availability | 24 | #eligible articles sharing at least some code that did not highlighted code availability | #eligible articles sharing at least some code |
articles sharing at least some code and hosting it in a permanent repository | 51 | #eligible articles sharing at least some code and hosting it in a repository | #eligible articles sharing at least some code |
articles sharing at least some code and hosting it in GitHub only | 12 | #eligible articles sharing at least some code and hosting it in GitHub only | #eligible articles sharing at least some code |
articles sharing at least some code and hosting it only in the supplements | 34 | #eligible articles sharing at least some code and hosting it only in the supplements | #eligible articles sharing at least some code |
articles sharing at least some code and hosting it in a repository in 2015/2016 | 52 | #eligible articles sharing at least some code and hosting it in a repository or GitHub only in 2015/2016 | #eligible articles sharing at least some code in 2015/2016 |
articles sharing at least some code and hosting it in a repository in 2018/2019 | 71 | #eligible articles sharing at least some code and hosting it in a repository or GitHub only in 2018/2019 | #eligible articles sharing at least some code in 2018/2019 |
articles with the potential to be computationally reproducible | 21 | #eligible articles sharing seemingly all code and data (if any used) | #eligible articles |
articles with the potential to be computationally reproducible in 2015/2016 | 20 | #eligible articles with the potential to be computationally reproducible in 2015/2016 | #eligible articles in 2015/2016 |
articles with the potential to be computationally reproducible in 2018/2019 | 21 | #eligible articles with the potential to be computationally reproducible in 2018/2019 | #eligible articles in 2018/2019 |
articles using free (non-proprietary) software | 74 | #eligible articles using free (non-proprietary) software | #eligible articles reporting software used |
articles using R | 79 | #eligible articles using R alone or together with other software | #eligible articles reporting software used |
articles not reporting the software used | 10 | #eligible articles not reporting the software used | #eligible articles |
Table S3. List of 14 ecological journals reviewed in this study and their code-sharing policies (updated in March 2020), number of articles we reviewed, number of articles sharing at least some code, and the percentage of articles sharing at least some code (see also Figure 2 in the main text).
# import journal information and abbreviations
journal.info <- read.table("data/journals_info_v2.csv",header=T,sep=",")
# merging journal info to percentages
full.journal.info <- merge(full.journal,journal.info)
# removing full capitalization of journal names using the following function
# which was obtained from: https://stackoverflow.com/questions/6364783/capitalize-the-first-letter-of-both-words-in-a-two-word-string
simpleCap <- function(x) {
s <- strsplit(x, " ")[[1]]
paste(toupper(substring(s, 1,1)), substring(s, 2),
sep="", collapse=" ")
}
full.journal.info$Journal<- sapply(tolower(full.journal.info$Journal),simpleCap)
full.journal.info$Journal <- ifelse(full.journal.info$Journal=="Proceedings Of The Royal Society B-biological Sciences",
"Proceedings Of The Royal Society B-Biological Sciences",
full.journal.info$Journal)
# printing table S3
full.journal.info %>% select(Journal,Policy,total,codepublished,percentage) %>% rename('Code-sharing policy'=Policy,'#articles reviewed'=total,'#articles sharing code'=codepublished, '% articles sharing code'=percentage) %>% arrange(Journal) %>% kable("html") %>% kable_styling() %>% scroll_box(width = "100%", height = "500px")
Journal | Code-sharing policy | #articles reviewed | #articles sharing code | % articles sharing code |
---|---|---|---|---|
American Naturalist | Encouraged | 24 | 7 | 29 |
Ecological Applications | Mandatory | 14 | 2 | 14 |
Ecological Monographs | Mandatory | 6 | 3 | 50 |
Ecology | Mandatory | 32 | 5 | 16 |
Ecosphere | Mandatory | 44 | 6 | 14 |
Functional Ecology | Encouraged/Mandatory | 28 | 2 | 7 |
Heredity | Encouraged/Mandatory | 21 | 6 | 29 |
Journal Of Animal Ecology | Encouraged/Mandatory | 21 | 5 | 24 |
Journal Of Applied Ecology | Encouraged/Mandatory | 27 | 5 | 19 |
Journal Of Ecology | Encouraged/Mandatory | 21 | 3 | 14 |
Methods In Ecology And Evolution | Encouraged/Mandatory | 19 | 10 | 53 |
Molecular Ecology | Encouraged | 48 | 18 | 38 |
Molecular Ecology Resources | Encouraged | 18 | 4 | 22 |
Proceedings Of The Royal Society B-Biological Sciences | Mandatory | 77 | 16 | 21 |
R session information detailing the versions and packages used in this script for reproducibility purposes.
sessionInfo() %>% pander()
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale: LC_COLLATE=English_United States.1252, LC_CTYPE=English_United States.1252, LC_MONETARY=English_United States.1252, LC_NUMERIC=C and LC_TIME=English_United States.1252
attached base packages: stats, graphics, grDevices, utils, datasets, methods and base
other attached packages: kableExtra(v.1.0.0), knitr(v.1.21), ggpubr(v.0.2.3.999), magrittr(v.1.5), pander(v.0.6.3), forcats(v.0.3.0), stringr(v.1.4.0), purrr(v.0.3.2), readr(v.1.3.1), tidyr(v.0.8.3), tibble(v.2.1.1), tidyverse(v.1.2.1), ggplot2(v.3.1.0), dplyr(v.0.8.5) and openxlsx(v.4.1.0)
loaded via a namespace (and not attached): tidyselect(v.0.2.5), xfun(v.0.5), haven(v.2.0.0), lattice(v.0.20-35), colorspace(v.1.3-2), viridisLite(v.0.3.0), htmltools(v.0.3.6), yaml(v.2.2.0), base64enc(v.0.1-3), rlang(v.0.4.0), pillar(v.1.3.1), glue(v.1.3.1), withr(v.2.1.2), modelr(v.0.1.2), readxl(v.1.2.0), plyr(v.1.8.4), ggsignif(v.0.5.0), munsell(v.0.5.0), gtable(v.0.2.0), cellranger(v.1.1.0), rvest(v.0.3.2), zip(v.2.0.1), evaluate(v.0.12), highr(v.0.7), broom(v.0.5.0), Rcpp(v.1.0.1), scales(v.1.0.0), backports(v.1.1.2), webshot(v.0.5.1), jsonlite(v.1.6), hms(v.0.4.2), digest(v.0.6.18), stringi(v.1.4.3), grid(v.3.5.1), cli(v.1.1.0), tools(v.3.5.1), lazyeval(v.0.2.1), pacman(v.0.5.0), crayon(v.1.3.4), pkgconfig(v.2.0.2), xml2(v.1.2.0), lubridate(v.1.7.4), assertthat(v.0.2.1), rmarkdown(v.1.11), httr(v.1.4.0), rstudioapi(v.0.8), R6(v.2.4.0), nlme(v.3.1-137) and compiler(v.3.5.1)
Mislan, K. A. S., Jeffrey M. Heer, and Ethan P. White. 2016. “Elevating the Status of Code in Ecology.” Trends in Ecology & Evolution 31 (1): 4–7. https://doi.org/10.1016/j.tree.2015.11.006.
Ouzzani, Mourad, Hossam Hammady, Zbys Fedorowicz, and Ahmed Elmagarmid. 2016. “Rayyana Web and Mobile App for Systematic Reviews.” Systematic Reviews 5 (1): 210. https://doi.org/10.1186/s13643-016-0384-4.
Stodden, Victoria, Peixuan Guo, and Zhaokun Ma. 2013. “Toward Reproducible Computational Research: An Empirical Analysis of Data and Code Policy Adoption by Journals.” PLOS ONE 8 (6): e67111. https://doi.org/10.1371/journal.pone.0067111.
Team, R Core. 2018. “R: A Language and Environment for Statistical Computing.” Vienna, Austria: R Foundation for Statistical Computing.