Simulation study to determine defaults: Naive Bayes + TF-IDF

Simulation Study to Determine Defaults

Active learning models were evaluated across four different classification techniques (naive Bayes, logistic regression, support vector machines, and random forest) and two different feature extraction strategies (TF-IDF and doc2vec). Moreover, models were evaluated across six systematic review datasets from various research areas to assess generalizability of active learning models across different research contexts. Performance of the models were assessed by conducting simulations on six systematic review datasets. The models reduced the number of publications needed to screen by 91.7% to 63.9%. Overall, the Naive Bayes + TF-IDF model performed the best.

Ferdinands, G., Schram, R., De Bruin, J., Bagheri, A., Oberski, D. L., Tummers, L., & Van de Schoot, R. (2020, September 16). Active learning for screening prioritization in systematic reviews – A simulation study. https://doi.org/10.31219/osf.io/w6qbg.

Go to paper

Go to code

Categories: Scientific Papers, Simulation Studies, Software

Check out similar projects

Optimizing ASReview simulations

Optimizing ASReview simulations with multiprocessing solutions for ‘light-data’ and ‘heavy-data’ users via a Kubernetes cluster.

The FORAS project

The FORAS project will replicate and extend an original review integrating advanced machine-learning techniques via the OpenAlex database.

The Noisy Label Filter procedure: a case study to address replication issues in systematic reviews

In this study, we addressed the issue of the lack of replicability of systematic reviews datasets. We used a case study format and developed a procedure to optimize and finalize the by rule imperfect reconstructed dataset.

Reproducibility and Data storage for Active Learning-Aided Systematic Reviews

This systematic review focused on synthesizing information on studies that evaluated the performance of Active Learning compared to human reading.

Simulation Study Switching between Models

This systematic review focused on synthesizing information on studies that evaluated the performance of Active Learning compared to human reading.

Simulation Study on Risk Analysis Documents

ASReview conducted a simulation study on risk analysis documents to evaluate the time-benefit for the Royal Dutch Pharmacists Association.

The Mega Meta project: Substance use, anxiety and depression

The MegaMeta project is a large scale project to review factors that contribute to substance use, anxiety and depressive disorders. Read more information on the search and screening protocol, hyperparameter tuning and post-processing used in this post.

Systematic Review on Studies Evaluating the Performance of Active Learning within Systematic Reviews

This systematic review focused on synthesizing information on studies that evaluated the performance of Active Learning compared to human reading.

Systematic Review on the Implementation of AI-aided Systematic Reviews in Clinical Guideline Development

The ASReview research team conducted a systematic review on the implementation of AI-aided Systematic Reviews within Clinical Guideline Development.

AI-aided literature screening in medical guideline development

In a time of exponential growth of new evidence supporting clinical decision making, combined with a labor-intensive process of selecting this evidence, there is a need for methods to speed up current processes in order to keep medical guidelines up-to-date.

Systematic reviews performed within the UU and UMC Utrecht

This dataset contains an overview of 117 systematic reviews published by corresponding authors affiliated to Utrecht University (UU) or UMC Utrecht in 2020.

Wordcloud

ASReview-wordcloud is a supplemental package for ASReview. Wordclouds can help you to get a visual impression of the contents of datasets.