Mock Up of the autoDISCERN validator.

Patients increasingly turn to the web for health information. This is great news for patient engagement… but only if they find good information! Unfortunately, low quality articles are common on the internet. This presents risks to the patient in the form of misinformation and a possibly poorer relationship with their physician. To address this, researchers at the University of Oxford developed the DISCERN instrument : a set of criteria that any lay-person can use to evaluate the quality of online health information. However, patients are unlikely to take the time to apply these criteria to the health websites they visit. Enter machine learning!

We built an automated implementation of the DISCERN instrument (Brief version) using machine learning models. We compared the performance of a traditional model (Random Forest) with that of a hierarchical encoder attention-based neural network (HEA) model using two language embeddings, BERT and BioBERT. The figure below summarizes the architecture of the HEA model.

Architecture of the Hierarchical Encoder Attention-based model used to evaluate health articles according to the DISCERN critera.

Overall, we found that our models were able to reproduce the DISCERN criteria reasonably well. The HEA architecture with BioBERT encodings achieved an average F1 score of 0.74 across all criteria. This translates to an accuracy of 81%. In comparison, human raters achieve an accuracy of 94% on this task.

The attention mechanism implemented in the HEA architectures not only provided model explainability by identifying reasonable supporting sentences for the documents fulfilling the Brief DISCERN criteria, but also boosted F1 performance by 0.05 compared to the same architecture without an attention mechanism.

Discern CriteriaAttended Sentence
Is it clear what sources of information were used to compile the publication (other than the author or producer)?American Journal of Geriatric Psychiatry.
Is it clear when the information used or reported in the publication was produced?Review Date: 3/8/2013.
Does it describe how each treatment works?Gentler martial arts which focus on internal control, breathing and mental discipline can be especially useful for combating depressed thinking and improving relaxation skills.
Does it describe the benefits of each treatment?The mindfulness approach uses meditation, yoga, and breathing exercises to focus awareness on the present moment and break negative thinking
Does it describe the risks of each treatment?Common side effects of SSRIs include:

Our research suggests that it is feasible to automate online health information quality assessment, which is an important step towards empowering patients to become informed partners in the healthcare process.

Laura Kinkead
Senior Research Software Engineer

Building software tools for Data Science.