University of Zurich & University Hospital Zurich
$ health = f(data) $
Comparing neural-networks versus logistic regression for predicting readmission.
Novel computational method for drug-drug interaction predictions which are an important consideration for patient treatment.
The amount of data that is stored in databases and must be analyzed is growing fast. Many analytical tasks are based on iterative methods that approximate optimal solutions. Propensity score matching is a technique that is used to reduce bias during cohort building. The main step is the propensity score computation, which is usually implemented via iterative methods such as gradient descent. Our goal is to support efficient and scalable propensity score computation over relations in a column-oriented database. To achieve this goal, we introduce shape-preserving iterations that update values in existing tuples until a fix point is reached. Shape-preserving iterations enable gradient descent over relations and, thus, propensity score matching. We also show how to create appropriate input relations for shape-preserving iterations with randomly initialized relations. The empirical evaluation compares in-database iterations with the native implementation in MonetDB where iterations are flattened.
Artificial intelligence (AI) systems are increasingly being used in healthcare, thanks to the high level of performance that these systems have proven to deliver. So far, clinical applications have focused on diagnosis and on prediction of outcomes. It is less clear in what way AI can or should support complex clinical decisions that crucially depend on patient preferences. In this paper, we focus on the ethical questions arising from the design, development and deployment of AI systems to support decision-making around cardio-pulmonary resuscitation leading to the determination of a patient’s Do Not Attempt to Resuscitate (DNAR) status (also known as code status). The COVID-19 pandemic has made us keenly aware of the difficulties physicians encounter when they have to act quickly in stressful situations without knowing what their patient would have wanted. We discuss the results of an interview study conducted with healthcare professionals in a university hospital aimed at understanding the status quo of resuscitation decision processes while exploring a potential role for AI systems in decision-making around code status. Our data suggest that 1) current practices are fraught with challenges such as insufficient knowledge regarding patient preferences, time pressure and personal bias guiding care considerations and 2) there is considerable openness among clinicians to consider the use of AI-based decision support. We suggest a model for how AI can contribute to improve decision-making around resuscitation and propose a set of ethically relevant preconditions - conceptual, methodological and procedural - that need to be considered in further development and implementation efforts.
Background: MUC16 is a mucin marker that is frequently mutated in melanoma, but whether MUC16 mutations could be useful as a surrogate biomarker for tumor mutation burden (TMB) remains unclear. Methods: This study rigorously evaluates the MUC16 mutation as a clinical biomarker in cutaneous melanoma by utilizing genomic and clinical data from patient samples from The Cancer Genome Atlas (TCGA) and two independent validation cohorts. We further extended the analysis to studies with patients treated with immunotherapies. Results: Analysis results showed that samples with MUC16 mutations had a higher TMB than the samples of wild-type, with strong statistical significance (P < 0.001) in all melanoma cohorts tested. Associations between MUC16 mutations and TMB remained statistically significant after adjusting for potential confounding factors in the TCGA cohort [OR, 9.28 (95% confidence interval (CI), 5.18–17.39); P < 0.001], Moffitt cohort [OR, 31.95 (95% CI, 8.71–163.90); P < 0.001], and Yale cohort [OR, 8.09 (95% CI, 3.12–23.79); P < 0.01]. MUC16 mutations were also found to be associated with overall survival in the TCGA [HR, 0.62; (95% CI, 0.45–0.85); P < 0.01] and Moffitt cohorts [HR, 0.49 (95% CI, 0.28–0.87); P = 0.014]. Strikingly, MUC16 is the only top frequently mutated gene for which prognostic significance was observed. MUC16 mutations were also found valuable in predicting anti–CTLA-4 and anti–PD-1 therapy responses. Conclusions: MUC16 mutation appears to be a useful predictive marker of global TMB and patient survival in melanoma.
Base editors are chimeric ribonucleoprotein complexes consisting of a DNA-targeting CRISPR-Cas module and a single-stranded DNA deaminase. They enable conversion of C•G into T•A base pairs and vice versa on genomic DNA. While base editors have vast potential as genome editing tools for basic research and gene therapy, their application has been hampered by a broad variation in editing efficiencies on different genomic loci. Here we perform an extensive analysis of adenine- and cytosine base editors on thousands of lentivirally integrated genetic sequences and establish BE-DICT, an attention-based deep learning algorithm capable of predicting base editing outcomes with high accuracy. BE-DICT is a versatile tool that in principle can be trained on any novel base editor variant, facilitating the application of base editing for research and therapy.
Advances in medical technology and IT infrastructure have led to increased availability of continuous patient data that allows to investigate the longitudinal progression of novel and known diseases in unprecedented detail. However, to accurately describe any underlying pathophysiology with longitudinal data, the individual patient trajectories have to be synchronized based on temporal markers. In this study, we use longitudinal data from 28 critically ill ICU COVID-19 patients to compare the commonly used alignment markers "onset of symptoms", “hospital admission” and "ICU admission" with a novel objective method based on the peak value of inflammatory marker C-reactive protein (CRP). By applying our CRP-based method to align the progression of neutrophils and lymphocytes, we were able to define a pathophysiological window that allowed further mortality risk stratification in our COVID-19 patient cohort. Our data highlights that proper synchronization of patient data to the underlying pathophysiology is crucial to differentiate severity subgroups and to allow reliable interpatient comparisons.
To help patients find high quality health information online, we developed a Deep Learning system that evaluates the quality of online health articles. The system implements the DISCERN criteria, which checks for references, balanced writing, and more.
Choosing an optimal data fusion technique is essential when performing machine learning with multimodal data. In this study, we examined deep learning-based multimodal fusion techniques for the combined classification of radiological images and associated text reports. In our analysis, we (1) compared the classification performance of three prototypical multimodal fusion techniques: Early, Late, and Model fusion, (2) assessed the performance of multimodal compared to unimodal learning; and finally (3) investigated the amount of labeled data needed by multimodal vs. unimodal models to yield comparable classification performance. Our experiments demonstrate the potential of multimodal fusion methods to yield competitive results using less training data (labeled data) than their unimodal counterparts. This was more pronounced using the Early and less so using the Model and Late fusion approaches. With increasing amount of training data, unimodal models achieved comparable results to multimodal models. Overall, our results suggest the potential of multimodal learning to decrease the need for labeled training data resulting in a lower annotation burden for domain experts.
Healthcare professionals have long envisioned using the enormous processing powers of computers to discover new facts and medical knowledge locked inside electronic health records. These vast medical archives contain time-resolved information about medical visits, tests and procedures, as well as outcomes, which together form individual patient journeys. By assessing the similarities among these journeys, it is possible to uncover clusters of common disease trajectories with shared health outcomes. The assignment of patient journeys to specific clusters may in turn serve as the basis for personalized outcome prediction and treatment selection. This procedure is a non-trivial computational problem, as it requires the comparison of patient data with multi-dimensional and multi-modal features that are captured at different times and resolutions. In this review, we provide a comprehensive overview of the tools and methods that are used in patient similarity analysis with longitudinal data and discuss its potential for improving clinical decision making.
In recent years, advances in technology have enabled research with health data derived from large volumes of electronic health records (EHR) and other health-related data sources to improve innovation and quality in medicine. This has also been accelerated through national and international efforts offering access to repositories containing an increasing amount of clinical knowledge and collaborative platforms harmonizing not only the algorithms used, but also ontologies enabling better interoperability. At the same time there is growing concern that the use of health data for publicly-funded research may lead to exposure of patients’ personal information, which potentially increases, among other things, risks for discrimination. Legislators have addressed this issue by implementing regulations to protect patient privacy, often focusing on data anonymization, i.e., the removal or masking of identifiable information. In this study we analyze, how the regulations in three jurisdictions (United States, European Union, Switzerland) distinguish between different levels of anonymization of health data, and assess whether and how these levels align with technical advancements.
Poly-ADP-ribose polymerase (PARP) inhibitors are active against cells and tumors with defects in homology-directed repair as a result of synthetic lethality. PARP inhibitors have been suggested to act by either catalytic inhibition or by PARP localization in chromatin. In this study, we treat human HCC1937 BRCA1 mutant and isogenic BRCA1-complemented cells for three weeks with veliparib, a PARP inhibitor. We show that long-term treatment with veliparib results in chromatin-bound PARP1 in the BRCA1 mutant cells, and that this correlates with significant upregulation of inflammatory genes and activation of the cyclic GMP–AMP synthase (cGAS)/ signalling effector stimulator of interferon genes (STING) pathway. In contrast, long-term treatment of isogenic BRCA1-complemented cells with veliparib does not result in chromatin-associated PARP or significant upregulation of the inflammatory response. Our results suggest that long-term veliparib treatment may prime BRCA1 mutant tumors for positive responses to immune checkpoint blockade.
Uropathogenic Escherichia coli (UPEC) is the primary causative agent of uncomplicated urinary tract infections (UTIs). UPEC fitness and virulence determinants have been evaluated in a variety of laboratory settings that include a well-established mouse model of UTI. However, the extent to which bacterial physiology differs between experimental models and human infections remains largely understudied. To address this important question, we compared the transcriptomes of three different UPEC isolates in human infection and a variety of laboratory conditions including LB culture, filter-sterilized urine culture, and the UTI mouse model. We observed high correlation in gene expression between the mouse model and human infection in all three strains examined (Pearson correlation coefficient of 0.86-0.87). Only 175 of 3,266 (5.4%) genes shared by all three strains had significantly different expression levels, with the majority of them (145 genes) down-regulated in patients. Importantly, gene expression of both canonical virulence factors and metabolic machinery were highly similar between the mouse model and human infection, while the in vitro conditions displayed more substantial differences. Interestingly, comparison of gene expression between the mouse model and human infection hint at differences in bladder oxygenation as well as nutrient composition. In summary, our work strongly validates the continued use of this mouse model for the study of the pathogenesis of human UTI.
We conclude that data from patient timelines improve 30 day readmission prediction, that a logistic regression with LASSO has equal performance to the best neural network model and that the use of administrative data result in competitive performance compared to published approaches based on richer clinical datasets.
Clinical Data Science, Translational Bioinformatics, Cancer Genetics
Bioinformatics, Cancer Genomics, Long-read sequencing, cfDNA sequencing
Machine Learning, NLP, Image Processing
NLP, Machine Learning, Information Extraction, Domain Adaptation
Artificial Intelligence, Causal Inference
Bioinformatics, Computational Pharmacology, Deep Learning
Bioinformatics, Cancer Genomics
Human Disease Monitoring, Smart Sensors, IoT
Data Science Tooling, Reproducibility, Machine Learning
Machine Learning for Health Care
Sleeping, Hunting birds and mice, Scratching trees
Jumping all over the place 🦘, Knocking over Chess pieces ♟️♕, remote control 📺 and everything in front of him, Playing soccer ⚽ and chasing butterflies 🦋, Sleeping 😴 in front of laptops 💻, Participating in origami 📄🏮 and crafting activities 🧶🎨
Kinematics of toy mice, Hiding in cardboard boxes
Fluid dynamics, Burrowing under blankets
Belly rubs, Staying at home, Hanging around with her dog