Mechanistic Understanding of ML/AI Systems Through Interdisciplinary Scientific Applications
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Modern artificial intelligence (AI) systems have achieved remarkable success across various scientific domains. However, fundamental questions remain about how these systems learn, make decisions, and generalize across different applications. This dissertation addresses these questions by systematically analyzing and improving AI systems through applications in physics, chemistry, and healthcare, demonstrating how mechanistic understanding can enhance practical performance.
First, we develop a unifying framework for understanding convolutional neural networks (CNNs) in quantum physics applications. We show how CNNs efficiently approximate quantum wavefunctions in exponentially large Hilbert spaces using only linearly many parameters by connecting them to maximum entropy models and correlator product states. This analysis reveals how CNNs leverage quantum system symmetries and entanglement properties, leading to a new training algorithm that significantly reduces convergence time or number of parameters while maintaining accuracy. This work establishes a bridge between physics and machine learning, providing a template for analyzing other neural architectures and suggesting when they might succeed or fail in solving certain physics problems.
Second, we develop a series of machine learning approaches for chemical spectroscopy analysis. The Characteristic Peak Extraction algorithm improves accuracy for identifying chemical components in complex mixtures, while our Characteristic Peak Similarity metric enables accurate matching between different types of spectroscopic measurements. These tools are being actively tested to detect harmful chemicals in environmental samples and human organs, including polycyclic aromatic hydrocarbons in soil and placenta samples. This work creates more accessible and efficient tools for environmental monitoring, addressing a longstanding challenge in the field of analytical chemistry where traditional chemical spectroscopy methods require extensive laboratory facilities, expert knowledge, and time-consuming analysis procedures.
Finally, we advance medical diagnostics by creating interpretable deep learning models for ECG analysis that achieves high accuracy in detecting junctional ectopic tachycardia. Through explainable AI techniques, we systematically analyze how these networks make decisions by identifying key ECG features that align with clinical expertise, categorizing error patterns, and conducting root cause analysis of misclassifications. This mechanistic understanding not only validates the model's reasoning against clinical expertise but also provides insights for model improvement and clinical deployment. Beyond the immediate clinical impact, this contribution provides a framework for developing trustworthy AI systems in healthcare, where understanding decision-making processes is crucial for clinical adoption.
Together, these contributions advance our understanding of AI systems while demonstrating their practical impact across multiple scientific disciplines. The frameworks and methodologies developed in this thesis provide a foundation for building more interpretable, efficient, and reliable AI systems.