Available Projects
Welcome to the Project Catalog for National Science Foundation (NSF) Mathematical Sciences Graduate Internship (MSGI) Program. Students submitting an application to the NSF MSGI program are required to select at least one, but no more than three projects.
For technical assistance with navigating Zintellect, contact Zintellect Support at Zintellect@orau.org.
Project Title  Citizenship Required  Reference Code  Posted Date  Posted Datetime  Hosting Site  Internship Location  Disciplines  Description 

No  ANLJin1  11/17/2023  1700197200000  Argonne National Laboratory (ANL)  Lemont, IL or Virtual  Applied Mathematics, Combinatorics, Geometry, Topology 
Project Description:Attributed graphs, having side information from nodes and edges, are widely used in many fields within the DOE applications, such as neuroscience, biological discovery, power grid, and distributed computing facilities. Learning the similarities of graphs, also known as graph matching, is one of the fundamental problems in machine learning tasks with structured data. Even though the graphmatching problem has been studied in the last decades, the research on learning the similarities between attributed graphs still remains open. Moreover, methods inspired by optimal transport, such as GromovHausdorff and (Fused) GromovWasserstein distances, have shown promising results to compare not only on structures but also in attributed graphs.
Disciplines: Applied Mathematics, Combinatorics, Geometry, and Topology Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL or Virtual Mentor:
Internship Coordinator:


No  LBNLLiu1  11/17/2023  1700197200000  Lawrence Berkeley National Laboratory (LBNL)  Berkeley, CA  Applied Mathematics, Computational Mathematics 
Project Description:Butterfly decompositions are numerical linear algebra tools wellsuited to represent many highly oscillatory transforms and integrals encountered in e.g., solving wave equations, signal processing and Fourier transforms. Although they can be used to efficiently compress the operators in low dimensions, their application to high dimensional problems, e.g., 6D transforms and 3D wave equations, yields large prefactors or nonoptimal asymptotic complexities. The project aims at investigating the tensor form, instead of the matrix form, of butterfly decompositions to address these highdimensional challenges. The student will get familiarity with butterfly algorithms, tensor computation with both matlab and HPC implementations. With the guidance of a mentor, the student will conduct preliminary complexity analysis and numerical experiments to validate the benefits of tensorized butterfly representations.Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Lawrence Berkeley National Laboratory (LBNL) Internship location: Berkeley, CA Mentor:


No  ANLMallick1  11/17/2023  1700197200000  Argonne National Laboratory (ANL)  Lemont, IL  Analysis, Applied Mathematics 
Project Description:Spatiotemporal graph neural networks (GNNs) are widely used in many different applications, including frequency prediction on the power grid, traffic forecasting, and weather forecasting. However, as data sets grow larger and models become more complex, there is a pressing need to accelerate spatiotemporal GNNs for effective training and inference. To that end, we will investigate the graph sampling and sparsification strategies for spatiotemporal GNNs. In the sampling process, the features are aggregated by choosing a specific number of neighbors for each node or a specific number of nodes per layer. The features from numerous neighbors are aggregated as the depth of the GNNs grows. However, using neighborhood aggregation, sometimes taskirrelevant information is intermingled into nodes, resulting in poor generalization performance for the learned models. Therefore, in this project, we will concentrate on creating sparsification algorithms for spatiotemporal Graph Neural Networks, where the sparsification will be built as a learnable module. Sparsification and sampling on the sparse graph will aid in the removal of taskirrelevant edges as well as the reduction of subsequent computation and memory access.Disciplines: Analysis, and Applied Mathematics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL Mentors:


No  ANLMallick2  11/17/2023  1700197200000  Argonne National Laboratory (ANL)  Lemont, IL  Analysis, Applied Mathematics 
Project Description:Convolutional Neural Networks (CNNs) offer an efficient architecture in machine learning problems where the coordinates of the underlying data representation have a regular or Euclidian structure. The ability of CNNs to learn local stationary structures and compose them to form multiscale hierarchical patterns has led to breakthroughs in image, video, and sound recognition tasks. Nevertheless, in several scientific domains, one cannot apply standard CNNs: material structure data, gene data from biological regulatory networks, traffic data from road networks are important examples of data lying on irregular or nonEuclidean domains. The irregular or nonEuclidean domains can be represented by graphs, which are universal representations of heterogeneous pairwise relationships. Representation of the data informs of directed/ undirected graph and apply convolution/ pooling is not straightforward as the convolution and pooling operators are only defined for regular grids.In this project, we will focus on neural networks that operate on graphs. We will develop domainspecific convolution and pooling operations that extract patterns from data defined on graph. We will evaluate the efficacy of the developed methods on data from transportation and supercomputers interconnect networks. Disciplines: Analysis, and Applied Mathematics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL Mentors:


Yes  USACEStyles1  11/17/2023  1700197200000  U.S. Army Corps of Engineers, Engineer Research and Development Center (ERDC)  Vicksburg, MS  Analysis, Applied Mathematics, Mathematics (General), Operations Research, Probability and Statistics, Topology 
U.S. Citizenship is a requirement for this internship Project Description:Student will utilize spectrogram images/digital data to identify patterns that indicate the passage of watercraft. An extensive suite of vessel wake data is available to develop robust training algorithms as well as sample data to verify and develop a vessel wake detection algorithm. Student should possess working knowledge of ML concepts and be able to work independently in MATLAB and/or Python environment. Experience with data analysis, including digital filtering, wavelet analysis and higher level ML tools/applications is highly desirable. Work will mostly be in an office setting but some possibility for field work during vessel wake collections for interested students. Disciplines: Analysis, Applied Mathematics, Mathematics (General), Operations Research, Probability and Statistics, and Topology Hosting Site:U.S. Army Corps of Engineers, Engineer Research and Development Center (ERDC) Internship location: Vicksburg, MS Mentor:
Internship Coordinator:


No  ANLMaulik1  11/17/2023  1700197200000  Argonne National Laboratory (ANL)  Lemont, IL  Applied Mathematics, Probability and Statistics 
Project Description:In this project, novel deep learning algorithms will be constructed to learn solutions to the FokkerPlanck equations for stochastic dynamical systems. The key challenges to overcome include the possibility of nonlocality, i.e., when such systems are driven by Levy noise; highdimensionality, and nonMarkovian characteristics. Potential datadriven solutions to such systems include the use of normalizing flows, generative adversarial networks, and neural stochastic differential equations. Some preliminary work in this area has been done by our team (across ANL, IITChicago, Johns Hopkins University) here: https://arxiv.org/pdf/2107.13735.pdf Disciplines: Applied Mathematics, and Probability and Statistics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL Mentor:
Internship Coordinator:


No  ORNLPasini1  11/17/2023  1700197200000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN 
Project Description:Deep learning models are gaining wide interest within the scientific computing community due to their role in detecting and explaining underlying correlations between the different physical quantities that characterize a complex system. In particular, graph convolutional neural networks (GCNNs) have been showing great potential to describe the behavior of materials at microscopic scale by accurately capturing and describing interatomic interactions. The predictive performance of GCNNs is very sensitive to the choice of the architecture for multiple hyperparameters such as the number of neurons per layers, the number of convolutional layers, the number of fully connected layers, the radius cutoff, the activation functions at each hidden layer, the learning rate and the batch size to iteratively train the model. All these hyperparameters strongly impact the predictions made by a GCNN model and GCNNs with different hyperparameter setups may produce vastly different predictions for the same input data. In particular, some choice of hyperparameters may lead to a poor predictive performance due to numerical artifacts such as overfitting or underfitting. Therefore, identifying an appropriate setting of hyperparameters is essential to ensure the model’s accuracy and generalizability. Identifying a hyperparameter configuration that would make GCNN both accurate and robust requires performing an exhaustive search over a high dimensional space, which, in general, is computationally expensive. High performance computing can be leveraged to alleviate the computational burden of hyperparameter optimization (HPO) by concurrently exploring several hyperparameter configurations with distributed computing resources. In this work, we will develop and implement scalable HPO algorithms for GCNNs. We will use the RayTune library for hyperparameter tuning and we will integrate the RayTune functionalities into an existing implementation of GCNNs. The performance of the HPO procedure will be assessed in terms of: (1) scalability attained by distributing the hyperparameter search over hundreds of compute nodes on supercomputers at the Oak Ridge Leadership Computing Facility (OLCF) and (2) validation accuracy of the trained GCNN model on abinitio density functional theory (DFT) data generated by material scientists at Oak Ridge National Laboratory (ORNL) that describe the functional behavior of solid solution alloys at atomic scale. The expected outcome is a scalable HPO framework integrated with the existing implementation of the GCNN model that attains linear scaling up to 100 compute nodes on the OLCF supercomputer Summit, with an improved accuracy by a factor of 10x with respect to existing GCNN models trained on the DFT data. Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentor:


No  ORNLPasini2  11/17/2023  1700197200000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Scientific computing has recently shed light on the effectiveness of artificial neural network (ANN) models as surrogates for complex multiscale physicsbased models in order to accelerate expensive scientific calculations without compromising their accuracy.However, to this day ANN outputs are still challenging to interpret and explain in terms that are immediately mappable back to the application domain. This lack of interpretability and explainability limits the deployment of ANN models in several research applications where extracting meaningful causeeffect relationships is essential. Towards building an explainable and interpretable ANN model, one important task to complete is monitoring with sections of the ANN architecture are activated or deactivated when a new specific set of inputoutput features is processed during the learning phase. This project aims to build an interpretable ANN model for simple learning tasks such as image recognition. Through a series of experiments in which a small ANN is taught to distinguish simple geometric shapes, we will gain an understanding of the discernment procedure learned by the network as a first step towards a deeper understanding of why these structures perform well as classifiers. The participant will learn about artificial neural networks, become familiar with software packages for training these networks, and explore the fundamental structure of ANNs with the goal of elucidating how they perform their tasks. Learning objectives:
Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentors:


Yes  LANLLei1  11/17/2023  1700197200000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Applied Mathematics, Computational Mathematics 
U.S. Citizenship is a requirement for this internship Project Description:Numerical modeling of nonlinear processes plays a key role in understanding nearfield dynamics of underground explosions and the monitoring of farfield seismic data for nuclear treaty enforcement. The objective of this project is to conduct nearsource physics modeling for selected experiments conducted under U.S. Department of Energy, National Nuclear Security Administration’s Source Physics Experiments (SPEs). The goal is to gain more insight in the generation of shear waves in explosions.In this project, the simulations will be conducted using LANL’s Hybrid Optimization Software Suite (HOSS). HOSS, a 2016 R&D 100 Finalist, is a hybrid multiphysics software package integrating computational fluid dynamics (CFD) with stateoftheart combined finitediscrete element methodologies (FDEM). HOSS has been widely used to predict predict shock wave propagation, large material deformation and failure under extreme conditions (e.g. underground explosion). Alongside a mentor, the successful applicant will have an excellent opportunity to perform research on modeling of underground explosions under the supervision of LANL’s scientists; he/she will have the chance to enhance their knowledge on a wide range of fields from material modeling, numerical methods, and computational physics as well as high performance computing. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentor:


No  LBNLWILD1  11/17/2023  1700197200000  Lawrence Berkeley National Laboratory (LBNL)  Berkeley, CA or Virtual  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:This summer experience revolves around algorithm creation and analysis for design problems for which goals are only available through observations of complex systems. Such settings commonly arise in the fields of derivativefree (zerothorder) optimization. Our objectives include some combination of modeling specific problems, implementing novel algorithms, and analyzing (non)asymptotic performance. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Lawrence Berkeley National Laboratory (LBNL) Internship location: Berkeley, CA or Virtual Mentor:


No  ORNLPasini3  11/17/2023  1700197200000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Description: This project aims at exploring and implementing new mathematical solutions to improve the predictive performance of message passing operations in graph layers used by GNN models in material science applications.Graph neural networks (GNNs) naturally interpret the atomic structure of materials as graphs, where atoms are interpreted as graph nodes and interatomic bonds are interpreted as graph edges. The accuracy of the GNN model strongly depends on the mathematical operations performed in transferring information across adjacent nodes in the graph representation of the atomic structure, also called message passing operations. Student Requirements: The student is required to have preliminary fundamental knowledge about deep learning concepts and familiarity with PyTorch. Competence using PyTorchGeometric is desirable, but not required. Student Responsibilities: The student is expected to implement and test new message passing algorithms and test the performance of such algorithms opensource material science datasets. The student will collabrate with ORNL staff to validate the impact of the work performed to solve relevant scientific problems address but the material science community at ORNL. Learning objectives:
Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentors:
Internship Coordinator:


No  ANLHückelheim1  11/17/2023  1700197200000  Argonne National Laboratory  Lemont, IL  Applied Mathematics 
Project Description:Our group has decades of experience on developing and using automatic differentiation, which is known as backpropagation or autodiff in the Machine Learning frameworks. We are developing alternatives to backpropagation that take a more flexible approach on how to compute gradients, inspired by techniques developed in the context of differential equations and related problems. Disciplines: Applied Mathematics Hosting Site:Argonne National Laboratory Internship location: Lemont, IL Mentors:
Internship Coordinator:


No  ANLHückelheim2  11/17/2023  1700197200000  Argonne National Laboratory  Lemont, IL  Applied Mathematics 
Project Description:Our group has developed methods for mapping the evaluation of certain mathematical functions to modern processors, for example by exploiting the associativity of operators to allow dynamic scheduling and accumulation of results. This allows us to compute these functions faster and using less energy. Disciplines: Applied Mathematics Hosting Site:Argonne National Laboratory Internship location: Lemont, IL Mentors:
Internship Coordinator:


Yes  NISTGrey1  11/27/2023  1701061200000  National Institute of Standards and Technology (NIST)  Boulder, CO  Applied Mathematics, Geometry, Probability and Statistics, Topology 
U.S. Citizenship is a requirement for this internship Project Description:We'll be assisting with perspectives and interpretations applying differential geometry and competitive linear and nonlinear modelbased dimension reductions to explore applications ranging from computer vision to materials science and next generation communications technology. Example applications include: statistics of material microstructures, general image classification, and highdimensional bicriteria optimization for next generation wireless spectrum sharing. The position requires a candidate curious to explore some or all of the following topics in applied mathematics: (i) novel lowdimensional visualization and approximation methods, (ii) abstractions of computational differential geometry over matrix manifolds, (iii) linear and nonlinear modelparameter dimension reduction, and (iv) infinite dimensional extensions of spaces of discrete shapes. Disciplines: Applied Mathematics, Geometry, Probability and Statistics, and Topology Hosting Site:National Institute of Standards and Technology (NIST) Internship location: Boulder, CO Mentor:
Internship Coordinator:


No  USFSSkowronski1  11/27/2023  1701061200000  USDA Forest Service, Northern Research Station  Morgantown, WV  Applied Mathematics, Probability and Statistics, Topology 
Project Description:Wildland fire is a natural process that has become problematic in society because of the expansion of human developments, increased fuel loads due to past fire suppression activities, climate change, and a myriad of other factors. Solutions for this problem require a more advanced understanding of the fundamental physical processes of these fires and how they propagate from the very small scale (fuel particles) to landscapes. Large efforts are currently underway to integrate highly instrumented field experiments, machine learning, artificial intelligence, and computational fluid dynamics models to advance our decision making in the future. The applicant, with the guidance of several mentors, will have the opportunity to design an experience that focuses on their analytical strengths to help us to disentangle and understand complex relationships of fire spread and behavior. The applicant will examine a set (n=30) of recent fire field experiments with data including multitemporal 3D laser scanning (LiDAR), infrared and color video, 3D wind fields, temperature profiles, and radiative fluxes. The primary objectives of the experience are: 1) Expand the applicant’s understanding of datasets of different spatial and temporal resolutions, 2) develop an approach to decompose and relate these data streams, and 3) to present the techniques and results in a way that is understandable to scientists from other disciplines and land managers. This internship will be based at the Forestry Sciences Laboratory in Morgantown, WV in collaboration with Scientists from the USDA Forest Service, Rochester Institute of Technology, West Virginia University, and other institutions. The applicant will have the opportunity to collect data (in a learning setting) with the same instruments used in the fire experiments to understand their intricacies and limitations. Depending on Covid restrictions, the applicant may have the opportunity to visit several field sites, interact with other scientists and fire managers, and observe a prescribed burn. Disciplines: Applied Mathematics, Probability and Statistics, and Topology Hosting Site:USDA Forest Service, Northern Research Station Internship location: Morgantown, WV Mentors:


Yes  USACEBarker1  11/27/2023  1701061200000  U.S. Army Corps of Engineers, Cold Regions Research and Engineering Laboratory (CRREL)  Fairbanks, AK  Analysis, Probability and Statistics 
U.S. Citizenship is a requirement for this internship Project Description:For watersheds and soil systems in the arctic, there is an intense seasonality. Seasonal transitions significantly impact watershed geochemistry and impact the soil thermal regime. In the spring, there is a large increase in water flow as a result of snowpack melting. During summer, the thawing of the active layer extends and the majority of surface water flow is derived from rainfall and the little bit of baseflow that may exist. In the late fall, pore waters are pushed deeper in the soil column and the surface of the soil is frozen, but the active layer is at its’ deepest yearly extent. This time of the year is often not studied because access gets tricky in the winter and everything is assumed to be frozen, but in reality, reactions do still continue to occur and as you move into winter there is a portion of the active layer that remains thawed while the surface is frozen. As deepening of the active layer into previously frozen material is expected with climate change there remains a limited understanding of subsurface geochemistry and elemental behavior during this shoulder season. We have robust datasets that include thousands of thaw depth measurements tied to soil chemistry and soil temperature data where specific correlations, relationships, and key variables need to be determined and may help to elucidate the complicated soil thermal regime in the arctic.The intern will assist with statistical and multivariate analysis, including but not limited to principal component analysis, linear combination fitting, and variance analyses. The intern will join a wellrounded group of scientists in various disciplines like geochemistry, engineering, geophysics, and sensor development, join office meetings, present results, and may assist with field work in Alaska, if interested. The intern should have experience with algorithm development, MATLAB, multivariate statistical analysis, as well as an interest in geochemistry, geology, and/or chemistry. Disciplines: Analysis, and Probability and Statistics Hosting Site:U.S. Army Corps of Engineers, Cold Regions Research and Engineering Laboratory (CRREL) Internship location: Fairbanks, AK Mentors:
Internship Coordinator:


No  LLNLGuenther1  11/27/2023  1701061200000  Lawrence Livermore National Laboratory (LLNL)  Livermore, CA  Applied Mathematics 
Project Description:To enable practically useful quantum computing, this project aims to improve the error rate of logical gates on superconducting quantum devices by augmenting the quantum dynamical model with a datadriven approach to identify and incorporate latent dynamics. Quantum dynamics are often modeled by Schrödinger’s and Liouvillevon Neumann’s equations, that evolve the quantum state and density matrix according to a Hamiltonian model. The Universal Differential Equations approach augments this model by a neural network that is trained from device data to account for unknown dynamics, such as drift in system parameters, control line losses, cross talk, and environmental interactions. The trained augmented model will provide a more accurate simulation of the quantum dynamics in superconducting quantum devices, that will then be used within our optimal control software stack to design control strategies that achieve higher fidelity of quantum gates on noisy quantum hardware. The trained neural network will further be analyzed using symbolic regression to provide a mathematical description and understanding of the latent quantum dynamics. Disciplines: Applied Mathematics Hosting Site:Lawrence Livermore National Laboratory (LLNL) Internship location: Livermore, CA Mentors:
Internship Coordinator:


No  NISTDOGAN1  11/27/2023  1701061200000  National Institute of Standards and Technology (NIST)  Gaithersburg, MD  Applied Mathematics, Geometry, Probability and Statistics 
Project Description:The goal of this project is to develop tools for image and data analysis, by leveraging scientific computing and machine learning algorithms. Various research opportunities exist in the following topics: Disciplines: Applied Mathematics, Geometry, and Probability and Statistics Hosting Site:National Institute of Standards and Technology (NIST) Internship location: Gaithersburg, MD Mentor:
Internship Coordinator:


No  LANLWang1  11/27/2023  1701061200000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Applied Mathematics, Computational Mathematics, Geometry, Probability and Statistics 
Project Description:Large language models (LLMs) such as ChatGPT and LLaMA have had transformative impact on natural language processing, with significant scientific knowledge processing. Better understanding of LLMs may be necessary for further advances in this high interdisciplinary field. This project will involve learning about the LLMs and their applications, mathematical foundations, and explore further possibilities in applying LLMs for physics and other natural sciences, including processing the scientific knowledge information. Disciplines: Applied Mathematics, Computational Mathematics, Geometry, and Probability and Statistics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentor:
Internship Coordinator:


No  NISTJia1  11/27/2023  1701061200000  National Institute of Standards and Technology (NIST)  Gaithersburg, MD or virtual  Applied Mathematics, Computational Mathematics, Mathematical Biology Analysis 
Project Description:Over the past three centuries, significant attention has been dedicated to the study of how mechanical instabilities such as wrinkling, buckling, and blistering cause patterns to emerge in inanimate materials. Can these principles be applied to the analysis of living systems? From the fractallike edges of a kale leaf to the undulating crinkles of the human brain, many of the striking morphologies emblematic of living entities do not arise exclusively from gene expression programs but rather from the presence of mechanical forces. Elucidating this relationship between applied stresses and geometric forms will help guide scientific innovations ranging from the creation of bioinspired materials to the establishment of metrological standards. Students will gain experience with topics in applied and computational mathematics, particularly the calculus of variations, differential geometry, and nonlinear partial differential equations. Disciplines: Applied Mathematics, and Computational Mathematics, Mathematical Biology Analysis Hosting Site:National Institute of Standards and Technology (NIST) Internship location: Gaithersburg, MD or virtual Mentor:
Internship Coordinator:


No  ANLGraziani1  11/27/2023  1701061200000  Argonne National Laboratory (ANL)  Lemont, IL  Applied Mathematics, Probability and Statistics 
Project Description:The purpose of this project is to study large language models (LLMs) that are founded on explicit probabilistic models, rather than on transformertype architectures. The objective is to obtain predictive models for discrete sequence data such as natural language, DNA sequences, chemical formulae, etc. that are transparent in their operation and amenable to furnishing characterizations of predictive uncertainty. The MSGI Intern will participate in theoretical development and exploration of models, and in exploration of corpuses of sequential data to determine model structures appropriate to each type of data. As this is a computational statistics project, the intern should expect to write code implementations of models, and also to code data exploration methods. Equally important is the expectation that the intern will prepare a report that should function as an outline or first draft of a paper on the methods developed in the project. The intern will learn techniques for probabilistic modeling of discrete sequences, and for characterizing the statistical behavior of this type of data. A model in current development uses Gaussian process (GP)type methods, so the intern should expect to learn the theory and practice of GPs. Model training is likely to require use of Argonne's highperformance computing (HPC) facilities, and the intern should also expect to learn to utilize such facilities. Disciplines: Applied Mathematics, and Probability and Statistics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL Mentor:


No  ANLLarson2  11/27/2023  1701061200000  Argonne National Laboratory (ANL)  Lemont, IL  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Expensivetoevaluate stochastic oracles appear throughout the quantum information sciences research field. Unfortunately, optimizing such oracles is considerably more difficult than optimizing deterministic oracles. This is because of the varying output observed for repeated calls to the oracle, even with a fixed set of input variables. This project seeks to develop, analyze, and implement numerical methods for optimizing oracles that are stochastic in nature and that are relatively expensive to evaluate. Application problems include stochastic oracles that appear across the quantum information sciences. We are especially interested in using (polynomial and Bayesian) modelbased methods to identify highquality local optima for such problems. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL Mentors:
Internship Coordinator:


No  ORNLTabassum1  11/27/2023  1701061200000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Analysis, Applied Mathematics 
Project Description:In this project, the student will learn to work on generalizable deep causality representation learning for scientific data. Causality Learning is a challenging task for many scientific domains due to the high dimension and availability of limited ground truth. Learning causality structure between multiple variables from lowfidelity simulations can aid in modeling and understanding highfidelity data. This project will provide an opportunity to explore the stateoftheart deep learningbased causality techniques, especially transformers and graph neural networks on largescale, highdimensional scientific data, e.g., neuroscience, climate, etc. Disciplines: Analysis, and Applied Mathematics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentors:
Internship Coordinator:


No  ANLRAGHAVAN1  12/7/2023  1701925200000  Argonne National Laboratory  Lemont, IL  Applied Mathematics 
Project Description:Dataset imbalance refers to the issue when certain classes are represented by significantly more number of data points relative to others. It is a prevalent issue in machine learning especially classification problems in many scientific applications. This issue materializes itself when the final performance of a model is biased towards the class with a larger number of sample points. One way to correct this bias is to equalize the imbalance and intelligent sampling strategies play a critical role in this procedure. However, due to a lack of efficient approaches, a common way to address the issue involves trial and error driven uniform oversampling of the underrepresented class or undersampling of the overrepresented class. In this project, we will formulate the problem of imbalance in a data batch as an optimization problem and derive conditions which must be satisfied for sampling a balanced data batch. We then integrate the condition into the neural network learning problem. We will develop a game theoretic approach to resolve the tradeoff between the performance of the neural network and the variance in the data. Disciplines: Applied Mathematics Hosting Site:Argonne National Laboratory Internship location: Lemont, IL Mentors:
Internship Coordinator:


No  USDAAmatya1  12/7/2023  1701925200000  USDA Forest Service, Southern Research Station  Analysis, Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:With rapid development in computing and their ability to process big data combined with artificial intelligence theory, extreme learning machine (ELM) and adaptive neurofuzzy inference system (ANFIS) models are gaining attention in large applications. For this study, our focus is limited to computing evapotranspiration (ET, the loss of water to the atmosphere) using highresolution weather data. ET is a critical climate variable that uniquely links the water cycle (evaporation), energy cycle (latent heat flux), and carbon cycle (transpirationphotosynthesis tradeoff), each of which is described by complex processbased mathematical equations and their physical parameters. ET for an ecosystem is a complex and nonlinear process that is difficult to measure accurately and estimate/predict. This complexity can be solved by applying the machine learning techniques with different sets of hydrometeorological input variables. We hypothesize that machine learning models, including artificial neural network (ANN), support vector machine (SVM), and random forest (RF) with different optimization techniques, can predict ecosystem ET better than that by myriad of empirical models available in literature. This study will investigate the performance of different machine learning and deep learning models to predict daily ET using available meteorological and ecohydrological data. The candidate will be introduced to background of datadriven empirical statistical models and their parameters. With their background in Mathematics and Statistics, they will apply a suite of machine learning and deep learning models with optimization techniques to simulate ET values using weather data recorded at the USDA Forest Service Experimental Forest sites in coastal South Carolina, as well as Coweeta Hydrology Laboratory in upland North Carolina, both being used for longterm silvicultural research involving hydrology, ecology, soils, and vegetation. The candidate will be mentored by two hydroinformaticians and will be provided with an opportunity to make the project live and publishable and develop networking opportunities by presenting at the annual Santee Experimental Forest Research Forum and others near the end of the completion of the project. The students will also learn about field experimental studies, hydrologic processes including ET represented by mathematical equations and their prediction uncertainties, realtime monitoring technology, and managing and analyzing the Big Data sets using statistics at the host SEF study site.Disciplines: Analysis, Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:USDA Forest Service, Southern Research Station Mentors:


No  LANLBhattarai5  12/7/2023  1701925200000  Los Alamos National Laboratory (LANL)  Los Alamos, NM or virtual  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Adversarial robustness is a pressing concern in multimodal frameworks, especially in large language models (LLMs) that solve a spectrum of problems like texttoimage, imagetotext, and texttotext translations. Multimodel LLM such as ChatGPT, despite their powerful capabilities, are vulnerable to adversarial attacks. These attacks subtly manipulate input modalities, compromising model responses, and can even be insidiously embedded within training datasets, affecting model embeddings and overall performance. Building upon our previous work, where we utilized the AdversarialTensors framework to effectively mitigate adversarial noises in an unsupervised manner for standard CNNs, we aim to extend this ability to enhance the robustness of multimodal frameworks. Our goal is to purify input modalities from adversarial contaminations before they are processed by LLMs and to fortify model robustness by integrating tensorial defense strategies alongside LORA—a widely adopted, lowcomplexity finetuning model. Moreover, by integrating knowledge graphs, we can harness the structured and interlinked knowledge between input and corresponding embeddings to identify and counteract inconsistencies or adversarial perturbations, thereby enhancing the model's ability to reason and improve its resilience against malicious attacks. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM or virtual Mentors:
Internship Coordinator:


No  LANLKe1  12/7/2023  1701925200000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Computational Mathematics, Mathematical Biology 
Project Description:This project aims to develop spatial mechanistic models of respiratory virus infection (such as SARSCoV2, influenza) and the immune responses to understand how stochasticity and immune regulation impact on viral infection and pathogenesis. Respiratory viruses (such as SARSCoV2 and influenza) cause a high mortality and morbidity every year. Their infections usually result in a wide range of clinical outcomes, from asymptomatic to lethal. There is an urgent need to quantitatively define the specific factors that govern infection outcomes to develop better therapeutic and intervention strategies. Therefore, the work will be critical for rational design of vaccines to prepare for the current or next pandemic caused by respiratory viruses. The intern will learn the mathematical modeling techniques using computer simulations as well as biological knowledge about the processes of respiratory virus infection. In addition, the intern will have the opportunity to develop data science skills through analyzing data from experimental collaborators and develop machine learning models that will be trained on both simulated datasets and experimental datasets to make biologically relevant predictions. This internship will be ideal for a candidate who is interested in developing data science skills to address biological and health science questions.Disciplines: Computational Mathematics, and Mathematical Biology Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentor:


No  LANLTokareva1  12/7/2023  1701925200000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Applied Mathematics, Computational Mathematics 
Project Description:Matrixfree highorder finite element method (MFFEM) is a cuttingedge method to perform fast, robust, and accurate simulations of advectiondiffusion partial differential equations (PDEs) that are ubiquitous in computational physics. However, currently the application of MFFEM is limited to algorithms that are explicit in time. The presence of diffusion or stiff source terms calls for implicit or implicitexplicit timestepping methods, which negatively impacts the efficiency of MFFEM. To overcome this performance issue, we propose to couple fast multigrid (MG) methods for the discrete diffusion operators with MFFEM for advection operators. Geometric multigrid (GMG) methods are one of the fastest classes of linear and nonlinear solvers for implicit equations arising in numerical methods for PDEs. GMG methods typically rely on a hierarchy of nested structured meshes, and the physical equations are solved on ""coarse"" meshes to provide corrections to the solution on the original ""fine"" mesh. For unstructured meshes that are moving or adapting in time, it is not obvious how to construct a mesh hierarchy, and algebraic multigrid (AMG) methods provide a less intrusive multigrid option that are often used instead. However, AMG methods require assembling a full matrix for the discretization, and, although efficient, do not compete with GMG in terms of computational efficiency, particularly on GPUs. Some previous work has considered semigeometric MG (sGMG) methods, where an initial simplicial (triangular) mesh is coarsened to construct a nested grid hierarchy on which to apply GMG. We are, however, interested in coupling implicit solvers to Arbitrary LagrangianEulerian simulations of hydrodynamics, where quadrilateral and hexahedral (nonsimplicial) meshes offer significantly higher physics fidelity. The goals of the proposed project are: (1) the development of novel sGMG solvers for the advectiondiffusion PDEs on quadrilateral and hexahedral meshes and (2) efficient implementation in LANL’s open source GPUbased code Fierro (https://lanl.github.io/Fierro). Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentors:


No  ORNLMoriano3  12/7/2023  1701925200000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN or virtual  Analysis, Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Many complex systems are usually represented by networks (e.g., communication networks, power grids, social networks, etc.). Communities or clusters are key to understand the structure and function of these complex systems because communities represent important functional modules in networked systems. Therefore, there is an increasing interest in understanding the limits of the robustness of the community structure. This is because maintaining the functionality of networked systems is heavily dependent on preserving their community structure. Given ORNL's expertise on modeling and simulation of complex systems using leadership computing facilities, this project will take advance of modern data science, machine learning, and network science techniques, or any technique of interest to the participant that could help on better understand the limits of the robustness of the community structure in interconnected systems. We are highly interested in exploring data clustering techniques that use graph embeddings for clustering both synthetic and empirical realworld networks. We will seek to understand the limitations of current graph embedding methods for clustering tasks and propose ways to alleviate these limitations. This project will allow the participant to actively drive an exciting facet of an ongoing research project at ORNL, and have their contributions directly integrated into the Computer Science and Mathematics Division research priorities. A successful student has prior experience with data science techniques, machine learning, and network science, but is not expected to have deep experience with programming. Notably, prior projects at ORNL by interns in this team have led to published papers at top interdisciplinary/applied math journals. Examples include:
Based on the findings here, we will also seek to publish a paper in a major data science/applied math venue with the participant as the lead author. Feel free to reach out if questions. Disciplines: Analysis, Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN or virtual Mentor:
Internship Coordinator:


No  ORNLMoriano4  12/7/2023  1701925200000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN or virtual  Analysis, Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:The promise of ubiquitous connectivity brought by emerging technologies (i.e., 5G/6G) will enable interconnected systems (e.g., Internet of things (IoT), smart cities, science facilities, among others) to work faster and more efficiently. However, ubiquitous connectivity comes at the expense of augmenting the attack surface and the risk of suffering from advanced cyber attacks. Detection and adaptation to such threats requires collecting, filtering, and analyzing event data of heterogeneous scale, speed, and modality. The design and deployment of trustworthy methods and algorithms that can accommodate volume, velocity, and variety is key for near realtime detection, continuous adaptation, and quick attack recovery is challenging and requires innovation. Thus, to address the emerging challenges of next generation of cyber threats, the goal of this project is to design and implement a suite of algorithms that allow near realtime detection of adaptive cyber attacks in interconnected systems. Given ORNL's expertise in acquisition and processing of data for enterprise and cyber physical domains using leadership computing facilities along with leading researchers in applied mathematics, computer science, statistics, and cybersecurity, this project will take advance of modern data science, machine learning, or any technique of interest to the participant (including but not limited to multivariate time series analysis, clustering, graph mining, etc.) that could help to design and implement algorithms that will enable quick detection, adaptation, and recovery from sophisticated and evolving cyberattacks targeting modern interconnected systems. Learning objectives for the applicant include: (1) develop a basic understanding of novel cyberattacks affecting modern interconnected systems; (2) design and development of datadriven models for detecting advanced cyberattacks; (3) validate the efficacy of developed algorithms using empirical data and compare its performance against stateoftheart classical approaches. This project will allow the participant to actively drive an exciting facet of an ongoing research project at ORNL, and have their contributions directly integrated into the Computer Science and Mathematics Division research priorities. A successful participant has prior experience with data science techniques and machine learning but is not expected to have deep experience with programming. Notably, prior projects at ORNL by interns in this team have led to papers published at top interdisciplinary/applied mathematics journals. Examples include:
Based on the findings here, we will also seek to publish a paper in a major data science/applied math venue with the participant as the lead author. Feel free to reach out if questions. Disciplines: Analysis, Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN or virtual Mentor:
Internship Coordinator:


No  LLNLSarracino1  12/7/2023  1701925200000  Lawrence Livermore National Laboratory (LLNL)  Livermore, CA or virtual  Computational Mathematics, Foundations 
Project Description:Kleene algebra with tests (KAT) is an algebraic technique for mathematical reasoning about the behavior of computer programs. In this project we will explore the semantic foundations of KAT as applied to the task of program verification. Potential projects include:
Disciplines: Computational Mathematics, and Foundations Hosting Site:Lawrence Livermore National Laboratory (LLNL) Internship location: Livermore, CA or virtual Mentor:
Internship Coordinator:


No  ORNLKotevska2  12/7/2023  1701925200000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Analysis, Applied Mathematics, Computational Mathematics, Mathematics (General), Probability and Statistics 
Project Description:Graph Neural Networks (GNNs) have gained significant attention owing to their ability to handle graphstructured data and the improvement in practical applications. However, many of these models prioritize high utility performance, such as accuracy, with a lack of privacy consideration, a significant concern in modern society where privacy attacks are rampant. In this project, we want to focus on privacypreservation techniques. With guidance from a mentor, the student will help developing new defense mechanisms and design comparison study with the state of the art methods. The student will learn about privacypreservation algorithms, graph neural networks numerical analysis, and writing/presentation skills.
Disciplines: Analysis, Applied Mathematics, Computational Mathematics, Mathematics (General), and Probability and Statistics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentor:
Internship Coordinator:


No  LANLTaitano1  12/7/2023  1701925200000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Applied Mathematics, Computational Mathematics 
Project Description:The RosenbluthFokkerPlanck (RFP) equation (also referred to as the LandauFokkerPlanck) is a partial differential equation with an anisotropictensordiffusionadvection form that is regarded as the first principles model for describing the dynamical evolution of the plasma particle distribution function (PDF) undergoing a longrange binary Coulomb interaction. As such, the equation has important applications in modeling the evolution of plasmas in thermonuclear fusion devices. One of the challenges of numerically evolving the RFP equation is the multiscaled nature of the equation, where the characteristic collisional relaxation timescale in which the PDF relaxes to a Gaussian can vary by many orders of magnitude within a real system. This difficulty can be cast mathematically as a singularly perturbed PDE, where a small parameter proportional to the relaxation time scale appears in the denominator before the RFP operator, making it stiff as the parameter vanishes. The other difficulty in numerically solving the RFP equation is retaining the equation's structural properties in the discrete, such as the Boltzmann Htheorem, where entropy monotonically increase until an equilibrium distribution is reached. To deal with these challenges, the student will learn from their mentors to develop a discretization scheme that satisfies the structural properties of the RFP equation and an efficient solver based on a nonlinear multigrid scheme to deal with the stiffness.Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentors:
Internship Coordinator:


Yes  USACEAffleck1  12/7/2023  1701925200000  U.S. Army Corps of Engineers, Cold Regions Research and Engineering Laboratory (CRREL)  Hanover, NH or virtual  Applied Mathematics, Probability and Statistics 
U.S. Citizenship is a requirement for this internship Project Description:The Army maintains stringent requirements for equipment in the Arctic operational environment and expresses concerns of risk to mission failure for sustaining the forces at temperatures down to 65^{o}F, wind speeds greater than 100 mph, and 25 lb/ft^{2} snow load. However, these conditions may only be encountered on rare occasions, and in certain subregions of the Arctic. In this new start ESTCP project, we explore refining the Arctic climatic zones characterization relative to these design thresholds. This includes identifying the circumpolar north climate breaks for environmental parameters to narrow the contingency, logistics and resources needed, and develop sitefocused requirements for the Services to operate across the region. To accomplish this, we intend to include three climate characteristics (air temperature extremes, snow depth and wind speed limits) to the currently defined climate zones and conduct analysis against materiel and equipment exposure thresholds. This will provide mission critical data and information for operations managers to make better riskinformed decisions and to operate equipment, materiel, and force projection in the circumpolar region. Disciplines: Applied Mathematics, and Probability and Statistics Hosting Site:U.S. Army Corps of Engineers, Cold Regions Research and Engineering Laboratory (CRREL) Internship location: Hanover, NH or virtual Mentor:
Internship Coordinator:


No  ANLLeyffer1  12/7/2023  1701925200000  Argonne National Laboratory (ANL)  Lemont, IL, or virtual  Computational Mathematics, Probability and Statistics 
Project Description:We are exploring new optimization methodologies to optimize with digital twins. Optimization methodologies are a critical component to make digital twins a reality. They are needed in the optimal design of experiments that integrate data acquisition with simulation, the optimal steering of processes or devices modeled as digital twins, and the connection to neighboring digital twins. In this project, we will explore the use of distributed optimization techniques to control a swarm of vehicles used to optimally acquire environmental data along different paths. The project combines the solution of PDEs with the design of experiments and optimal control. Our goal is to build a prototype model and computational simulation using python and Fenics. Disciplines: Computational Mathematics, and Probability and Statistics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL, or virtual Mentor:
Internship Coordinator:


No  LANLRomeroSeverson1  12/7/2023  1701925200000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Applied Mathematics, Computational Mathematics, Mathematical Biology 
Project Description:Viruses with segmented genomes such as Crimean Congo hemorrhagic fever virus (CCHFV), hantaviruses, and influenza can form chimeric viruses when an individual host becomes coinfected by distinct variants of a given virus. This reassortment of genomic segments is postulated to be an important aspect in the emergence and reemergences of both old and new infectious diseases. The ultimate goal of this project is to answer a scientific question: can we determine if a virus experienced a genomic reassortment event in the past that increased its fitness by looking at contemporary genomic samples of that virus? The challenge is that there are no robust offtheshelf methods that can infer the presence of genomic reassortment (see https://doi.org/10.1101/2023.09.20.558687 for our preprint on this issue). Further, while there are interesting candidate methods that might allow for robust maximum likelihood inference of genomic reassortment from phylogenetic trees, it is unlikely that those methods will have reasonable computational efficiency. This project seeks to address that issue by 1) building a mechanistic model to simulate both the transmission dynamics and the viral evolution of a segmented virus in an idealized population, 2) build a Deep Neural Network mapping the simulated phylogenetic trees to the model parameters such as increased contagiousness of the reassorted virus, and 3) infer the presence of relevant genomic reassortment using real genetic sequence data from CCHFV. Participants will gain handson experience in epidemiological research, enhance their mathematical modeling, statistical analysis, and machine learning skills, and prepare for future careers in academia, research, or related industries. This opportunity contributes to vital scientific discoveries and equips students with highly soughtafter skills in computational biology and public health research. Disciplines: Applied Mathematics, Computational Mathematics, and Mathematical Biology Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentors:
Internship Coordinator:


No  FNALMrenna1  12/7/2023  1701925200000  Fermi National Accelerator Laboratory (FNAL)  Batavia, IL or virtual  Applied Mathematics, Computational Mathematics 
Project Description:This project will build models based on observed discrepancies between theory computer simulations and data at the Large Hadron Collider that fill in the physics behind these gaps. The models will be developed using artificial intelligence (AI) with two distinguishing and novel features: we will learn (1) corrections to an existing physics model to describe data using reweighting and (2) symbolic relationships between theory features to construct better physics models. A mature version of the developed tools will, for example, significantly reduce the measurement uncertainties on the top quark mass and possibly aid in the discovery of a new, weaklyinteracting particle. We propose to complete the early stages of this project with following goals: (1) Using techniques such as weakly supervised machine learning to automatically identify collider data in the Rivet analysis library that is poorly described by theory predictions and has no obvious physics explanation (2) Developing an AIbased regression model to describe the gap between theory and data using the Pythia event generator (3) Constructing metamodels, using Simulationbased inference (SBI) for example, to determine correlations between the gap and the latent variables of the generator. These metamodels will be used to supplement the standard generator predictions and provide candidate physical explanations and descriptions of the data. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Fermi National Accelerator Laboratory (FNAL) Internship location: Batavia, IL or virtual Mentors:
Internship Coordinator:


No  FNALMrenna2  12/7/2023  1701925200000  Fermi National Accelerator Laboratory (FNAL)  Batavia, IL or virtual  Computational Mathematics 
Project Description:Monte Carlobased event generators are essential tools for analyzing and interpreting data from particle collision experiments. Currently, the physical evolution of a particle collision, and hence the computing algorithms used to make predictions, are serial. On the other hand, many of our computations in the future will be done on chips with GPUs. This project aims to explore how the physics algorithms could be restructured to exploit acceleration. As a first case, we would explore the parton shower algorithm. Disciplines: Computational Mathematics Hosting Site:Fermi National Accelerator Laboratory (FNAL) Internship location: Batavia, IL or virtual Mentors:
Internship Coordinator:


No  LBNLLi2  12/7/2023  1701925200000  Lawrence Berkeley National Laboratory (LBNL)  Berkeley, CA or Virtual  Computational Mathematics, Probability and Statistics 
Project Description:Randomized algorithms, such as sketching, sparsification, and streaming, are powerful dimensionality reduction tools for analyzing large. The applications for DOE include computational linear algebra (such as solvers, preconditioners, and lowrank approximation) and graph analysis (e.g., graph partitioning, clustering, and graph learning). Many impressive advances in the theory of randomized algorithms have not been translated into practical demonstrations. This project aims to bridge this gap between theory and practice, and to produce highquality software running on modern HPC hardware with demonstrations in application codes. Disciplines: Computational Mathematics, and Probability and Statistics Hosting Site:Lawrence Berkeley National Laboratory (LBNL) Internship location: Berkeley, CA or Virtual Mentor:
Internship Coordinator:

The name and contact information of the hosting site internship coordinator is provided for further assistance with questions regarding the hosting site; local housing availability, cost, or roommates; local transportation; security clearance requirements; internship start and end dates; and other administrative issues specific to that research facility. If you contact the internship coordinator, identify yourself as an applicant to the NSF Mathematical Sciences Graduate Internship (MSGI) Program.
Interns will not enter into an employee/employer relationship with the Hosting Site, ORAU/ORISE, NSF or DOE. No commitment with regard to later employment is implied or should be inferred.