Past Projects
Below is the list of archived projects from the 2024 summer cohort for the NSF Mathematical Sciences Graduate Internship.
For a list of currently available projects, visit the Project Catalog page.
2024 Projects
Project Title  Citizenship Required  Reference Code  Posted Date  Posted Datetime  Hosting Site  Internship Location  Disciplines  Description 

No  ANLJin1  11/17/2023  1700197200000  Argonne National Laboratory (ANL)  Lemont, IL or virtual  Applied Mathematics, Combinatorics, Geometry, Topology 
Project Description:Attributed graphs, having side information from nodes and edges, are widely used in many fields within the DOE applications, such as neuroscience, biological discovery, power grid, and distributed computing facilities. Learning the similarities of graphs, also known as graph matching, is one of the fundamental problems in machine learning tasks with structured data. Even though the graphmatching problem has been studied in the last decades, the research on learning the similarities between attributed graphs still remains open. Moreover, methods inspired by optimal transport, such as GromovHausdorff and (Fused) GromovWasserstein distances, have shown promising results to compare not only on structures but also in attributed graphs.
Disciplines: Applied Mathematics, Combinatorics, Geometry, and Topology Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL or virtual Mentor:
Internship Coordinator:


No  LBNLLiu1  11/17/2023  1700197200000  Lawrence Berkeley National Laboratory (LBNL)  Berkeley, CA  Applied Mathematics, Computational Mathematics 
Project Description:Butterfly decompositions are numerical linear algebra tools wellsuited to represent many highly oscillatory transforms and integrals encountered in e.g., solving wave equations, signal processing and Fourier transforms. Although they can be used to efficiently compress the operators in low dimensions, their application to high dimensional problems, e.g., 6D transforms and 3D wave equations, yields large prefactors or nonoptimal asymptotic complexities. The project aims at investigating the tensor form, instead of the matrix form, of butterfly decompositions to address these highdimensional challenges. The student will get familiarity with butterfly algorithms, tensor computation with both matlab and HPC implementations. With the guidance of a mentor, the student will conduct preliminary complexity analysis and numerical experiments to validate the benefits of tensorized butterfly representations.Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Lawrence Berkeley National Laboratory (LBNL) Internship location: Berkeley, CA Mentor:


No  ANLMallick1  11/17/2023  1700197200000  Argonne National Laboratory (ANL)  Lemont, IL  Analysis, Applied Mathematics 
Project Description:Spatiotemporal graph neural networks (GNNs) are widely used in many different applications, including frequency prediction on the power grid, traffic forecasting, and weather forecasting. However, as data sets grow larger and models become more complex, there is a pressing need to accelerate spatiotemporal GNNs for effective training and inference. To that end, we will investigate the graph sampling and sparsification strategies for spatiotemporal GNNs. In the sampling process, the features are aggregated by choosing a specific number of neighbors for each node or a specific number of nodes per layer. The features from numerous neighbors are aggregated as the depth of the GNNs grows. However, using neighborhood aggregation, sometimes taskirrelevant information is intermingled into nodes, resulting in poor generalization performance for the learned models. Therefore, in this project, we will concentrate on creating sparsification algorithms for spatiotemporal Graph Neural Networks, where the sparsification will be built as a learnable module. Sparsification and sampling on the sparse graph will aid in the removal of taskirrelevant edges as well as the reduction of subsequent computation and memory access.Disciplines: Analysis, and Applied Mathematics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL Mentors:


No  ANLMallick2  11/17/2023  1700197200000  Argonne National Laboratory (ANL)  Lemont, IL  Analysis, Applied Mathematics 
Project Description:Convolutional Neural Networks (CNNs) offer an efficient architecture in machine learning problems where the coordinates of the underlying data representation have a regular or Euclidian structure. The ability of CNNs to learn local stationary structures and compose them to form multiscale hierarchical patterns has led to breakthroughs in image, video, and sound recognition tasks. Nevertheless, in several scientific domains, one cannot apply standard CNNs: material structure data, gene data from biological regulatory networks, traffic data from road networks are important examples of data lying on irregular or nonEuclidean domains. The irregular or nonEuclidean domains can be represented by graphs, which are universal representations of heterogeneous pairwise relationships. Representation of the data informs of directed/ undirected graph and apply convolution/ pooling is not straightforward as the convolution and pooling operators are only defined for regular grids.In this project, we will focus on neural networks that operate on graphs. We will develop domainspecific convolution and pooling operations that extract patterns from data defined on graph. We will evaluate the efficacy of the developed methods on data from transportation and supercomputers interconnect networks. Disciplines: Analysis, and Applied Mathematics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL Mentors:


Yes  USACEStyles1  11/17/2023  1700197200000  U.S. Army Corps of Engineers, Engineer Research and Development Center (ERDC)  Vicksburg, MS  Analysis, Applied Mathematics, Mathematics (General), Operations Research, Probability and Statistics, Topology 
U.S. Citizenship is a requirement for this internship Project Description:Student will utilize spectrogram images/digital data to identify patterns that indicate the passage of watercraft. An extensive suite of vessel wake data is available to develop robust training algorithms as well as sample data to verify and develop a vessel wake detection algorithm. Student should possess working knowledge of ML concepts and be able to work independently in MATLAB and/or Python environment. Experience with data analysis, including digital filtering, wavelet analysis and higher level ML tools/applications is highly desirable. Work will mostly be in an office setting but some possibility for field work during vessel wake collections for interested students. Disciplines: Analysis, Applied Mathematics, Mathematics (General), Operations Research, Probability and Statistics, and Topology Hosting Site:U.S. Army Corps of Engineers, Engineer Research and Development Center (ERDC) Internship location: Vicksburg, MS Mentor:
Internship Coordinator:


No  ANLMaulik1  11/17/2023  1700197200000  Argonne National Laboratory (ANL)  Lemont, IL  Applied Mathematics, Probability and Statistics 
Project Description:In this project, novel deep learning algorithms will be constructed to learn solutions to the FokkerPlanck equations for stochastic dynamical systems. The key challenges to overcome include the possibility of nonlocality, i.e., when such systems are driven by Levy noise; highdimensionality, and nonMarkovian characteristics. Potential datadriven solutions to such systems include the use of normalizing flows, generative adversarial networks, and neural stochastic differential equations. Some preliminary work in this area has been done by our team (across ANL, IITChicago, Johns Hopkins University) here: https://arxiv.org/pdf/2107.13735.pdf Disciplines: Applied Mathematics, and Probability and Statistics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL Mentor:
Internship Coordinator:


No  ORNLPasini1  11/17/2023  1700197200000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN 
Project Description:Deep learning models are gaining wide interest within the scientific computing community due to their role in detecting and explaining underlying correlations between the different physical quantities that characterize a complex system. In particular, graph convolutional neural networks (GCNNs) have been showing great potential to describe the behavior of materials at microscopic scale by accurately capturing and describing interatomic interactions. The predictive performance of GCNNs is very sensitive to the choice of the architecture for multiple hyperparameters such as the number of neurons per layers, the number of convolutional layers, the number of fully connected layers, the radius cutoff, the activation functions at each hidden layer, the learning rate and the batch size to iteratively train the model. All these hyperparameters strongly impact the predictions made by a GCNN model and GCNNs with different hyperparameter setups may produce vastly different predictions for the same input data. In particular, some choice of hyperparameters may lead to a poor predictive performance due to numerical artifacts such as overfitting or underfitting. Therefore, identifying an appropriate setting of hyperparameters is essential to ensure the model’s accuracy and generalizability. Identifying a hyperparameter configuration that would make GCNN both accurate and robust requires performing an exhaustive search over a high dimensional space, which, in general, is computationally expensive. High performance computing can be leveraged to alleviate the computational burden of hyperparameter optimization (HPO) by concurrently exploring several hyperparameter configurations with distributed computing resources. In this work, we will develop and implement scalable HPO algorithms for GCNNs. We will use the RayTune library for hyperparameter tuning and we will integrate the RayTune functionalities into an existing implementation of GCNNs. The performance of the HPO procedure will be assessed in terms of: (1) scalability attained by distributing the hyperparameter search over hundreds of compute nodes on supercomputers at the Oak Ridge Leadership Computing Facility (OLCF) and (2) validation accuracy of the trained GCNN model on abinitio density functional theory (DFT) data generated by material scientists at Oak Ridge National Laboratory (ORNL) that describe the functional behavior of solid solution alloys at atomic scale. The expected outcome is a scalable HPO framework integrated with the existing implementation of the GCNN model that attains linear scaling up to 100 compute nodes on the OLCF supercomputer Summit, with an improved accuracy by a factor of 10x with respect to existing GCNN models trained on the DFT data. Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentor:


No  ORNLPasini2  11/17/2023  1700197200000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Scientific computing has recently shed light on the effectiveness of artificial neural network (ANN) models as surrogates for complex multiscale physicsbased models in order to accelerate expensive scientific calculations without compromising their accuracy.However, to this day ANN outputs are still challenging to interpret and explain in terms that are immediately mappable back to the application domain. This lack of interpretability and explainability limits the deployment of ANN models in several research applications where extracting meaningful causeeffect relationships is essential. Towards building an explainable and interpretable ANN model, one important task to complete is monitoring with sections of the ANN architecture are activated or deactivated when a new specific set of inputoutput features is processed during the learning phase. This project aims to build an interpretable ANN model for simple learning tasks such as image recognition. Through a series of experiments in which a small ANN is taught to distinguish simple geometric shapes, we will gain an understanding of the discernment procedure learned by the network as a first step towards a deeper understanding of why these structures perform well as classifiers. The participant will learn about artificial neural networks, become familiar with software packages for training these networks, and explore the fundamental structure of ANNs with the goal of elucidating how they perform their tasks. Learning objectives:
Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentors:


Yes  LANLLei1  11/17/2023  1700197200000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Applied Mathematics, Computational Mathematics 
U.S. Citizenship is a requirement for this internship Project Description:Numerical modeling of nonlinear processes plays a key role in understanding nearfield dynamics of underground explosions and the monitoring of farfield seismic data for nuclear treaty enforcement. The objective of this project is to conduct nearsource physics modeling for selected experiments conducted under U.S. Department of Energy, National Nuclear Security Administration’s Source Physics Experiments (SPEs). The goal is to gain more insight in the generation of shear waves in explosions.In this project, the simulations will be conducted using LANL’s Hybrid Optimization Software Suite (HOSS). HOSS, a 2016 R&D 100 Finalist, is a hybrid multiphysics software package integrating computational fluid dynamics (CFD) with stateoftheart combined finitediscrete element methodologies (FDEM). HOSS has been widely used to predict predict shock wave propagation, large material deformation and failure under extreme conditions (e.g. underground explosion). Alongside a mentor, the successful applicant will have an excellent opportunity to perform research on modeling of underground explosions under the supervision of LANL’s scientists; he/she will have the chance to enhance their knowledge on a wide range of fields from material modeling, numerical methods, and computational physics as well as high performance computing. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentor:


No  LBNLWILD1  11/17/2023  1700197200000  Lawrence Berkeley National Laboratory (LBNL)  Berkeley, CA or virtual  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:This summer experience revolves around algorithm creation and analysis for design problems for which goals are only available through observations of complex systems. Such settings commonly arise in the fields of derivativefree (zerothorder) optimization. Our objectives include some combination of modeling specific problems, implementing novel algorithms, and analyzing (non)asymptotic performance. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Lawrence Berkeley National Laboratory (LBNL) Internship location: Berkeley, CA or virtual Mentor:


No  ORNLPasini3  11/17/2023  1700197200000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Description: This project aims at exploring and implementing new mathematical solutions to improve the predictive performance of message passing operations in graph layers used by GNN models in material science applications.Graph neural networks (GNNs) naturally interpret the atomic structure of materials as graphs, where atoms are interpreted as graph nodes and interatomic bonds are interpreted as graph edges. The accuracy of the GNN model strongly depends on the mathematical operations performed in transferring information across adjacent nodes in the graph representation of the atomic structure, also called message passing operations. Student Requirements: The student is required to have preliminary fundamental knowledge about deep learning concepts and familiarity with PyTorch. Competence using PyTorchGeometric is desirable, but not required. Student Responsibilities: The student is expected to implement and test new message passing algorithms and test the performance of such algorithms opensource material science datasets. The student will collabrate with ORNL staff to validate the impact of the work performed to solve relevant scientific problems address but the material science community at ORNL. Learning objectives:
Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentors:
Internship Coordinator:


No  ANLHückelheim1  11/17/2023  1700197200000  Argonne National Laboratory (ANL)  Lemont, IL  Applied Mathematics 
Project Description:Our group has decades of experience on developing and using automatic differentiation, which is known as backpropagation or autodiff in the Machine Learning frameworks. We are developing alternatives to backpropagation that take a more flexible approach on how to compute gradients, inspired by techniques developed in the context of differential equations and related problems. Disciplines: Applied Mathematics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL Mentors:
Internship Coordinator:


No  ANLHückelheim2  11/17/2023  1700197200000  Argonne National Laboratory (ANL)  Lemont, IL  Applied Mathematics 
Project Description:Our group has developed methods for mapping the evaluation of certain mathematical functions to modern processors, for example by exploiting the associativity of operators to allow dynamic scheduling and accumulation of results. This allows us to compute these functions faster and using less energy. Disciplines: Applied Mathematics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL Mentors:
Internship Coordinator:


Yes  NISTGrey1  11/27/2023  1701061200000  National Institute of Standards and Technology (NIST)  Boulder, CO  Applied Mathematics, Geometry, Probability and Statistics, Topology 
U.S. Citizenship is a requirement for this internship Project Description:We'll be assisting with perspectives and interpretations applying differential geometry and competitive linear and nonlinear modelbased dimension reductions to explore applications ranging from computer vision to materials science and next generation communications technology. Example applications include: statistics of material microstructures, general image classification, and highdimensional bicriteria optimization for next generation wireless spectrum sharing. The position requires a candidate curious to explore some or all of the following topics in applied mathematics: (i) novel lowdimensional visualization and approximation methods, (ii) abstractions of computational differential geometry over matrix manifolds, (iii) linear and nonlinear modelparameter dimension reduction, and (iv) infinite dimensional extensions of spaces of discrete shapes. Disciplines: Applied Mathematics, Geometry, Probability and Statistics, and Topology Hosting Site:National Institute of Standards and Technology (NIST) Internship location: Boulder, CO Mentor:
Internship Coordinator:


No  USFSSkowronski1  11/27/2023  1701061200000  USDA Forest Service, Northern Research Station  Morgantown, WV  Applied Mathematics, Probability and Statistics, Topology 
Project Description:Wildland fire is a natural process that has become problematic in society because of the expansion of human developments, increased fuel loads due to past fire suppression activities, climate change, and a myriad of other factors. Solutions for this problem require a more advanced understanding of the fundamental physical processes of these fires and how they propagate from the very small scale (fuel particles) to landscapes. Large efforts are currently underway to integrate highly instrumented field experiments, machine learning, artificial intelligence, and computational fluid dynamics models to advance our decision making in the future. The applicant, with the guidance of several mentors, will have the opportunity to design an experience that focuses on their analytical strengths to help us to disentangle and understand complex relationships of fire spread and behavior. The applicant will examine a set (n=30) of recent fire field experiments with data including multitemporal 3D laser scanning (LiDAR), infrared and color video, 3D wind fields, temperature profiles, and radiative fluxes. The primary objectives of the experience are: 1) Expand the applicant’s understanding of datasets of different spatial and temporal resolutions, 2) develop an approach to decompose and relate these data streams, and 3) to present the techniques and results in a way that is understandable to scientists from other disciplines and land managers. This internship will be based at the Forestry Sciences Laboratory in Morgantown, WV in collaboration with Scientists from the USDA Forest Service, Rochester Institute of Technology, West Virginia University, and other institutions. The applicant will have the opportunity to collect data (in a learning setting) with the same instruments used in the fire experiments to understand their intricacies and limitations. Depending on Covid restrictions, the applicant may have the opportunity to visit several field sites, interact with other scientists and fire managers, and observe a prescribed burn. Disciplines: Applied Mathematics, Probability and Statistics, and Topology Hosting Site:USDA Forest Service, Northern Research Station Internship location: Morgantown, WV Mentors:


Yes  USACEBarker1  11/27/2023  1701061200000  U.S. Army Corps of Engineers, Cold Regions Research and Engineering Laboratory (CRREL)  Fairbanks, AK  Analysis, Probability and Statistics 
U.S. Citizenship is a requirement for this internship Project Description:For watersheds and soil systems in the arctic, there is an intense seasonality. Seasonal transitions significantly impact watershed geochemistry and impact the soil thermal regime. In the spring, there is a large increase in water flow as a result of snowpack melting. During summer, the thawing of the active layer extends and the majority of surface water flow is derived from rainfall and the little bit of baseflow that may exist. In the late fall, pore waters are pushed deeper in the soil column and the surface of the soil is frozen, but the active layer is at its’ deepest yearly extent. This time of the year is often not studied because access gets tricky in the winter and everything is assumed to be frozen, but in reality, reactions do still continue to occur and as you move into winter there is a portion of the active layer that remains thawed while the surface is frozen. As deepening of the active layer into previously frozen material is expected with climate change there remains a limited understanding of subsurface geochemistry and elemental behavior during this shoulder season. We have robust datasets that include thousands of thaw depth measurements tied to soil chemistry and soil temperature data where specific correlations, relationships, and key variables need to be determined and may help to elucidate the complicated soil thermal regime in the arctic.The intern will assist with statistical and multivariate analysis, including but not limited to principal component analysis, linear combination fitting, and variance analyses. The intern will join a wellrounded group of scientists in various disciplines like geochemistry, engineering, geophysics, and sensor development, join office meetings, present results, and may assist with field work in Alaska, if interested. The intern should have experience with algorithm development, MATLAB, multivariate statistical analysis, as well as an interest in geochemistry, geology, and/or chemistry. Disciplines: Analysis, and Probability and Statistics Hosting Site:U.S. Army Corps of Engineers, Cold Regions Research and Engineering Laboratory (CRREL) Internship location: Fairbanks, AK Mentors:
Internship Coordinator:


No  LLNLGuenther1  11/27/2023  1701061200000  Lawrence Livermore National Laboratory (LLNL)  Livermore, CA  Applied Mathematics 
Project Description:To enable practically useful quantum computing, this project aims to improve the error rate of logical gates on superconducting quantum devices by augmenting the quantum dynamical model with a datadriven approach to identify and incorporate latent dynamics. Quantum dynamics are often modeled by Schrödinger’s and Liouvillevon Neumann’s equations, that evolve the quantum state and density matrix according to a Hamiltonian model. The Universal Differential Equations approach augments this model by a neural network that is trained from device data to account for unknown dynamics, such as drift in system parameters, control line losses, cross talk, and environmental interactions. The trained augmented model will provide a more accurate simulation of the quantum dynamics in superconducting quantum devices, that will then be used within our optimal control software stack to design control strategies that achieve higher fidelity of quantum gates on noisy quantum hardware. The trained neural network will further be analyzed using symbolic regression to provide a mathematical description and understanding of the latent quantum dynamics. Disciplines: Applied Mathematics Hosting Site:Lawrence Livermore National Laboratory (LLNL) Internship location: Livermore, CA Mentors:
Internship Coordinator:


No  NISTDOGAN1  11/27/2023  1701061200000  National Institute of Standards and Technology (NIST)  Gaithersburg, MD  Applied Mathematics, Geometry, Probability and Statistics 
Project Description:The goal of this project is to develop tools for image and data analysis, by leveraging scientific computing and machine learning algorithms. Various research opportunities exist in the following topics: Disciplines: Applied Mathematics, Geometry, and Probability and Statistics Hosting Site:National Institute of Standards and Technology (NIST) Internship location: Gaithersburg, MD Mentor:
Internship Coordinator:


No  LANLWang1  11/27/2023  1701061200000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Applied Mathematics, Computational Mathematics, Geometry, Probability and Statistics 
Project Description:Large language models (LLMs) such as ChatGPT and LLaMA have had transformative impact on natural language processing, with significant scientific knowledge processing. Better understanding of LLMs may be necessary for further advances in this high interdisciplinary field. This project will involve learning about the LLMs and their applications, mathematical foundations, and explore further possibilities in applying LLMs for physics and other natural sciences, including processing the scientific knowledge information. Disciplines: Applied Mathematics, Computational Mathematics, Geometry, and Probability and Statistics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentor:
Internship Coordinator:


No  NISTJia1  11/27/2023  1701061200000  National Institute of Standards and Technology (NIST)  Gaithersburg, MD or virtual  Applied Mathematics, Computational Mathematics, Mathematical Biology Analysis 
Project Description:Over the past three centuries, significant attention has been dedicated to the study of how mechanical instabilities such as wrinkling, buckling, and blistering cause patterns to emerge in inanimate materials. Can these principles be applied to the analysis of living systems? From the fractallike edges of a kale leaf to the undulating crinkles of the human brain, many of the striking morphologies emblematic of living entities do not arise exclusively from gene expression programs but rather from the presence of mechanical forces. Elucidating this relationship between applied stresses and geometric forms will help guide scientific innovations ranging from the creation of bioinspired materials to the establishment of metrological standards. Students will gain experience with topics in applied and computational mathematics, particularly the calculus of variations, differential geometry, and nonlinear partial differential equations. Disciplines: Applied Mathematics, and Computational Mathematics, Mathematical Biology Analysis Hosting Site:National Institute of Standards and Technology (NIST) Internship location: Gaithersburg, MD or virtual Mentor:
Internship Coordinator:


No  ANLGraziani1  11/27/2023  1701061200000  Argonne National Laboratory (ANL)  Lemont, IL  Applied Mathematics, Probability and Statistics 
Project Description:The purpose of this project is to study large language models (LLMs) that are founded on explicit probabilistic models, rather than on transformertype architectures. The objective is to obtain predictive models for discrete sequence data such as natural language, DNA sequences, chemical formulae, etc. that are transparent in their operation and amenable to furnishing characterizations of predictive uncertainty. The MSGI Intern will participate in theoretical development and exploration of models, and in exploration of corpuses of sequential data to determine model structures appropriate to each type of data. As this is a computational statistics project, the intern should expect to write code implementations of models, and also to code data exploration methods. Equally important is the expectation that the intern will prepare a report that should function as an outline or first draft of a paper on the methods developed in the project. The intern will learn techniques for probabilistic modeling of discrete sequences, and for characterizing the statistical behavior of this type of data. A model in current development uses Gaussian process (GP)type methods, so the intern should expect to learn the theory and practice of GPs. Model training is likely to require use of Argonne's highperformance computing (HPC) facilities, and the intern should also expect to learn to utilize such facilities. Disciplines: Applied Mathematics, and Probability and Statistics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL Mentor:


No  ANLLarson2  11/27/2023  1701061200000  Argonne National Laboratory (ANL)  Lemont, IL  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Expensivetoevaluate stochastic oracles appear throughout the quantum information sciences research field. Unfortunately, optimizing such oracles is considerably more difficult than optimizing deterministic oracles. This is because of the varying output observed for repeated calls to the oracle, even with a fixed set of input variables. This project seeks to develop, analyze, and implement numerical methods for optimizing oracles that are stochastic in nature and that are relatively expensive to evaluate. Application problems include stochastic oracles that appear across the quantum information sciences. We are especially interested in using (polynomial and Bayesian) modelbased methods to identify highquality local optima for such problems. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL Mentors:
Internship Coordinator:


No  ORNLTabassum1  11/27/2023  1701061200000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Analysis, Applied Mathematics 
Project Description:The project aims to explore and implement mathematical computations for causality representing learning in scientific data leveraging deeplearning models like transformers, graph neural networks. Causality in scientific data mostly denotes what process, event, or state of a variable may cause a change of that event or state of one/multiple variables in low/high fidelity scientific data. Causality Learning is a challenging task for many scientific domains due to high dimensional data, unobserved or partially observed physics knowledge, and the availability of limited ground truth. Learning causality or dependency structure among different features (e.g., particleparticle interactions in fluids, neuron connectivity in neuroscience, etc.) that can be directly obtained from lowfidelity scientific simulations can aid in modeling and understanding highfidelity data. Student Responsibilities: The student is expected to have a foundational knowledge of training deep learning models with Pytorch, Python data analysis and visualization libraries pandas, numpy, scikitlearn, matplotlib. The student is expected to implement and test transformer or graph neural networks and test the performance of such models on opensource scientific datasets. The student will collaborate with ORNL staff to validate the performance of causality and address multiple scientific communities at ORNL. Learning objectives:
Disciplines: Analysis, and Applied Mathematics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentors:
Internship Coordinator:


No  ANLRAGHAVAN1  12/7/2023  1701925200000  Argonne National Laboratory (ANL)  Lemont, IL  Applied Mathematics 
Project Description:Dataset imbalance refers to the issue when certain classes are represented by significantly more number of data points relative to others. It is a prevalent issue in machine learning especially classification problems in many scientific applications. This issue materializes itself when the final performance of a model is biased towards the class with a larger number of sample points. One way to correct this bias is to equalize the imbalance and intelligent sampling strategies play a critical role in this procedure. However, due to a lack of efficient approaches, a common way to address the issue involves trial and error driven uniform oversampling of the underrepresented class or undersampling of the overrepresented class. In this project, we will formulate the problem of imbalance in a data batch as an optimization problem and derive conditions which must be satisfied for sampling a balanced data batch. We then integrate the condition into the neural network learning problem. We will develop a game theoretic approach to resolve the tradeoff between the performance of the neural network and the variance in the data. Disciplines: Applied Mathematics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL Mentors:
Internship Coordinator:


No  USDAAmatya1  12/7/2023  1701925200000  USDA Forest Service, Southern Research Station  Analysis, Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:With rapid development in computing and their ability to process big data combined with artificial intelligence theory, extreme learning machine (ELM) and adaptive neurofuzzy inference system (ANFIS) models are gaining attention in large applications. For this study, our focus is limited to computing evapotranspiration (ET, the loss of water to the atmosphere) using highresolution weather data. ET is a critical climate variable that uniquely links the water cycle (evaporation), energy cycle (latent heat flux), and carbon cycle (transpirationphotosynthesis tradeoff), each of which is described by complex processbased mathematical equations and their physical parameters. ET for an ecosystem is a complex and nonlinear process that is difficult to measure accurately and estimate/predict. This complexity can be solved by applying the machine learning techniques with different sets of hydrometeorological input variables. We hypothesize that machine learning models, including artificial neural network (ANN), support vector machine (SVM), and random forest (RF) with different optimization techniques, can predict ecosystem ET better than that by myriad of empirical models available in literature. This study will investigate the performance of different machine learning and deep learning models to predict daily ET using available meteorological and ecohydrological data. The candidate will be introduced to background of datadriven empirical statistical models and their parameters. With their background in Mathematics and Statistics, they will apply a suite of machine learning and deep learning models with optimization techniques to simulate ET values using weather data recorded at the USDA Forest Service Experimental Forest sites in coastal South Carolina, as well as Coweeta Hydrology Laboratory in upland North Carolina, both being used for longterm silvicultural research involving hydrology, ecology, soils, and vegetation. The candidate will be mentored by two hydroinformaticians and will be provided with an opportunity to make the project live and publishable and develop networking opportunities by presenting at the annual Santee Experimental Forest Research Forum and others near the end of the completion of the project. The students will also learn about field experimental studies, hydrologic processes including ET represented by mathematical equations and their prediction uncertainties, realtime monitoring technology, and managing and analyzing the Big Data sets using statistics at the host SEF study site.Disciplines: Analysis, Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:USDA Forest Service, Southern Research Station Mentors:


No  LANLBhattarai5  12/7/2023  1701925200000  Los Alamos National Laboratory (LANL)  Los Alamos, NM or virtual  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Adversarial robustness is a pressing concern in multimodal frameworks, especially in large language models (LLMs) that solve a spectrum of problems like texttoimage, imagetotext, and texttotext translations. Multimodel LLM such as ChatGPT, despite their powerful capabilities, are vulnerable to adversarial attacks. These attacks subtly manipulate input modalities, compromising model responses, and can even be insidiously embedded within training datasets, affecting model embeddings and overall performance. Building upon our previous work, where we utilized the AdversarialTensors framework to effectively mitigate adversarial noises in an unsupervised manner for standard CNNs, we aim to extend this ability to enhance the robustness of multimodal frameworks. Our goal is to purify input modalities from adversarial contaminations before they are processed by LLMs and to fortify model robustness by integrating tensorial defense strategies alongside LORA—a widely adopted, lowcomplexity finetuning model. Moreover, by integrating knowledge graphs, we can harness the structured and interlinked knowledge between input and corresponding embeddings to identify and counteract inconsistencies or adversarial perturbations, thereby enhancing the model's ability to reason and improve its resilience against malicious attacks. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM or virtual Mentors:
Internship Coordinator:


No  LANLKe1  12/7/2023  1701925200000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Computational Mathematics, Mathematical Biology 
Project Description:This project aims to develop spatial mechanistic models of respiratory virus infection (such as SARSCoV2, influenza) and the immune responses to understand how stochasticity and immune regulation impact on viral infection and pathogenesis. Respiratory viruses (such as SARSCoV2 and influenza) cause a high mortality and morbidity every year. Their infections usually result in a wide range of clinical outcomes, from asymptomatic to lethal. There is an urgent need to quantitatively define the specific factors that govern infection outcomes to develop better therapeutic and intervention strategies. Therefore, the work will be critical for rational design of vaccines to prepare for the current or next pandemic caused by respiratory viruses. The intern will learn the mathematical modeling techniques using computer simulations as well as biological knowledge about the processes of respiratory virus infection. In addition, the intern will have the opportunity to develop data science skills through analyzing data from experimental collaborators and develop machine learning models that will be trained on both simulated datasets and experimental datasets to make biologically relevant predictions. This internship will be ideal for a candidate who is interested in developing data science skills to address biological and health science questions.Disciplines: Computational Mathematics, and Mathematical Biology Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentor:


No  LANLTokareva1  12/7/2023  1701925200000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Applied Mathematics, Computational Mathematics 
Project Description:Matrixfree highorder finite element method (MFFEM) is a cuttingedge method to perform fast, robust, and accurate simulations of advectiondiffusion partial differential equations (PDEs) that are ubiquitous in computational physics. However, currently the application of MFFEM is limited to algorithms that are explicit in time. The presence of diffusion or stiff source terms calls for implicit or implicitexplicit timestepping methods, which negatively impacts the efficiency of MFFEM. To overcome this performance issue, we propose to couple fast multigrid (MG) methods for the discrete diffusion operators with MFFEM for advection operators. Geometric multigrid (GMG) methods are one of the fastest classes of linear and nonlinear solvers for implicit equations arising in numerical methods for PDEs. GMG methods typically rely on a hierarchy of nested structured meshes, and the physical equations are solved on ""coarse"" meshes to provide corrections to the solution on the original ""fine"" mesh. For unstructured meshes that are moving or adapting in time, it is not obvious how to construct a mesh hierarchy, and algebraic multigrid (AMG) methods provide a less intrusive multigrid option that are often used instead. However, AMG methods require assembling a full matrix for the discretization, and, although efficient, do not compete with GMG in terms of computational efficiency, particularly on GPUs. Some previous work has considered semigeometric MG (sGMG) methods, where an initial simplicial (triangular) mesh is coarsened to construct a nested grid hierarchy on which to apply GMG. We are, however, interested in coupling implicit solvers to Arbitrary LagrangianEulerian simulations of hydrodynamics, where quadrilateral and hexahedral (nonsimplicial) meshes offer significantly higher physics fidelity. The goals of the proposed project are: (1) the development of novel sGMG solvers for the advectiondiffusion PDEs on quadrilateral and hexahedral meshes and (2) efficient implementation in LANL’s open source GPUbased code Fierro (https://lanl.github.io/Fierro). Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentors:


No  ORNLMoriano3  12/7/2023  1701925200000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN or virtual  Analysis, Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Many complex systems are usually represented by networks (e.g., communication networks, power grids, social networks, etc.). Communities or clusters are key to understand the structure and function of these complex systems because communities represent important functional modules in networked systems. Therefore, there is an increasing interest in understanding the limits of the robustness of the community structure. This is because maintaining the functionality of networked systems is heavily dependent on preserving their community structure. Given ORNL's expertise on modeling and simulation of complex systems using leadership computing facilities, this project will take advance of modern data science, machine learning, and network science techniques, or any technique of interest to the participant that could help on better understand the limits of the robustness of the community structure in interconnected systems. We are highly interested in exploring data clustering techniques that use graph embeddings for clustering both synthetic and empirical realworld networks. We will seek to understand the limitations of current graph embedding methods for clustering tasks and propose ways to alleviate these limitations. This project will allow the participant to actively drive an exciting facet of an ongoing research project at ORNL, and have their contributions directly integrated into the Computer Science and Mathematics Division research priorities. A successful student has prior experience with data science techniques, machine learning, and network science, but is not expected to have deep experience with programming. Notably, prior projects at ORNL by interns in this team have led to published papers at top interdisciplinary/applied math journals. Examples include:
Based on the findings here, we will also seek to publish a paper in a major data science/applied math venue with the participant as the lead author. Feel free to reach out if questions. Disciplines: Analysis, Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN or virtual Mentor:
Internship Coordinator:


No  ORNLMoriano4  12/7/2023  1701925200000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN or virtual  Analysis, Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:The promise of ubiquitous connectivity brought by emerging technologies (i.e., 5G/6G) will enable interconnected systems (e.g., Internet of things (IoT), smart cities, science facilities, among others) to work faster and more efficiently. However, ubiquitous connectivity comes at the expense of augmenting the attack surface and the risk of suffering from advanced cyber attacks. Detection and adaptation to such threats requires collecting, filtering, and analyzing event data of heterogeneous scale, speed, and modality. The design and deployment of trustworthy methods and algorithms that can accommodate volume, velocity, and variety is key for near realtime detection, continuous adaptation, and quick attack recovery is challenging and requires innovation. Thus, to address the emerging challenges of next generation of cyber threats, the goal of this project is to design and implement a suite of algorithms that allow near realtime detection of adaptive cyber attacks in interconnected systems. Given ORNL's expertise in acquisition and processing of data for enterprise and cyber physical domains using leadership computing facilities along with leading researchers in applied mathematics, computer science, statistics, and cybersecurity, this project will take advance of modern data science, machine learning, or any technique of interest to the participant (including but not limited to multivariate time series analysis, clustering, graph mining, etc.) that could help to design and implement algorithms that will enable quick detection, adaptation, and recovery from sophisticated and evolving cyberattacks targeting modern interconnected systems. Learning objectives for the applicant include: (1) develop a basic understanding of novel cyberattacks affecting modern interconnected systems; (2) design and development of datadriven models for detecting advanced cyberattacks; (3) validate the efficacy of developed algorithms using empirical data and compare its performance against stateoftheart classical approaches. This project will allow the participant to actively drive an exciting facet of an ongoing research project at ORNL, and have their contributions directly integrated into the Computer Science and Mathematics Division research priorities. A successful participant has prior experience with data science techniques and machine learning but is not expected to have deep experience with programming. Notably, prior projects at ORNL by interns in this team have led to papers published at top interdisciplinary/applied mathematics journals. Examples include:
Based on the findings here, we will also seek to publish a paper in a major data science/applied math venue with the participant as the lead author. Feel free to reach out if questions. Disciplines: Analysis, Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN or virtual Mentor:
Internship Coordinator:


No  LLNLSarracino1  12/7/2023  1701925200000  Lawrence Livermore National Laboratory (LLNL)  Livermore, CA or virtual  Computational Mathematics, Foundations 
Project Description:Kleene algebra with tests (KAT) is an algebraic technique for mathematical reasoning about the behavior of computer programs. In this project we will explore the semantic foundations of KAT as applied to the task of program verification. Potential projects include:
Disciplines: Computational Mathematics, and Foundations Hosting Site:Lawrence Livermore National Laboratory (LLNL) Internship location: Livermore, CA or virtual Mentor:
Internship Coordinator:


No  ORNLHauck1  01/4/2024  1704344400000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Applied Mathematics, Combinatorics, Computational Mathematics 
Project Description:The goal of this project is to investigate the use of machine learning tools for solving combinatorial optimization problems. Such problems seek to optimize an objective function defined over a discrete set of possible choices, the size of which is typically too large to exhaustively search. These problems are often approached using principled heuristics, techniques to reduce the search space, or approximate solution strategies. In this project, the student will learn about optimization, machine learning, and the use of related software tools. The student will be exposed to applications in material science and enzyme engineering. They will collaborate with research staff in mathematics and scientific computing and have the opportunity to present their research at the end of the summer.
Disciplines: Applied Mathematics, Combinatorics, and Computational Mathematics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentor:
Internship Coordinator:


No  LANLTaitano1  12/7/2023  1701925200000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Applied Mathematics, Computational Mathematics 
Project Description:The RosenbluthFokkerPlanck (RFP) equation (also referred to as the LandauFokkerPlanck) is a partial differential equation with an anisotropictensordiffusionadvection form that is regarded as the first principles model for describing the dynamical evolution of the plasma particle distribution function (PDF) undergoing a longrange binary Coulomb interaction. As such, the equation has important applications in modeling the evolution of plasmas in thermonuclear fusion devices. One of the challenges of numerically evolving the RFP equation is the multiscaled nature of the equation, where the characteristic collisional relaxation timescale in which the PDF relaxes to a Gaussian can vary by many orders of magnitude within a real system. This difficulty can be cast mathematically as a singularly perturbed PDE, where a small parameter proportional to the relaxation time scale appears in the denominator before the RFP operator, making it stiff as the parameter vanishes. The other difficulty in numerically solving the RFP equation is retaining the equation's structural properties in the discrete, such as the Boltzmann Htheorem, where entropy monotonically increase until an equilibrium distribution is reached. To deal with these challenges, the student will learn from their mentors to develop a discretization scheme that satisfies the structural properties of the RFP equation and an efficient solver based on a nonlinear multigrid scheme to deal with the stiffness.Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentors:
Internship Coordinator:


Yes  USACEAffleck1  12/7/2023  1701925200000  U.S. Army Corps of Engineers, Cold Regions Research and Engineering Laboratory (CRREL)  Hanover, NH or virtual  Applied Mathematics, Probability and Statistics 
U.S. Citizenship is a requirement for this internship Project Description:The Army maintains stringent requirements for equipment in the Arctic operational environment and expresses concerns of risk to mission failure for sustaining the forces at temperatures down to 65^{o}F, wind speeds greater than 100 mph, and 25 lb/ft^{2} snow load. However, these conditions may only be encountered on rare occasions, and in certain subregions of the Arctic. In this new start ESTCP project, we explore refining the Arctic climatic zones characterization relative to these design thresholds. This includes identifying the circumpolar north climate breaks for environmental parameters to narrow the contingency, logistics and resources needed, and develop sitefocused requirements for the Services to operate across the region. To accomplish this, we intend to include three climate characteristics (air temperature extremes, snow depth and wind speed limits) to the currently defined climate zones and conduct analysis against materiel and equipment exposure thresholds. This will provide mission critical data and information for operations managers to make better riskinformed decisions and to operate equipment, materiel, and force projection in the circumpolar region. Disciplines: Applied Mathematics, and Probability and Statistics Hosting Site:U.S. Army Corps of Engineers, Cold Regions Research and Engineering Laboratory (CRREL) Internship location: Hanover, NH or virtual Mentor:
Internship Coordinator:


No  ANLLeyffer1  12/7/2023  1701925200000  Argonne National Laboratory (ANL)  Lemont, IL or virtual  Computational Mathematics, Probability and Statistics 
Project Description:We are exploring new optimization methodologies to optimize with digital twins. Optimization methodologies are a critical component to make digital twins a reality. They are needed in the optimal design of experiments that integrate data acquisition with simulation, the optimal steering of processes or devices modeled as digital twins, and the connection to neighboring digital twins. In this project, we will explore the use of distributed optimization techniques to control a swarm of vehicles used to optimally acquire environmental data along different paths. The project combines the solution of PDEs with the design of experiments and optimal control. Our goal is to build a prototype model and computational simulation using python and Fenics. Disciplines: Computational Mathematics, and Probability and Statistics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL or virtual Mentor:
Internship Coordinator:


No  LANLRomeroSeverson1  12/7/2023  1701925200000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Applied Mathematics, Computational Mathematics, Mathematical Biology 
Project Description:Viruses with segmented genomes such as Crimean Congo hemorrhagic fever virus (CCHFV), hantaviruses, and influenza can form chimeric viruses when an individual host becomes coinfected by distinct variants of a given virus. This reassortment of genomic segments is postulated to be an important aspect in the emergence and reemergences of both old and new infectious diseases. The ultimate goal of this project is to answer a scientific question: can we determine if a virus experienced a genomic reassortment event in the past that increased its fitness by looking at contemporary genomic samples of that virus? The challenge is that there are no robust offtheshelf methods that can infer the presence of genomic reassortment (see https://doi.org/10.1101/2023.09.20.558687 for our preprint on this issue). Further, while there are interesting candidate methods that might allow for robust maximum likelihood inference of genomic reassortment from phylogenetic trees, it is unlikely that those methods will have reasonable computational efficiency. This project seeks to address that issue by 1) building a mechanistic model to simulate both the transmission dynamics and the viral evolution of a segmented virus in an idealized population, 2) build a Deep Neural Network mapping the simulated phylogenetic trees to the model parameters such as increased contagiousness of the reassorted virus, and 3) infer the presence of relevant genomic reassortment using real genetic sequence data from CCHFV. Participants will gain handson experience in epidemiological research, enhance their mathematical modeling, statistical analysis, and machine learning skills, and prepare for future careers in academia, research, or related industries. This opportunity contributes to vital scientific discoveries and equips students with highly soughtafter skills in computational biology and public health research. Disciplines: Applied Mathematics, Computational Mathematics, and Mathematical Biology Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentors:
Internship Coordinator:


No  FNALMrenna1  12/7/2023  1701925200000  Fermi National Accelerator Laboratory (FNAL)  Batavia, IL or virtual  Applied Mathematics, Computational Mathematics 
Project Description:This project will build models based on observed discrepancies between theory computer simulations and data at the Large Hadron Collider that fill in the physics behind these gaps. The models will be developed using artificial intelligence (AI) with two distinguishing and novel features: we will learn (1) corrections to an existing physics model to describe data using reweighting and (2) symbolic relationships between theory features to construct better physics models. A mature version of the developed tools will, for example, significantly reduce the measurement uncertainties on the top quark mass and possibly aid in the discovery of a new, weaklyinteracting particle. We propose to complete the early stages of this project with following goals: (1) Using techniques such as weakly supervised machine learning to automatically identify collider data in the Rivet analysis library that is poorly described by theory predictions and has no obvious physics explanation (2) Developing an AIbased regression model to describe the gap between theory and data using the Pythia event generator (3) Constructing metamodels, using Simulationbased inference (SBI) for example, to determine correlations between the gap and the latent variables of the generator. These metamodels will be used to supplement the standard generator predictions and provide candidate physical explanations and descriptions of the data. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Fermi National Accelerator Laboratory (FNAL) Internship location: Batavia, IL or virtual Mentors:
Internship Coordinator:


No  FNALMrenna2  12/7/2023  1701925200000  Fermi National Accelerator Laboratory (FNAL)  Batavia, IL or virtual  Computational Mathematics 
Project Description:Monte Carlobased event generators are essential tools for analyzing and interpreting data from particle collision experiments. Currently, the physical evolution of a particle collision, and hence the computing algorithms used to make predictions, are serial. On the other hand, many of our computations in the future will be done on chips with GPUs. This project aims to explore how the physics algorithms could be restructured to exploit acceleration. As a first case, we would explore the parton shower algorithm. Disciplines: Computational Mathematics Hosting Site:Fermi National Accelerator Laboratory (FNAL) Internship location: Batavia, IL or virtual Mentors:
Internship Coordinator:


No  LBNLLi2  12/7/2023  1701925200000  Lawrence Berkeley National Laboratory (LBNL)  Berkeley, CA or virtual  Computational Mathematics, Probability and Statistics 
Project Description:Randomized algorithms, such as sketching, sparsification, and streaming, are powerful dimensionality reduction tools for analyzing large. The applications for DOE include computational linear algebra (such as solvers, preconditioners, and lowrank approximation) and graph analysis (e.g., graph partitioning, clustering, and graph learning). Many impressive advances in the theory of randomized algorithms have not been translated into practical demonstrations. This project aims to bridge this gap between theory and practice, and to produce highquality software running on modern HPC hardware with demonstrations in application codes. Disciplines: Computational Mathematics, and Probability and Statistics Hosting Site:Lawrence Berkeley National Laboratory (LBNL) Internship location: Berkeley, CA or virtual Mentor:
Internship Coordinator:


No  ORNLLaiu2  12/14/2023  1702530000000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Federated learning builds global predictive models on a central server from data distributed on multiple local devices without sharing local data. Standard federated learning algorithms train models on each local devices and require communications of the model parameters between the central server and local devices. For learning complicated models with many parameters, the communication cost often becomes significant. This project aims to investigate both randomized and deterministic approaches for reducing the communication cost in federated learning algorithms. The student will learn about basic federated learning algorithms, dimensional reduction techniques, numerical analysis, and writing/presentation skills.Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentor:


No  LLNLCHOI1  12/14/2023  1702530000000  Lawrence Livermore National Laboratory (LLNL)  Livermore, CA  Analysis, Applied Mathematics, Mathematics (General), Operations Research, Probability and Statistics 
Project Description:We are developing efficient physicsinformed neural network reduced order models (NNROMs) to accelerate complicated, largescale physical simulations. Currently, our physicsinformed NNROM can reduce the dimensionality of an advectiondominated 2D Burgers simulation to a latent space of 5 with a relative error with respect to the corresponding full order model of less than 1% and accelerate the full order model simulation by a factor of 10, which cannot be achieved by any machinelearning black box approach. We plan to extend the ROM to largescale problems, such as advectiondominated hydrodynamics, transport problems, turbulence, and Rayleigh–Taylor instability simulations. We expect our NNROM will achieve a higher speedup when it is applied to largerscale problems. A student participating in our research project will first learn what our NNROM can do for the 2D Burgers simulation and then extend it to a new physics problem by training an autoencoder neural network and implementing NNROM on more complex problems, such as shockmoving hydrodynamics, porecollapse dynamics, particle transport, plasma physics, and earthquake inverse problems. Depending on the results, we will write a journal paper together. The experience of implementing NNROM will let the student to apply it to other problems, including those that may be part of the student’s Masters or PhD thesis. Disciplines: Analysis, Applied Mathematics, Mathematics (General), Operations Research, and Probability and Statistics Hosting Site:Lawrence Livermore National Laboratory (LLNL) Internship location: Livermore, CA Mentor:
Internship Coordinator:


No  LLNLChoi2  12/14/2023  1702530000000  Lawrence Livermore National Laboratory (LLNL)  Livermore, CA  Analysis, Applied Mathematics, Mathematics (General), Operations Research, Probability and Statistics 
Project Description:We are developing efficient latentspace dynamics identification (LaSDI) algorithms to accurately accelerate parametric and complex physical systems. The reduced space dynamics after compression are often much simpler than the corresponding full space dynamics. Therefore, various models can be fit to identify the hidden dynamics in the reduced space, which in turn can be used to predict system response to new input parameter. We have successfully applied the latentspace learning algorithm, so called LaSDI, to accurately accelerate various benchmark problems, such as advection equation, Burgers’ equation, and heat conduction problems. A student participating in our research project will first learn our existing tool box, LaSDI and gLaSDI. Then he or she will further improve LaSDI by exploiting other latent space model and extend it to more complex problems, such as shockmoving hydrodynamics, porecollapse dynamics, particle transport, plasma physics, and earthquake inverse problems. Depending on the results, we will write a journal paper together. Our LaSDI is applicationagnostic, so by the end of summer, the student will be able to apply the improved LaSDI method to a broad range of physical simulations, including those that may be part of the student’s Masters or PhD thesis. Disciplines: Analysis, Applied Mathematics, Mathematics (General), Operations Research, and Probability and Statistics Hosting Site:Lawrence Livermore National Laboratory (LLNL) Internship location: Livermore, CA Mentor:
Internship Coordinator:


No  LANLDesantis2  01/3/2024  1704258000000  Los Alamos National Laboratory (LANL)  Los Alamos, NM or virtual  Analysis, Applied Mathematics, Probability and Statistics, Topology 
Project Description:The Earth system represents a sophisticated, nonlinear dynamic system. Traditionally, climate modeling is conducted through extensive Earth System Models, which are comprehensive computational structures designed to simulate interactions among Earth's various elements. Due to their enormity, these models encompass large datasets, posing challenges for direct analysis. In recent years, Koopman operator theory has emerged as a mathematically rigorous method for analyzing and modeling dynamic systems. It is increasingly favored for its datadriven approach. The focus of this summer's research project is to enhance modern Koopman techniques for a more effective characterization of the nonequilibrium and chaotic dynamics present in the Earth system. Our goals include developing and implementing innovative Koopman machine learning models tailored for Earth system applications. Disciplines: Analysis, Applied Mathematics, Probability and Statistics, and Topology Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM or virtual Mentor:
Internship Coordinator:


No  FNALKurkcuoglu3  01/3/2024  1704258000000  Fermi National Accelerator Laboratory (FNAL)  Batavia, IL or virtual  Algebra or Number Theory, Applied Mathematics 
Project Description:Quantum computing is a promising venue to solve many problems which are classically complex. It’s been demonstrated that some of the quantum algorithms have clear advantage over the classical algorithms. However, quantum errors prevent the scalability of the quantum computers. We will theoretically explore the various generalized quantum errors in qubits and qudits, and classical tensor network algorithms, and find the interplay between these errors. Time permitting, actual implementation of the errors on real quantum hardware and a simple classical tensor network algorithm could be implemented. For this position, prior knowledge about quantum computation, tensor networks are not needed, however an exposure could be helpful. Strong background in linear algebra and programming is preferred, although the candidate is free to choose a programming language to use. This project will be conducted in a team setting under the primary direction of researchers at Fermilab. The entire project may be done remotely, with frequent video meetings and the use of other communication tools (e.g., Slack, email), or we may offer housing onsite at Fermilab, based on availability. Disciplines: Algebra or Number Theory, and Applied Mathematics Hosting Site:Fermi National Accelerator Laboratory (FNAL) Internship location: Batavia, IL or virtual Mentors:


Yes  USACEEllison1  01/3/2024  1704258000000  U.S. Army Corps of Engineers, Geospatial Research Laboratory (GRL)  Applied Mathematics 
U.S. Citizenship is a requirement for this internship Project Description:Battle damage assessment involves assessing the physical damage to buildings and infrastructure from satellite images taken before and after an event. This project has focused on developing automated methods of doing so using neural networks. However, lack of generalization from one geographic region to another, limited labeled data, and the need for a confidence measure make this a complex problem. An intern on this project will have the opportunity to explore one of these factors (or another related aspect). We are seeking candidates with experience in programming (preferably python) and an interest in applying mathematical or statistical principles to computer vision and machine learning. Disciplines: Applied Mathematics Hosting Site:U.S. Army Corps of Engineers, Geospatial Research Laboratory (GRL) Mentors:
Internship Coordinator:


Yes  NISTSu1  01/3/2024  1704258000000  National Institute of Standards and Technology (NIST)  Applied Mathematics, Computational Mathematics 
U.S. Citizenship is a requirement for this internship Project Description:This project will explore visual analytics aspects of perspectives and interpretations applying Riemannian geometry and competitive linear and nonlinear modelbased dimension reductions in applications ranging from wind energy infrastructure and computer vision to materials science and next generation communications technology. Example applications include: statistics of material microstructures, image and time series classification, and bicriteria optimization for next generation wind turbine design/measurement as well as coexistence in wireless spectrum sharing. Using scientific visualization package like ParaView, the project will attempt to visualize and analyze applied mathematics. The position requires a candidate curious to explore some or all of the following topics in applied mathematics: (i) novel lowdimensional (< 4) visualization and approximation methods, (ii) computational differential geometry over matrix manifolds, (iii) linear and nonlinear dimension reduction, (iv) sensitivity analysis over Riemannian manifolds, and (v) regularized representations of planar curves and embedded surfaces. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:National Institute of Standards and Technology (NIST) Mentors:


Yes  USACEPan1  01/3/2024  1704258000000  U.S. Army Corps of Engineers, Geospatial Research Laboratory (GRL)  Alexandria, VA or virtual  Applied Mathematics, Biometrics and Biostatistics 
U.S. Citizenship is a requirement for this internship Project Description:The identification of fallow fields from remote sensing images is challenging because of their similar spectral properties to recultivation or transitional classes. However, the temporal changes in spectral properties in each field helps to identify if a field is fallow or not. Fallow fields are possible indicators of several socioeconomicclimatic changes including drought, changes in policies, and war. For these reasons, the ability to detect where and when fields are transitioning to or from a fallow state, is imperative to mitigate potential conflict and humanitarian disasters. This project will implement a temporal segmentation algorithm to detect fallow fields using multispectral time series. The time series will then be enumerated to spatial segments, using the Segment Anything Model (SAM) or other opensource methods. The last step is to apply a temporal segmentation algorithm to the spatially enumerated time series to detect changes in the spectral signature, defining a conversion from cultivated to fallow. The student should have a working knowledge of ML and experience using Google Earth Engine or preferably Python using geospatial datasets. The student does not need to be an expert coder, one outcome of the internship is that the student will strengthen their coding skills and familiarity with opensource libraries and their application to solve geospatial problems. What is more important, is that the student can participate independently and is selfmotivated. *This topic is flexible. Replace ‘fallow fields’ with any possible feature (forest disturbance, wildfires, wetlands, etc.) – what is important is to develop the Earth Engine or STAC Framework to handle remotely sensed timeseries data and research a topic that is of interest to you. The intern can be virtual or onsite in Alexandria, VA. The mentors are remote. Disciplines: Applied Mathematics, and Biometrics and Biostatistics Hosting Site:U.S. Army Corps of Engineers, Geospatial Research Laboratory (GRL) Internship location: Alexandria, VA or virtual Mentors:
Internship Coordinator:


Yes  USACELasko1  01/3/2024  1704258000000  U.S. Army Corps of Engineers, Geospatial Research Laboratory (GRL)  Analysis, Applied Mathematics 
U.S. Citizenship is a requirement for this internship Project Description:The selected intern will be placed within a research team focused on geospatial analytics and prediction. The team is comprised of several PhD and MS level scientists with backgrounds in satellite remote sensing, GIS, or the physical sciences. The intern should expect to gain valuable experience in remote sensing methods, applied machine learning, and cloud computing during the course of this internship project. Deliverables are a key aspect of a professional portfolio and enable research to be more impactful within the organization. Accordingly, the intern will be supported by the mentor to publish the findings from this project in either a peerreviewed journal article within the remote sensing discipline, or in a governmental technical report which is published online by the organization. The specific project topic is flexible based on the interests and skills of the intern as long as it aligns within the general topic of geospatial analytics and prediction. One possible research project idea would leverage Synthetic Aperture Radar (SAR) with multispectral imagery observations in order to gap fill cloud cover from satellite imagery and generate more complete time series of land surface observations. While some research has been done in this domain, more advanced deep learning algorithms, such as long short term memory networks, are needed to increase the accuracy of this project. Ultimately, the output from this project would be impactful and directly incorporated into decision support tools and/or transition of technology to our Army Enterprise customers. It should be noted that the primary mentor is remote, however, he is available for virtual support and guidance. Other comentors and interns will be located onsite. Disciplines: Analysis, and Applied Mathematics Hosting Site:U.S. Army Corps of Engineers, Geospatial Research Laboratory (GRL) Mentor:
Internship Coordinator:


Yes  USACELasko2  01/3/2024  1704258000000  U.S. Army Corps of Engineers, Geospatial Research Laboratory (GRL)  Analysis, Applied Mathematics 
U.S. Citizenship is a requirement for this internship Project Description:The selected intern will be placed within a research team focused on geospatial analytics and prediction. The team is comprised of several PhD and MS level scientists with backgrounds in satellite remote sensing, GIS, or the physical sciences. The intern should expect to gain valuable experience in remote sensing methods, applied machine learning, and cloud computing during the course of this internship project. Deliverables are a key aspect of a professional portfolio and enable research to be more impactful within the organization. Accordingly, the intern will be supported by the mentor to publish the findings from this project in either a peerreviewed journal article within the remote sensing discipline, or in a governmental technical report which is published online by the organization. The specific project topic is flexible based on the interests and skills of the intern as long as it aligns within the general topic of geospatial analytics and prediction. One research idea would use readilyavailable cloud computing platforms with time series of satellite imagery observations to quantify trends in land cover changes and disturbance events across broad geographic areas. Some specific inquiries to address would include: 1) How have changes to earth's land cover varied over time within protected conservation areas versus nonprotected areas? 2) How accurately can we model building heights from satellite remote sensing imagery using multiple satellite sensor images and ancillary geospatial data? For both of these example ideas, geospatial datasets, ground truth, and a computing platform are all available for analysis. The mentors are available to guide the intern and provide technical assistance for those who may be unfamiliar with GIS or remote sensing. It is not expected that the intern have GIS or remote sensing experience. Ultimately, the output from this project would be impactful and directly incorporated into decision support tools and/or transition of technology to our Army Enterprise customers. The primary mentor is remote, however, he is available for virtual support and guidance. Other comentors and interns will be located onsite and available as well. Disciplines: Analysis, and Applied Mathematics Hosting Site:U.S. Army Corps of Engineers, Geospatial Research Laboratory (GRL) Mentor:
Internship Coordinator:


No  ANLRaghavan3  01/3/2024  1704258000000  Argonne National Laboratory (ANL)  Lemont, IL  Analysis, Applied Mathematics, Probability and Statistics 
Project Description:Continual learning (CL) is a field concerned with learning a series of interrelated tasks that are typically defined in the sense of either regression or classification. In recent years, CL has been studied extensively when these tasks are defined using Euclidean data – data, such as images, that can be described by a set of vectors in an ndimensional real space. However, when the data corresponding to a CL task is nonEuclidean– data, such as graphs, point clouds or manifold, the notion of similarity in the sense of Euclidean metric does not hold. For instance, a graph is described by a tuple of vertices and edges and similarities between two graphs is not well defined through a Euclidean metric. Due to this fundamental nature of graphs, developing and demonstrating efficacy with CL for nonEuclidean data presents several theoretical and methodological challenges. In particular, CL for graphs requires explicit modelling of nonstationary behavior of vertices and edges and their effects on the learning problem. To obviate this necessity, in our prior research we have developed an adaptive dynamic programming viewpoint for CL with graphs. Here, we formulate a twoplayer sequential game between the act of learning new tasks (generalization) and remembering previously learned tasks (forgetting). The goal of this new project is to apply the theoretical principles of graph continual learning to scientific applications and demonstrate that, the two player principle formulated theoretically will improve the performance. In particular, we will start with a literature survey and identify the state of the art methods and datasets where the graph continual learning is relevant. Next, we will formulate a comparative study between the present method and the state of the art. Finally, we will consider the spatiotemporal traffic analysis problem and demonstrate the applicability of the graph continual learning approach on the spatiotemporal problem. The applicant will get exposed to the world of lifelong learning, which is becoming very popular in recent times. Moreover, the applicant will be able to understand the impact of theoretical principles on real applications and explore the influence of parameter choices on such applications. The internship will culminate in a research paper that will be potentially submitted to AAAI2025. Disciplines: Analysis, Applied Mathematics, and Probability and Statistics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL Mentors:
Internship Coordinator:


No  ANLZhang3  01/3/2024  1704258000000  Argonne National Laboratory (ANL)  Lemont, IL  Applied Mathematics, Computational Mathematics 
Project Description:Diffusion probabilistic models (DPMs) offer stateoftheart performance for various tasks including highfidelity image generation, image editing, texttoimage generation, voice synthesis and data compression. The sample quality of DPMs depend critically on the guided sampling technique, which requires solving the diffusion ODEs or SDEs defined by the data prediction model. In this project, we aim to develop efficient ODE solvers to accelerate DPMs. We will exploit the semilinear structure of the model, investigate new time integration methods and analyze their convergence and stability properties. The participant will obtain handson experience with diffusion models and highperformance ODE solvers with full GPU support. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL Mentor:


No  ORNLEndeve2  01/3/2024  1704258000000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Computational Mathematics 
Project Description:The goal of this project is to design and analyze numerical methods for modeling electrically conducting fluids as described by the equations of magnetohydrodynamics (MHD). The MHD model is frequently used to model fusion and astrophysical plasmas. In this project, we aim to design and implement structurepreserving (SP) methods for MHD. SP methods aim to capture key properties of continuum models at the discrete level (such as conservation, symmetries, asymptotic limits, and maximum principles), and often improve robustness and accuracy in longterm simulations. For the MHD model, we aim to maintain divergencefree magnetic fields, conservation properties, and retain accuracy and efficiency when the system is close to equilibrium, characterized by low flow and Alfvenic Mach numbers. We will design and analyze a method based on discontinuous Galerkin phasespace discretization. The student will implement numerical methods in a highlevel programming language (e.g., Python or Julia), analyze results, be introduced to computational MHD, and have the opportunity to interact with postdocs and lab staff. Disciplines: Computational Mathematics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentor:
Internship Coordinator:


No  LANLMolinaParis1  01/3/2024  1704258000000  Los Alamos National Laboratory (LANL)  Los Alamos, NM or virtual  Applied Mathematics, Mathematical Biology, Probability and Statistics 
Project Description:RNA segmented viruses have independent genomic segments, or casettes, that are readily swapped when assembling viral particles. Thus, when a host is coinfected with two or more circulating viral strains, it is possible for a novel virus to emerge from the reassortment of casettes. This mode of evolution is more than just academic: for influenza A, the best characterized RNA segmented virus, reassortment has facilitated the formation of pandemic strains in 1957, 1968 and 2009. Genome segmentation has important implications for viral gene expression control and RNA assembly into nascent virions. It also creates the potential for reassortment: the exchange of intact gene segments between viruses that coinfect the same cell. Reassortment is different from recombination since it allows many distinct genotypes to emerge from a single coinfected cell. Not only does segmentation enhance genetic diversification but it plays a unique role in the evolutionary history of segmented viruses due to the rare occasions when a reassortant is successful at a population scale. A striking example from the Bunyaviridae family of the emergence of a novel virus through reassortment is that of Ngari virus. There are six groups of segmented RNA viruses that infect vertebrates. Of relevance to this project are the Bunyavirales, given that of seven epidemicprone diseases prioritised by the WHO 2018 R&D Blueprint as public health emergencies with an urgent need for accelerated research, three are Bunyaviruses: Lassa, Rift Valley and CrimeanCongo hemorrhagic fevers. The intern will assist with the development of a mathematical model to quantify the reassortment probability in Bunyavirus coinfection. The intern will also assist with statistical and multivariate analysis of potential data sets (published in the literature), including but not limited to principal component analysis, fitting, and Bayesian inference. The intern will join a wellrounded group of scientists in various disciplines like mathematical biology, statistical epidemiology, theoretical immunology, and theoretical virology, join office meetings, present results, and may assist with writing a research paper. The intern should have experience with the study of differential equations, branching processes, Markov birth and death processes, algorithm development, computational methods (Python), Mathematica (or equivalent), multivariate statistical analysis, as well as an interest in virology and immunology. Disciplines: Applied Mathematics, Mathematical Biology, and Probability and Statistics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM or virtual Mentors:
Internship Coordinator:


Yes  SNLQuadros1  01/3/2024  1704258000000  Sandia National Laboratories (SNL)  Virtual  Analysis, Applied Mathematics, Computational Mathematics, Geometric Analysis, Algebra or Number Theory 
U.S. Citizenship is a requirement for this internship Project Description:The research project Auto HexDominant Meshing combines the merits of stateoftheart geometric reasoning and artificial intelligence (AI) techniques in Sandia’s wellknown geometry and meshing toolkit Cubit®. The hexdominant meshing has the potential to automate discretizing contiguous 3D domain similar to tet meshing and still provide accurate analysis results similar to hex meshing. The contiguous 3D domains in many industries are represented using the boundary representation (BRep) in many CAD formats. As BRep captures only the skin of a 3D domain, we first extract a skeletal representation called Chordal Axis Transform (CAT). The CAT reduces the 3D shape into a 2D shape with an associated thickness field information. The CAT representation assists in identifying thinwall regions, Tjunctions, holes, extremities, etc. The CAT is further reduced to a weighted graph representation and a few graph algorithms are used to automate geometry decomposition for hexdominant meshing. As this is a new approach, further research is required to enhance geometric reasoning and artificial intelligence techniques on the graphs to improve efficiencies and robustness of hexdominant meshing. Students will be exposed to Cubit®’s geometry, meshing, and AI capabilities, programming, and potentially writing research finding for an international conference. Disciplines: Analysis, Applied Mathematics, Computational Mathematics, and Geometric Analysis, Algebra or Number Theory Hosting Site:Sandia National Laboratories (SNL) Internship location: Virtual Mentor:


No  LANLHentgartner1  01/3/2024  1704258000000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Applied Mathematics, Mathematical Biology, Probability and Statistics 
Project Description:Modeling dynamics of a disease propagating in a population is important to predict health care needs and effectiveness of interventions and mitigation strategies. Standard models of epidemics rely either on mean field approximations that abstract away realworld complexities or agentbased simulations requiring extensive fitting to realworld spatial correlations that are often difficult to measure accurately. This project seeks to develop models that bridge the gap between mean field and agentbased models. The starting point of this project is to view the epidemic from the point of view of a single individual. For that individual, we want to describe statistically the time to infection by modeling the hazard of getting infected. Such a model involves knowledge of the dynamic of the disease in the general population, but it also will account for individual characteristics such as connectedness of the individual to the population and its susceptibility to infection. Our project will seek to aggregate an ensemble of “single individual hazard models” to build a stochastic description of disease evolution in a population, using the language of point processes. Benefits of this new formalism is a better accounting of individual heterogeneity and opportunities to leverage ideas and analysis developed for point processes. Such a formalism is particularly useful to model coinfections. A mathematical description of the aggregation involves modeling dependencies between random processes, approximating the distribution of sums of dependent variables, and applying ideas from point processes to disease modeling. As such, the ideal intern should have a background in applied mathematics, statistics or probability, some experience modeling stochastic processes (and point processes in particular), and an interest in disease modeling. The successful candidate will have the opportunity to participate with a team of disease modelers at the Los Alamos National Laboratory, will be encouraged to give talks and poster presentation of the project, and expected to contribute to a manuscript to be published in a peer reviewed journal. Disciplines: Applied Mathematics, Mathematical Biology, and Probability and Statistics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentors:
Internship Coordinator:


No  NRELKnueven1  01/3/2024  1704258000000  National Renewable Energy Laboratory (NREL)  Golden, CO or virtual  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Under the direction of the mentor, the participant will learn about optimization under uncertainty and its application to the bulk electricity system both mitigate and adapt to climate change. They will also have the opportunity to learn about deploying optimization algorithms on DOE highperformance computing systems. Under the mentor’s guidance, the participant will develop mathematically sound approaches for transmission and capacity expansion as applied to the bulk electricity system to enhance the economics, reliability, resilience, and security of realtime grid operations as power systems both decarbonize and adapt to climate change. A final report and presentation will be delivered on the project at the end of the appointment. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:National Renewable Energy Laboratory (NREL) Internship location: Golden, CO or virtual Mentors:


No  BNLJantre1  01/3/2024  1704258000000  Brookhaven National Laboratory (BNL)  Upton, NY  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:In science and engineering domains, we often turn to mechanistic computer simulation models to predict the behavior of natural systems, such as fluid dynamics, molecular dynamics, or particle interactions in a collider. These simulations are often based on differential equations and provide critical information about the underlying governing equations but often are computationally expensive. To address this limitation, machine learning based approach is neural partial differential equation (NPDE), which seeks to emulate a complex computer model with a simpler partial differential equation (PDE) whose governing equation or “right hand side” is learned from data in the form of a neural network. However, existing work typically do not address the uncertainty associated with the neuralbased emulation model arising from limited training data and function approximation errors. In this project, we will first explore the potential of NPDEs to describe various physical systems, such as those governed by the heat or wave equations, or more complex systems like reactiondiffusion processes describing spatially distributed chemical reactions, or phase separation in selfassembling nanomaterials. Subsequently, we will explore ways to quantify the neural network uncertainty via a Bayesian deep learning framework. To recover useful physical equations from limited simulation data, we will also explore the reduction of overparameterized neural network weight spaces through linear subspace methods and Bayesian inference on this reduced parameter space. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Brookhaven National Laboratory (BNL) Internship location: Upton, NY Mentors:
Internship Coordinator:


No  LBNLIbrahim1  01/3/2024  1704258000000  Lawrence Berkeley National Laboratory (LBNL)  Berkeley, CA or virtual  Applied Mathematics, Computational Mathematics 
Project Description:Graph Neural Networks (GNN) have a wide range of applications in social and physical sciences. Improving the execution efficiency of these networks is notoriously difficult because the performance relies on the sparsity and connectivity pattern of these networks and the mapping to the underlying system. This project aims at analyzing the performance of these GNNs in HPC environments. We also aim to apply transformations to the GNN model to improve their representations and execution efficiency, considering application and system constraints. Students will gain knowledge of executing these in leadership class HPC environments and performance analysis and tuning techniques, algorithmically and through transformation to the model and data representation. Student Requirements: Basic knowledge of deep learning frameworks, such as Tensorflow or PyTorch. Familiarity with performance analysis techniques is desirable but not required. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Lawrence Berkeley National Laboratory (LBNL) Internship location: Berkeley, CA or virtual Mentor:


No  ANLRao3  01/3/2024  1704258000000  Argonne National Laboratory (ANL)  Lemont, IL or virtual  Applied Mathematics, Computational Mathematics 
Project Description:This project aims to develop scalable randomized algorithm for data collection and processing in Bayesian inverse problems. Inverse problems use experimental data to infer parameters governing physical models (e.g., initial or boundary conditions of a partial differential equation). For scenarios where data collection is timeconsuming or expensive, optimal experimental design (OED) for inverse problems seeks to determine experimental conditions for data acquisition that minimize uncertainty} (or maximize information gain) in parameters, predictions, or other quantities of interestsubject to budgetary constraints. We will explore the use of Randomization to mitigate the computational challenges posed by the OED problems. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL or virtual Mentors:
Internship Coordinator:


No  ANLRao4  01/3/2024  1704258000000  Argonne National Laboratory (ANL)  Lemont, IL or virtual  Applied Mathematics, Computational Mathematics 
Project Description:This project will explore efficient solutions methods for solving a Bayesian inverse problem (i.e. recovering model parameters from observations) in systems modeled by stochastic PDEs with highdimensional input data. The solution methods will rely on surrogate models of the PDE solver  a cheaptoevaluate function that maps a sample of the stochastic input to the dependent variable in the PDE. In order to efficiently construct a surrogate with limited data, we leverage several key building blocks from stateoftheart deep learning that encode desirable inductive biases into the model. Additionally, we will also explore sampling methods such as Normalizing flows to sample from the posterior. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL or virtual Mentors:
Internship Coordinator:


Yes  USACEBragdon2  01/3/2024  1704258000000  U.S. Army Corps of Engineers, Cold Regions Research and Engineering Laboratory (CRREL)  Hanover, NH  Applied Mathematics, Computational Mathematics 
U.S. Citizenship is a requirement for this internship Project Description:Object detection algorithms have broad applications and this project is focusing on the application of detecting objects sitting on the ground surface amongst a cluttered background. The images have low signaltonoise ratios and our aim is to minimize the false negatives. The datasets for this project are collected using longwave IR and polarized longwave IR. The MGI intern will explore approaches to combine traditional computer vision detection approaches with machine learning for optimal performance in a cluttered background. Traditional computer vision approaches behave in a predictable way on outofdomain datasets and this project aims to incorporate computer vision features into a machine learning object detection algorithm to result in a more robust detection algorithm. The intern will participate in a multidisciplinary research group, participate in laboratory meetings, and have the opportunity to showcase results. The intern should have some basic experience with algorithm development and machine learning (pytorch). Experience with computer vision is beneficial, but not required. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:U.S. Army Corps of Engineers, Cold Regions Research and Engineering Laboratory (CRREL) Internship location: Hanover, NH Mentors:
Internship Coordinator:


Yes  USACEBragdon3  01/3/2024  1704258000000  U.S. Army Corps of Engineers, Cold Regions Research and Engineering Laboratory (CRREL)  Hanover, NH  Applied Mathematics, Computational Mathematics 
U.S. Citizenship is a requirement for this internship Project Description:Understanding the operating conditions that lead to high or low probability of detection of objects using a given sensor is important in the development of a sensor decision aide tool. This project will focus on developing a machine learning algorithm to predict likely sensor performance given a set of operating conditions. This will require developing a reliable metric for probability of detection that is dependent on the dataset of interest and the sensor being used. This project will consider datasets collected with Longwave IR and polarized Longwave IR cameras with images of objects placed on the ground surface in a cluttered environment. The MSGI intern will explore appropriate metrics to represent probability of detection, which may potentially make use of inhouse developed algorithms. The intern will use these metrics to develop a machine learning algorithm that will take the input of the operating conditions (e.g., meteorological data) and output a probability of detection. There will be a model developed for each sensor type considered and the result of the project will lead into the development of a reliability measure for using a sensor in a given environment for object detection. The intern will participate in a multidisciplinary research group, participate in laboratory meetings, and have the opportunity to showcase results. The intern should have some basic experience with algorithm development, machine learning (pytorch), and data science. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:U.S. Army Corps of Engineers, Cold Regions Research and Engineering Laboratory (CRREL) Internship location: Hanover, NH Mentors:
Internship Coordinator:


No  LLNLPetersson1  01/3/2024  1704258000000  Lawrence Livermore National Laboratory (LLNL)  Livermore, CA  Computational Mathematics, Probability and Statistics 
Project Description:In the current era of noisy intermediate scale quantum (NISQ) computing devices, the control pulses for realizing logical gates frequently need to be recalibrated due to environmental noise and drift of system parameters. The dynamics of a quantum device can be modeled by Schroedinger’s or Lindblad’s differential equations, based on parameterized Hamiltonian and decoherence models in which the parameters, for example, are transition frequencies and decoherence times. Because of the probabilistic nature of the system parameters, they need to be represented by distributions rather than values. These distributions can then be used in a riskneutral optimal control methodology to design control pulses for realizing logical gates that are more resilient to noise, and therefore require less frequent recalibration. In this project, the student will learn about probabilistic characterization of superconducting quantum devices and riskneutral optimal control. The student will also implement a riskneutral algorithm within our open source C++ code Quandary. As a stretch goal, the student will learn how to apply their control pulses on the Quantum Device Integration Testbed (QuDIT) at Lawrence Livermore National Laboratory. Disciplines: Computational Mathematics, and Probability and Statistics Hosting Site:Lawrence Livermore National Laboratory (LLNL) Internship location: Livermore, CA Mentors:


No  PNNLHoward2  01/3/2024  1704258000000  Pacific Northwest National Laboratory (PNNL)  Richland, WA or Virtual  Applied Mathematics 
Project Description:This project is focused on modeling the behavior of complex systems using physicsinformed neural networks and operator networks. Physicsinformed machine learning has shown great promise for a wide range of applications, however, training physicsinformed neural networks and deep operator networks accurately is difficult and computationally expensive. This project will develop unique network architectures and weighting schemes to improve the training of physicsinformed neural networks and operator networks, with applications including fluid dynamics and climate modeling.Disciplines: Applied Mathematics Hosting Site:Pacific Northwest National Laboratory (PNNL) Internship location: Richland, WA or Virtual Mentors:


No  BNLKelly1  01/3/2024  1704258000000  Brookhaven National Laboratory (BNL)  Upton, NY or virtual  Applied Mathematics, Probability and Statistics 
Project Description:Particle accelerators are a central tool for scientific discovery in various fields. From unraveling the governing laws of fundamental particles to advancing medical treatments and understanding the universe's origins. Ensuring the accuracy and reliability of these experiments is of critical importance and requires the development of robust beam control methodologies. Predicting accurate beam positions in particle accelerators via firstprinciples models is crucial for achieving optimal performance and safety in the physical experiment, potentially improving scientific outcomes. However, uncertainties in model parameters can compromise the reliability and robustness of beam position predictions. In this research project, we propose to address this challenge by applying worstcase optimization under uncertainty techniques to guarantee robustness of beam position predictions using Bayesian neural networks (BNNs). The project aims to develop an optimization framework that minimizes the worstcase uncertainty impact on the beam position predictions, ensuring reliable and robust control recommendations to experiment operators given uncertainty in the imperfect datadriven models induced by insufficient data. To achieve this, we will first build a BNN model to map applied current at alignment magnets to the particle beam position, wherein modelform uncertainties are captured by inferred probability distributions over the weights. This allows BNNs to capture and propagate uncertainty throughout the network's computations. Next, we will formulate an optimization problem that leverages worstcase uncertainty optimization principles to minimize the worstcase impact of uncertainties in the BNN model on the beam position predictions. By considering BNN uncertainties as bounded uncertainties and employing robust optimization techniques, we will identify optimal model parameter values that minimize the worstcase uncertainty impact, ensuring reliable performance even under our imperfect models. The proposed framework will be evaluated and validated using particle accelerator beam position data. The student intern on this project will gain practical experience in cuttingedge machine learning topics (i.e., Bayesian neural networks) and applied mathematics (i.e., optimization under uncertainty, uncertainty quantification). In addition, the intern will also gain a practical understanding of the challenges and “art” involved in applying optimization under uncertainty and machine learning techniques to realworld applications, specifically in the field of particle accelerator control. Given the student’s specific interests, we can determine primary areas of focus for the scope of their internship. Disciplines: Applied Mathematics, and Probability and Statistics Hosting Site:Brookhaven National Laboratory (BNL) Internship location: Upton, NY or virtual Mentor:
Internship Coordinator:


No  BNLQian1  01/3/2024  1704258000000  Brookhaven National Laboratory (BNL)  Upton, NY or virtual  Applied Mathematics, Computational Mathematics, Mathematical Biology , Probability and Statistics 
Project Description:With more and more evidence demonstrating the success of largescale foundation models (deep learning neural networks trained with massive datasets) in many science and engineering applications, more efficient and accurate complex system modeling can be achieved leveraging these advancements to enable optimal decision making to alter system behavior towards specific operational objectives. The recent GraphCast AI model has shown superior prediction performance compared to the existing stateofthearts for global weather forecasting, for which previous models are computationally expensive to simulate and whose dynamics are substantially uncertain. Such large AI surrogate models open a door for complex spatiotemporal dynamic modeling in a more computationally tractable manner. When developing such AI models for highly complex realworld systems and problems, major concerns include the requirement of enormous training data as well as extremely demanding computational resources to be able to train and run them. This project aims to better understand such AI models, using GraphCast as a representative, so that we can further improve data and computational efficiency for complex system modeling. We will develop scientific AI/ML techniques that integrates physics first principles into AI foundation models to enable dataefficient spatiotemporal dynamic modeling. We also aim to equip these models with objectivedriven uncertainty quantification (UQ) in a Bayesian paradigm to develop theories and algorithms that help understand the limitations of AI foundation models for complex system modeling and ultimately lead to an effective uncertaintyaware learning procedure of effective surrogates for complex systems. Specific research topics of interest include effective strategies for integrating ODE/PDE/mechanistic models with datadriven models and developing corresponding Bayesian inference to account for potential prediction uncertainty due to inherent data and model uncertainty. Potential applications of this methodology will be discussed with the student and can focus on multiple science and engineering applications, including climate/weather forecasting, as well as other dynamic modeling in other nature and humanengineered systems.Disciplines: Applied Mathematics, Computational Mathematics, Mathematical Biology , and Probability and Statistics Hosting Site:Brookhaven National Laboratory (BNL) Internship location: Upton, NY or virtual Mentors:


Yes  USACEHaley1  01/3/2024  1704258000000  U.S. Army Corps of Engineers, Cold Regions Research and Engineering Laboratory (CRREL)  Hanover, NH  Applied Mathematics, Computational Mathematics, Probability and Statistics 
U.S. Citizenship is a requirement for this internship Project Description:This is to develop a hybridAI based sensor fusion framework for autonavigation of unmanned aircraft systems (UAS), which incorporates physics and human cognitive logic into AI to make the AI more predictable, explainable, and robust. HybridAI is one of the cuttingedge technologies initiated very recently to address the unpredictable behavior of AI and training bias, which are just emerging as its limitations. Decreased dependency upon the quality of the training dataset, which constrains performance, will result in increased hybridAI accuracy and reliability if AI generated features based either on human cognitive logic or via fused physicsbased approaches can be effectively controlled, especially for unseen scenarios. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:U.S. Army Corps of Engineers, Cold Regions Research and Engineering Laboratory (CRREL) Internship location: Hanover, NH Mentors:
Internship Coordinator:


No  LBNLLadiges1  01/3/2024  1704258000000  Lawrence Berkeley National Laboratory (LBNL)  Berkeley, CA  Applied Mathematics, Computational Mathematics 
Project Description:Stateoftheart fluid separation technologies, such as in gas purification, water desalination and chemical processing, involve flows of fluid mixtures across nanoporous graphene and graphene oxide membranes. Fluid dynamics at the nanoscale is predominantly governed by thermal fluctuations and Knudsen effusion, where the classical NavierStokes equations are not valid, and one has to rely on a molecular description. Our group has developed numerical methods for simulating continuum fluctuating hydrodynamics (FHD) for fluids at the nanoscale by incorporating stochastic fluxes that correctly account for intrinsic thermal fluctuations. Furthermore, we also have expertise in using Direct Simulation Monte Carlo (DSMC) methods for a highfidelity, but computationally expensive, molecular representation of the nanoscale fluid. We propose implementing an adaptive algorithm refinement (AAR) hybrid numerical method to simulate gas permeation across these membranes. Using this approach, the nanoscale fluid dynamics will have a highfidelity DSMC representation in the region near the membranes, whereas a continuum FHD will be implemented far from the membrane for enhanced computational performance. In this project, we will collaborate to develop and implement numerical methods to couple DSMC and continuum FHD representation of nanoscale fluid dynamics using AAR. This project will involve collaboration with a team of applied mathematicians and computational physicists in the Center for Computational Sciences and Engineering at Lawrence Berkeley National Laboratory. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Lawrence Berkeley National Laboratory (LBNL) Internship location: Berkeley, CA Mentor:
Internship Coordinator:


No  LANLTang1  01/3/2024  1704258000000  Los Alamos National Laboratory (LANL)  Los Alamos, NM or virtual  Analysis, Applied Mathematics, Computational Mathematics 
Project Description:We propose to develop structurepreserving MLbased surrogates for singularlyperturbed dynamical systems. Conventional solvers for singularlyperturbed systems, such as those arising from the semidiscretization of a kinetic PDE, are generally too expensive for deployment in an innerloop of an optimizer. Reducedorder and MLbased models are typically used as surrogates for these solvers, but they often lack the robustness necessary to accurately model multiple timescales and resolve the limiting dynamics. Our approach involves leveraging structurepreserving properties in MLbased surrogates as a novel strategy to address these challenges. We intend to embed specific mathematical or physical structures robustly to capture the asymptotic behavior associated with the singular limit. Our MLbased methodology will employ key mathematical tools such as numerical linear algebra, advanced time integrators, perturbation methods, and geometric singular perturbation theory. The focus of our applications will be on kinetic models relevant to fusion devices and accelerators, aligning closely with several DOE Office of Science programs. Disciplines: Analysis, Applied Mathematics, and Computational Mathematics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM or virtual Mentors:


No  ORNLZhang1  01/3/2024  1704258000000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN or virtual  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Generative machine learning models, including variational autoencoders (VAE), normalizing flows (NF), generative adversarial networks (GANs), diffusion models, have dramatically improved the quality and realism of generated content, whether it's images, text, or audio. In science and engineering, generative models can be used as powerful tools for probability density estimation or highdimensional sampling that critical capabilities in uncertainty quantification (UQ), e.g., Bayesian inference for parameter estimation. Studies on generative models for image/audio synthesis focus on improving the quality of individual sample, which often make the generative models complicated and difficult to train. On the other hand, UQ tasks usually focus on accurate approximation of statistics of interest without worrying about the quality of any individual sample, so direct application of existing generative models to UQ tasks may lead to inaccurate approximation or unstable training process. To alleviate those challenges, this project will develop several new generative models for various UQ tasks, including pseudoreversible conditional normalizing flow model and its convergence analysis, diffusionmodelassisted supervised learning of generative models, and a scorebased nonlinear filter for recursive Bayesian inference. We will demonstrate the effectiveness of those methods in various UQ tasks including density estimation, learning stochastic dynamical systems, and data assimilation problems. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN or virtual Mentors:
Internship Coordinator:


No  LANLFinkelstein1  01/3/2024  1704258000000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Applied Mathematics, Computational Mathematics 
Project Description:This project spans multiple disciplines: applied mathematics, chemistry, physics, and software development. Students will be supervised by field experts and only need basic coding skills and linear algebra knowledge. We will develop a fast, mixed numerical precision, electronic structure solver that combines cuttingedge highperformance computing with computational science. In standard approaches to quantum chemistry, such as Density Functional Theory, several computational challenges to calculating the electronic structure of molecular systems are encountered. Often, the main computational bottleneck is solving a nonlinear Schrodinger type eigenvalue problem which requires calculating the socalled density matrix describing the electron density. Analytically, this density matrix is obtained by applying the FermiDirac function to the Hamiltonian matrix, which practically translates into matrix diagonalization, an expensive and timeconsuming operation. A robust and useful alternative is to instead approximate the FermiDirac function with a Chebyshev polynomial expansion. Recently, a fast Chebyshev expansion method for GPUs was developed [12]. However, this algorithm is extremely illconditioned, with a condition number that scales exponentially in the polynomial expansion order. For small order expansions calculated in doubleprecision, this does not pose an issue. But in the case when large order expansions are desired, or a low numerical precision is used, accuracy of the resulting density matrix suffers substantially. In this project, we seek to develop and study strategies that mitigate this undesirable numerical behavior in order to obtain high accuracy. The GPU and AI hardware code will be written using CUDA and HIP (for Nvidia and AMD devices, respectively) or using the hardware agnostic MAGMA library [3]. We will develop and test this on LANL’s Darwin compute cluster which has Nvidia A100 GPUs and AMD MI250X GPUs. It may also be necessary to utilize vendor libraries such as Nvidia’s cuBLAS or AMD’s rocBLAS for linear algebra operations. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentor:
Internship Coordinator:


No  ORNLFattebert2  01/3/2024  1704258000000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Applied Mathematics, Computational Mathematics 
Project Description:The Limitedmemory BFGS (LBFGS) algorithm proposed by Broyden–Fletcher–Goldfarb–Shanno is a very efficient optimization algorithm. It makes use of the inverse Hessian matrix associated with the degrees of freedom to optimize, to accelerate the search for a local minima or maxima. It has been used to optimize the geometry of molecules using the forces obtained from the quantum electronic structure surrounding the atoms. One issue in this context though is that function evaluations and their derivatives can be noisy and lead to the algorithm failures when getting close to the energy minimum. In this project, the student will develop a simple LBFGS code, couple it to an existing electronic structure library that will provide function evaluations and their derivatives, in order to explore optimal strategies for memorylength and line minimizations in the presence of noise. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentor:
Internship Coordinator:


No  ORNLFattebert3  01/3/2024  1704258000000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Applied Mathematics, Computational Mathematics 
Project Description:The interaction between two rigid bodies can be generally described by six degrees of freedom when taking their orientations into account. This is specifically the case for two rigid molecules, identical or not. In this project, the student will (i) explore how the potential of interaction between two molecules looks like using a recently proposed compact, symmetric representation of that space, using quaternions (ii) investigate efficient numerical approaches to represent such a function, possibly using machine learning techniques. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentor:
Internship Coordinator:


No  LANLNegre2  01/3/2024  1704258000000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Applied Mathematics, Computational Mathematics 
Project Description:The advent of highperformance computing capabilities, accompanied with highly optimized algorithms, has presented scientific computing with the formidable task of having to handle vast volumes of generated data. This trend has received a substantial impetus from the Exascale Computing Projects (ECP). In the field of Quantum Molecular Dynamics (QMD), the ECP project has enabled the simulation of considerably larger systems over significantly larger timescales. The challenge now revolves around analyzing the generated data to extract crucial information, including, for instance, the identification of chemical reactions. When a reaction happens, electrons get reorganized across the molecule, which translates into an increase of the electronic current flowing through the chemical bonds. In this project we will use the Liouvillevon Newmann equation to compute the currents between bods and detect reactions by monitoring the sudden changes in current flows through the course of a QMD simulation. We will use the LATTE code [1] to perform QMD simulations. LATTE will be compiled using BML[2] and PROGRESS[3] libraries which enables the use of GPUs such as the new A100 Nvidia GPUs recently incorporated on LANL’s Darwin HPC cluster. The student will learn to develop a scientific code using best software practices, including version control method and regression testing. We will use this code to produce simulations and visualize chemical reactions. This project can lead to a fast publication that could be written after the summer appointment. Some previous experience with HPC and Fortran programming is recommended, as well as a fair linear algebra math background. 1 Bock, N., M. J. Cawkwell, J. D. Coe, A. Krishnapriyan, M. P. Kroonblawd, A. Lang, C. Liu, et al. 2008. “LATTE.” https://github.com/lanl/LATTE. 2 Bock, Nicolas, Christian F. A. Negre, Susan M. Mniszewski, Jamaludin MohdYusof, Bálint Aradi, JeanLuc Fattebert, Daniel OseiKuffuor, Timothy C. Germann, and Anders M. N. Niklasson. 2018. “The Basic Matrix Library (BML) for Quantum Chemistry.” The Journal of Supercomputing 74 (11): 6201–19. 3 openhub.net. n.d. “The QmdProgress Open Source Project on Open Hub.” Accessed September 2, 2019. https://www.openhub.net/p/qmdprogress. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentors:
Internship Coordinator:


No  ANLFadikar1  01/3/2024  1704258000000  Argonne National Laboratory (ANL)  Virtual  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Surrogate models are increasingly used in calibrating complex and expensive epidemic simulations. Such surrogate models are used to estimate unknown parameters in the simulation model when only a limited number of simulation runs are possible. Surrogate models are generally statistical or machine learning models that aim to learn the inputoutput relationships from limited simulation outputs. There exist a number of surrogate techniques, some focus on a specific set of aspects of the simulation but all are used as faster alternatives to running the actual expensive simulations the many times needed for characterizing model parameter spaces. In this project, we will investigate an AIbased surrogate technique called neural ordinary differential equations (NODE) and apply it to an epidemic simulation setting. Since typical compartmental epidemic models can be expressed as a set of ordinary differential equations (ODEs), NODEs map naturally to encompass the underlying epidemic dynamics. Having an efficient surrogate model is particularly useful for large simulation models, such as cityscale agentbased models (ABMs) that our research group at Argonne has developed. In particular, the CityCOVID ABM models all of the residents of Chicago (2.7 million individuals) going to and from specific geographic locations (1.2 million locations) on an hourly basis. For this project, we will make use of a simpler (yet highly detailed) mechanistic epidemic model to generate synthetic disease trajectories for which the NODE surrogate will be trained. Later, we will apply the same technique to the more expensive CityCOVID model. This will require the tuning of the NODE surrogate structure and learning parameters (socalled hyperparameters) to develop accurate time series emulation of the model’s epidemic trajectories. Upon successful development of the NODE surrogate for the model, we will be able to apply it to develop fast model calibration approaches that can make use of the computational efficiency of the NODE surrogate and its ability to provide derivatives. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Argonne National Laboratory (ANL) Internship location: Virtual Mentor:
Internship Coordinator:


No  LLNLTekriwal1  01/3/2024  1704258000000  Lawrence Livermore National Laboratory (LLNL)  Virtual  Computational Mathematics, Geometry 
Project Description:Mesh generation has always been a timeconsuming and errorprone process. This is especially true in computational sciences, where meshes must be generated for threedimensional geometries of various levels of complexity. Inspired by past program synthesis work in constructive geometry, we will be using concepts from programming languages and formal logic to facilitate rule based synthesis of automated mesh generation algorithms. Some of the goals of the project are: * Formalize the mesh specifications and geometric characteristics of a mesh, which includes element size, element shape, and element orientation. * Formalize geometric transformations of the mesh, and mesh quality enhancement techniques used by most production level meshing software. * Study stateoftheart mesh generation techniques like advanced front methods, Delaunay triangulation, and develop a refinement rulebased synthesis of the corresponding algorithms. * The refinement method used to go from specification to code, proofs of correctness, and assertions related to mesh quality will be formalized in a stateoftheart theorem prover like Coq, Lean4, Isabelle/HOL. This project will be investigating the intersection between automated proof generation, numerical analysis, and computational geometry. Disciplines: Computational Mathematics, and Geometry Hosting Site:Lawrence Livermore National Laboratory (LLNL) Internship location: Virtual Mentors:
Internship Coordinator:


No  ANLMadireddy2  01/3/2024  1704258000000  Argonne National Laboratory (ANL)  Lemont, IL or virtual  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:This project would involve developing and evaluating robustness and uncertainty quantification methods for large language models (LLMs) and foundation models in scientific domains. The goal is to integrate probabilistic modeling techniques (e.g. conformal statistics, kernel attention) and advanced training procedures to enhance the reliability and controllability of these models in tasks related to climate modeling, chemistry, or cosmology. Furthermore, this would involve exploring adversarial robustness, uncertainty quantification, and finetuning strategies to mitigate inherent vulnerabilities and sensitivities in these models, contributing to the emerging field of robust and reliable AI in scientific research. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL or virtual Mentor:
Internship Coordinator:


No  ANLMadireddy3  01/3/2024  1704258000000  Argonne National Laboratory (ANL)  Lemont, IL or virtual  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:This project aims to leverage a novel unified Bayesian framework for network architecture search and ensembling developed recently in our group to build physicspreserved neural networks for coupled PDEdriven problems such as nuclear fusion and fission. By integrating sparse priors and innovative pruning strategies, the framework not only discovers optimal architectures but also quantifies uncertainty at the neural architecture level. The focus will be on incorporating physics information from differential equations directly into the neural architecture through such an approach, enhancing the model's performance and reliability. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL or virtual Mentor:
Internship Coordinator:


No  LBNLKirst1  01/3/2024  1704258000000  Lawrence Berkeley National Laboratory (LBNL)  Virtual  Applied Mathematics, Computational Mathematics, Mathematical Biology, Probability and Statistics 
Project Description:Social attachments play a fundamental role in most, if not all, levels of human interaction. Secure attachments promote resilience, where insecure or disrupted attachments increase the risk of multiple neuropsychiatric conditions. Despite its clinical significance, social attachment has been difficult to study because traditional model systems like mice do not display longterm social attachment behaviors. Prairie voles strongly engage in social behavior, and further allow to study genetic effects on the social repertoire, including mutations implicated e.g. in autism spectrum disorder (ASD). Understanding the relationship between the brain and behavior requires an integration of neurobiology and ethology in the study of naturalistic behaviors. Progress toward this goal requires the development of tools and analytical frameworks capable of measuring pose and describing pose dynamics. Recent advances in computational neuroethology have enabled highfidelity, highthroughput tracking of animal pose as well as the detection of recurrent behavior modules. However, these existing methods are geared toward the study of individual, rather than social, behaviors. Quantification of the social interactions would permit the identification of shared structures in behavior syllables and neural recording data, yielding hypotheses about the neural mechanisms that underlie behavior selection. In this project we aim (i) to develop advanced machine learning methods to extract and track animal postures from (single or multiview) video recordings of socially interacting animals and (ii) to develop novel algorithms to identify socially interacting motifs. In particular, we will design and train a attention based neuronal networks that track animal postures and skeletons over time by extending current deep object transformer neuronal networks which provide state of the art video object segmentation. A focus will be on extracting postures of interacting animals that frequently occlude each other and apply this to our data sets of socially interacting prairie voles. We will further develop machine learning methods based on inference of switching dynamical systems to detect and extract behavioral motifs form the tracked data. In particular. we will employ hierarchical Dirichlet process hidden Markov autoregressive modeling and extend existing approaches to the situation of multiple interacting agents. The tools developed here will have larger range of applications in the analysis of interaction dynamics in complex systems science. Disciplines: Applied Mathematics, Computational Mathematics, Mathematical Biology, and Probability and Statistics Hosting Site:Lawrence Berkeley National Laboratory (LBNL) Internship location: Virtual Mentor:
Internship Coordinator:


No  NRELGraf1  01/4/2024  1704344400000  National Renewable Energy Laboratory (NREL)  Golden, CO or virtual  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Although increasingly general models like GPT4 have made remarkable strides to wide acclaim, subsequent research have revealed surprising limitations in their ability to perform basic logic, planning, and reasoning. This summer experience will provide a curious student the opportunity to explore neurosymbolic AI and its application to the design and control of energy systems. The student will be mentored by a team of NREL AI scientists to conduct and analyze approaches to AI that combine modern, state of the art neural networkbased methods with classical symbolic approaches. The research will involve integrating AI techniques, such as reinforcement learning, with classical mathematical optimization approaches and implementations, where symbolic and subsymbolic representations are freely intertwined, and learning is ubiquitous. Approaches inspired by cognitive science are especially welcome. Successful students will have creative problemsolving skills and interest in crossdisciplinary collaboration. Familiarity with mathematical and statistical foundations of both neuralbased AI (statistical learning) and symbolic AI (reasoning) is a plus. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:National Renewable Energy Laboratory (NREL) Internship location: Golden, CO or virtual Mentors:
Internship Coordinator:


No  NRELEmami1  01/4/2024  1704344400000  National Renewable Energy Laboratory (NREL)  Golden, CO or virtual  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Machine learning methods for time series play an important role across a spectrum of scientific applications, including renewable energy forecasting and studying the dynamics of complex nonlinear systems. The success of deep learning and increasing availability of large datasets has created interest in global forecasting models. These models are pretrained on a collection of related time series and can be finetuned after pretraining on target time series. Current research challenges for global time series modeling are as follows: 1) related time series can have dynamic ranges that vary widely, even within a single dataset, 2) time series may be continuous, discrete, or mixed, and each may be distributed in arbitrarily complex ways. At the same time, probabilistic modeling over future scenarios remains difficult, yet is critical for many scientific applications. This summer project will engage a student passionate about deep learning, applied math/probability/stats, and energy/climate applications to design a new probabilistic approach for global time series forecasting. Learning opportunities can include (but are not limited to): 1) implementing and evaluating existing advanced probabilistic deep learning methods for global time series forecasting (e.g., C2FAR (Bergsma et al., 2022)), 2) integration of these approaches into an existing benchmark for generalizable building energy forecasting (BuildingsBench: Emami et al., 2023), and 3) propose a novel method to be written up in a report (possibly for publication). Familiarity with PyTorch is a prerequisite for success. The student will be paired with an NREL machine learning scientist as a mentor to guide them along the way. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:National Renewable Energy Laboratory (NREL) Internship location: Golden, CO or virtual Mentors:
Internship Coordinator:


No  LANLAiken1  01/4/2024  1704344400000  Los Alamos National Laboratory (LANL)  Los Alamos, NM or virtual  Applied Mathematics, Probability and Statistics 
Project Description:Aerosols are small particles (<~10 microns in diameter) that are suspended in a gas, like the Earth’s atmosphere. These particles are central to understanding the water cycle and transport of nutrients within the Earth System. A complete understanding cannot be provided without groundbased and vertically resolved observations, particularly for aerosolcloud interactions. The goal of this study is to analyze diverse and large datasets collected by the U.S. DOE Atmospheric Radiation Measurement (ARM) mobile facility during campaigns that we have deployed instrumentation to and are actively involved in with funding from the Atmospheric System Research (ASR). Two examples include the Surface Atmosphere Integrated Field Laboratory (SAIL) campaign in Colorado focused on aerosol impacts on mountain hydrology and the Eastern Pacific Cloud Aerosol Precipitation Experiment (EPCAPE) in California focused on aerosolcloud interactions. Data includes but are not limited to those collected by the ARM, including the Aerosol Observing System (AOS) and the tethered balloon system (TBS) as well as those deployed by LANL with complex 2D and 3D datasets of sizeresolved and single particle chemical composition, such as those collected using online and hightime resolution aerosol mass spectrometry. Specific aims are directed at answering key aerosol process science objectives related to identifying different aerosol regimes, the processes controlling their lifecycles, quantifying impacts on the radiative budget, and the sensitivity of cloud phase and precipitation to cloud condensation nuclei (CCN) and icenucleating particle (INP) concentrations. The intern will learn the mathematical modeling techniques using computer simulations as well as biological knowledge about the processes of respiratory virus infection. In addition, the intern will have the opportunity to develop data science skills through analyzing data from experimental collaborators and develop machine learning models that will be trained on both simulated datasets and experimental datasets to make biologically relevant predictions. This internship will be ideal for a candidate who is interested in developing data science skills to address biological and health science questions.Disciplines: Applied Mathematics, and Probability and Statistics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM or virtual Mentor:
Internship Coordinator:


No  ORNLYuan1  11/17/2023  1700197200000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Adaptive signal timing at signalized intersections crucially relies on precise vehicle arrival forecasting. In this project, we present a cuttingedge PhysicsInformed Language Model (PILM) specifically formulated to improve these predictions. The PILM harnesses observable patterns from traffic data across upstream and downstream detectors, fusing them with the immutable laws of vehicle conservation and sophisticated language processing algorithms that decode both temporal and spatial aspects of traffic sequences. A novel tokenization approach is proposed to tokenize the vehicle arrivals by time along the prediction horizon. The model dynamically incorporates signal timing and queuing theory to account for the complex interactions at intersections. Our model adopts a dataoriented paradigm, allowing it to internalize and anticipate spatiotemporal vehicular arrival patterns, while its physicsaware framework guarantees conformity with tangible traffic movements. Through extensive data modeling and simulations, we demonstrate that our model outperforms existing prediction methods in accuracy and robustness, especially during peak traffic conditions and incidents that disrupt normal flow patterns. The PILM’s predictive insights are critical for intelligent traffic management systems, facilitating improved adaptive traffic signal timing, reducing congestion, and enhancing road safety. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentors:
Internship Coordinator:


No  ORNLHauck3  01/4/2024  1704344400000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Applied Mathematics, Mathematical Biology 
Project Description:This project is to explore different graph neural networks and message passing algorithms for functional annotation of protein residues from protein threedimensional (3D) structures. Proteins are polymers of length N built from 20 different residue types, where N ranges from tens to thousands of residues. Different arrangement of the 20 residues along the protein sequence affects the 3D structure and function. Engineering a protein for specific function requires altering residue types at certain positions along the sequence. Identifying the positions and residue types to change is a combinatorial problem where the search space is prohibitively large with 20N possible choices. A solution to address this challenge is to accurately predict function at the residue level so that the search space is narrowed down to specific residue targets that contribute the highest to the function of interest. Graph neural networks have been used to represent protein 3D structures, where atoms of residues are nodes and connectivity between atoms are edges. However, to use graphs for function prediction, it is necessary for graphs to capture relevant physicochemical properties of residues and residueresidue interactions. Scientifically, it remains unclear what the relevant residue properties and interactions are, and mathematically, it is unclear which graph models are suitable for such representations. To this end, the student will develop and apply different graph neutral networks and message passing algorithms to study protein structurefunction relationship at the residue level. The students will use a dataset comprised of human proteins involved in three biological functions as a benchmark. The goal is to understand how different graph neural networks and message passing algorithms affect resulting residue function prediction and identify key residue properties and interactions for the three functions of interest. Learning objectives: 1) Learn the impact of different graph neural networks and message passing algorithms on functional annotation of protein residues by participating in model implementation and performance analysis 2) Learn the relationship between protein structure and function 3) Meet staff, present research findings, and practice scientific communication skills with broad audience Disciplines: Applied Mathematics, and Mathematical Biology Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentors:
Internship Coordinator:


No  NRELJain1  01/4/2024  1704344400000  National Renewable Energy Laboratory (NREL)  Golden, CO or virtual  Analysis, Applied Mathematics, Computational Mathematics, Probability and Statistics, Topology 
Project Description:Transformers are neural networks that use attention mechanisms to capture longrange dependencies and context in sequential data. They have revolutionized natural language processing, computer vision, and other domains that deal with complex and highdimensional data. Power systems are also dynamic and nonlinear systems that require efficient and robust methods for analysis and control. This project will involve utilizing the Transformer architecture and adapting its applications and mathematical foundations for power system applications, such as shortterm voltage stability assessment, renewable energy integration, and fault detection. Disciplines: Analysis, Applied Mathematics, Computational Mathematics, Probability and Statistics, and Topology Hosting Site:National Renewable Energy Laboratory (NREL) Internship location: Golden, CO or virtual Mentor:


No  NRELJain2  01/4/2024  1704344400000  National Renewable Energy Laboratory (NREL)  Golden, CO or virtual  Analysis, Applied Mathematics, Computational Mathematics, Probability and Statistics, Topology 
Project Description:Large Language Models (LLMs) are powerful neural networks that can process natural language and generate meaningful text. However, how LLMs make the correlations between mathematical computations and the topological information in graph networks is still an open question. Graph networks can represent entities and their relationships as nodes and edges, respectively  widely used to model complex systems and phenomena in various clean energy domains. This project will involve learning about the LLMs and their applications, mathematical foundations, and graph algorithms. It will also explore the methods and prospects of integrating LLMs with graph networks, such as using LLMs to enhance node attributes, perform graph reasoning, and generate graph structures. Disciplines: Analysis, Applied Mathematics, Computational Mathematics, Probability and Statistics, and Topology Hosting Site:National Renewable Energy Laboratory (NREL) Internship location: Golden, CO or virtual Mentor:


No  ORNLSchotthoefer1  01/4/2024  1704344400000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Applied Mathematics, Computational Mathematics 
Project Description:This project is to explore acceleration strategies for lowrank training of neural networks. From a mathematical perspective, training neural networks is a largescale optimization problem, involving billions of degrees of freedom. Training of neural networks is usually performed in Pytorch or similar Python APIs. Lowrank compression and training strategies aim to find lowrank representations of a neural network at hand to enable memory and runtime efficient training procedures. Here the neural networks are decomposed in an SVDlike fashion. Training, i.e. updating the parameters with a numerical optimizer, is performed in only on the factors of the lowrank matrix representation. One key aspect of lowrank compression is the augmentation and truncation of the basis that is used to represent the lowrank network. Numerically this requires the repeated evaluation of linear algebra routines as QR decompositions and SVDs. Although many aspects of neural network training with Pytorch can be efficiently performed on a GPU, the implementations of QR and SVD algorithms are lacking in terms of walltime efficiency. An efficient implementation of these linear algebra routines is an important aspect of lowrank training that determines the broad applicability of the method. To this end, the student will assess the currently available standard implementations of linear algebra routines in Pytorch on modern generalpurpose GPUs. The student will perform a broad runtime comparison for different lowrank neural network architectures and hardware configurations. After mapping the landscape of available methods, the student will implement the bestperforming packages into the codebase for dynamical lowrank training. The goal is to gain an understanding of highperformance neural network training at scale and contribute to the advancement of memory and walltime efficient machine learning. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentor:
Internship Coordinator:


No  ORNLTabassum2  11/27/2023  1701061200000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Analysis, Applied Mathematics 
Project Description:Predicting alloy chemical properties given a set of material characteristics and environmental conditions like temperature is a challenging task due to noisy data, missing features, and multimodality. An experimental approach to evaluate such alloy is expensive and highly inefficient. Generative AI can aid domain scientists in inversedesign of predicting feasible chemical properties under unseen or partially observed material characteristics and environmental conditions. This project aims to explore and (i) implement transformer architecture (ii) learn techniques to tackle missing and noisy data samples, and (iii) explore datadriven architecture for multimodal features. Student responsibilities: The student is expected to be comfortable with pytorch and python data analysis and visualization libraries like, numpy, pandas, matplotlib, scikitlearn. The student should have some foundational knowledge on deep learning. Having experience with transformer architecture or multimodal data is a plus, but not necessary. Learning Objectives:
Disciplines: Analysis, and Applied Mathematics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentors:
Internship Coordinator:


No  LBNLYe1  01/4/2024  1704344400000  Lawrence Berkeley National Laboratory (LBNL)  Berkeley, CA or virtual  Applied Mathematics, Computational Mathematics 
Project Description:In recent years, there have been several works investigating the use of quantized tensor networks (QTNs) to approximately solve nonlinear partial differential equations (PDEs) with significantly reduced computational cost. Example PDEs include the NavierStokes and the Vlasov equations; solving these equations using traditional numerical methods is often impractical or even infeasible due to extremely large computational costs. While the initial investigations appear quite promising, further work is needed for practical usage. For example, current works typically consider boxlike geometries, though many problems of interest may be better suited for a cylindrical or spherical coordinate system. However, these coordinate systems require specialized methods to treat the singularity that appears in the grid. This project will involve investigating how these specialized methods can be translated into the QTN framework, and then developing QTNspecific algorithms that offer similar performance but with reduced cost. Students will have the opportunity to become familiar with both traditional numerical simulation algorithms and the growing field of quantized tensor network algorithms. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Lawrence Berkeley National Laboratory (LBNL) Internship location: Berkeley, CA or virtual Mentor:


No  PNNLKay1  01/3/2024  1704258000000  Pacific Northwest National Laboratory (PNNL)  Virtual  Combinatorics, Probability and Statistics 
Project Description:Introduced by Claude Shannon in the 1940s, Entropy is a tool which measures the uncertainty in a given random variable. Specifically, the entropy of a random variable is the expected number of bits one would need to efficiently convey the outcome. In this project, we will apply entropy to the study of radio frequency signals. In particular, we will look at data sets consisting of transmitted signals and study what of their information content can be revealed by entropic methods. This project will have a theory aspect (understanding entropy, information theory, and general analytical frameworks) as well as an applied aspect (data manipulation, researching scientific data, and data driven experimentation). No background on entropy or radio frequency will be required a general familiarity with programming and mathematical reasoning are the only prerequisites.Disciplines: Combinatorics, and Probability and Statistics Hosting Site:Pacific Northwest National Laboratory (PNNL) Internship location: Virtual Mentor:


No  LBNLMorrow1  01/4/2024  1704344400000  Lawrence Berkeley National Laboratory (LBNL)  Berkeley, CA or virtual  Applied Mathematics, Probability and Statistics 
Project Description:Obstructive sleep apnea (OSA) affects 24% of all Veterans or 1 in every 15 Americans and is associated with increased risk for developing cardiovascular and metabolic comorbidities such as heart disease, stroke, hypertension, and type 2 diabetes mellitus. A vast number of patients are already experiencing at least one of these comorbidities by the time they are diagnosed with OSA; suggesting that OSA is being diagnosed well after the onset. OSA shares many symptoms with other diagnoses such as depression, increasing the diagnosis gap. With the use of machine learning and natural language processing, as well as statistical modeling and deep learning, we can better understand and identify OSA and target patients for treatment closer to the onset, reducing the likelihood of patients developing comorbidities. Applicants will have the opportunity to participate with many clinicians and subject matter experts. Applicants will have the opportunity to use Perlmutter, one the largest supercomputers in the world. Applicants will have the opportunity to learn, explore and deploy large language models, predictive models and more on healthcare data. Disciplines: Applied Mathematics, and Probability and Statistics Hosting Site:Lawrence Berkeley National Laboratory (LBNL) Internship location: Berkeley, CA or virtual Mentor:


No  BNLUrban1  01/4/2024  1704344400000  Brookhaven National Laboratory (BNL)  Upton, NY or virtual  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Many scientific machine learning (SciML) seek to quickly approximate the solutions of partial differential equation (PDE) systems. These include physicsinformed neural networks, convolutional neural networks, graph neural networks, attentionbased Transformer models, and neural PDEs. Generative models of the Transformer type, in particular, have revolutionized natural language processing (such as ChatGPT), and by now several ""foundation"" models exist for scientific applications such as climate/weather, biomedicine, materials science, etc. However, while these datadriven models may serve as useful predictors or proxies for numerical PDE simulation, it is not yet clear what they learn about physical systems or why. General results from the ""explainable AI"" (XAI) field have provided means to study how input features influence outputs (e.g., how the initial state is related to the predicted state); how individual neural network layers act on inputs to produce predictions; and even how Transformer layers learn to approximately perform iterative numerical algorithms such as first or secondorder optimization. This project will involve applying XAI techniques to generative SciML models trained on linear and/or nonlinear PDE solution data. One objective could be to relate properties of the trained ML model (such as its accuracy and precision) back to properties of the underlying dynamical system, such as its eigenspectrum, stiffness, degree and type of nonlinearity, symmetry and conservation laws, etc. Another could be to study the error of the neural network's predictions as they relate to the amount and type of training data provided. The project will provide opportunities to train neural networks, analyze their mathematical structure and properties, as well as to use numerical PDE solvers and analyze the behavior of dynamical systems. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Brookhaven National Laboratory (BNL) Internship location: Upton, NY or virtual Mentors:
Internship Coordinator:


No  BNLUrban2  01/4/2024  1704344400000  Brookhaven National Laboratory (BNL)  Upton, NY or virtual  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Largescale simulation models are pervasive throughout science, used in climate, materials science, nuclear physics, quantum chemistry, etc. These simulations often require large amounts of computation time, scientific machine learning (SciML) methods have been developed to create AIbased ""surrogate"" models that make fast, approximate predictions of physical systems. These ML surrogates are trained to simulation output and learn to reproduce it under new input settings (initial conditions, boundary conditions, parameter values). However, once trained, the ML surrogate may not make further use of simulation data. Another approach, known as active learning, involves selecting new simulations to perform and then retraining the ML surrogate. An alternative that has not yet been widely explored is whether the ML surrogate can be given access to a simulator that it can invoke ondemand, to generate limited amounts of new data, each time it makes predictions. The new simulation data would become additional inputs (predictors or covariates) to the trained surrogate, rather than a source of data to retrain the model. This would allow the ML model to generate, via physical simulation, additional information about the physical system that are expected to most improve its predictions. This project would explore whether additional MLgenerated data of this form are useful in physical predictions, and how an ML model can learn a policy that controls which new simulation information is requested for a given prediction. The project would involve the opportunity to develop and train SciML surrogate models, integrate them with physical simulations, and analyze and validate their predictions. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Brookhaven National Laboratory (BNL) Internship location: Upton, NY or virtual Mentors:
Internship Coordinator:


Yes  ORNLBall1  01/4/2024  1704344400000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Applied Mathematics, Computational Mathematics, Probability and Statistics 
U.S. Citizenship is a requirement for this internship Project Description:Researchers at ORNL use chemical vapor deposition (CVD) to produce TRistructural ISOtropic particle (TRISO) fuel. There are numerous variables in the CVD process that affect the uniformity of the TRISO fuel surface. This project will focus on developing statistical, machine learning, and reduced order models to describe the relationship between the variables in the CVD process and the uniformity of TRISO fuel and how we can use these models to optimize the fuel surface. The student is required to have a strong background in mathematical/statistical/machine learning modeling; The student will learn to apply mathematical and statistical modeling knowledge to an engineering process by studying the existing data and building models; The student will learn new optimization methods by using the models as surrogates. The student will help produce a code that will act as a library of mathematical/statistical/machine learning models to be applied to datasets to build surrogate models. The project will be summarized in a section of a technical report and should be given in presentations. Events / opportunities happen inside of our directorate or division that the student will be able to participate in and learn from that aren't specifically related to this internship project include seminars, lunch and learns, and crosscut forums. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentor:
Internship Coordinator:


No  NRELdeFrahan1  01/4/2024  1704344400000  National Renewable Energy Laboratory (NREL)  Golden, CO or virtual  Applied Mathematics, Computational Mathematics 
Project Description:As renewable energy systems, from mobility to human behavior models, become larger and more interconnected—with agents’ actions triggering a cascade of reactions nearly instantaneously—there is an increasingly urgent need for a method to simulate eventdriven complex systems on high performance computing resources. Parallel discrete events simulations (PDES) are used to model a wide range of complex systems. NREL’s High Performance Algorithms and Complex Fluids (HPACF) Group is particularly interested in the use of PDES for solving largescale eventdriven renewable energy systems. These systems include mobility and transport technology, the electrical grid, and models of material growth (e.g. crystals in chemical vapor deposition reactors). The objective of this research is to develop a scalable engine for PDES by investigating novel numerical methods and algorithms suitable for heterogeneous computing architectures, e.g., Graphics Processing Units (GPU), in high performance computing systems. This project will build on ongoing work to accelerate solutions in PDES. The focus of this research will be on the use of GPU to solve discrete events systems. The student will build fundamental understanding of PDES algorithms and how these can be modified to target new computing architectures. Research will be tied to actual renewable energy technology research, including applications to material growth and mobility. This is a relatively open field with significant opportunities for future research and publication. The intern can expect to build skills in the following areas: 1) Highperformance computing. 2) Algorithm development for PDES. 3) Applications of PDES to renewable energy technology. Interns in HPACF generally are encouraged to present internally and externally and submit results for peer review. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:National Renewable Energy Laboratory (NREL) Internship location: Golden, CO or virtual Mentor:
Internship Coordinator:


No  NISTKusne1  01/4/2024  1704344400000  National Institute of Standards and Technology (NIST)  Gaithersburg, MD or virtual  Applied Mathematics, Probability and Statistics 
Project Description:Autonomous experiment design comes at a high computational cost. In this project the intern will investigate the mathematical equivalence or difference between experiment design strategies. The goal is to write a paper providing the math of the equivalence or difference between strategies and then demonstrate using a real autonomous application. Disciplines: Applied Mathematics, and Probability and Statistics Hosting Site:National Institute of Standards and Technology (NIST) Internship location: Gaithersburg, MD or virtual Mentor:


No  ORNLOsorio1  01/4/2024  1704344400000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Applied Mathematics, Mathematical Biology, Probability and Statistics 
Project Description:Resilience refers to the ability of a system to maintain certain degree of wellbeing in the face of disturbances. Complex systems, natural or engineered, are resilient if they possess feedback mechanisms of control that allow them to adapt after disturbances. Understanding the properties of those adaptation mechanisms is a key problem in many scientific areas in engineering, earth and social sciences. Resilience in complex systems can be mathematically studied within the theory of random dynamical systems. Specifically, we are interested in the probabilistic response of highdimensional noisy dynamical systems to exogenous perturbations. Here, randomness is included to account for the presence of uncertainty in initial conditions and/or model parameterizations. The goal is to study models of how systems can adapt dynamically in order to steer the probability distribution of its state away from states of low wellbeing, functionality or productivity. In this project, we will model and analyze complex systems through a variate of mathematical tools. These include differential equations, probability, stochastic processes and network theory. The student can expect to be exposed to and learn theoretical, applied and computational aspects of each of these subjects. Specific areas of application can be decided according to the student’s background and interests. However, there are some subjects in which we already have ongoing projects: 1) Biology: resilience of photosynthesis pathways to temperature changes. 2) Ecology: resilience of trophic networks to changes in species variability or network topology. 3) Energy: resilience in electrical grids through improved power restoration protocols. Disciplines: Applied Mathematics, Mathematical Biology, and Probability and Statistics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentor:
Internship Coordinator:


No  ORNLLaiu3  01/4/2024  1704344400000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Analysis, Applied Mathematics, Computational Mathematics 
Project Description:This project aims to investigate the use of physicsinformed surrogate models for partial differential equations (PDEs) in computational fluid dynamics (CFD) applications. We will explore various strategies to incorporate adjoint information from PDE solvers in surrogate model training and develop optimization schemes that leverage adjoint information from surrogate models to accelerate PDEconstrained optimizations for CFD applications. In this project, we will learn about standard physicsinformed surrogate modeling approaches for PDEs, basics of PDE solvers for CFD applications, and adjointbased PDEconstrained optimization strategies. We will analyze the performance and efficiency of the developed surrogatebased optimization scheme using numerical analysis tools. Disciplines: Analysis, Applied Mathematics, and Computational Mathematics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentor:
Internship Coordinator:


No  ORNLRestrepo1  01/4/2024  1704344400000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN or virtual  Analysis, Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Neural networks have complex connections, a variety of architectures, and tunable parameters. The capacity of these networks to find nontrivial connections between inputs and outputs, or to accomplish complex tasks, is attributed to their complex structure. This complexity delivers remarkable fidelity, but often at the expense of understanding. Moreover, the results are sometimes very fragile, often dependent on the richness of the training set. For certain and very common networks, one can frame the process of tuning parameters and of training as a dynamic learning problem of complex interacting agents. There is an existing framework for understanding systems of this sort, namely, statistical physics. In this project we will apply ideas from equilibrium and nonequilibrium statistical physics to tune simple networks and endow them with some degree of explainability. The project combines theory, computation, and numerical experimentation. The two essential skills required to participate in this research project is probability (there is no requirement for a proficiency in measure theory), linear algebra, and some expertise in coding in python, Julia, or matlab. The research performed by the student will be reported eventually in peerreviewed scientific journals. Disciplines: Analysis, Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN or virtual Mentor:
Internship Coordinator:


No  BNLReyes1  01/3/2024  1704258000000  Brookhaven National Laboratory (BNL)  Upton, NY or virtual  Probability and Statistics 
Project Description:In this project, the student will explore methods to fuse a heterogeneous set of materials science data using Bayesian statistics. The student will implement specific prior distributions using kernelbased assumptions and perform samplebased inference on marginal distributions. The student will implement models using standard Python libraries and Pythonbased Markov chain Monte Carlo packages. Students will explore the effectiveness of various kernels and MCMC parameters, applying their developed tools to data from the Materials Project database. When developing such AI models for highly complex realworld systems and problems, major concerns include the requirement of enormous training data as well as extremely demanding computational resources to be able to train and run them. This project aims to better understand such AI models, using GraphCast as a representative, so that we can further improve data and computational efficiency for complex system modeling. We will develop scientific AI/ML techniques that integrates physics first principles into AI foundation models to enable dataefficient spatiotemporal dynamic modeling. We also aim to equip these models with objectivedriven uncertainty quantification (UQ) in a Bayesian paradigm to develop theories and algorithms that help understand the limitations of AI foundation models for complex system modeling and ultimately lead to an effective uncertaintyaware learning procedure of effective surrogates for complex systems. Specific research topics of interest include effective strategies for integrating ODE/PDE/mechanistic models with datadriven models and developing corresponding Bayesian inference to account for potential prediction uncertainty due to inherent data and model uncertainty. Potential applications of this methodology will be discussed with the student and can focus on multiple science and engineering applications, including climate/weather forecasting, as well as other dynamic modeling in other nature and humanengineered systems.Disciplines: Probability and Statistics Hosting Site:Brookhaven National Laboratory (BNL) Internship location: Upton, NY or virtual Mentor:
Internship Coordinator:


No  LANLSingh1  01/4/2024  1704344400000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Analysis, Applied Mathematics, Biometrics and Biostatistics, Computational Mathematics, Geometry, Mathematics (General), Operations Research, Probability and Statistics, Topology 
Project Description:Crystal shapes play a pivotal role in determining the properties of energetic materials, impacting their performance and safety [1, 2]. Various theoretical methods, such as the attachment energy model and kinetic Monte Carlo simulations, exist for predicting crystal shapes [3, 4]. However, discerning whether the predicted shape is at equilibrium remains a challenge. This project aims to address this gap by obtaining the crystal shape of an energetic material, pentaerythritol tetranitrate (PETN), at equilibrium. The use of free energy methods, such as umbrella sampling (US), in molecular simulations is powerful for obtaining the equilibrium state of a system [5]. In this project, we will perform biased molecular dynamics (MD) simulations using the US method to determine the equilibrium shape of the crystal. The student undertaking this internship will have the opportunity to develop essential skills in computational chemistry, including thermodynamics, classical mechanics, and statistical mechanics. The student will learn to perform molecular simulations on highperformance computers. These skills will not only be valuable for this project but will also lay the groundwork for future research endeavors in computational chemistry. The student is expected to have an interest in physics with coding skills in either Python, C++, or Fortran. 1 Handley, C. A., et al. "Understanding the shock and detonation response of high explosives at the continuum and meso scales." Applied Physics Reviews 5.1 (2018). 2 Perry, W. Lee, et al. "Relating microstructure, temperature, and chemistry to explosive ignition and shock sensitivity." Combustion and Flame 190 (2018): 171176. 3 Ibrahim, S. Fatimah, et al. "Prediction of the mechanical deformation properties of organic crystals based upon their crystallographic structures: case studies of pentaerythritol and pentaerythritol tetranitrate." Pharmaceutical Research 39.12 (2022): 30633078. 4 ZepedaRuiz, Luis A., et al. "Size and habit evolution of PETN crystals—a lattice Monte Carlo study." Journal of crystal growth 291.2 (2006): 461467. 5 Singh, Himanshu, et al. "Determination of equilibrium adsorbed morphologies of surfactants at metalwater interfaces using a modified umbrella samplingbased methodology." Journal of Chemical Theory and Computation 18.4 (2022): 25132520. Disciplines: Analysis, Applied Mathematics, Biometrics and Biostatistics, Computational Mathematics, Geometry, Mathematics (General), Operations Research, Probability and Statistics, and Topology Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentors:
Internship Coordinator:


No  ORNLKotevska2  12/7/2023  1701925200000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Analysis, Applied Mathematics, Computational Mathematics, Mathematics (General), Probability and Statistics 
Project Description:Graph Neural Networks (GNNs) have gained significant attention owing to their ability to handle graphstructured data and the improvement in practical applications. However, many of these models prioritize high utility performance, such as accuracy, with a lack of privacy consideration, a significant concern in modern society where privacy attacks are rampant. In this project, we want to focus on privacypreservation techniques. With guidance from a mentor, the student will help developing new defense mechanisms and design comparison study with the state of the art methods. The student will learn about privacypreservation algorithms, graph neural networks numerical analysis, and writing/presentation skills.
Disciplines: Analysis, Applied Mathematics, Computational Mathematics, Mathematics (General), and Probability and Statistics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentor:
Internship Coordinator:


No  LLNLDzanic1  01/4/2024  1704344400000  Lawrence Livermore National Laboratory (LLNL)  Livermore, CA or virtual  Computational Mathematics 
Project Description:Standard computational fluid dynamics approaches have widely relied on solving the NavierStokes equations governing fluid flow. These approaches assume that the fluid can be treated as continuum, which can be illposed for complex fluid flows such as rarefied gases and hypersonic aeronautics. In such cases, it becomes necessary to revert to the governing equations of molecular gas dynamics which underpin the macroscopic behavior of the fluid. One such example, the Boltzmann equation, provides a statistical description of particle transport and collision which can seamlessly recover the hydrodynamic equations in the continuum limit while offering a more detailed description of fluid flow in the nonequilibrium regime. However, its highdimensional nature drastically increases the associated computational cost of solving complex fluid flows. The goal of this project is to design and apply robust and efficient numerical schemes for solving the Boltzmann equation, namely highorder unstructured finite element methods which retain structurepreserving properties such as positivity of probability density and the conservation of macroscopic quantities. The student’s research will use an established codebase on projects such as developing discretelyconservative particle velocity domain discretizations and collision models, nonlinear limiting and shock capturing methods, and adaptive mesh refinement techniques for the velocity domain. Further project ideas include the application of these techniques to complex threedimensional flows including hypersonic reentry vehicles and fundamental fluid flow phenomena such as transition to turbulence, developing data reduction methods and analysis for highdimensional phase space representations of fluid flow instabilities, and any potential selfproposed project ideas by the student. Students should have basic familiarity with highorder finite element methods, scientific computing, and computational fluid dynamics. Familiarity with molecular gas dynamics/Boltzmann equation is preferred but not necessary. Disciplines: Computational Mathematics Hosting Site:Lawrence Livermore National Laboratory (LLNL) Internship location: Livermore, CA or virtual Mentor:
Internship Coordinator:


No  LBNLPerciano1  01/4/2024  1704344400000  Lawrence Berkeley National Laboratory (LBNL)  Berkeley, CA or virtual  Applied Mathematics, Computational Mathematics 
Project Description:Our goal is to ignite research on quantum data analysis, quantum image processing, and quantum machine learning. We aim to develop innovative quantum data representations and quantum analysis methods applicable to different types of data by exploiting the recent progress in QIS, using quantum hardware and quantum simulators on highperformance computing (HPC) systems. We aim to optimize the new algorithms for current and future noisyintermediate scale quantum (NISQ) devices, which are limited by low numbers of qubits with short decoherence times, and high gate errorrates, hindering the fidelity of quantum computations. The main goals of the project are to (1) develop and evaluate efficient quantum data representations and (2) develop implementations of quantum data analysis and quantum machine learning (ML) targeting essential data types (images, sequences, and timeseries). We will explore hybrid quantum/classical algorithms and study their suitability and efficiency for specific analysis tasks. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Lawrence Berkeley National Laboratory (LBNL) Internship location: Berkeley, CA or virtual Mentor:
Internship Coordinator:


No  LBNLPerciano2  01/4/2024  1704344400000  Lawrence Berkeley National Laboratory (LBNL)  Berkeley, CA or virtual  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. Commonly used in probability theory, statistics — particularly Bayesian statistics — and machine learning. The main advantages are its ability to use prior information (physical constraints) related to the data. This becomes very important when analyzing scientific data. This project aims to develop efficient PGMbased algorithms to tackle problems such as classification, regression, image segmentation, image denoising, feature tracking, data reduction, data fusion, etc. We use mainly Markov Random Fields and Conditional Random Fields, and some of our approaches combine these methods with deep learning algorithms to solve problems in different scientific fields such as chemistry, material sciences, and biosciences. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Lawrence Berkeley National Laboratory (LBNL) Internship location: Berkeley, CA or virtual Mentor:
Internship Coordinator:


Yes  NETLGamwo1  01/4/2024  1704344400000  National Energy Technology Laboratory (NETL)  Virtual  Applied Mathematics 
U.S. Citizenship is a requirement for this internship Project Description:Large quantities of CO_{2} will be transported over a large distance from CO_{2} capture plants to underground storage sites. There are growing safety concerns across communities that face the prospect of more CO_{2} pipelines primarily due to accidental release of odorless toxic CO2 in the atmosphere caused by defective cracks on the pipe or pipe failure. It is necessary to have a validated risk assessment model to mainly predict the safety distance from the pipeline where accidental releases cannot cause unacceptable risk for the people. The objective of this project is to extensively review existing dispersion models of carbon dioxide release in case of leak or rupture then select the best model to deepen our knowledge through extensive simulations of a combination of several scenarios. These models are important to regulate the safety construction and operation of CO_{2} pipelines through populated areas and remote communities. Disciplines: Applied Mathematics Hosting Site:National Energy Technology Laboratory (NETL) Internship location: Virtual Mentor:
Internship Coordinator:


No  NRELMartin2  01/4/2024  1704344400000  National Renewable Energy Laboratory (NREL)  Golden, CO or virtual  Applied Mathematics, Computational Mathematics 
Project Description:Iron and steelmaking are complex processes, with reacting multiphase fluid flow and heat transfer occurring at extreme temperatures. The need to decarbonize manufacturing, which accounts for as large of a share of carbon emissions in the US as electricity generation, has led to a renewed interest in understanding the physics of not just iron and steel manufacturing, but all manufacturing processes involving metals. Because of the complex physics and extreme temperatures, simulation is critical to understanding these systems. These simulations then allow the impact of process changes, such as substituting hydrogen for fossil fuels in iron ore reduction, to be understood. This, in turn accelerates reengineering these processes to make them more sustainable. NREL’s High Performance Algorithms and Complex Fluids (HPACF) Group is host to multiple projects simulating these processes, using both commercial and opensource software. The intern will have the chance to contribute to these projects by (1) adding complex physics to advanced simulations, (2) using these simulation tools to model the physics of sustainable manufacturing, and (3) creating reducedorder models that can be incorporated into design decisions. Research will be tied to operating conditions defined by industrial partners. This is an emerging research topic with significant opportunities for future investigation and publication. The intern can expect to build skills in the following areas: 1) Highperformance computing. 2) Numerical analysis including computational fluid dynamics. 3) Energy systems modeling. Interns in HPACF are encouraged to present at both local and national conferences, and to submit their research to peerreviewed journals. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:National Renewable Energy Laboratory (NREL) Internship location: Golden, CO or virtual Mentor:
Internship Coordinator:


No  LBNLRouson1  01/4/2024  1704344400000  Lawrence Berkeley National Laboratory (LBNL)  Berkeley, CA  Applied Mathematics 
Project Description:Training a neural network involves selecting numerous model parameters (e.g., hiddenlayer width and depth, weight and bias initialization, activation function choice, and network connectivity) and trainingconfiguration parameters (e.g., optimizer choice, learning rate, regularization, tensor normalizations, and minibatch size). These choices can greatly impact the convergence behavior of the training process. We have developed the InferenceEngine deep learning library to support the incorporation of neuralnetworkbased model surrogates into highperformance computing applications. InferenceEngine offers concurrent inference, the stateofthe art Adam optimizer, and also provides a platform for studying languagebased parallel and GPU programming in Fortran 2023. In our first target application, the Intermediate Complexity Atmospheric Research (ICAR) model, the cloud microphysics model that predicts precipitation occupies approximately 90% of the runtime. We have recently demonstrated convergence of the training process for a neuralnetwork that is now a candidate surrogate for ICAR's most widely used cloud microphysics model. This project will involve studying techniques for tuning the aforementioned model parameters and training parameters to improve the convergence rate of neuralnetwork training. Disciplines: Applied Mathematics Hosting Site:Lawrence Berkeley National Laboratory (LBNL) Internship location: Berkeley, CA Mentor:
Internship Coordinator:


No  NRELGoldwyn1  01/4/2024  1704344400000  National Renewable Energy Laboratory (NREL)  Golden, CO or virtual  Applied Mathematics, Combinatorics, Computational Mathematics, Probability and Statistics 
Project Description:NREL’s Institutional Knowledge Graph project aims to organize institutional knowledge for collective scientific advancement. Until now, scientific knowledge has been decentralized across disparate databases, webpages, publications, and the minds of researchers. But with a graph containing the complex network of relationships between the projects, people, data, and tools that make up applied research at NREL, we can leverage what is known collectively to do better science. Organizing NREL’s knowledge is just the first step. As part of the MSGI, the applicant will contribute to the development of a knowledge recommendation system, ensuring that researchers across NREL have access to data, tools, methods, and people who could be useful to their efforts. The successful applicant will learn to understand and implement graph completion and graph questionanswering algorithms using stateoftheart machine learning on heterogeneous directed graphs (for example, graph neural networks and graph attention networks). The applicant may also have the opportunity to contribute to a journal publication on the scientific knowledge recommendation system. Disciplines: Applied Mathematics, Combinatorics, Computational Mathematics, and Probability and Statistics Hosting Site:National Renewable Energy Laboratory (NREL) Internship location: Golden, CO or virtual Mentors:
Internship Coordinator:


No  ORNLSchnake1  01/4/2024  1704344400000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Applied Mathematics, Computational Mathematics 
Project Description:The main objective of this project is to develop algorithms for data reduction of solutions to highdimensional dynamical systems. Specifically, under the direction of the mentor, the student will help design and implement hybrid methods for evolving lowrank approximations to solutions of partial differential equations (PDEs) including kinetic models. The student will learn modern techniques for evolving lowrank data in a dynamical system; determine physical regimes where solutions to kinetic problems are lowrank through numerical simulations; become more familiar with PDE solvers and numerical discretizations for kinetic problems. The student will give a presentation of the project at the end of the appointment. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentor:
Internship Coordinator:


No  ANLMadireddy4  01/4/2024  1704344400000  Argonne National Laboratory (ANL)  Lemont, IL or virtual  Applied Mathematics, Operations Research, Probability and Statistics 
Project Description:Continual learning in machine learning refers to the ability of models to learn continuously from new data, adapting to new tasks while retaining previously learned information, that are key for applications in edge computing and the continual adaptation of language models without full retraining. However, such models seem to be very sensitive to the order in which the tasks arrive. The goal of this project to develop robust optimization techniques to address the combinatorial challenges of task ordering through "WorstCase Scenario Analysis" for optimal performance in challenging conditions, "Adversarial Training" to build resilience against unexpected task sequences, and "Uncertainty Modeling" to inform decisionmaking in dynamic environments. Disciplines: Applied Mathematics, Operations Research, and Probability and Statistics Hosting Site:Argonne National Laboratory (ANL) Internship location: Lemont, IL or virtual Mentors:
Internship Coordinator:


No  ORNLHauck2  01/4/2024  1704344400000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Applied Mathematics, Combinatorics, Computational Mathematics 
Project Description:The goal of this project is to investigate the behavior of numerical algorithms under the assumption of statistical variations and uncertainty in the computational state. With the emergence of new microelectronic materials and new system architectures, it is expected that the state of a computer system is no longer deterministic. Rather the state may vary due to fluctuations in material properties or due to a lack of precision in the underlying algorithm. To address these challenges, resilient and probabilistic computational models will be required. In this project, the student will learn about probability, fundamental kernels in scientific computing, and basic numerical analysis. The student will collaborate with research staff in mathematics and scientific computing and have the opportunity to present their research at the end of the summer. Disciplines: Applied Mathematics, Combinatorics, and Computational Mathematics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentor:
Internship Coordinator:


No  ORNLXue2  01/4/2024  1704344400000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN  Algebra or Number Theory, Analysis, Computational Mathematics, Probability and Statistics, Topology 
Project Description:Topologically, electrical power grids can be treated as largescale complex networks or graphs. Therefore, graph theory can be applied to analyze power grids to deal with the problems such as network decomposition, clustering, feeder routing, and dynamic microgrid formulation, etc. The physical motivations come from the enhancement of grid reliability, stability, and structural resilience with effective means. This project will introduce mathematical PhD students with power grid network knowledges, survey and identify one of the most promising problems using directed graph theory, and perform a successful case study, e.g. decomposing a test grid with hundreds of nodes into multiple clusters based on certain criteria or features, using a programming language. This project will offer learning experience for future applied "mathematicians" to explore power grid modeling and stability analysis problems, and the cuttingedge R&D in power grid modeling and controls.Disciplines: Algebra or Number Theory, Analysis, Computational Mathematics, Probability and Statistics, and Topology Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN Mentor:
Internship Coordinator:


No  NISTIyer1  01/4/2024  1704344400000  National Institute of Standards and Technology (NIST)  Gaithersburg, MD or virtual  Analysis, Applied Mathematics, Biometrics and Biostatistics, Computational Mathematics, Geometry, Mathematics (General), Operations Research, Probability and Statistics, Topology 
Project Description:Forensic science's role in the legal system is pivotal. Yet, NAS (2009) and PCAST (2016), found that forensic methods in some pattern comparison disciplines  such as fingerprints, firearms, footwear impressions, etc.  lacked scientific validation. As a response to such criticisms, the forensic science community has been conducting studies, using groundtruth known scenarios, in which the forensic science practitioners and methods are tested in order to understand how well the systems perform. Such studies are generally called Blackbox studies. Analyses of data from blackbox studies aimed at assessing accuracy and reproducibility in pattern comparison forensic disciplines face a challenge when participants who sign up for the study either drop out or fail to respond to some or all of the test items. Some Bayesian hierarchical ""not missing at random"" models, developed outside of the forensic science context, have recently received attention and been applied to analyze data from two recently conducted black box studies. Results of these analyses suggest that the performance of forensic methods in pattern disciplines may be much worse than previously thought. The goal of this project is to address the strengths and weaknesses of these hierarchical Bayesian models with primary focus on parameter identifiability and the effect of statistical modeling assumptions on the results. This will allow us to evaluate the trustworthiness of the results obtained from applying such models. This will have significant impact on how the forensic science community will assess the reliability of various forensic pattern discipline methods. Disciplines: Analysis, Applied Mathematics, Biometrics and Biostatistics, Computational Mathematics, Geometry, Mathematics (General), Operations Research, Probability and Statistics, and Topology Hosting Site:National Institute of Standards and Technology (NIST) Internship location: Gaithersburg, MD or virtual Mentors:


No  LANLSkurikhin3  01/4/2024  1704344400000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Applied Mathematics, Probability and Statistics 
Project Description:Foundation models (FMs) open new opportunities for generalization of neural networks across different data modalities. FMs are largescale deep learning neural network models (e.g., vision transformers) that are trained on vast amount of data and are expected to be tuned to a wide range of downstream tasks with relatively little additional taskspecific training. Most FMs are currently restricted to two or three modalities such as text, RGBcolor images and video. Transfer of FMs to processing other data modalities is an exciting direction for development of multimodal FMs. This project will focus on adopting vision transformer models and transferring them to the interpretation of synthetic aperture radar (SAR) imagery. This is a fundamentally harder problem than optical imagery interpretation. SAR data are particularly valuable as SAR sensors can penetrate clouds and work in any illumination setting (day or night). As such, SAR image data could be particularly valuable in a disasterresponse scenario where cloud cover and adverse conditions often limit the value of traditional optical imagery. The project will target development of transformer model to extract objects from SAR imagery. In particular, recently introduced SpaceNet6 (MultiSensor AllWeather Mapping) challenge will provide a baseline and an open overhead imagery dataset acquired in two modalities over the same area, both quadpolarized Xband SAR and electrooptical imagery. The challenge is semantic segmentation, specifically extraction of building footprints. While model training can be done using both modalities, testing will have to be done using only SAR data. This is done to follow realworld scenario, when concurrent collection of electoroptical and SAR imagery is often not possible. The participant will advance their knowledge in computer vision, machine learning, remote sensing and statistics. They will take an active role in developing transformer model(s) and approaches to transfer them to SAR data interpretation. The participant is expected to implement modelling in Python, PyTorch, and other machine learning and statistical related packages. They will also evaluate transformer model(s) on realworld aerial and satellite image data, will learn and use the appropriate evaluation metrics, and identify remaining challenges and how they can potentially be addressed. By learning alongside a multidisciplinary team with expertise in statistics, machine learning, and remote sensing, they will gain expertise in technical communication and better appreciate teambased research efforts. Disciplines: Applied Mathematics, and Probability and Statistics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentor:
Internship Coordinator:


Yes  LANLMONROE1*  01/18/2024  1705554000000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Algebra or Number Theory, Applied Mathematics, Mathematics (General) 
U.S. Citizenship is a requirement for this internship Project Description:Inexact computing is any kind of computing where one does not get the exact numerical result. This can include approximate and probabilistic computation. This will be applicable to a wide range of postMoore’s era architectures, because of reliability issues, potential power savings, increased resilience to faults and architectural changes. Some combination of general processors, general inexact processors and specialized inexact processors will have to be developed, as well as efficient ways to use them. LANL has an ongoing exploration of inexact computing techniques, with projects in a range of areas of inexact computing. We are exploring reduced precision, machine learning approaches, advanced error detection and correction methods and other techniques, and applying these to problems in computational mathematics, basic mathematics and computer science. The specific project we address with an NSFMSGI intern will depend on intern interests and background. Our current projects include:
>We encourage publication of results. LANL has a wide range of compute systems, and students will have access to cuttingedge devices of interest. If onsite activity is possible at the time of the internship, the intern will sit in the Ultrascale Systems Research Center, which supports a wide range of research in computer science. We are happy to discuss the project in more detail upon request. For further information, please contact: Dr. Laura Monroe (lmonroe@lanl.gov). Disciplines: Algebra or Number Theory, Applied Mathematics, and Mathematics (General) Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentor:
Internship Coordinator:


Yes  LANLMONROE2  01/18/2024  1705554000000  Los Alamos National Laboratory (LANL)  Los Alamos, NM or virtual  Algebra or Number Theory, Combinatorics, Geometry 
U.S. Citizenship is a requirement for this internship Project Description:This project is a search for graph topologies that are suited to the new generation of emerging routers coming from industry. This is true crossdisciplinary work between mathematics and computer science, like that that took place in the 40s and 50s. We are addressing the degreediameter problem of graph theory and applying our results to postexascale computer networks, in collaboration with a major vendor.We have already used this approach to create PolarFly, a new family of diameter2 topologies. This topology supports radixes suited to the new highradix routers, aymptotically approaches the maximum number of nodes for the radix and diameter, exploits mathematical symmetries for modularity, and outperforms other networks in terms of scalability, cost and performance. This has resulted in a paper presented in a toptier conference, another submitted to a toptier conference, and a patent application. Diameter 2 is suited to smaller systems, but not exascale. In this work, we hope to introduce further mathematical advances to develop new diameter3 topologies, which would position such networks to address exascale and postexascale systems. We want to survey the wide area of graph theory to find graphs well suited to postexascale networks. We hope that advances will result in further publications, and more importantly, we hope to influence a mathematical approach to network design over the next decade or more. Background: Photonic technology has improved greatly over the last few years. Recent advances in copackaged optics make it possible to drive multiple terabytes per second out of a single socket. In addition, the photonic ecosystem is advancing rapidly, making copackaged optics an excellent candidate for upcoming generations of postexascale systems. The primary advantage of this technology is performance – speedoflight latency for short and longreach communication, combined with an exponential growth of communication bandwidth. An important aspect of integrated optics is the level of connectivity: it is now possible to drive 3264 optical connections out of a single highradix device. This level of connectivity is a real advancement, but current network designs do not fully exploit this opportunity. Without advances in system design, these systems will not reach their potential. Such advances are especially needed in network topology and system design, which are still open areas of research. In particular, this calls upon successful approaches to the degreediameter problem, a classical but very open problem in graph theory. We propose to use mathematical graph theory and projective geometry to design very large and compact interconnection networks that are optimally tailored to this emerging technology. Disciplines: Algebra or Number Theory, Combinatorics, and Geometry Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM or virtual Mentor:


Yes  USACESava1  01/18/2024  1705554000000  U.S. Army Corps of Engineers, Geospatial Research Laboratory (GRL)  Analysis, Applied Mathematics, Computational Mathematics 
U.S. Citizenship is a requirement for this internship Project Description:Understanding local change over time is a multifaceted effort that often requires an integrated approach. This process involves identifying and understanding shifts, transformation and developments within a specific geographic area at the neighborhood, town, or region scale. As an intern, you will be part of a multidisciplinary team which focuses on understanding the interrelationship of earths physical features and human influence into a particular location by leveraging geospatial capabilities, image analysis, and advanced objectoriented change. The goal of this team is to extract information and knowledge about tangible features often found in urban/suburban environments such as roads, buildings, water, and high vegetation. The primary responsibilities include preliminary data analysis and leveraging applied math frameworks to train and validate deep learning models in order to enhance model generalization. The intern will conduct training of multiple deep learning models, assess each model’s performance on predicting desired features, and carry out preliminary accuracy assessments using various statistical approaches. Additionally, the intern will identify areas for model improvement and finetuning. Lastly, the intern is expected to create clear and concise documentation of findings and model performance to the team. Strong programming skills, preferably in Python, and familiarity with deep learning frameworks such as TensorFlow or PyTorch. Knowledge of image processing techniques and computer vision principles. Excellent problemsolving and analytical skills. Effective communication and collaboration abilities. Disciplines: Analysis, Applied Mathematics, and Computational Mathematics Hosting Site:U.S. Army Corps of Engineers, Geospatial Research Laboratory (GRL) Mentors:
Internship Coordinator:


Yes  LANLBell1  01/18/2024  1705554000000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Applied Mathematics, Computational Mathematics, Foundations, Geometry, Topology 
U.S. Citizenship is a requirement for this internship Project Description:This project connects machinelearning with machinelearning practice. Statistical bounds for nonlinear function approximation have not provided tools to accurately predict empirical performance of neural network models. As a result, practitioners rely on computational evaluation of models, which does not guarantee robust prediction accuracy. A recent focus has been on representing large parametric models with kernel formulations which open them up to analysis by wellestablished mathematical tools. At LANL, we have developed an exact Kernel based formulation for arbitrary Artificial Neural Networks (ANNS) which has a wide variety of applications both to produce theoretical guarantees, and to provide robust mathematical foundation to empirical methods in OutofDistribution (OOD) Detection, Adversarial Robustness, Signal Manifold Approximation, and more. In particular, these formulations allow decompositions of predictions into combinations of gradients attributed to each training input. This decomposition is implicitly the foundation of much of modern interpretability, OOD, robustness, and other techniques, and yet the connection between these techniques and these decompositions is only now being realized. Building upon our previous work, this project will further refine the mathematical theory for these representations and build approximation algorithms that can make nearexact Kernel surrogate models that can enhance the performance and reliability of large, realworld neural network models. Disciplines: Applied Mathematics, Computational Mathematics, Foundations, Geometry, and Topology Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentors:
Internship Coordinator:


No  LLNLDudson1  01/18/2024  1705554000000  Lawrence Livermore National Laboratory (LLNL)  Livermore, CA or virtual  Applied Mathematics, Probability and Statistics 
Project Description:We are developing efficient methods to solve inverse problems in fusion energy, enabling complex simulation codes to be routinely applied to experimental data with Uncertainty Quantification. The most complete physical models are computationally expensive and have a highdimensional input space, but may be accelerated using initial conditions derived from lower fidelity models. We have successfully used this approach in specific cases and are now generalizing and automating these tools. A student joining our multidisciplinary team will learn both statistical techniques and applied mathematics relevant to plasma physics simulations of tokamak fusion devices. First they will learn Bayesian Optimization methods for highdimensional problems, building on existing examples and tools. He or she will then evaluate the impact of model fidelity on computational efficiency, and learn methods for mapping between model fidelities to accelerate highdimensional inverse problems. Depending on results, we anticipate that by the end of the summer we will write a journal publication together. Disciplines: Applied Mathematics, and Probability and Statistics Hosting Site:Lawrence Livermore National Laboratory (LLNL) Internship location: Livermore, CA or virtual Mentors:
Internship Coordinator:


No  FNALLi1  01/18/2024  1705554000000  Fermi National Accelerator Laboratory (FNAL)  Batavia, IL or virtual  Applied Mathematics, Computational Mathematics 
Project Description:The objective of this project is to investigate how the design of cost functions affects the trainability of variational quantum algorithms. The nearterm quantum processors are noisy, and errors vastly limit their ability to execute lengthy algorithms. Variational quantum algorithms offer a promising potential toward practical quantum advantage on several specific applications utilizing noisy quantum processors without error correction. The error tolerance comes from offloading part of the work to classical optimization of cost functions measured by quantum processors. The cost functions are generally nonconvex, and their gradients may decrease exponentially with increasing system size. Therefore, the trainability of the cost functions is one of the most significant challenges for achieving quantum advantage using the variational approach. In this project, we will explore how different choices of cost function design, e.g., the operators to measure, affect trainability and how optimization strategies can be adjusted accordingly. Physics knowledge and quantum computing are not required for this project. Students should be comfortable with optimization problems and have some experience (or a willingness to learn) programming in Python or Julia. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Fermi National Accelerator Laboratory (FNAL) Internship location: Batavia, IL or virtual Mentors:
Internship Coordinator:


No  FNALLi2  01/18/2024  1705554000000  Fermi National Accelerator Laboratory (FNAL)  Batavia, IL or virtual  Applied Mathematics, Computational Mathematics 
Project Description:This project aims to develop efficient protocols for synthesizing qudit gates across diverse hardware implementations. Qudits are multilevel quantum systems that can serve as fundamental units for encoding information in quantum computing. Compared to qubits, which have two levels, qudits can offer advantages in specific applications by executing certain computational operations more efficiently. Synthesizing efficient qudit gates requires optimizing native gate or pulse sequences. This optimization task is generally challenging due to the nonconvex and barren plateau cost function landscapes, especially when dealing with a large number of qudit levels. An efficient protocol for synthesizing qudit gates is crucial for the success of quantum computation utilizing qudits. This project will investigate the performance and optimization complexity of qudit gate synthesis protocols. The outcomes will provide valuable insights into the design of qudit gates and their control protocols. Physics knowledge and quantum computing are not required for this project. Students should be comfortable with linear algebra and optimization problems and have some programming experience (or a willingness to learn). Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Fermi National Accelerator Laboratory (FNAL) Internship location: Batavia, IL or virtual Mentors:
Internship Coordinator:


No  NRELPaul1  01/18/2024  1705554000000  National Renewable Energy Laboratory (NREL)  Golden, CO or virtual  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Background and Learning Experience: This project is a unique and exhilarating journey into the intersection of applied mathematics, machine learning, and cybersecurity within the realm of CyberPhysical Systems (CPS). Participants will have the opportunity to be at the vanguard of technological innovation, focusing on enhancing the robustness of ML algorithms against adversarial attacks in critical CPS domains. It offers a rich blend of theoretical exploration and handson practical experience, allowing participants to delve deep into the intricacies of securing ML models in an increasingly interconnected and technologydriven world. By engaging in this project, participants will not only gain invaluable knowledge and skills in a cuttingedge field but also contribute to shaping the future of technology, making them invaluable assets in sectors ranging from autonomous transportation to smart infrastructure. This project isn’t just an educational endeavor; it's an adventure at the forefront of technological advancement, offering a unique chance to tackle some of the most pressing cybersecurity challenges of our time. Learning Objectives: Participants will:
Activities and Participation Participants in this project will have the opportunity to:
Impact: This project aims not only to contribute significantly to the field of cybersecurity in CPS but also to provide participants with a comprehensive learning experience. By engaging in this project, participants will enhance their understanding and skills in applied mathematics, machine learning, cybersecurity, and the practical application of these disciplines in realworld scenarios. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:National Renewable Energy Laboratory (NREL) Internship location: Golden, CO or virtual Mentors:


No  NRELPaul2  01/18/2024  1705554000000  National Renewable Energy Laboratory (NREL)  Golden, CO or virtual  Analysis, Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Background and Learning Experience This exploratory project offers a unique foray into the application of Generative Artificial Intelligence (AI) within the realm of cybersecurity, particularly focusing on the protection of critical infrastructure. It stands at the exciting intersection of AI innovation and cybersecurity, a field that is rapidly gaining significance in our increasingly digital world. Participants will delve into the world of generative models, exploring how these advanced AI tools can be leveraged to predict, detect, and counter cyber threats against essential infrastructure systems. This project not only offers a chance to engage with cuttingedge AI technology but also to apply these innovations in practical, realworld scenarios that have a direct impact on national and global security. Learning Objectives Participants will:
Activities and Participation: Participants in this project will have the opportunity to:
Impact This project is designed to not only advance the field of AI in cybersecurity but also to provide participants with a rich and comprehensive learning experience. Participants will be at the forefront of exploring how generative AI can revolutionize the way we protect our most critical digital and physical assets, preparing them for future challenges and opportunities in this vital field. Disciplines: Analysis, Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:National Renewable Energy Laboratory (NREL) Internship location: Golden, CO or virtual Mentors:


No  NRELPaul3  01/18/2024  1705554000000  National Renewable Energy Laboratory (NREL)  Golden, CO or virtual  Analysis, Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Background and Learning Experience This innovative project introduces the concept of 'Prompt Engineering' as a tool for enhancing cybersecurity in CyberPhysical Systems (CPS). Prompt engineering, primarily used in the realm of advanced AI and machine learning, involves crafting precise and effective input prompts to elicit desired responses or behaviors from AI systems. In the context of CPS cybersecurity, this technique can be pivotal in identifying, mitigating, and defending against sophisticated cyberattacks. Participants in this project will explore the intersection of AI, cybersecurity, and CPS, gaining handson experience in using prompt engineering to strengthen the security of critical infrastructure against cyber threats. Learning Objectives Participants will:
Activities and Participation Participants will have the opportunity to: Participants will engage in:
Impact This project aims to pioneer the use of prompt engineering in cybersecurity, offering participants a unique chance to contribute to an emerging field. The knowledge and skills gained will be crucial in advancing cybersecurity measures for CPS, preparing participants for future roles in safeguarding critical digital and physical infrastructures.
Disciplines: Analysis, Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:National Renewable Energy Laboratory (NREL) Internship location: Golden, CO or virtual Mentors:


No  NRELPaul4  01/18/2024  1705554000000  National Renewable Energy Laboratory (NREL)  Golden, CO or virtual  Applied Mathematics, Computational Mathematics, Foundations, Probability and Statistics 
Project Description:Background and Learning Experience: This pioneering project investigates the phenomenon of hallucinations in Large Language Models (LLMs) and their potential exploitation by cyber adversaries to initiate attacks on CyberPhysical Systems (CPS). Hallucinations in LLMs refer to instances where these models generate incorrect, often nonsensical or unrelated information, which can be a significant vulnerability in cybersecurity. Participants in this project will explore this lesscharted territory, understanding how LLM hallucinations can be manipulated for cyberattacks and developing strategies for prevention and mitigation. Learning Objectives Participants will:
Activities and Participation Participants will:
Impact: This project aims to contribute significantly to the understanding of a novel aspect of AI security, particularly in the context of CPS. Participants will not only gain specialized knowledge and skills but also contribute to advancing the field of cybersecurity, addressing emerging threats in an increasingly AIintegrated world. Disciplines: Applied Mathematics, Computational Mathematics, Foundations, and Probability and Statistics Hosting Site:National Renewable Energy Laboratory (NREL) Internship location: Golden, CO or virtual Mentors:


No  NRELNag1  01/18/2024  1705554000000  National Renewable Energy Laboratory (NREL)  Golden, CO or virtual  Analysis, Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Background and Learning Experience: This project is centered on the generation and validation of synthetic data for cybersecurity applications in CyberPhysical Systems (CPS). It aims to address a critical challenge in the field: ensuring that the decisions made by models trained on synthetic data are applicable and effective in realworld scenarios. Participants will explore innovative methods for creating realistic synthetic datasets that can safely and effectively simulate cyberphysical environments, as well as develop validation frameworks to assess the feasibility and reliability of these datasets in realworld applications. Learning Objectives Participants will:
Activities and Participation Participants will:
Impact: This project aims to significantly advance the field of CPS cybersecurity by addressing one of its key challenges: the development of reliable and applicable synthetic data for model training and validation. Participants will contribute to enhancing the safety and security of CPS by ensuring that cybersecurity measures are robust and effective in realworld conditions. Disciplines: Analysis, Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:National Renewable Energy Laboratory (NREL) Internship location: Golden, CO or virtual Mentors:
Internship Coordinator:


No  ORNLSpannaus1  01/18/2024  1705554000000  Oak Ridge National Laboratory (ORNL)  Oak RIdge, TN  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:AI models are gaining wide interest within the medical community due to their increasing ability to rapidly assimilate text in the form of electronic health records (EHR) and predictive skill. The predictive performance of any AI model is dependent on the choice of model architecture and hyperparameters, such as the specific choices of activation function in each layer, learning rates, batch sizes, and choice to finetune the word embeddings (or not) during model training. Indeed some sets of hyperparameters may lead to overfitting or underfitting a model. As a direct consequence, these models will have poor predictive performance and/or lack of generalization outside of the training dataset. Consequently, identifying an appropriate set of hyperparameters is crucial to ensure the model’s accuracy and generalizability. Identifying a hyperparameter set that would make our NLP models both more accurate and robust requires a prohibitively expensive search over a high dimensional space, as large models must be trained and validated at each hyperparameter combination. High performance computing systems may be leveraged to alleviate some the computational burden of this search by exploring multiple hyperparameter sets simultaneously. Due to the coupled nature of some hyperparmeters however, this problem is not trivially parallel. As part of this project, we will develop and implement scalable hyperparameter optimization (HPO) algorithms from the clinical NLP library, FrESCO, deployed as part of the EHRLICH project, which may include: a multitask convolutional neural network, multitask hierarchical selfattention network, and a clinical Transformer architecture. We will use the RayTune library for hyperparameter tuning and will integrate these RayTune functionalities as modules into our FrESCO clinical NLP library. The performance of our HPO procedure will be assessed in terms of: (1) scalability attained by distributing the hyperparameter search over multiple compute nodes at the Oak Ridge Leadership Computing Facility (OLCF) and (2) validation accuracy of the trained clinical NLP model on MIMICIV data. The expected outcome is a scalable HPO framework integrated within the existing FrESCO library which attains linear scaling over multiple compute nodes on the OLCF Frontier supercomputer, with an improved accuracy or decreased time to solution as compared with existing clinical NLP models trained on the same EHR dataset. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak RIdge, TN Mentors:
Internship Coordinator:


Yes  NISTCohl1  01/18/2024  1705554000000  National Institute of Standards and Technology (NIST)  Gaithersburg, MD or virtual  Applied Mathematics 
U.S. Citizenship is a requirement for this internship Project Description:This project is to work with the original Fortran 77 code developed by Norman Lebovitz and Alexander Lifschitz to perform a high resolution numerical ellipsoidal harmonic instability analysis of the Riemann Stype ellipsoids. The Riemann Stype ellipsoids are uniform density (incompressible) inviscid selfgravitating equilibrium triaxial rotating fluid ellipsoids. These ellipsoids which have been studied for many centuries were popularized by Nobel Laureate Subrahmanyan Chandrasekhar in his 1969 book "Ellipsoidal Figures of Equilibrium". It was noticed by Chandrasekhar and Lebovitz (Chandrasekhar's student) that these equilibrium fluid ellipsoids become unstable to a dynamical shape changing instability as the angular momentum of the ellipsoids (eccentricity) is increased. Using a numerical code originally developed by Lebovitz and Lifschitz, we would like to further explore the beautiful properties of these nonlinear instabilities by exploiting the original LebovitzLifschitz code. To find more detail about the instability analysis, code and algorithms which are due to Norman Lebovitz (d.2022) and Alexander Lifschitz (now Alexander Lipton), see the following papers: (1) Lebovitz & Lifschitz, New Global Instabilities of the Riemann ellipsoids, The Astrophysical Journal (1996) 458, 699713; (2) Lebovitz & Lifschitz, Shortwavelength instabilities of Riemann ellipsoids, Philosophical Transactions of the Royal Society of London. Series A. Mathematical, Physical Sciences and Engineering, 354 (1996), no. 1709, 927–950. Disciplines: Applied Mathematics Hosting Site:National Institute of Standards and Technology (NIST) Internship location: Gaithersburg, MD or virtual Mentor:


No  LLNLSolis1  01/18/2024  1705554000000  Lawrence Livermore National Laboratory (LLNL)  Livermore, CA or virtual  Analysis, Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:A longstanding problem in the electricity sector is the lack of transparent reflection of outage and insufficiency risks in energy pricing. This leads to mostly flat yet spiky price signals, with sudden price spikes arising during extreme scarcity, as observed during the 2021 winter storms in Texas. To mitigate these price spikes, policymakers introduce ancillary products, such as capacity markets and operating reserve curves, aimed at providing more stable price signals to infrastructure investors. These ancillary products, however, are often criticized for their discretionary nature. Their quantities and prices are computed based on idiosyncratic formulas that do not adapt to new trends or technologies. For example, conventional capacity markets disregarded renewables energy resources as suppliers until recently. Our approach seeks to address this issue by utilizing risk measures, such as Conditional Value at Risk (CVaR), to assess the impact of potential outages or sufficiency events during normal operating conditions. This approach offers the advantages of being technologyagnostic, requiring only the setting of two parameters based on historical engineering standards, e.g., 1 day/10 years security criterion. It directly informs energy pricing through duality, without the need for additional products or services. The main challenge of the proposed approach lies in computational complexity: optimizing power scheduling problems with risk measures can be complicated, especially when handling industrialscale power scheduling problems. To address this challenge, we will expand on recent research on lowcomplexity algorithmic approaches for CVaRconstrained optimization problems. We will adapt these algorithms to a CVaRconstrained version of the securityconstrained DC (linearized) optimal power flow problem (CVaRDCOPF) where we restrict the total load loss across the worst k contingencies to be less than certain prespecified quantity. We will implement a parallel numerical solver for CVaRDCOPF, building on prior tools developed by the team at LLNL, and test the performance of the approach on industrialscale instances. On the theoretical side, we will investigate techniques to replace the projection operator in current algorithms by functional constraints, which allow us to capture nonlinear problems in future research. Finally, we will numerically study the effect of parametrizing CVaR constraints in energy prices and assess the impact of such prices on investment over the long term, to inform the viability of the approach to ensure security of supply of electricity to end costumers. As an intern on this project, the applicant will have the opportunity to learn the cuttingedge research at the intersection of energy economics and computational optimization. The learning goals involve acquiring handon experience in:
Throughout the internship, the selected candidate will engage in diverse activities, including data analysis, algorithm development, software implementation, and collaborative research discussions. This experience will provide valuable insights into realworld energy challenges and equip the intern with practical skills in computational optimization, energy economics, and algorithm development. Disciplines: Analysis, Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Lawrence Livermore National Laboratory (LLNL) Internship location: Livermore, CA or virtual Mentors:
Internship Coordinator:


No  LBNLFornace1  01/18/2024  1705554000000  Lawrence Berkeley National Laboratory (LBNL)  Berkeley, CA or virtual  Applied Mathematics, Computational Mathematics, Mathematical Biology 
Project Description:Understanding the folding of biopolymers (including nucleic acids and proteins) is vital to ongoing research in biochemistry and bioengineering. Recent models built on deep learning can often accurately predict equilibrium structures. Nevertheless, full inference and summation in the Boltzmann ensembles of these systems is still an outstanding and vital research goal. If the ensemble of folding topologies is suitably restricted, algorithms based on dynamic programming can achieve exact summation in relatively low cost. We are seeking to extend the applicability of these algorithms to general folding topologies using ideas from random matrices and matrix product states. A student in this internship will have the opportunity to learn sophisticated theories related to polymer physics, random walks, tensor networks, and random matrices with an eye towards efficient algorithm design and computational implementation. Disciplines: Applied Mathematics, Computational Mathematics, and Mathematical Biology Hosting Site:Lawrence Berkeley National Laboratory (LBNL) Internship location: Berkeley, CA or virtual Mentor:


No  LBNLVan Beeumen1  01/18/2024  1705554000000  Lawrence Berkeley National Laboratory (LBNL)  Berkeley, CA  Applied Mathematics, Computational Mathematics 
Project Description:Current and nearterm quantum computers, often referred to as noisy intermediatescale quantum (NISQ) computers, are characterized by low qubit counts, short qubit decoherence times, and high gate error rates. Despite these limitations, there is significant promise in harnessing NISQ computers through the use of hybrid quantumclassical algorithms. These algorithms leverage both quantum and classical hardware to perform specific computational tasks. The goal of this project is to leverage the capabilities of NISQ quantum computers to estimate eigenenergies in various scientific domains, including physics, chemistry, and materials science. Eigenenergies are critical in understanding the behavior of quantum systems, such as molecules and materials, and they have wideranging applications in predicting properties, reactions, and behaviors. We aim to develop approaches that are resilient to noise and can accurately estimate eigenenergies from quantum dynamics. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Lawrence Berkeley National Laboratory (LBNL) Internship location: Berkeley, CA Mentor:


No  LANL Matsekh1  01/22/2024  1705899600000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Applied Mathematics, Computational Mathematics, Probability and Statistics 
Project Description:Most particle transport codes fall into two major categories – Monte Carlo (MC) codes, that rely on stochastic simulations, and deterministic codes, that rely on numerical approximation and discretization methods. MC codes are memorybound, being limited by the enormous memory footprints arising from the need to store and read from the memory very large data sets describing simulation particles. In the deterministic transport one of the major computational bottlenecks is the ‘curse of dimensionality’ of the underlying quadrature integration routines, such as the Discrete Ordinates, or Sn, integration method. In both instances thermal radiation transport (TRT) codes can be dramatically improved when the underlying methods are coupled to the latest statistical and machine learning (ML) algorithms. The goal of this project is to explore advanced statistical learning methods, such as the Stochastic Expectation Maximization, Gibbs Sampling, and Markov Chain Monte Carlo, in order to address two distinct problems arising in TRT simulations: memory usage reduction in Monte Carlo TRT codes, and reduction of computational complexity of the deterministic TRT codes. Disciplines: Applied Mathematics, Computational Mathematics, and Probability and Statistics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentor:


Yes  USACERyder1  01/22/2024  1705899600000  U.S. Army Corps of Engineers, Engineer Research and Development Center (ERDC)  Vicksburg, MS or virtual  Analysis, Applied Mathematics, Mathematics (General), Probability and Statistics 
U.S. Citizenship is a requirement for this internship Project Description:USACE reservoirs are experiencing more frequent and more severe algal blooms. Harmful algal blooms (HABs) , those that include toxin producing cyanobacteria and phytoplankton species, are also increasing. These disruptive events are believed to be caused by changes in flow and thermal conditions as well as nutrient water quality that are induced by the combined effects of climate and land use change. In short, the loading rates and internal storage of critical factors in algae growth such as nutrients, carbon, and sediments are changing concurrently with precipitation, wind, and air temperature. Thus, the relationships between algal growth and water quality parameters that have traditionally been applied to reservoirs can be outdated in terms of parameter value ranges and fit to individual reservoirs. Planned activities The participant will utilize historical multiparameter water quality observations to develop statistical tools for deriving site specific driver relationships to be used in numerical water quality models. This will involve compiling and preparing data sets, developing sorting algorithms, applying multiple types of regression analysis (linear, quantile, etc.), analyzing for common patterns, and documenting the process and results. Necessary skills The intern should have experience with data handling; algorithm development, coding, and scripting; multivariate statistical analysis; and stochastic modeling. An interest in complex natural systems is preferred but not required. Use of R is required. Experience with Python, Excel, Visual Basic, and MATLAB may also be helpful. This position requires formal scientific writing. The participant will interact regularly within a water quality engineering team and have opportunities to learn about ongoing research in other emerging water quality topics such as wildfires, PFAS, and microplastics.
Disciplines: Analysis, Applied Mathematics, Mathematics (General), and Probability and Statistics Hosting Site:U.S. Army Corps of Engineers, Engineer Research and Development Center (ERDC) Internship location: Vicksburg, MS or virtual Mentor:
Internship Coordinator:


No  ORNLBurkovska1  01/22/2024  1705899600000  Oak Ridge National Laboratory (ORNL)  Oak Ridge, TN or virtual  Analysis, Applied Mathematics, Computational Mathematics 
Project Description:Partial differential equations (PDEs) that involve inequality constraints on the solution, the socalled obstacle problems, arise in various applications, including contact mechanics, phasefield modeling in material science and also mathematical finance. To accurately simulate these models there is a need to develop efficient and accurate discretization methods. The aim of this project is to design approximation algorithms for the obstacletype problems using a neural network discretization approach. Under the guidance of the mentor, the student will help design and implement the neural network architecture for the approximation of the elliptic and timedependent obstacle problems. During the internship, the student will learn about the theory of obstacletype problems, discretization of such models and its use in scientific applications. The student will be able to interact and engage with other staff members as well as to give a presentation about the summary of the internship project at the end of the appointment. Disciplines: Analysis, Applied Mathematics, and Computational Mathematics Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak Ridge, TN or virtual Mentors:
Internship Coordinator:


Yes  NETLWright1  01/22/2024  1705899600000  National Energy Technology Laboratory (NETL)  Applied Mathematics, Computational Mathematics 
U.S. Citizenship is a requirement for this internship Project Description:This project is dedicated to optimizing the applicability of fiber optic sensors (FOS) in industrial monitoring, with a specific focus on addressing challenges related to modeling the behavior of these sensors in the presence of physical defects observed in pipelines, such as corrosion, cracks, and welding issues. Despite the inherent advantages of fiber optic sensors in precisely measuring various parameters, the need for an effective simulation platform for modeling their behavior under different scenarios has become evident. To tackle this challenge, the project aims to develop a sophisticated simulation platform employing scientific machine learning methodologies. This platform will specifically focus on modeling guided wave propagation on pipelines to simulate various physical defects. The simulation platform will generate synthetic data, which, when combined with experimental data, will be used to train machine learning models. These models will contribute to the development of an overall health index for the pipeline. Moreover, the simulation platform will be extended to create a digital twin of the gas pipeline, replicating reallife defects and gas transport conditions. The primary objectives include the creation of an efficient realtime simulation platform for modeling physical defects in pipelines under various scenarios. The project will explore recent advancements in scientific machine learning methodologies, such as physicsinformed neural networks and Fourier neural operators. This initiative represents the initial step towards establishing a comprehensive digital twin of a pipeline. Through these efforts, the project aims to unleash the full potential of fiber optic sensors, transforming them into highly effective monitoring tools across diverse industries, including aerospace, defense, security, civil engineering, energy, and healthcare. Project Objectives: Week 1 & 2: Intern orientation with relevant people from the team. The candidate will be provided with materials and ongoing research related to the project for their learning and understanding of the project. Intern will also do some study on existing platforms for scientific machine learning. Week 3&4: The intern will finalize the platform to research upon such as nvidia modulus or deepxde..etc. They will focus on some tutorial problems similar to guided wave propagation. Week 5&6: The intern will implement the learned methodologies towards replicating guided wave propagation problem. Week 7&8: The intern will continue to investigate and research the problem adding physical defects. Week 9&10: Iterate on the model with feedback from team members and debug for any potential issues. Finalize the results, prepare the relevant documentation and reports. Present the findings to the overall team. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:National Energy Technology Laboratory (NETL) Mentors:
Internship Coordinator:


No  SNLHarper1  01/29/2024  1706504400000  Sandia National Laboratories (SNL)  Albuquerque, NM  Applied Mathematics, Computational Mathematics 
Project Description:Preconditioning and solving largescale linear systems underlies nearly all largescale science and engineering simulations. There are approaches that work well for many cases like diffusion and electromagnetics, but there are many difficult problems such as turbulent flow and magnetohydrodynamics which require care to solve effectively. Multigrid methods are a class of optimal computational complexity solvers and preconditioners which solve a linear system by recursively coarsening the problem size and improving the solution estimate. Construction of a coarse problem (also referred to as a coarse grid) may use geometric data via geometric multigrid (GMG) or use algebraic data from the matrix operator via algebraic multigrid (AMG). The interpolation between grids is supplemented by simple solver iterations called smoothers. Interested participants will learn how multigrid methods interact with partial differential equations and discretizations through many examples involving practical applications. Participants will develop mathematical theory and practical intuition for improving linear solvers for a variety of problems. The team will collaborate to develop effective and performant approaches for difficult applications. This project will utilize the parallel C++ library MueLu in the Trilinos framework (https://github.com/trilinos/Trilinos) for multigrid methods on Linux supercomputers, including some of the largest supercomputers in the world. Additionally, a Sandia summer proceedings article will be written about the completed project. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Sandia National Laboratories (SNL) Internship location: Albuquerque, NM Mentors:
Internship Coordinator:


No  SNLHarper2  01/29/2024  1706504400000  Sandia National Laboratories (SNL)  Albuquerque, NM  Applied Mathematics, Computational Mathematics 
Project Description:Finite element methods (FEMs) are some of the most ubiquitous approaches for discretizing and solving partial differential equations (PDEs). FEMs allow one to construct finite dimensional space approximating an infinite dimensional space and solve a corresponding linear system arising from a PDE. FEMs are utilized for a wide variety of applications, from diffusion and electromagnetics to turbulence and magnetohydrodynamics. In such applications, there are often additional desired properties such as fluid incompressibility or solenoidal magnetic fields. However, many multiphysics applications are constrained in size due to memory limitations. To that end, techniques such as basis compression and dimension reduction are applied in order to reduce computational burden and increase simulation speed. Interested participants will learn how FEMs are used to discretize partial differential equations and how they are used to solve multiphysics applications. Participants will then help develop numerical analysis for FEM basis compression and software to validate the theory. The team will collaborate to advance the state of the art in basis compression for FEMs. This project will utilize the parallel C++ library MrHyDE (https://github.com/sandialabs/mrhyde) developed utilizing the Trilinos framework (https://github.com/trilinos/Trilinos) for finite element methods on Linux supercomputers, including some of the largest supercomputers in the world. Additionally, a Sandia summer proceedings article will be written about the completed project. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Sandia National Laboratories (SNL) Internship location: Albuquerque, NM Mentors:
Internship Coordinator:


No  LANLSvyatsky1  01/29/2024  1706504400000  Los Alamos National Laboratory (LANL)  Los Alamos, NM  Applied Mathematics, Computational Mathematics 
Project Description:Flooding is the natural hazard causing major economic and social impacts worldwide. Urban hydrology includes complex interaction of surface flows, subsurface infiltration and flows in sewer drainage networks. Urban flooding often occurs when rainfall overwhelms the sewer system, causing water to pool in streets, yards, and house basements.Accurate modeling of urban hydrology is of utmost importance to understand water movement in urbanized areas and perform mitigation strategies. In this project, the development and simulation will be conducted using multilab project Advanced Terrestrial Simulator(ATS) to provide efficient and accurate processbased model for urban hydrology. The model will include overland flows (diffusive wave and/or shallow water models), subsurface flows (Richards equation) and pipe network flows (shallow water 1D SaintVenant equation). Alongside a mentor, the successful applicant will have an excellent opportunity to perform research on advanced spatial and time discretization methods that have to be applied to perform highfidelity predictive simulation at scale. He/She will have the chance to enhance their knowledge on a wide range of fields including integrated hydrology modeling, numerical methods, and computational physics as well as high performance computing. Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Los Alamos National Laboratory (LANL) Internship location: Los Alamos, NM Mentors:


No  SNLRidzal1  01/29/2024  1706504400000  Sandia National Laboratories (SNL)  Albuquerque, NM  Applied Mathematics, Computational Mathematics 
Project Description:Maxwell's equations are an important modeling component in a variety of modern engineering applications, ranging from nanoscale optical devices to magnetic confinement fusion. Despite the significant advances in the mathematical analysis and numerical methods for the solution of Maxwell's equations, little progress has been made in our understanding of optimal control and optimal design problems governed by Maxwell's equations. This project seeks to develop rigorous, robust and efficient solution methods for such optimization problems, to help realize the full potential of modern electromagnetic devices and systems. The project will focus on optimal control, in both space and time, of electromagnetic sources (currents). First, a rigorous mathematical formulation will be developed and analyzed. The analysis will combine elements of semigroup theory, to handle the temporal aspects of the problem, with the de Rham complex for the gradient, curl and divergence operators, for a proper spatial setting. The analysis will inform the problem formulation in the sense of defining an appropriate control regularization strategy. Second, the problem will be discretized in space and time, using finite element spaces and time steppers that yield robust and accurate optimal controls. Third, the problem will be solved numerically using Sandia's opensource software MrHyDE (https://github.com/sandialabs/MrHyDE), which combines a variety of scientific computing tools developed in the Trilinos project (https://github.com/Trilinos). Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Sandia National Laboratories (SNL) Internship location: Albuquerque, NM Mentor:


No  LBNLPopovici1  01/29/2024  1706504400000  Lawrence Berkeley National Laboratory (LBNL)  Berkeley, CA  Applied Mathematics, Computational Mathematics 
Project Description:Density Functional Theory (DFT) is a computational quantum mechanical modeling approach used in fields like physis, chemistry and material science to investigate the structure of manybody systems, focusing on atoms and molecules and their interactions. DFT is among the most popular and versatile methods available in condensed matter physics, computational physics and computational chemistry. This is emphasized by the fact that more than 25% of the HPC workloads on leading supercomputing centers is spent in performing DFT simulations. Therefore, understanding the DFT algorithms and providing scalable and highly efficient implementations for these algorithms becomes mandatory, especially now when GPU and customized hardware accelerators are used to provide better performance and execution time. Building frameworks that allow for DFT codes to be executed on a myriad of systems, offering automatic optimizations, can provide crucial benefits to scientists working on problems that are becoming larger and more complicated. Goals: Provide an in depth study and comparison between different algorithms for the eigensolver problem. Most DFT calculations have an eigenvalue solver as their main computation. Depending on the system some algorithms may perform better than others, as such we would like to study three of these algorithms (Conjugate Gradient, Davidson, RMM) and understand their overall performance and convergence rate. Provide an automatic mechanism to parallelize and distribute the data and computation across the network. All eigensolvers require linear algebra and Fourier transforms (in the case of planewave Density Functional Theory). These components may exhibit different communication patterns, which will affect the overall performance on a distributed setting. We want to investigate that and build models to guide our implementation using information from algorithms and the hardware. Each algorithm may have different convergence rates. The three mentioned algorithms are iterative, and may require multiple iterations to converge to a solution. As such we would also like to investigate the different convergence rates and maybe offer solutions by implementing better preconditioners. This will add another dimension to our model. Approach and methodology Our study will concentrate on three key aspects:
Disciplines: Applied Mathematics, and Computational Mathematics Hosting Site:Lawrence Berkeley National Laboratory (LBNL) Internship location: Berkeley, CA Mentor:


No  BNLLopezMarrero1  02/14/2024  1707886800000  Brookhaven National Laboratory (BNL)  virtual  Applied Mathematics, Computational Mathematics, Mathematical Biology 
Project Description:In this project we will develop and study stochastic models describing the growth and spread of pathogens in plantpathogen interactions. The models will form part of a digital twin framework for plant health monitoring. In addition to working with the mentor in defining the models, project tasks will include implementation of numerical solution methods for systems of stochastic differential equations. During the internship the student will gain experience with numerical methods which may include neural network approaches combined with traditional numerical methods for solving differential equations. Python programming will be required. Therefore, the student will be expected to learn Python programming during the course of the internship if he/she does not already meet this requirement. Disciplines: Applied Mathematics, Computational Mathematics, and Mathematical Biology Hosting Site:Brookhaven National Laboratory (BNL) Internship location: virtual Mentor:
Internship Coordinator:


No  ORNLWong1  02/14/2024  1707886800000  Oak Ridge National Laboratory (ORNL)  Oak RIdge, TN or virtual  Applied Mathematics, Computational Mathematics, Foundations 
Project Description:The discovery of matrix multiplication schemes has been a recent hot topic, and resulted in a few record breaking results for multiplying small matrices. In particular, the study of discrete objects known as 'flip graphs', which consists of sets of orbits of multiplication schemes and edges representing actions on those schemes, enables a way to perform searches for new ways to multiply two matrices. Thus far, these searches have been naively pursued and only on CPUs, which presents limitations very quickly in the large search space. The goal of this project will be to study how to adapt these flip graphs and develop algorithms that would effectively perform the search on GPUs, which would leverage the supercomputing power available to us at ORNL. Learning Objectives:
Disciplines: Applied Mathematics, Computational Mathematics, and Foundations Hosting Site:Oak Ridge National Laboratory (ORNL) Internship location: Oak RIdge, TN or virtual Mentors:
Internship Coordinator:


No  BNLLi1  02/14/2024  1707886800000  Brookhaven National Laboratory (BNL)  Upton, NY  Applied Mathematics 
Project Description:Machine learning (ML) has shown great success in many domains, and increasing efforts are applying ML in the field of computer science and engineering, for instance, to design more efficient hardware and software. Due to its data driven nature, it requires thousands if not millions of programs to generate training data in ML techniques, and the computing research community does not have such a collection of programs in hand. Particularly, traditional benchmarks are short in number, and source code hosting websites (e.g., GitHub) have abundant code snippets instead of executable programs. This project aims to bridge this gap by creating a database including a large number of executable programs. The main goal of this project is to collect/write programs that stretch both computing power and memory capability of modern computers, leveraging coding exercise websites such as LeetCode and CodeForces. This database will subsequently be used for program performance predictive model training and testing. Besides the main purpose mentioned above, this project will also help students sharp their programming skills and benefit their future careers in both academia and industry, giving the fact that these coding exercise websites are the de facto means to prepare technical interviews. Disciplines: Applied Mathematics Hosting Site:Brookhaven National Laboratory (BNL) Internship location: Upton, NY Mentor:
Internship Coordinator:

The name and contact information of the hosting site internship coordinator is provided for further assistance with questions regarding the hosting site; local housing availability, cost, or roommates; local transportation; security clearance requirements; internship start and end dates; and other administrative issues specific to that research facility. If you contact the internship coordinator, identify yourself as an applicant to the NSF Mathematical Sciences Graduate Internship (MSGI) Program.
Interns will not enter into an employee/employer relationship with the Hosting Site, ORAU/ORISE, NSF or DOE. No commitment with regard to later employment is implied or should be inferred.