Meet a Participant: Chitrak Banerjee
Helping researchers save time and effort through computer programming
Computer programming can often seem objective, a straightforward method to convert human input into language computers can understand. Hundreds of programming languages exist. The technical action of communicating with a computer may be objective, but the choice of programming language can be riddled with bias.
Many programming languages have not been used widely in the past because of stigmatism in the programming community, despite offering advantages, such as more simplistic language codes and easier user access. Fortunately, that trend is changing as more people across different disciplines integrate computer programming into lifestyles, scientific research and business ventures.
Programming language R, for instance, has substantially increased in popularity in recent years. R offers a free software environment for statistical computing and graphics, and has an active programming community supporting it. Many R users have created and published their own R packages tailored to specific areas of research, making R even more accessible to a variety of users.
Now that certain programming languages are becoming more commonly used, popular software tools and packages need to catch up to be compatible with these languages.
“The growing popularity of scripting languages, such as R and Python™, among end users has made it all the more urgent to provide a toolbox for automatic differentiation to use with these languages,” said Chitrak Banerjee, a fourth-year doctoral student in statistics at Michigan State University. Banerjee, a recent intern with the National Science Foundation’s (NSF) Mathematical Sciences Graduate Internship (MSGI) Program, was tasked with helping address some of these compatibility problems.
The NSF MSGI program offers research opportunities for mathematical sciences doctoral students to participate in internships at national laboratories, industries and other facilities. NSF MSGI seeks to provide hands-on experience for the use of mathematics in a nonacademic setting.
For his internship, Banerjee was stationed at Argonne National Laboratory, Lemont, Illinois, in the Mathematics and Computer Science Division. Under the mentorship of Sri Hari Krishna Narayanan, Ph.D., Banerjee researched how to effectively deploy ADOL-C (Automatic Differentiation by OverLoading in C++) with language R instead. ADOL-C, a tool created for use with programming languages C or C++, is utilized for evaluating first and higher order derivatives of functions at a desired precision level and within a reasonable computation time.
Banerjee also explored a way to link programming libraries such as Boost, which provides fast computation, and ColPack, a specific type of library that expands the ADOL-C to a general optimization setting. Banerjee also spent time creating scripts to automatically download, build and install packages such as ADOL-C, Boost and ColPack. Overall, Banerjee’s efforts make it easier for the end user to install and utilize the programming packages, avoiding complications that can arise during the installation process.
As a result of his research, users who utilize ADOL-C with Boost and ColPack in the programming language R save significant time and effort. By using the installation process that Banerjee and his mentor created, researchers can now have access to an R package that produces an interface for automatic differentiation in the R environment.
The path for Banerjee and his mentor wasn’t always easy. “The project required us to explore and try new things in order to build the R package. Most of the time, there were bugs in every step of the process,” said Banerjee. “I had no prior experience building an R package from C or C++ source code, so I had to do a lot of learning upfront, particularly about the intricacies of shell scripting.”
By the end of the program, however, Banerjee had succeeded in several respects. “I can say I’m very confident on delivering a positive result in a computer science project, which is quite different than my major area of study, statistics.” Banerjee continued, “I have gained skills which I didn’t have before, and I believe these skills will help me in the future to solve similar problems.”
The programming package for R that Banerjee helped create will be published in CRAN (Comprehensive R Archive Network), where other researchers can find and utilize it. Since the conclusion of his internship, Banerjee has returned to finish his doctoral degree at Michigan State University.
“I consider myself extremely lucky to have had this experience. Without a doubt, this program provides a unique experience more than just a typical internship. The best part of the program is learning through contribution. Working on this project has motivated me and my future research,” Banerjee said.
The NSF MSGI Program is funded by NSF and administered through the U.S. Department of Energy’s (DOE) Oak Ridge Institute for Science and Education (ORISE). ORISE is managed for DOE by ORAU.