By Heather Wilber
This July marked the end of my first year as an assistant professor in the Applied Mathematics Department. I was invited to write an article for the newsletter to share about how that first year went, as well as introduce myself to those I haven’t yet had a chance to meet. I’ll also take this opportunity to broadly discuss an area of computational mathematics critical in my own research, which is the development of fast algorithms that take advantage of data-sparse structures.
Let me begin by reflecting on why I am here. By the end of my first year in graduate school, I knew I belonged in academia. Many people imagine that academics love ideas and join academia because they desire the space to pursue ideas freely. This is in part true, but I’ve come to appreciate that ideas are inseparable from people. My love of applied mathematics began as a love of conversations with inventive, inspiring, and sometimes refreshingly odd people. Whether it occurs in-person, across the page, scrawled over chalkboards in the classroom, or inscribed into the forms of built things, I cherish good conversations. I especially love conversations about math, and my first year at UW has been filled with them. I am so grateful to be part of this welcoming and wonderful community, and I’m looking forward to all the conversations to come.
A brief introduction
I grew up in a small town in southwest Idaho. I completed undergraduate degrees in mathematics and
English at Boise State University, and then took a long break from school to travel and explore the
wider world. I eventually came back to Idaho and earned an M.S. in applied mathematics under the
supervision of Prof. Grady Wright. It was Grady who introduced me to numerical analysis and
approximation theory. I was incurably hooked by Grady’s Chebfun-based lectures and haven’t left the
area since. I completed my PhD in applied mathematics at Cornell University under the supervision of
Prof. Alex Townsend in 2021. My PhD work applies rational approximation methods to the
development of fast and highly accurate algorithms for scientific computing. A major theme is low rank
approximation. Before arriving at UW, I completed a short postdoc with Prof. Gunnar Martinsson at
the University of Texas at Austin, where I worked on developing fast direct solvers for partial
differential equations for modeling phenomena such as acoustic scattering. From all of these
experiences, I picked up a love for blazing fast higher-order methods, an obsession with understanding
how data-sparse structures arise from physical and mathematical assumptions, and the desire to
understand and design highly effective numerical methods through the application of approximation
theory.
Reflections on my first year
When I look back on my first year at UW, I am surprised and delighted by how much fun it was. My
(admittedly ridiculous) enthusiasm for numerical linear algebra was matched, and dare I say exceeded,
by some of the graduate and undergraduate students that I had the privilege of teaching. I’ll quote one
anonymous undergraduate from AMATH 352: “The SVD reveals the meaning of life. Thank you for
sharing it with me.” One of my big goals for the first year was to start up a research group, and I am
very happy that multiple students were enthusiastic about joining me and diving into the challenging
worlds of hierarchical methods and approximation theory. These include PhD students Wietse Vaes,
Levent Batakci, and Arjun Sethi-Olowin, all working on projects connected in various ways to hierarchical numerical linear algebra, as well as M.S. student Daniel Dou, who is working on the development of new methods in computational rational approximation.
This summer has especially been lively with research and reading discussions. I’ve been spending my
days in lovely conversation with a crew of enthusiastic UW students, including those listed above, as
well as undergraduates Josephine Noone, Maria Elena Protogerou, and Henry Ramstad, M.S. students
Tianbo Zhang, Xinze Ren, Jeffrey Xu, and Eric Wan, and PhD student Payton Howell. Over the past
year, I also served as the faculty advisor for both the DEI committee and our AMATH chapter of the
AWM. I additionally served as a faculty mentor for the WAMM program. I was amazed by the
dedication and talent of our students, acting instructors, and staff members in these service roles, which far exceeded my expectations.
As this year wraps up and the new one begins, I am looking forward to continued opportunities for
collaboration and research with colleagues and students. I’ll be teaching the AMATH 301 course in
scientific computing this Autumn, where I’m thrilled to have the opportunity to welcome hundreds of
undergraduate students into the kinds of conversations I was similarly welcomed into years ago.
Fast algorithms in a fortunately-structured world
If you give scientists a big computer and enough time, they will inevitably use that computer to ask
new questions that require even bigger computers. And if you give them even bigger computers...well,
you end up in a world like ours, where in 2022 the first public exascale supercomputer was introduced¹. This is simply the nature of discovery and innovation. While hardware development is very important, we’ll never keep up with the pace of human curiosity by focusing on hardware alone. Effective and efficient algorithms and numerical methods can dramatically change the landscape of possibilities in scientific computing and engineering. A famous example is the fast Fourier transform (FFT), which in its modern form was introduced via Cooley and Tukey in 1965². Problems that required days of computation via the naive discrete Fourier transform (DFT) could suddenly be done in mere seconds on the same machine. This fact was world-changing, and it continues to shape innumerable aspects of our lives today [1].
Much of the work I and other computational mathematicians do is focused on designing fast (and
storage-efficient) algorithms. By fast, we loosely mean that practitioners can expect substantial
improvement over what is possible with a standard method run on the same machine. I motivate my
work with a sentiment borrowed from supercomputing expert Prof. David Keyes (KAUST Univ.).
Commenting on the environmental costs of modern supercomputing, he implores us to strive for
computational methods that wring out the most “science-per-Joule” in all we do. Even at the smaller
scale, this is a worthy goal that motivates excellence.
Fast methods can often be developed if the matrices, models, signals or data sets involved in the
problem possess useful structures. Common structures we make use of include low rank properties,
various recursive and hierarchical structures, sparsity properties, symmetry structures, and properties
related to smoothness and analyticity. How common are these kinds of structures in practice?
Fortunately for us, the kinds of questions scientists and engineers like to ask are not arbitrary. They
arise from physical and biological phenomena that seem to engender mathematical objects with useful structures. This could be because we naturally impose structures via the assumptions we make in science to simplify the world, or it could imply that the world itself is, at least up to some relevant order of approximation, highly mathematically structured in computationally useful ways. In any case, the abundance of these structures in applications has been critical to our success in designing all kinds of technologies that we now rely on in our everyday lives.
The fast Fourier transform is a classic example of how structure can be wielded to create a fast
algorithm. The basic action at the heart of the discrete Fourier transform (DFT) is a matrix-vector
product. The matrix involved is called a DFT matrix. If the DFT matrix were an arbitrary m-by-m
matrix, it would take m² work just to write it into memory, and on the order of m² floating point
operations to compute the transform. However, DFT matrices are not arbitrary. Every DFT matrix
encodes an expansion in terms of trigonometric polynomials, and this makes the matrix highly
structured. The structure requires very little storage and admits a lovely recursive strategy for
evaluating the matrix-vector product. The work required for this recursive process is on the order of
only mlog(m) operations. To get a sense of the speedup, consider that an early computer like the IBM
7090 (the computer used by NASA in the movie Hidden Figures) could crunch about 100,000 floating
point operations/second. A that rate, an m² procedure with m = 10,000 requires about 16 minutes to complete. In contrast, an mlog(m) procedure takes less than a second!
How does one discover or develop structure-exploiting algorithms? There’s no set formula, but we
often begin by trying to make sense of when and why mathematical structures appear in an application. Theorems and paradigms are then developed that can explain and formally characterize these structures. This stage of the work can involve tools and techniques from classical areas of mathematics, such as approximation theory and complex analysis. From there, algorithms can be devised, implemented, tested, revised, and so on. It’s often a two-way street, with computational tests and tinkering informing and guiding further refinements to theoretical work. Like everything, developments in computational mathematics take the shape of longstanding discussions threaded through literature, pedagogy, history, and technology. Even breakthrough ideas like the FFT and other game-changing algorithms tend to have a longer history than what might seem obvious on the surface. A version of the FFT was developed by Gauss in the 1800s, and used by several other mathematicians long before
modern computers ever entered the scene [1]. Many modern questions in scientific computing and
engineering have roots in the physics and mathematics of centuries ago [5].
My own work in this area in the past year has been largely focused on fast computational methods that
involve matrices with hierarchical low-rank structures. For an overview on hierarchical matrices, see
[2]. In recent work with Ethan Epperly (Caltech) and Alex Barnett (Flatiron Institute), we tackle a
variation of the discrete Fourier transform problem known as the inverse nonuniform discrete Fourier
transform [6]. Unlike the setup for the usual DFT, the output of the transform is sampled at a collection
of nonuniform points. From this, one seeks to recover the input, which in the nonuniform case requires
the solving of a linear system³. The nonuniformity of the sample locations destroys the special butterfly
structure that makes the FFT possible. However, many additional structures exist that we can turn to.
We show that nonuniform DFT matrices are linked to special matrices with hierarchical low rank
structures. These structures lead to a fast scheme (near linear in the degrees of freedom) for computing the inverse transform. In ongoing work with UW students Levent Batakci and Arjun Sethi-Olwon, we
seek to develop inversion algorithms that can exploit special block structures that appear in higher
dimensional nonuniform DFT matrices. The structures held by the nonuniform DFT matrices are
shared by many other important matrices in scientific computing, including Toeplitz, Hankel, Cauchy, and generalized Vandermonde matrices [3]. Other ongoing work includes the development of a new
time-frequency method for solving the acoustic scattering problem. In this work with UW student
Wietse Vaes, Gunnar Martinsson (UT Austin), and Abi Gopal (UC Davis), we make use of hierarchical
low rank structures that arise from properties of the discretized PDE, and we also make use of
analyticity properties held by the damped solution to the acoustic wave equation in frequency space.
In addition to the work I do related to fast solvers, I have lately been focused on questions related to
computational methods of rational approximation. This includes ongoing work with Daniel Dou on
rational representations in coefficient space, as well as recent work with Nick Trefethen (Harvard) on
computational rational methods for solving Zolotarev problems [4]. I’m looking forward to continued
developments!
¹ Exascale computers execute a quintillion floating point operations per second.
² The FFT debuted alongside new advancements in analog-to-digital signal processing hardware. The pairing together of these two advancements afforded spectacular change.
³ For the uniform case, the DFT matrix is unitary, so its inverse is its Hermitian transpose, and inversion reduces to a matrix-vector product.
References:
[1] “Gauss and the history of the Fast Fourier transform”, Heideman, Johnson, and Burrus. Archive for
History of Exact Sciences., Vol. 35, pp. 265-277, 1985.
[2] “Matrices with hierarchical low-rank structures ”, Kressner and Ballani. Exploiting Hidden
Structure in Matrix Computations: Algorithms and Applications, Eds. Benzi and Simoncini, Springer,
pp. 161-209, 2015.
[3] Structured Matrices and Polynomials: Unified Superfast Algorithms, Pan. Spring Science and
Business Media, 2012.
[4] “Computation of Zolotarev rational functions”, Trefethen and Wilber.
https://arxiv.org/abs/2408.14092, 2024.
[5] Approximation Theory and Approximation Practice, Extended Edition, Trefethen. SIAM, 2019.
[6] “A superfast direct inversion method for the nonuniform discrete Fourier transform”, Wilber,
Epperly and Barnett. arXiv preprint arXiv:2404.13223, 2024.