Skip to main content

Cornell University

Research

Statistics is the science of data. Estimation facilitates learning from data. Quantifying, controlling, and describing uncertainty provides a basis for the inference, modeling, and decision making that is central to all theoretical and applied science. Quantitative analysis and statistical reasoning have become ubiquitous, and the prevalence of "big data" is rapidly changing the way we study and view our world. CAM's statistics research programs span from mathematical statistics, computational statistics, and machine learning to the development of statistical methods for astrophysics, ecology, economics, emergency medical services, financial modeling, genomics, high-dimensional and functional data, neurobiology, risk management and spatiotemporal data.

Imaging and other sensing modalities are increasingly key to science, medicine, engineering, and many other fields, and hence computational methods for processing and extracting information from sensor data are of critical importance. Extracting useful information from raw, noisy data involves a wide range of mathematical techniques including inverse problems, optimization, modeling and prediction, discrete algorithms, and methods for high-level image understanding. At CAM, researchers are designing algorithms for a range of applications, from studying bird and insect flight, to reconstructing volume data from medical scans, to reconstructing 3D geometry from 2D images.

Scientific computing can be thought of as the application of high-performance numerical algorithms to large-scale computational problems arising in science and engineering, and is therefore ubiquitous in the work of applied mathematicians. Numerical analysis is the development of such methods as well as the study of their accuracy, stability, and complexity. Typical problems in numerical analysis involve ordinary and partial differential equations, continuous optimization, and the linear algebraic tasks that underlie computations in these and other areas. All of these topics are being investigated in various settings by CAM researchers. The range of scientific computing at Cornell can be gauged from the homepages of the faculty of the graduate field of computational science and engineering at https://cornell-cse.github.io/people.html. Examples include the study of solids and structures under uncertainty; the role of rapid evolution in the dynamics of food webs in evolutionary biology; simulation optimization in call center staffing and ambulance deployment; understanding turbulent and reactive flows in combustion; the investigation of material structure across multiple length and time scales; and the study of complex systems applied to insect flight.

Probability is the study of randomness and uncertainty. The field of stochastic processes deals with randomness as it develops dynamically and can be thought of as the study of collections of related, uncertain events. Research in this area involves finding laws governing randomness; familiar examples include the law of large numbers and the central limit theorem. Some research in probability is highly theoretical and is connected to a number of areas of mathematics including functional analysis, measure theory, and partial differential equations; other work is more applied, involving the modeling of service or queuing systems, financial phenomena, networks, manufacturing processes, and other real-world systems. There is significant interaction between the research being done in probability and stochastic processes and that in other research areas at CAM.

Optimization is the act of finding the best solution to a given problem, subject to problem-specific constraints. Optimization problems arise in engineering when trying to choose the best among different system designs or courses of action, and in the physical sciences when trying to predict how nature will behave. They also arise in statistics, when trying to describe reality in a way that best fits the available data. Optimization becomes challenging -- and mathematically interesting -- when the number of options is large enough to render the consideration of every option prohibitively expensive. In such situations, optimization methods can use a problem's structure to quickly find the best option without evaluating each one. Challenges in optimization also arise when options cannot be evaluated perfectly or when the time needed to evaluate an option is significant. Researchers within CAM consider many different kinds of optimization problems, each with its own theory and applications, including convex optimization, combinatorial optimization, continuous optimization, optimization via simulation, and global optimization of expensive functions.

Researchers at CAM apply tools from probability theory, quantum theory, nonlinear dynamics and bifurcation theory, and differential geometry to wide-ranging problems of interest in physics and complex systems. These techniques have been central to our understanding of the synchronization of fireflies, infectious disease dynamics in complex populations, algorithms for 3D X-ray imaging of cellular structures, categorization of topological quantum materials, and the effectiveness of simple models in describing our complex world.

Mathematical finance is the field of mathematics that studies financial markets. Topics studied include market trading mechanisms, called market microstructure, corporate management decision making, called corporate finance, investment management, and derivative securities. In each of these areas, sophisticated mathematics is utilized for modeling purposes. The theories of stochastic processes, stochastic optimization, partial differential equations, and simulation methods are just some of the mathematical tools employed. For example, in the area of derivatives, stochastic calculus is used to price a call option on a common stock. A call option is a financial security that gives its owner the right to buy a common stock at a fixed price on or before a fixed future date. Using stochastic calculus, the price of a call option can be characterized as the expected value of a nonlinear and random payoff at a future date. Numerical methods, such as Monte Carlo simulation, are often used to compute these expected values.

Mechanism design is the problem of designing a game so as to guarantee that rational players will produce outcomes that are desirable from the point of view of the mechanism designer. Auctions can be viewed as instances of mechanisms. We may be interested in designing auctions that, for example, maximize the revenue of a seller or encourage truthful behavior on the part of buyers. Computer scientists have become interested in algorithmic mechanism design, where the focus is on designing mechanisms with good properties that can be implemented in polynomial time. More recently, there has been work designing extremely simple mechanisms that may not be optimal but perform well in equilibrium, in some appropriate sense. Combining learning theory with game theory: Online learning theory provides powerful models (such as the famous multi-armed bandit problem) for reasoning about processes that adapt their behavior based on past observations. New research questions emerge when one places online learning in a game-theoretic context. For example, what can be said about the dynamics of systems composed of multiple learners interacting in either a competitive or a cooperative environment? If the learner's observations depend partly on data provided by selfish users, can we design learning algorithms whose behavior is aligned with the users' incentives, and what impact does this have on convergence rates? How should one design algorithms for settings where the learner cannot take actions directly, but instead depends on myopic agents who can be encouraged via incentive payments to explore the space of alternatives, as when an online retailer attempts to learn the quality of products by eliciting consumer ratings? Adding computation and language issues to game theory: We consider models of game theory that explicitly charge for computation and take seriously the language used by agents. This turns out to have a major impact on solution concepts like Nash equilibrium. This approach seems to capture important intuitions and can deal with a number of well-known problematic examples. Moreover, there are deep connections between computational games and cryptographic protocol security.

Lee Segel, an eminent applied mathematician and a founder of modern mathematical biology, observed that “mathematical biology sounds like a narrow specialty, but in fact it encompasses all of biology and most of mathematics”. Much current research is driven by the flood of “big data” (e.g., next-generation genomic and whole-transcriptome sequencing, dozens of environmental variables measured globally in nearly continuous time) and “small data” (e.g., individual ion channels in nerve cells, motion tracking of insect wings) that open new phenomena to quantitative modeling, and by the increasing challenges of biosphere sustainability. Mathematics plays an important role at all levels of biological organization. Research areas represented at CAM include: Identifying the structure and parameters of biochemical and genetic networks; Identifying genes associated with complex diseases and differences among individuals; How insects walk, hover, and turn in flight; The biomechanics of muscular tissue and the effects of muscular degeneration; How infectious diseases spread through contact networks; How rapid evolution affects population and ecosystem dynamics; Finding improved solutions to conservation, environmental, and sustainability problems such as nature reserve design, invasive species management, pest and disease management in agriculture, environmental remediation, and carbon sequestration.

From the perspective of an applied mathematician, fluid mechanics encompasses a wealth of interesting problems. Fluid motion is governed by the Navier-Stokes equations; the apparent simplicity of these differential equations belies the range of fascinating phenomena that emerge in the motion of liquids and gases. Understanding problems in such disparate application areas as groundwater hydrology, combustion mechanics, ocean mixing, animal swimming or flight, or surface-tension-driven motion hinges on a deeper exploration of fluid mechanics. Attempts to understand fluid motion from a theoretical perspective lead to mathematical questions involving numerical analysis, dynamical systems, stochastic processes, and computational methods -- classically rich territory for the applied mathematician. CAM offers opportunities to work in many areas of fluids with researchers whose interests range throughout the engineering disciplines.

Topics studied in dynamical systems span the natural and social sciences as well as many engineering disciplines: the vibration of molecules, the motion of planets, and the study of complex systems like the human brain or power grid, to name a few. There is also a branch of the subject concerned with more mathematically pure problems, such as analyzing the iteration of quadratic functions. In recent years, CAM faculty and students have used dynamical systems to study the oscillations of bubbles, the flight of mosquitoes, the emergence of cooperation, and rapid evolution in predator-prey systems, among other phenomena. Cornell has a long history as a center for research in the subject, and CAM has been a focal point for that research. CAM faculty who regularly teach dynamical systems courses and serve as advisors of students doing research in the subject are Steve Ellner, John Guckenheimer, John Hubbard , Richard Rand, James Sethna, John Smillie, Paul Steen, Steve Strogatz, Alex Vladimirsky and Jane Wang. Many students have taken advantage of the interdisciplinary opportunities provided by CAM to engage in research that connects theory, simulation, and experiment. Dynamical systems theory studies models of how things change over time. Even the simplest nonlinear dynamical systems can generate phenomena of bewildering complexity. Because formulas that describe the long time behavior of a system seldom exist, we rely on computer simulation to show how initial conditions determine the evolution of particular systems. Simulations of different systems sometimes display common patterns. One of the main goals of dynamical systems theory is to discover these patterns and characterize their properties. The theory can then be used to describe and interpret the dynamics of specific systems. It can also be used as the foundation for numerical algorithms that seek to analyze system behavior in ways that go beyond simulation. Another significant research area is bifurcation theory, which studies how a system's dynamics change as its parameters are varied.

Combinatorics is the study of discrete objects and their configurations. It has connections to a variety of areas in pure mathematics, particularly algebra, geometry, and topology, and has applications including the design of codes and circuits, modeling computation, and designing and analyzing algorithms for navigation. Researchers at CAM work on topics such as studying incentives in online systems, designing algorithms to find near-optimal solutions to hard discrete optimization problems, determining good ways to send information through networks, and organizing bike-sharing systems in cities.

Some of the most fascinating problems in applied mathematics today concern the structure and dynamics of systems composed of many interacting parts. Think of the thousands of power plants in the electrical grid or the companies in the global economy; the billions of people on Facebook; or the tens of billions of neurons in the human brain. In each of these complex systems, the interactions between individual parts define a network. Faculty and students at CAM are using the tools of graph theory, statistical physics, machine learning, probability theory, and dynamical systems to investigate the complex architecture and collective behavior of the diverse networked systems in the world around us.

Artificial intelligence heralds the promise of computers that can exhibit human-like behavior and has progressed steadily by mastering a series of increasingly challenging benchmark tasks. In sharp contrast to the state of the field twenty years ago, today there are automated tools to perform facial recognition from a corpus of images, play world-championship chess, and even win at Jeopardy. One particular subfield has made especially remarkable advances, and that is the area of machine learning, in which the algorithms used for these tasks are “trained” by their experience in handling a series of inputs for the task at hand. Machine learning algorithms rely on sophisticated mathematics that ties problems in high-dimensional statistics to approaches to non-convex optimization. Cornell is an international leader in machine learning and AI research. CAM facilitates collaborations in this inherently interdisciplinary field that connects computer science, the mathematical and physical sciences, and engineering.

Mathematical models of natural phenomena often take the form of nonlinear partial differential equations (PDEs) and/or minimization problems. Their rigorous treatment is the historical root for the entire field of mathematical analysis. However, applied analysis has the distinctive feature that it develops not simply for its own sake, but with an eye toward finding solutions to concrete problems. Moreover, this relationship is symbiotic: mathematical models motivate general analytic techniques, while the analysis itself informs the modeling and computational experiments. For example, a sharp existence theorem can determine the ability of a model to predict an observed phenomenon. Or, by knowing the specific type of discontinuities present in the solution of a differential equation, one can often build a hybrid (analytic/numerical) approximation that is more accurate and efficient than what would result from a naïve/standard discretization. Employing various techniques from nonlinear functional analysis and PDEs, the calculus of variations, bifurcation theory, probability theory, stochastic processes, geometric group theory, numerical analysis and computational science, current research at Cornell in applied analysis and PDEs includes problems from nonlinear elasticity and thin structures, mechanics of materials, mathematical aspects of materials science, homogenization theory, optimal control and differential games, seismic imaging and inverse problems, heat diffusion on manifolds, condensed matter physics, and nano-scale electronic systems.

Applied algebra research at Cornell comes in several flavors. For computational scientists and engineers, numerical linear algebra is frequently an "inner loop bottleneck" that requires great ingenuity to overcome. The matrices are typically large and highly structured, especially if they arise from a discretized partial differential equation or an optimization problem. Despite advances in high-performance computing, algorithmic insights still rule the day, making linear equation solving and eigenvalue computation vibrant research areas. Information science applications are increasingly driving the field. For example, the PageRank algorithm involves solving an eigenvalue problem. Large datasets are sometimes assembled in high-dimensional matrices called tensors. Finding patterns in such a structure is a particularly important “big data” challenge for researchers in applied algebra. Abstract algebra is no less applied, with many timely problems springing up in computer science, operations research, and mathematics. The development of computer algebra systems has revolutionized the field, making it possible for researchers to tackle problems that were considered intractable just a short time ago.

"Complexity" refers to the study of the efficiency of computer algorithms. In this a problem is specified, with the ultimate goals being to determine the greatest efficiency possible for algorithms designed to solve instances of the problem, and to design an algorithm possessing the optimal efficiency. For example, consider the problem of multiplying two square real-valued matrices. Here, an instance of the problem consists of two specific matrices. Applying the procedure everyone learns in linear algebra for multiplying matrices, one has an algorithm which multiplies two n-by-n matrices using n3 scalar multiplications, along with slightly fewer scalar additions, for a total number of arithmetic operations proportional to n3. However, this is not optimal for the problem, as algorithms have been devised for which the total number of arithmetic operations is at most proportional to np where p = 2.376. Regarding lower bounds, all that is known is that no algorithm for multiplying two square matrices can do so using fewer than n2 arithmetic operations in general. We thus say that the computational complexity of matrix multiplication lies somewhere between n2 and np for p = 2.376. A full formalization of complexity requires there be an underlying mathematical model of a computer, so that definitions and proofs can possess complete rigor. There are various mathematical models of computers used in complexity, the most prevalent being the Turing machine, but also important is the real number machine, which is the model that best fits the spirit of the example above. Complexity that is fully formalized is known as complexity theory. Complexity is widely done in science and engineering, even if not complexity theory. Indeed, it is standard when presenting a new algorithm — even algorithms aimed squarely at applications — to argue the superiority of the algorithm by bounding its running time as a function of key problem parameters (e.g., np for p = 2.376), and showing the bound beats the bounds previously established for competitor algorithms. Referring to an algorithm's "complexity" has thusly become commonplace.

Skip to toolbar