Research
Statistics is the science of data. Estimation facilitates learning from data. Quantifying, controlling and describing uncertainty provides a basis for inference, modeling and decision making that is central to all theoretical and applied sciences. Quantitative analysis and statistical reasoning have become ubiquitous, and the prevalence of "big data" is rapidly changing the way we study and view our world. CAM's statistics research programs span from mathematical statistics, computational statistics, and machine learning to the development of statistical methods for astrophysics, ecology, economics, emergency medical services, financial modeling, genomics, high dimensional and functional data, neurobiology, risk management and spatio-temporal data.
Images and other sensing modalities are increasingly key to science, medicine, engineering, and many other fields, and hence computational methods for processing and extracting information from such sensors are of critical importance. Extracting useful information from raw, noisy data involves a wide range of mathematical techniques including inverse problems, optimization, modeling and prediction, discrete algorithms, and methods for high-level image understanding. In the Center for Applied Mathematics, researchers are creating new algorithms for these and other problems, solidly grounded in principled theory, and using these algorithms in a range of applications, from studying bird and insect flight, to reconstructing volume data from medical scans, to automatically reconstructing 3D geometry from millions of 2D photos on the Internet.
Scientific computing can be thought of as the application of high-performance numerical algorithms to large-scale computational problems arising in science and engineering, and is therefore ubiquitous in the work of applied mathematicians at Cornell. Numerical analysis is the development of such methods as well as the study of their accuracy, stability, and complexity, and hence is more specialized. Typical problem classes studied in numerical analysis include ordinary and partial differential equations, continuous optimization problems, and the linear algebra tasks that underlie these and other computations. All of these topics are being investigated in various settings by CAM researchers. The range of scientific computing at Cornell can be gauged from the homepages of the faculty of the graduate field of computational science and engineering at http://www.cse.cornell.edu/people.html. Examples include the study of solids and structures under uncertainty; the role of rapid evolution in the dynamics of food webs in evolutionary biology; simulation optimization in call center staffing and ambulance deployment; understanding turbulent and reactive flows in combustion; investigation of material structure across multiple length and time scales; and the study of complex systems applied to insect flight.
Probability is the study of randomness and uncertainty. The field of stochastic processes deals with randomness as it develops dynamically, and it can be thought of as the study of collections of related, uncertain events. Research in this area finds laws governing randomness; familiar examples include the law of large numbers and the central limit theorem. Some research in probability is highly theoretical, and is connected to a number of areas of mathematics including functional analysis, measure theory and partial differential equations. Some research in probability and stochastic processes involves modelling of real systems such as service/queuing systems, financial phenomena, networks, manufacturing and other physical systems. There is significant interaction between the research being done in probability and stochastic processes, and that in other research areas of CAM.
Optimization means choosing the best among a set of options. It arises in engineering when trying to choose the best among different system designs or courses of action, and in the physical sciences when trying to predict how nature will behave. It also arises in statistics, when trying to describe reality in a way that best fits available data. Optimization becomes challenging, and mathematically interesting, when the number of options is too large to allow evaluating each one individually. In such situations, optimization methods can use a problem's structure to quickly find the best option without evaluating each one. Challenges in optimization also arise when options cannot be evaluated perfectly, when evaluating an option takes a long time. Researchers within CAM consider many different kinds of optimization problems, each with its own special structure and applications, including convex optimization, combinatorial optimization, continuous optimization, optimization via simulation, and global optimization of expensive functions.
At Cornell, we use sophisticated mathematics of probability theory, quantum theory, nonlinear dynamics and bifurcation theory, and differential geometry to wide-ranging problems of interest to physics and complex systems. These techniques have been central to our understanding of the synchronization of fireflies, infectious disease dynamics in complex populations, algorithms for 3D X-ray imaging of cellular structures, categorization of topological quantum materials, and the effectiveness of simple models in describing our complex world.
Mathematical Finance is the field of mathematics that studies financial markets. Topics in financial markets studied include market trading mechanisms, called market microstructure, corporate management decision making, called corporate finance, investment management, and derivative securities. In each of these areas, sophisticated mathematics is utilized for modeling purposes. The theory of stochastic processes, stochastic optimization, partial differential equations, and simulation methods are just some of the mathematical tools employed. For example, in the area of derivatives, stochastic calculus is used to price a call option on a common stock. A call option is a financial security that gives its owner the right to buy a common stock at a fixed price on or before a fixed future date. Using stochastic calculus, the price of a call option can be characterized as the expected value of a nonlinear and random payoff at a future date. Numerical methods, such as Monte Carlo simulation, are often used to compute these expected values.
Mechanism design is the problem of designing a game so as to guarantee that players playing rationally will produce outcomes that are desirable from the point of view of the mechanism designer. Auctions can be viewed as instances of mechanisms. We may be interested in designing auctions that, for example, maximize the revenue of a seller or encourage truthful behavior on the part of buyers. Computer scientists have become interested in algorithmic mechanism design, where the focus is on designing mechanisms with good properties that can be implemented in polynomial time. More recently, there has been work designing extremely simple mechanisms, that may not be optimal but perform well (in some appropriate sense) in equilibrium. Combining learning theory with game theory: Online learning theory provides powerful models (such as the famous multi-armed bandit problem) for reasoning about processes that adapt their behavior based on past observations. New research questions emerge when one places online learning in a game-theoretic context. For example, what can be said about the dynamics of systems composed of multiple learners interacting in either a competitive or a cooperative environment? If the learner's observations depend partly on data provided by selfish users, can we design learning algorithms whose behavior is aligned with the users' incentives, and what impact does this have on convergence rates? How should one design algorithms for settings where the learner cannot take actions directly, but instead depends on myopic agents who can be encouraged via incentive payments to explore the space of alternatives, as when an online retailer attempts to learn the quality of products by eliciting consumer ratings? Adding computation and language issues to game theory: We consider models of game theory that explicitly charge for computation and take seriously the language used by agents. This turns out to have a major impact on solution concepts like Nash equilibrium. The approach seems to capture important intuitions, and can deal with a number of well-known problematic examples. Moreover, there are deep connections between this approach and cryptographic protocol security. Thus, thinking in terms of computational games can lead to insights and new approaches in security.
Lee Segel, an eminent applied mathematician and a founder of modern mathematical biology, observed that “mathematical biology sounds like a narrow specialty, but in fact it encompasses all of biology and most of mathematics”. Much current research is driven by the flood of “big data” (next-generation genomic and whole-transcriptome sequencing, dozens of environmental variables measured globally in nearly continuous time) and “small data” (e.g., individual ion channels in nerve cells, motion-tracking of insect wings) that open new phenomena to quantitative modeling, and by the increasing challenges of biosphere sustainability. Mathematics plays an important role at all levels of biological organization and the full range is represented in CAM, including: Identifying the structure and parameters of biochemical and genetic networks; Identifying genes associated with complex diseases and differences among individuals; How do insects walk, hover, and turn in flight; Biomechanics of muscular tissue and effects of muscular degeneration; How infectious diseases spread on contact networks; How rapid evolution affects population and ecosystem dynamics; Finding improved solutions to conservation, environmental and sustainability problems such as nature reserve design, invasive species management, pest and disease management in agriculture, environmental remediation and carbon sequestration.
From the perspective of an applied mathematician, fluid mechanics encompasses a wealth of interesting problems. Fluid motion is governed by the Navier-Stokes equations; the apparent simplicity of these differential equations belies the range of fascinating phenomena that emerge in the motion of liquids and gases. Understanding problems in such disparate application areas as groundwater hydrology, combustion mechanics, ocean mixing, animal swimming or flight, or surface tension driven motion, hinges on a deeper exploration of fluid mechanics. Attempts to understand fluid motion from a theoretical perspective lead to mathematical questions involving numerical analysis, dynamical systems, stochastic processes and computational methods. This is classically rich territory for the applied mathematician and CAM offers opportunities to work in many areas of fluids with researchers whose interests range throughout the engineering disciplines.
Dynamical systems problems range from the vibrations of molecules to planetary motion and span the breadth of the physical, biological and social sciences as well as engineering. Research in the subject stretches from investigation of realistic models of complex systems like the brain and the power grid to mathematically rigorous investigations of highly abstract systems such as the iteration of quadratic functions. In the past few years, CAM faculty and students have used dynamical systems to study the oscillations of bubbles, the flight of mosquitoes, the emergence of cooperation, and rapid evolution in predator -prey systems, among other phenomena. Cornell has a long history as a center for research in the subject, and CAM has been a focal point for that research. CAM faculty who regularly teach dynamical systems courses and serve as advisors of students doing research in the subject are Steve Ellner, John Guckenheimer, John Hubbard , Richard Rand, James Sethna, John Smillie, Paul Steen, Steve Strogatz, Alex Vladimirsky and Jane Wang. Many students have taken advantage of the interdisciplinary opportunities provided by CAM to engage in research that connects experiment, theory and simulation. Dynamical systems theory studies models of how things change in time. Even the simplest nonlinear dynamical systems can generate phenomena of bewildering complexity. Because formulas that describe the long time behavior of a system seldom exist, we rely on computer simulation to show how initial conditions evolve for particular systems. Simulations with many different systems display common patterns that have been observed repeatedly. One of the main goals of dynamical systems theory is to discover these patterns and characterize their properties. The theory can then be used to describe and interpret the dynamics of specific systems. It can also be used as the foundation for numerical algorithms that seek to analyze system behavior in ways that go beyond simulation. Throughout the theory, dependence of dynamical behavior upon system parameters has been an important topic. Bifurcation theory is the part of dynamical systems theory that systematically studies how systems change with varying parameters.
Discrete mathematics is a broad subfield of applied mathematics that deals with the topic of enumerating and processing finite sets of objects. It draws on a wide variety of areas of mathematics, including geometry, algebra, and analysis, and in turn has a wide variety of applications from designing codes and circuits, to modeling computation, to algorithms for finding directions in a road network and finding good ways to do viral marketing. Researchers in CAM in this area work on topics such as studying incentives in online systems, designing algorithms to find near-optimal solutions to hard discrete optimization problems, to determining good ways to send information through networks, to figuring out how to organize and implement city-wide bike-sharing systems.
Some of the most fascinating problems in applied mathematics today concern the structure and dynamics of systems composed of many interacting parts. Think of the thousands of power plants in the electricity grid or the companies in the global economy; the billions of people on Facebook; or the hundreds of billions of neurons in the human brain. In each of these complex systems, the individual parts interact with a subset of the others, and the pattern of connections between them defines a network. Faculty and students at CAM are using the tools of graph theory, statistical physics, machine learning, probability theory, and dynamical systems to look into the complex architecture and collective behavior of the diverse networked systems in the world around us.
Artificial intelligence heralds the promise of computers that can exhibit human-like behavior and has progressed steadily by mastering a series of increasingly challenging benchmark tasks. In sharp contrast to the state of the field twenty years ago, today there are automated tools to perform facial recognition from a corpus of images, play world-championship chess, and even win at Jeopardy. One particular subfield has made especially remarkable advances, and that is the area of machine learning, in which the algorithms used for these tasks are “trained” by their experience in handling a series of inputs for the task at hand. Machine learning algorithms rely upon sophisticated new mathematics that ties problems in high-dimensional statistics to new approaches to non-convex optimization. Cornell is an international leader in AI research, including machine learning. CAM facilitates collaborations of this inherently interdisciplinary field that connects computer science, the mathematical and physical sciences and engineering.
Mathematical models of natural phenomena often present themselves in the form of nonlinear partial differential equations (PDEs) and/or minimization problems. Their rigorous treatment is the historical root for the entire field of mathematical analysis. However, applied analysis has the distinctive feature that it develops not simply for its own sake, but with an eye toward finding effective solutions to concrete problems. Moreover, this relationship is symbiotic: mathematical models motivate general analytic techniques, while the analysis itself informs the modeling and computational experiments. For example, a sharp existence theorem can reveal either the adequacy or the inadequacy of a model to predict an observed phenomenon. Or by knowing the specific type of discontinuities present in the solution of a differential equation, one can often build a hybrid (analytic/numerical) approximation that is more accurate and efficient than what would result from a naïve/standard discretization. Employing various techniques of nonlinear functional analysis and PDEs, the calculus of variations, bifurcation theory, probability theory, stochastic processes, geometric group theory, numerical analysis and computational science, current research at Cornell in Applied Analysis and PDEs includes problems from nonlinear elasticity and thin structures, mechanics of materials, mathematical aspects of materials science, homogenization theory, optimal control and differential games, seismic imaging and inverse problems, heat diffusion on manifolds, condensed matter physics, and nano-scale electronic systems.
Applied algebra research at Cornell comes in several flavors. For computational scientists and engineers, numerical linear algebra is frequently an "inner loop bottleneck" that requires great ingenuity to overcome. The matrices are typically large and highly structured, especially if they arise from a discretized partial differential equation or an optimization problem. Despite advances in high performance architectures, algorithmic insights still rule the day and that explains why research in linear equation solving and eigenvalue computation is such a vibrant area. Increasingly, information science applications are driving the field. For example, the computation of PageRank is an eigenvalue problem. Large datasets are sometimes assembled in high dimension matrices called tensors. Finding patterns in such a structure is a particularly important “big data” challenge for researchers in applied algebra. Abstract algebra is no less applied with many timely problems springing up in computer science, operations research, and mathematics. The development of computer algebra systems has revolutionized the field, making it possible for researchers to tackle problems that were considered intractable just a short time ago.
"Complexity" refers to the study of the efficiency of computer algorithms. In this a problem is specified, with the ultimate goals being to determine the greatest efficiency possible for algorithms designed to solve instances of the problem, and to design an algorithm possessing the optimal efficiency. For example, consider the problem of multiplying two square real matrices. Here, an instance of the problem consists of two specific matrices. Applying the procedure everyone learns in linear algebra for multiplying matrices, one has an algorithm which multiplies two n-by-n matrices using n^3 scalar multiplications, along with slightly fewer scalar additions, for a total number of arithmetic operations proportional to n^3. However, this is not optimal for the problem, as algorithms have been devised for which the total number of arithmetic operations is at most proportional to n^p where p = 2.376. Regarding lower bounds, all that is known is that no algorithm for multiplying two square matrices can do so using fewer than n^2 arithmetic operations in general. We thus say that the computational complexity of matrix multiplication lies somewhere between n^2 and n^p for p = 2.376. A full formalization of complexity requires there be an underlying mathematical model of a computer, so that definitions and proofs can possess complete rigor. There are various mathematical models of computers used in complexity, the most prevalent being the Turing machine, but also important is the real number machine, which is the model that best fits the spirit of the example above. Complexity that is fully formalized is known as complexity theory. Complexity is widely done in science and engineering, even if not complexity theory. Indeed, it is standard when presenting a new algorithm — even algorithms aimed squarely at applications — to argue superiority of the algorithm by bounding its running time as a function of key problem parameters (e.g., n^p for p = 2.376), and showing the bound beats the bounds previously established for competitor algorithms. Referring to an algorithm's "complexity" has thusly become commonplace.