# Leonard Schulman

**discipline**Computer science & informatics

### Research project

Topic 1: Algorithms and Information Theory for Causal Inference

Can people make rational conclusions from data in absence of well-controlled scientific experiments that differentiate causal from correlative relationships? This question is of profound significance for society. Conceptually, it raises the exciting prospect of extending the scientific method, rigorously, beyond the traditional framework of controlled or natural experiments.

Our starting point in this work is the framework of structured causal (graphical) models in probability theory. Historically, causality has been teased apart from correlation through controlled experiments. Our goal is to enable statistically sound decision-making even in certain situations where controlled experiments are impossible and natural experiments unavailable.

There are many reasons that we must often make do with passive observation. These include: ethical considerations, especially with regard to human subjects; governance constraints; and uniqueness of the system and the inability to re-run history, as is the case regarding climate.

Fortunately, there is another alternative. The groundbreaking work of J. Pearl and others in the 1990s provides a rigorous and general mathematical framework for inferring causation from a combination of data and assumptions. Despite abundant work since then, the approach has so far had little effect on practice. I argue that this is because of limitations in the theory, some of which I hope to address with the proposed research.

The framework is known as ``structured causal models", and rests upon the mathematical formalism of directed graphical models. The idea is that each vertex of a directed graph is associated with a variable of interest– it may be a choice of action, an outcome, or an environmental variable. By physical considerations one hopes to show that certain of these variables cannot influence others – then one is allowed to omit the corresponding directed edge in the graph. The existing theory allows us to determine whether it is possible to establish – from passive observation alone – whether a given intervention variable can have a statistical effect on an outcome. However, the theory is limited by its fragility. First, very few graphs, and generally only very sparse ones, enable causal identifiability. Second, the verification required regarding the absence of an edge is absolute: even the slightest possible effect requires the edge to be retained.

I propose to: (I) Develop more robust identifiability theorems, which will relax the model verification requirements in exchange for ``error bars" on the identified causal effect. (II) Study whether causal inference is well-conditioned with respect to perturbations that respect the underlying semi-Markovian graphical model. (III) Develop an algorithm to bound the condition number of a given semi-Markovian model. (IV) Develop new algorithms which, for a suitable class of graphs, require as input only the probabilities of relatively likely events. This is in order to remedy the current need for extremely extensive and accurate statistics in the input to the identification procedure.

Topic 2: Coding Theory and High-Dimensional Combinatorics

I have a long-standing interest in coding theory, beginning with my proof of the interactive coding theorem, and invention and proof of existence of tree codes. In 2014 I (with Cris Moore) proposed a new approach to explicit construction of error-correcting codes, which in particular would be powerful enough to give tree codes, if certain results in analytic number theory will hold. These particular results (bounds on exponential sums) appear to be far beyond the current state of the art in number theory. However, quite a few variations of the construction are possible, and for some of them I will look for the collaboration of algebraists who are interested in constructions. For this purpose Jerusalem is the perfect place for me to spend a year due to the presence of Alex Lubotzky in particular, who has made great contributions in this vein; also mathematicians with related interests such as Nati Linial and Gil Kalai.

Relatedly, I have become interested in what have recently been called analog error-correcting codes: these are linear subspaces based on the foundational Dvoretzky theorem in Analysis, which in its essence is quite similar to the Shannon coding theorem. I am interested in developing explicit constructions of these objects (only randomized constructions are known), and even further, a ``real-time” version which would enable, for power-limited transmissions, an analogue of the interactive coding theorems mentioned above. This raises rather difficult questions in mathematical analysis and is at the same time closely related to problems in signal processing and control theory.

### Biography

Leonard J. Schulman received the B.Sc. in Mathematics in 1988 and the Ph.D. in Applied Mathematics in 1992, both from the Massachusetts Institute of Technology. Since 2000 he has been on the faculty of the California Institute of Technology. He has also held appointments at UC Berkeley, the Weizmann Institute of Science, and the Georgia Institute of Technology. He is the director of the Caltech Center for the Mathematics of Information, is on the faculty of the Institute for Quantum Information and Matter, and serves as Editor-in-Chief of the SIAM Journal on Computing. His research is in several overlapping areas: algorithms and communication protocols; combinatorics and probability; coding and information theory; quantum computation.

### Selected publications

Stability of Causal Inference,” Leonard J. Schulman and Piyush Srivastava. Proc. UAI, 2016, to appear. Plenary presentation.

“The Adversarial Noise Threshold for Distributed Protocols,” William M. Hoza and Leonard J. Schulman. ArXiv 1412.8097. Proc. 27th SODA 240-258, 2016.

“Allocation of Divisible Goods under Lexicographic Preferences,” Leonard J. Schulman and Vijay V. Vazirani. ArXiv 1206.4366. Proc. FSTTCS 543-559, 2015.

“Symbolic integration and the complexity of computing averages,” Leonard J. Schulman, Alistair Sinclair and Piyush Srivastava. Proc. 56th FOCS 1231-1245, 2015.

“Analysis of a classical matrix preconditioning algorithm,” Leonard J. Schulman and Alistair Sinclair. ArXiv 1504.03026. Proc. 47th STOC 831-840, 2015.

### institut

**fellow**

**EURIAS promotion**2012/2013

**discipline**Linguistics

**fellow**

**EURIAS promotion**2016/2017

**discipline**History

**fellow**

**EURIAS promotion**2016/2017

**discipline**Literature

**fellow**

**EURIAS promotion**2017/2018

**discipline**Regional Studies