research statement

Steven J. Miller

research statement

Steven J. Miller

visibility

…

description

10 pages

link

1 file

My main interest is analytic number theory and random matrix theory (especially the distribution of zeros of Lfunctions and the eigenvalues of random matrix ensembles); I am also studying equidistribution problems in analysis and probability, and working on applied problems in probability, statistics, graph theory, cryptography, sabermetrics and linear programming where the tools and techniques of number theory can successfully be applied. These have led me to study computational and numerical methods, and I have written numerous programs to investigate and solve the above projects, ranging from zeros of L-functions to constructing elliptic curves with rank to linear programming problems. Papers and talks are available at

Research Statement: Steven J Miller Steven.J.Miller@williams.edu May 30, 2008 My main interest is analytic number theory and random matrix theory (especially the distribution of zeros of Lfunctions and the eigenvalues of random matrix ensembles); I am also studying equidistribution problems in analysis and probability, and working on applied problems in probability, statistics, graph theory, cryptography, sabermetrics and linear programming where the tools and techniques of number theory can successfully be applied. These have led me to study computational and numerical methods, and I have written numerous programs to investigate and solve the above projects, ranging from zeros of L-functions to constructing elliptic curves with rank to linear programming problems. Papers and talks are available at http://www.williams.edu/go/math/sjmiller/ 1 Research Interests: Summary Since Riemann’s investigations 150 years ago, zeros of L-functions have been known to be intimately connected to solutions to many problems in number theory. In the last few decades finer properties of the zeros have helped understand problems such as the observed preponderance of primes congruent to 3 mod 4 over 1 mod 4 as well as the growth of the class number. Random Matrix Theory has become a powerful tool to model the behavior of these zeros, suggesting both the answers as well as new questions to ask. My primary interest is in the distribution of zeros near the central point for families of L-functions, especially families of elliptic curves with rank over Q(T ). Related to this, I am also investigating constructions for moderate to high rank one-parameter families of elliptic curves, lower order corrections to n-level densities of elliptic curves, and the influence of forced zeros at the central point on the distribution of the first zero above the central point. Additionally, I am studying the low zeros of Dirichlet characters with square-free modulus (which have applications to how primes are distributed in arithmetic progressions) and Rankin-Selberg convolutions of GLn and GLm families of L-functions (which highlights how the behavior of complex families can be understood in terms of the behavior of the building blocks). I am investigating numerous problems in Random Matrix Theory and Random Graphs, especially ensembles with few degrees of freedom (order N independent matrix elements, instead of order N 2 ). These provide fascinating windows to see new behavior and have numerous applications (k-regular graphs are used to construct cheap and efficient networks). Along these lines, I am also studying several problems on the boundary of Probability Theory, Number Theory and Analysis, such as proving that the distribution of the first digits of |L(s, f )| near the critical line and iterates of the 3x + 1 map follows Benford’s Law of digit bias (the first digit is a 1 about 30% of the time). These problems have led to results ranging from the distribution of digits of order statistics to a generalization of the central limit theorem for random variables modulo 1. With some colleagues and students I am extending these results and working on applications (I have been in contact with the Criminal Investigative Division of the IRS, and am organizing a conference on Benford’s law). Using recent strong concentration results I have just proved a conjecture on the size of the sumset to the difference set in additive number theory, with fascinating behavior at the critical threshold. I am also interested in several applied projects in Probability, Statistics, Linear Algebra and Cryptography, such as closed-form Bayesian inferences for the multinomial logit model, a binary integer linear programming problem for movie distributors, bounding incomplete multiple exponential sums arising in Computer Science, extreme cases of the Cramer-Rao inequality, modeling baseball games, and determining the security of certain signature schemes in cryptography, as well as studying the propagation of viruses in networks. I am also interested in computational aspects of these problems, writing algorithms to investigate many of these topics, from zeros of elliptic curve Lfunctions and moments of Dirichlet L-functions over function fields to random matrix theory and graph theory to Bayesian inference and linear programming. 2 Summary of Thesis Results Following Brumer-Heath-Brown [BH-B], Iwaniec-Luo-Sarnak [ILS], Katz-Sarnak [KaSa1, KaSa2], Rubinstein [Rub] and Silverman [Si], I used the 1- and 2-level densities to study the distribution of low-lying zeros for oneparameter rational families of elliptic curves of rank r over Q(T ); these densities are defined by summing test 1 functions at scaled zeros of the L-functions. Katz and Sarnak [KaSa1, KaSa2] predict that, to each family of L-functions F, there is an associated symmetry group G(F) (a classical compact group) which governs the distribution of the low-lying zeros. In other words, the behavior of zeros in a family of L-functions near the central point is well modeled by the behavior of eigenvalues near 1 of a classical compact group. For families of elliptic curves of rank 0 over Q(T ), we expect G(F) to be SO(even) if all curves have even functional equation, O if half are even, half odd, and SO(odd) if all are odd. If the family has rank r over Q(T ), the densities are trivially modified to take into account the r expected zeros at the central point, and we will still refer to these as O, SO(even), SO(odd). The 1-level densities for the Unitary and Symplectic are distinguishable from the three Orthogonal groups for test functions of arbitrary small support; unfortunately, the three orthogonal groups all agree for functions supported in (−1, 1). The 2-level densities for the orthogonal groups, however, are distinguishable for functions supported in arbitrarily small neighborhoods of the origin. Modulo standard conjectures, for small support I showed the densities agree with Katz and Sarnak’s predictions [Mil1]. The difficulty is that the logarithm of the analytic conductors must be extremely well controlled or oscillatory behavior drowns out the main term. In particular, it is not enough to confine the logarithms of the conductors to lie in [log N d , log 2N d ] as N → ∞. The conductors are controlled by careful sieving and deriving explicit formulas via Tate’s algorithm. Further, the densities confirm that the curves’ L-functions behave in a manner consistent with having r zeros at the central point, as predicted by the Birch and Swinnerton-Dyer conjecture. By studying the 2-level densities of some constant sign families, we find the first examples of families of elliptic curves where we can distinguish SO(even) from SO(odd) symmetry. Similar to the GUE universality Rudnick and Sarnak [RS] found in studying n-level correlations of L-functions, our universality follows from the sums of a2t (p) in our families. For non-constant j(T ), this follows from a SatoTate law proved by Michel [Mic]; however, for many of our families we are able to show this by a direct calculation. The effect of the rank over Q(T ) surfaces through sums of at (p). Finally, while the n-level densities for these families are universal, potential lower order correction terms have been observed in several families. These family dependent corrections are of size 1/ log N ; unfortunately, trivial estimation of the errors lead to terms of size log log N/ log N . I have recently completed a detailed analysis for many families [Mil5], where these corrections can be isolated. These corrections show how the arithmetic of the family enters as lower order corrections; in particular, families with and without complex multiplication behave differently. I am currently exploring the relation between these lower order terms and finer properties of the behavior of low zeros for small conductors. 3 3.1 Representative Subset of Previous and Current Research Influence of Zeros at the Central Point My main ongoing research project involves the observed repulsion of zeros near the central point by zeros at the central point. By the Birch and Swinnerton-Dyer Conjecture, if an elliptic curve has geometric rank r, its Lfunction should vanish to order r at the central point, and these curves offer an exciting laboratory to test the conjectures of Random Matrix Theory. I have introduced two ‘natural’ models for the random matrix analogues into the literature, what I call the independent model (where the forced zeros do not interact with the remaining zeros) and the interaction model (where they do) [Mil3]. In my thesis I proved that as the conductors tend to infinity the distribution of zeros agrees with the independent model (which is the same as the interaction model with no forced zeros). The interaction model is related to the classical Bessel kernels of RMT, and gives a very different prediction as to the behavior of the first few zeros when there are forced zeros. I and my students (at Princeton, AIM and Ohio State) wrote code to construct large numbers of elliptic curves and study the effect on the location of the first zero above the central point. Extensive calculations and theoretical modeling are in progress; this project is joint with Eduardo Dueñez, Jon Keating and Nina Snaith (professors, University of Bristol) and their student Duc Khiem Huynh. Unlike the excess rank investigations, however, as we increase the conductor we see a marked change in data; specifically, the repulsion decreases. The detailed numerical investigations I have run [Mil3] provide several promising clues as to what the correct theory is. In particular, we observe that the repulsion increases with rank, decreases with the size of the conductor, and all the zeros are shifted by the same amount. This suggests the right model for finite conductors is the interaction model, with parameters a function of the average rank for conductors of a given size. As excess rank has long been observed for finite conductors, this explains how we can have repulsion in rank 0 families over Q(T ) (where there are no zeros at the central point to repel other zeros!). Using these observations, we are currently working on finding the correct model for finite conductors. This is similar to Keating and Snaith’s observation [KeSn1, KeSn2] that zeros of L-functions at height T should not be modeled by the infinite scaling limits of matrices, but by N × N matrices with N ≈ log T . I am working on the analogous problem for function fields with Salman Butt (graduate student, University of Texas at Austin) and Chris Hall (postdoc, Michigan). 2 3.2 Families of L-functions and n-Level Density Assuming GRH, the non-trivial zeros of any L-function lie on its critical line, and therefore it is possible to investigate the spacing statistics of its normalized zeros. The general philosophy, born out of many examples [CFKRS, KaSa2, KeSn3, ILS, MS], is that the statistical behavior of eigenvalues of random matrices (resp., random matrix ensembles) is similar to that of the critical zeros of L-functions (resp., families of L-functions). This is born out in a number of cases, such as all Dirichlet characters, quadratic Dirichlet characters, L(s, ψ) with ψ √ a character of the ideal class group of the imaginary quadratic field Q( −D) (D > 3 square-free and congruent to 3 modulo 4), families of elliptic curves, weight k level N cuspidal newforms, and symmetric powers of GL2 L-functions; see [FI, Gao, Gü, HR, ILS, KaSa2, Ro, Rub, Yo2]. Eduardo Dueñez (postdoc at the University of Texas at San Antonio) and I have determined the n-level densities for a family on GL6 , φ × sym2 f , where f varies over all weight k cuspidal forms of full level and φ is a Maass form [DM1]. A folklore conjecture predicted that this family would have symplectic symmetry (because all functional equations are even and there is no corresponding family with odd signs), but in fact it has SO(even) symmetry, showing that the n-level densities are more than just a theory of the sign of the functional equations. We generalized to the Rankin-Selberg convolution of families of GLn and GLm L-functions [DM2]. We see a universality similar to that observed in Rudnick-Sarnak [RS] for the n-level correlations; again, the controlling quantity is the second moment of the aπ (p)’s. For nice families of L-functions, we show that we may assign a symmetry constant cF which is 0 (resp., 1 or -1) if the family F has Unitary (resp., Symplectic or Orthogonal) symmetry, and if F × G is the Rankin-Selberg convolution of the two families, then cF ×G = cF · cG . We are currently extending our results to as general families as possible. Independently of Hughes-Rudnick [HR] and in more generality, I have calculated the 1-level density for families of Dirichlet L-functions. Consider the family of primitive Dirichlet characters with square-free conductor in [N, 2N ]. Then as N → ∞, the 1-level density agrees with the Katz-Sarnak predictions (Unitary) for test functions supported in (−2, 2). By assuming some standard conjectures on the error terms of primes in arithmetic progression and building on the work of Goldston and Vaughan [GV], I have extended these results to larger support, and am investigating the implications of assuming larger support on standard conjectures on the distribution of primes in arithmetic progressions. Chris Hughes (professor, York) and I have extended the results of Iwaniec-Luo-Sarnak [ILS]. In particular, consider the family of weight k cuspidal newforms of prime level N . ILS calculate the 1-level density (splitting the family by sign); in [HuMi] we extended the calculations to the n-level densities for all n. Obstructions emerge, but we are able to calculate for large enough support to see new features, as well as derive a new and significantly more tractable formula for the nth centered moments and improved bounds for high vanishing at the central point. The difficulty here is one that plagues the subject (see for example [Gao, Rub, RS]), namely the formulas from random matrix theory and number theory look different, and involved combinatorics are required to show equality. We surmount this through an analysis of multidimensional integrals of the non-diagonal terms in the Bessel-Kloosterman expansion of the Petersson formula, and generalizing the work of Soshnikov and Hughes-Rudnick through desymmerizing certain integrals to handle the combinatorics. 3.3 Lower Order, Family-Dependent Corrections to n-Level Densities Numerous papers (see for example [Be, BeKe, BBLM, BoKe, CFKRS, CS, Ke]) have investigated lower order terms in the behavior of L-functions. One of the many applications of these terms is to understanding the observed excess rank in families of elliptic curves, which I observed in my thesis [Mil2]. A more delicate analysis of the error terms and the sums of akt (p) yield potential lower order, family dependent corrections to the n-level densities for families of elliptic curves. I have derived an alternate version of Riemann’s explicit formula which is more tractable for investigations of low lying zeros of GL2 L-functions. In particular, the quantities encountered are now weighted averages of the moments of the Fourier coefficients, and not the Satake parameters. Thus the lower order terms should be different for families with and without complex multiplication. This allows us to break the universality observed among the main terms of families of elliptic curves. I have verified this in [Mil5] by isolating these lower order terms for many families, which allows us to see how the arithmetic of the family enters. I am currently exploring the implications of these terms on the observed numerical data of zeros of elliptic curve L-functions. Recently Conrey, Farmer and Zirnbauer [CFZ1, CFZ2] formulated a conjecture (the L-functions Ratios Conjecture) for sums of ratios of L-functions. Applications range from formulas for moments to n-level densities with incredibly detailed error terms [CS]. This powerful conjecture reproduces many previous conjectures, with incredibly sharp and detailed error predictions. As to date there are few instances where its predictions have been tested, I decided to examine its predictions for the family of quadratic Dirichlet L-functions. I was able to find perfect agreement with the phenomenally sharp error terms (ie, up to square-root cancelation) for the 1-level density for test functions with suitably restricted support [Mil6]. I am currently performing similar investigations for orthogonal families of cuspidal newforms, where new features emerge. 3 3.4 Constructing Elliptic Curves with Moderate Rank Álvaro Lozano-Robledo (postdoc, Cornell University) Scott Arms (graduate student, Ohio State) and I have generalized the methods of my dissertation to construct families of elliptic curves of moderate rank over Q(T ) [ALM]. By studying special families (not in Weierstrass form) that are quadratic (or special quartics) in T , we obtain a double sum (on t and x mod p) of at (p) which, by a theorem of Rosen and Silverman [RSi], leads to a family with moderate rank. We obtain rank 6 rational elliptic surfaces without having to specialize points. We have constructed other surfaces of rank at least 8, and are continuing to extend the methods. 3.5 Random Matrix Theory and Random Graphs Recently Friedman [Fr] proved Alon’s conjecture [Al] for many families √ of d-regular graphs, namely that given any ² > 0 “most” graphs have second largest eigenvalue at most 2 d − 1 + ². These graphs have important applications in communication network theory, allowing the construction of superconcentrators and nonblocking networks, coding theory and cryptography. As many of these applications depend on the size of the second largest eigenvalue, it is natural to investigate its distribution, which I have done with Tim Novikoff (graduate student, Cornell) and Anthony Sabelli (undergraduate, Brown) [MNS]. There is reason to believe the answer is given by the β = 1 Tracy-Widom distribution [TW]. Unfortunately, normalized to have mean 0 and variance 1, most data sets are consistent with all three Tracy-Widom distributions (and even the standard normal); this is similar to the closeness of Wigner’s surmise to the GOE level spacing statistics. We have derived more sensitive statistical tests which detect differences between the normalized distributions. The results are consistent only with the β = 1 case. We are extending our results by parallelizing our code. Chris Hammond (graduate student, Michigan) and I [HaMi] investigated the ensemble of Real Symmetric Toeplitz matrices, with entries iidrv from a probability density p(x) with mean 0, variance 1, and finite higher moments. We have shown that the density of normalized eigenvalues tends to a universal limit. This new measure is almost a Gaussian – the deviations can be interpreted as arising from obstructions to systems of Diophantine equations. The distribution has unbounded support, and the ratio of its moments to the Gaussian’s tends to zero. By imposing additional symmetries (requiring the first row to be a palindrome), I showed [MMS] that the Diophantine obstructions vanish, and the resulting measure is a Gaussian (joint with John Sinsheimer, graduate student at Stony Brook SUNY, and Adam Massey, graduate student at UCLA). Most results in Random Matrix Theory assume that the second moment of p(x) is finite. I guided Inna Zakharevich [Za] (graduate student, MIT) in investigating N × N real symmetric matrices with entries chosen from distributions with infinite variance. Truncating and rescaling these distributions at f (N ) and normalizing the eigenvalues by g(N ), we find new universal distributions for some choices, and recover the semi-circle (the answer for real symmetric matrices arising from finite variance) for other choices. To each d-regular graph, we may attach a real symmetric matrix (the adjacency matrix). The density of eigenvalues for d-regular graphs is not the semi-circle, but rather a new universal distribution (Kesten’s measure / McKay’s Law). Leo Goldmakher (graduate student, Michigan) and I studied fattening the space of d-regular random graphs. Rather than having aij in the adjacency matrix equal 1 if vertex i and j are connected (and 0 otherwise), we fix a probability density w(x), and if i and j are connected, assign a value to aij through this weight function. Different choices of the weight function lead to different densities of eigenvalues; for example, we have observed that if the weight function is the semi-circle, then the first 8 moments of the density of eigenvalues agrees with the semi-circle! We are currently trying to prove this for the higher moments. 3.6 Benford’s Law, Values of L-Functions and the 3x + 1 Problem While looking through tables of logarithms in the late 1800s, Newcomb noticed a surprising fact: certain pages were significantly more worn than others. People were referencing numbers whose logarithm started with 1 more frequently than other digits. In 1938 Benford [Ben] observed the ¡ ¢ same digit bias in a wide variety of phenomenon: the probability that the first digit is d in base B is logB 1 + d1 . See [Hi1, Rai] for a description and history. Many ¡ ¢ diverse systems have been shown to satisfy Benford’s law, ranging from recurrence relations [BrDu] to n! and nk (0 ≤ k ≤ n) [Dia] to iterates of power, exponential and rational maps and Newton’s method [BBH, Hi2]. There are numerous applications of Benford’s Law. It is used in computer science in analyzing round-off errors (see page 255 of [Knu] and [BH]), in determining the optimal way to store numbers [Ha], and in accounting to detect tax fraud [Nig1, Nig2]. See [Hu] for a detailed bibliography of the field. Alex Kontorovich (postdoc, Brown) and I [KoMi] showed that the values of |L(s, f )| near the critical line obey Benford’s Law. A similar result holds for the 3x + 1 problem (the proof is complicated by the discrete nature of the system, and involves the rate of convergence to equidistribution of n logB 2 mod 1, which depends on the irrationality type of logB 2). I have further explored Benford’s Law with Mark Nigrini (professor of accounting at St. Michael’s College) [MN1, MN2, NM1, NM2]. Benford’s Law is used by the IRS to detect fraudulent data; unfortunately people realize this and now intelligently fabricate data. We have developed more advanced tests to detect data fraud. In our 4 investigations we have discovered numerous results in pure analysis, ranging from the distribution of order statistics of general random variables to the most general version of the Modulo 1 Central Limit Theorem (here the sums are taken modulo 1, so there is no need to subtract the sample mean or divide by the sample standard deviation to obtain a quantity with a nice limiting distribution). I am currently supervising four undergraduates on Benford’s Law projects, investigating (among other things) how the irrationality measure effects the rate of convergence. I have also been contacted by the Criminal Investigation Division of the IRS about applications of Benford’s Law, and am organizing a conference on Benford’s Law (Sante Fe, December 2007). 3.7 Additive Number Theory Peter Hegarty (professor, Chalmers University Of Technology) and I studied the relationship between the sizes of the sum and difference sets attached to a subset of {0, 1, ..., N }, chosen randomly according to a binomial model with parameter p(N ) with N −1 = o(p(N )). We showed [HeMi] that the random subset is almost surely difference dominated, as N → ∞, for any choice of p(N ) tending to zero, thus proving a conjecture of Martin and O’Bryant [MO]. The proofs use recent strong concentration results of Kim and Vu [KiVu, Vu1, Vu2]. There is a threshold phenomenon regarding the ratio of the size of the difference- to the sumset. If p(N ) = o(N −1/2 ) then almost all sums and differences in the random subset are almost surely distinct, and in particular the difference set is almost surely about twice as large as the sumset. If N −1/2 = o(p(N )) then both the sum and difference sets almost surely have size (2N +1)−O(p(N )−2 ), and so the ratio in question is almost surely very close to one. If p(N ) = c·N −1/2 then as c increases from zero to infinity (i.e., as the threshold is crossed), the same ratio almost surely decreases continuously from two to one according to an explicitly given function of c. Our results extend to the comparison of the generalized difference sets attached to an arbitrary pair of binary linear forms, and we are looking at other extensions. 3.8 Incomplete Exponential Sums Exponential sums have a rich history, and estimates of their size have numerous applications, ranging from uniform distribution to solutions to Diophantine equations to L-functions to the Circle Method, to name a few. Consider the following incomplete exponential sum: X X S(f, n, q) = ··· x1 · · · xn e2πif (x1 ,...,xn )/q , x1 =±1 xn =±1 with f a non-homogenous quadratic. Proving non-trivial exponentially decreasing upper bounds for S(f, n, m) will provide insight into the computational complexity of a class of boolean circuits needed to compute the parity of n binary inputs. A theorem that shows that the norm of S(f, n, m) is cn , where c < 1 will show that the size (number of binary gates) of these circuits computing parity has to grow exponentially fast. These lower bound results are of great interest to the theoretical computer science community. Using Ramsey-theoretic techniques, Alon and Beigel [AB] proved that for each fixed n, d and m there exists a positive constant bd,m,n such that |S(f, n, m)| < bd,m,n and limn→∞ bd,m,n = 0; note the resulting sequences converge very slowly to 0. In terms of computational complexity, this only tells us that the minimum circuit size required to compute parity of n bits tends to infinity with n. It is of far more interest, from the computational point of view, to show exponentially fast growth in minimum circuit size. This is generally interpreted as showing that parity circuits of the required kind cannot feasibly be built. We have currently obtained sharp bounds on average, and have solved the problem for n small; this project is joint with Eduardo Dueñez and Amitabha Roy and Howard Straubing (professors of computer science, Boston College) [DMRS]. 3.9 Empirical Bayes Inference in the Multinomial Logit Model Whether it’s the 20,000+ hits based on a www.google.com search or the 1000+ hits on www.jstor.org, the multinomial logit (MNL) model plays a very prominent role in many literatures as a basis for probabilistic inferences. One of the recent advances regarding the MNL model is the ability to incorporate heterogeneity into the response coefficients; unfortunately, this leads to increased numerical computation. Once one combines the MNL kernel, a Bernoulli random variable with logit link function, with a heterogeneity distribution, closed-form inference is unavailable due to the non-conjugacy of the product Bernoulli likelihood and the heterogeneity distribution (prior). Eric Bradlow (professor of marketing and statistics, University of Pennsylvania). Kevin Dayaratna (graduate student in marketing at the University of Maryland) and I [MBD] derived a closed-form solution to the heterogeneous MNL problem; unfortunately, the closed-form expansion requires too many terms to be computationally feasible at present. We reduce the number of computations by several orders of magnitude by rewriting the expansion in terms of the number of solutions to systems of Diophantine equations. This allows us to have one very long initial calculation, with all subsequent calculations involving only 10 or 20 terms (instead of 108 and higher), and is now applicable for some problems. 5 3.10 Binary Integer Linear Programming With Joshua Eliashberg (professor, Wharton), Sanjeev Swami (professor, India Institute of Technology Kanpur) Chuck Weinberg (professor, University of British Columbia) and Berend Wierenga (professor, Erasmus University Rotterdam), I solved a binary integer linear programming problem to allow movie theaters to optimally schedule movies each day, taking into account a variety of managerial constraints in reel time. We are currently expanding our model to include additional constraints, and it is being implemented at movie theaters in Amsterdam. 3.11 Dynamical Systems Leo Kontorovich (postdoc, Weizmann Institute), Amitabha Roy and I are studying various models for the propagation of viruses in different systems. We are primarily concerned with how infections are transmitted in various networks. For certain configurations we have derived a differential equation whose fixed points answer the problem, and numerical and theoretical investigations suggest what the limiting behavior should be. We have developed techniques to analyze the resulting equations and can determine the limiting behavior in most cases. 3.12 Cryptography Jeff Hoffstein and Jill Pipher (professors, Brown University) and I guided undergraduates and graduate students in investigating the security of lattice based cryptosystems. I am currently analyzing data related to this problem. 3.13 Sabermetrics Sabermetrics is the application of mathematical tools to analyze baseball. The field really took off with Bill James’ (who later helped build the Red Sox championship teams of ’04 and ’08) work in the 70’s and 80’s, and has since become an major force in the baseball industry [Le]. The subject poses numerous questions of both theoretical and practical interest, and it is often highly non-trivial to derive a mathematically tractable model for a baseball event which captures the essential features. It has been noted that in many professional sports leagues a good predictor of a team’s end of season won-loss percentage is Bill James’ Pythagorean Formula RSobs γ /(RSobs γ + RAobs γ ), where RSobs (resp. RAobs ) is the observed average number of runs scored (allowed) per game and γ is a constant for the league; for baseball the best agreement is when γ is about 1.82. This formula is often used in the middle of a season to determine if a team is performing above or below expectations, and estimate their future standings (the principles involved, however, have enormous application, as the question could just as easily be asked about which mutual funds or stocks are over- or underperforming, and thus determine when to buy or sell). In [Mil4] I showed how this formula is a consequence of a reasonable model of a baseball game. I have presented this result at numerous conferences, discussed my model with Bill James at the Boston Red Sox, and am running an undergraduate research class on improving the model (and related questions) in the Spring of 2008. 6 References [Al] N. Alon, Eigenvalues and expanders, Combinatorica 6 (1986), no. 2, 83–96. [AB] N. Alon and R. Beigel, Lower bounds for approximations by low degree polynomials over Zm . In Sixteenth Annual IEEE Conference on Computational Complexity, IEEE Computer Society Press (2001), 184-187. [ALM] S. Arms, A. Lozano-Robledo and S. J. Miller, Constructing One-Parameter Families of Elliptic Curves over Q(T ) with Moderate Rank, Journal of Number Theory 123 (2007), no. 2, 388–402. [Ben] F. Benford, The law of anomalous numbers, Proceedings of the American Philosophical Society 78 (1938), 551-572. [BBH] A. Berger, Leonid A. Bunimovich and T. Hill, One-dimensional dynamical systems and Benford’s Law, Trans. Amer. Math. Soc. 357 (2005), no. 1, 197–219. [BH] A. Berger and T. Hill, Newton’s method obeys Benford’s law, The Amer. Math. Monthly 114 (2007), no. 7, 588-601. [Be] M. V. Berry, Semiclassical formula for the number variance of the Riemann zeros, Nonlinearity 1 (1988), 399–407. [BeKe] M. V. Berry and J. P. Keating, The Riemann zeros and eigenvalue asymptotics, Siam Review 41 (1999), no. 2, 236–266. [BBLM] E. Bogomolny, O. Bohigas, P. Leboeuf and A. G. Monastra, On the spacing distribution of the Riemann zeros: corrections to the asymptotic result, Journal of Physics A: Mathematical and General 39 (2006), no. 34, 10743–10754. [BoKe] E. B. Bogomolny and J. P. Keating, Gutzwiller’s trace formula and spectral statistics: beyond the diagonal approximation, Phys. Rev. Lett. 77 (1996), no. 8, 1472–1475. [BrDu] J. Brown and R. Duncan, Modulo one uniform distribution of the sequence of logarithms of certain recursive sequences, Fibonacci Quarterly 8 (1970) 482–486. [BH-B] A. Brumer and R. Heath-Brown, The average rank of elliptic curves V, preprint. [CFKRS] B. Conrey, D. Farmer, P. Keating, M. Rubinstein and N. Snaith, Integral moments of L-functions, Proc. London Math. Soc. (3) 91 (2005), no. 1, 33–104. [CFZ1] J. B. Conrey, D. W. Farmer and M. R. Zirnbauer, Autocorrelation of ratios of L-functions, preprint. http://arxiv.org/abs/0711.0718 [CFZ2] J. B. Conrey, D. W. Farmer and M. R. Zirnbauer, Howe pairs, supersymmetry, and ratios of random characteristic polynomials for the classical compact groups, preprint. http://arxiv.org/abs/math-ph/0511024 [CS] J. B. Conrey and N. C. Snaith, Applications of the L-functions Ratios Conjecture, to appear in the Proc. London Math. Soc. http://arxiv.org/abs/math/0509480 [Dia] P. Diaconis, The distribution of leading digits and uniform distribution mod 1, Ann. Probab. 5 (1979), 72–81. [DM1] E. Dueñez and S. J. Miller, The low lying zeros of a GL(4) and a GL(6) family of L-functions, Compositio Mathematica 142 (2006), no. 6, 1403–1425. [DM2] E. Dueñez and S. J. Miller, The effect of convolving families of L-functions on the underlying group symmetries, preprint. http://arxiv.org/abs/math/0607688 [DMRS] E. Dueñez, S. J. Miller, A. Roy and H. Straubing, Incomplete quadratic exponential sums in several variables, Journal of Number Theory 116 (2006), no. 1, 168–199. [FI] E. Fouvry and H. Iwaniec, Low-lying zeros of dihedral L-functions, Duke Math. J. 116 (2003), no. 2, 189-217. [Fr] J. Friedman, A proof of Alon’s second eigenvalue conjecture, Proceedings of the Thirty-Fifth Annual ACM Symposium on Theory of Computing, 720–724 (electronic), ACM, New York, 2003. 7 [Gao] P. Gao, N -level density of the low-lying zeros of quadratic Dirichlet L-functions, Ph. D thesis, University of Michigan, 2005. [GV] D. A. Goldston and R. C. Vaughan, On the Montgomery-Hooley asymptotic formula, Sieve methods, exponential sums and their applications in number theory (ed. G. R. H. Greaves, G. Harman and M. N. Huxley, Cambridge University Press, 1996), 117-142. [Gü] A. Güloğlu, Low-Lying Zeros of Symmetric Power L-Functions, Internat. Math. Res. Notices 2005, no. 9, 517-550. [Ha] R. W. Hamming, On the distribution of numbers, Bell Syst. Tech. J. 49 (1970), 1609-1625. [HaMi] C. Hammond and S. J. Miller, Distribution of eigenvalues for the ensemble of real symmetric Toeplitz matrices, Journal of Theoretical Probability 18 (2005), no. 3, 537–566. [HeMi] P. Hegarty and S. J. Miller, When almost all sets are difference dominated, preprint. http://arxiv.org/abs/0707.3417 [Hi1] T. Hill, The first-digit phenomenon, American Scientist 86 (1996), 358–363. [Hi2] T. Hill, A statistical derivation of the significant-digit law, Statistical Science 10 (1996), 354–363. [HuMi] C. Hughes and S. J. Miller, Low-lying zeros of L-functions with orthogonal symmtry, Duke Mathematical Journal, 136 (2007), no. 1, 115–172. [HR] C. Hughes and Z. Rudnick, Linear Statistics of Low-Lying Zeros of L-functions, Quart. J. Math. Oxford 54 (2003), 309–333. [Hu] W. Hurlimann, Benford’s Law from http://arxiv.org/abs/math/0607168. [ILS] H. Iwaniec, W. Luo and P. Sarnak, Low lying zeros of families of L-functions, Inst. Hautes Études Sci. Publ. Math. 91, 2000, 55–131. [KaSa1] N. Katz and P. Sarnak, Random Matrices, Frobenius Eigenvalues and Monodromy, AMS Colloquium Publications 45, AMS, Providence, 1999. [KaSa2] N. Katz and P. Sarnak, Zeros of zeta functions and symmetries, Bull. AMS 36, 1999, 1 − 26. [Ke] J. P. Keating, Statistics of quantum eigenvalues and the Riemann zeros, in Supersymmetry and Trace Formulae: Chaos and Disorder, eds. I. V. Lerner, J. P. Keating & D. E Khmelnitskii (Plenum Press), 1–15. [KeSn1] J. P. Keating and N. C. Snaith, Random matrix theory and ζ(1/2+it), Comm. Math. Phys. 214 (2000), no. 1, 57–89. [KeSn2] J. P. Keating and N. C. Snaith, Random matrix theory and L-functions at s = 1/2, Comm. Math. Phys. 214 (2000), no. 1, 91–110. [KeSn3] J. P. Keating and N. C. Snaith, Random matrices and L-functions, Random matrix theory, J. Phys. A 36 (2003), no. 12, 2859–2881. [KiVu] J. H. Kim and V. H. Vu, Concentration of multivariate polynomials and its applications, Combinatorica 20 (2000), 417–434. [KoMi] A. Kontorovich and S. J. Miller, Benford’s Law, Values of L-functions and the 3x + 1 Problem, Acta Arith. 120 (2005), 269–297. [Knu] D. Knuth, The Art of Computer Programming, Volume 2: Seminumerical Algorithms, Addison-Wesley, third edition, 1997. [Le] M. Lewis, Moneyball: The Art of Winning an Unfair Game, W. W. Norton & Company, 2004. [MO] G. Martin and K. O’Bryant, Many sets have more sums than differences. To appear in : Proceedings of CRM-Clay Conference on Additive Combinatorics, Montréal 2006. [MS] F. Mezzadri and N. C. Snaith (eds.), Recent perspectives in random matrix theory and number theory, LMS Lecture Note Series, 322, Cambridge University Press, Cambridge, 2005. 8 1881 to 2006: a bibliography, [Mic] P. Michel, Rang moyen de familles de courbes elliptiques et lois de Sato-Tate, Monat. Math. 120, (1995), 127-136. [Mil1] S. J. Miller, 1- and 2-level densities for families of elliptic curves: Evidence for the underlying group symmetries, Compositio Mathematica 104 (2004), no. 4, 952–992. [Mil2] S. J. Miller, Variation in the number of points on elliptic curves and applications to excess rank, C. R. Math. Rep. Acad. Sci. Canada 27 (2005), no. 4, 111–120. [Mil3] S. J. Miller, Investigations of zeros near the central point of elliptic curve L-functions, Experimental Mathematics 15 (2006), no. 3, 257–279. [Mil4] S. J. Miller, A derivation of the Pythagorean Won-Loss Formula in baseball, Chance Magazine 20 (2007), no. 1, 40–48 (an abridged version appeared in The Newsletter of the SABR Statistical Analysis Committee 16 (February 2006), no. 1, 17–22). [Mil5] S. J. Miller, Lower order terms in the 1-level density for families of holomorphic cuspidal newforms, preprint. http://arxiv.org/abs/0704.0924 [Mil6] S. J. Miller, A symplectic test of the L-functions Ratios Conjecture, to appear in IMRN, http://arxiv.org/abs/0704.0927 [MBD] S. J. Miller, E. T. Bradlow and K. Dayaratna, Closed-form Bayesian inference for the logit model via polynomial expansions, Quantitative Marketing and Economics 4 (2006), no. 2, 173–206. [MMS] S. J. Miller, A. Massey and J. Sinsheimer, Distribution of eigenvalues of real symmetric palindromic Toeplitz matrices and circulant matrices, Journal of Theoretical Probability 20 (2007), no. 3, 637–662. [MNS] S. J. Miller, T. Novikoff and A. Sabelli, The distribution of the second largest eigenvalue in families of random regular graphs, to appear in Experimental Mathematics. http://arxiv.org/abs/math/0611649 [MN1] S. J. Miller and M. Nigrini, The Modulo 1 Central Limit Theorem and Benford’s Law for Products, to appear in the International Journal of Algebra. http://arxiv.org/abs/math/0607686 [MN2] S. J. Miller and M. Nigrini, Order Statistics and Shifted Almost Benford Behavior, preprint. http://arxiv.org/abs/math/0601344 [Mon] H. Montgomery, The pair correlation of zeros of the zeta function, Analytic Number Theory, Proc. Sympos. Pure Math. 24, Amer. Math. Soc., Providence, 1973, 181 − 193. [Nig1] M. Nigrini, Digital Analysis and the Reduction of Auditor Litigation Risk. Pages 69–81 in Proceedings of the 1996 Deloitte & Touche / University of Kansas Symposium on Auditing Problems, ed. M. Ettredge, University of Kansas, Lawrence, KS, 1996. [Nig2] M. Nigrini, The Use of Benford’s Law as an Aid in Analytical Procedures, Auditing: A Journal of Practice & Theory, 16 (1997), no. 2, 52–67. [NM1] M. Nigrini and S. J. Miller, Benford’s Law applied to hydrology data – results and relevance to other geophysical data, Mathematical Geology 39 (2007), no. 5, 469–490. [NM2] M. Nigrini and S. J. Miller, Data diagnostics using second order tests of Benford’s Law, preprint. [OS1] A. E. Özlük and C. Snyder, Small zeros of quadratic L-functions, Bull. Austral. Math. Soc. 47 (1993), no. 2, 307–319. [OS2] A. E. Özlük and C. Snyder, On the distribution of the nontrivial zeros of quadratic L-functions close to the real axis, Acta Arith. 91 (1999), no. 3, 209–228. [Rai] R. A. Raimi, The first digit problem, Amer. Math. Monthly 83 (1976), no. 7, 521–538. [RR] G. Ricotta and E. Royer, Statistics for low-lying zeros of symmetric power L-functions in the level aspect, preprint. http://arxiv.org/abs/math/0703760 [RSi] M. Rosen and J. Silverman, On the rank of an elliptic surface, Invent. Math. 133 (1998), 43-67. [Ro] E. Royer, Petits zéros de fonctions L de formes modulaires, Acta Arith. 99 (2001), no. 2, 147-172. [Rub] M. Rubinstein, Low-lying zeros of L–functions and random matrix theory, Duke Math. J. 109, (2001), 147–181. 9 [RS] Z. Rudnick and P. Sarnak, Zeros of principal L-functions and random matrix theory, Duke Journal of Math. 81, 1996, 269 − 322. [Si] J. Silverman, The average rank of an algebraic family of elliptic curves, J. reine angew. Math. 504, (1998), 227-236. [TW] C. Tracy and H. Widom, Distribution functions for largest eigenvalues and their applications, ICM Vol. I (2002), 587–596. [Vu1] V. H. Vu, New bounds on nearly perfect matchings of hypergraphs: Higher codegrees do help, Random Structures and Algorithms 17 (2000), 29–63. [Vu2] V. H. Vu, Concentration of non-Lipschitz functions and Applications, Random Structures and Algorithms 20 (2002), no.3, 262-316. [Yo1] M. Young, Lower-order terms of the 1-level density of families of elliptic curves, Internat. Math. Res. Notices 2005, no. 10, 587–633. [Yo2] M. Young, Low-lying zeros of families of elliptic curves, J. Amer. Math. Soc. 19 (2006), no. 1, 205– 250. [Za] I. Zakharevich, A generalization of Wigner’s law, Comm. Math. Phys. 268 (2006), no. 2, 403–414. 10

Log In

research statement

Sign up for access to the world's latest research.

Related papers