Technion, IEM faculty - Statistics Seminar Speaker: Richard Olshen, Stanford University Title: Successive normalization of rectangular sets of data Date: 03/07/2011 Time: 14:30 Place: Bloomfield-527 Abstract: <http://ie.technion.ac.il/seminar_files/1307369931_Olshen.pdf> Or read this: Standard statistical techniques often require transforming data to have mean 0 and standard deviation 1. Typically, this process of “standardization” or “normalization” is applied across subjects when each subject produces a single number. High throughput genomic data often come as a rectangular array, where each coordinate in one direction concerns a subject with, for example, case or control status; and each coordinate in the other designates “outcome” for a specific feature, for example, a “gene” or “gene fragment.” When analyzing data that come as rectangular arrays, it may helpful if both subjects and features are “on the same footing.” This entails a need to standard across rows and columns. We have investigated convergence of what seems to us a natural approach to successive normalization that we learned from Bradley Efron. The process applies when the array has at least three rows and at least three columns. It involves successive polishing, by row (say), and then by column. A polish of rows (columns) means first subtracting off means by row (column) and then dividing the resultant rows (columns) by respective row (column) standard deviations. The process is iterated, beginning by rows, then columns, then rows; alternatively beginning with columns and proceeding analogously. Not only does this process converge for (Lebesgue) almost all initial arrays, but also convergence is very fast in the numbers of iterations. Limiting arrays have row and column mean values 0, row and column standard deviations 1. Research thus far will be summarized, and plots that make clear the rapidity of convergence will be shown. Particularly illustrative graphics will be given for the 3ª3 case. Extensions of conclusions will be given. All research in the area of this talk is collaborative with Stanford colleague Bala Rajaratnam. --------------------------------------------------------- Technion Math. Net (TECHMATH) Editor: Michael Cwikel <techm@math.technion.ac.il> Announcement from: <ynardi@ie.technion.ac.il>