Is There a Better Way to Rank Business Schools?

Patrick Perry and Keith Reigert, NYU Stern School of Business – April 4, 2016

A lot of people at Stern are angry about the data collection error that caused the school's U.S. News and World Report full-time MBA program ranking to drop from #11 last year to #20 this year. Stern failed to report the number of students that submitted GMAT scores to the school, and rather than alert the school to this omission, U.S. News chose to estimate the missing value. Some forensic data analysis, shows that the omission caused U.S. News to effectively replace Stern's average GMAT score (720) with a much lower value (560). We don't definitively know that this is what happened, but the evidence is compelling.

The missing-data debacle got us interested more generally about how business school rankings are determined in the first place. U.S. News describes the process on their website. They measure a set of eight attributes for each school and then reduce these eight values to a single numerical score by taking a weighted combination. Some people have taken issue with the sensitivity of rankings to the specific choices of weights used to combine the individual attributes. These critics certainly have a point, but to us, the more fundamental issue is that many of the measurements that U.S. News uses to rank business schools are themselves problematic.

While attributes like “Mean Starting Salary” and “Mean Undergraduate GPA” are, without a doubt, useful for prospective students deciding whether to apply to a school (and for assessing their chances of being accepted), these attributes may not have much to say about the quality of a school itself. Here is a breakdown of some of the specific measurement issues we see:

Mean Starting Salary, a measure of the mean salary of students entering the workforce after graduation, is informative for students considering the return on investment of attending a certain school, but it tends to be more a function of geography and student career choice than school quality.
Employment Rate at Graduation and Employment Rate Three Months After Graduation, again are informative and relevant for students considering a program's return on investment, but they have little bearing on the value of the school. Furthermore, at a time when many of the nation's top schools are fostering innovation and entrepreneurship programs (straying from the traditional “corporate recruiting model” of MBA programs), these attributes may be outdated.
Mean Undergraduate GPA is not comparable across different applicant pools (a 3.0 GPA from Harvard doesn't mean the same thing as a 3.0 GPA from DeVry).
Acceptance Rate is not comparable for schools with different enrollments (Harvard enrolls roughly 1900 students; Stanford enrolls 800; Haas enrolls 500). Furthermore, shifting national trends in the number and quantity of business school applications each year can have varying affects on schools, unrelated to school quality (for example, over the past few years, there has been a national decline in the number of applicants to part-time MBA programs, but schools located in or near Silicon Valley have been less affected by this trend).

It's likely that U.S. News tries to adjust these measurements to fix some of the issues mentioned above, but it is not clear how they are doing this or how effective these adjustments are.

For ranking the “best” business schools, there are three attributes that, while by no means perfect, seem to be more trustworthy:

Mean GMAT Score measures the quality of the student body at the start of the MBA. This also quantifies the perceived value of the school to each year's incoming class (the higher a student’s GMAT score, the more potential program options they have for getting their MBA).
Peer Assessment Score and Recruiter Assessment Score both measure the quality of the student body at the end of the MBA. The former is based on business school deans' ratings of their competitors, and the latter is based on corporate recruiters' and company contacts' assessments.

We don't have much insight into the different psychologies of business school deans and corporate recruiters, but empirically, “Peer Assessment” and “Recruiter Assessment” are strongly correlated with each other (correlation 0.83) and it seems to us like these numbers are two different measurements of the same attribute. To combine both into a single “Combined Assessment Score”, we standardize each to have the same standard deviation and take the average of the two numbers. To make the result more interpretable, we transform back to the original 1-5 scale; this effectively puts a weight of 45% on peer assessment and 55% on recruiter assessment.

Here's a scatter plot of the “Mean GMAT Score” and “Combined Assessment Score” for the ranked U.S. News business schools. Each point represents a school, with the x and y coordinates giving the respective attribute values. (Three schools have average GMAT scores below 600, and they have been excluded from this plot.)

plot of chunk unnamed-chunk-4

If we want to use these two attributes to rank the schools, there is a natural choice, which is to rescale the attributes appropriately and then give each equal weight. (This process has a geometric interpretation, which is that each point in the scatter plot gets mapped to the closest spot on the dashed line, preserving as much of the variability in the data as possible.)

Rescaling is necessary because GMAT scores (ranging from 200-800) are not directly comparable to assessment scores (ranging from 1-5). To make the values comparable, we subtract off the mean of each attribute and divide by its standard deviation. This ensures that each rescaled attribute has mean 0 and standard deviation 1. We then give each rescaled attribute a weight of 50%. To make the values more interpretable, after we compute the scores, we re-center and rescale them to have mean 75 and standard deviation 10, then round the values. The rounding process induces some ties between schools, but otherwise this transformation does not affect the final ranking.

The final formula for determining a school's score is

(Simplified Score) =  -39.14 + 0.125 (Mean GMAT) + 4.26 (Peer Assessment) + 5.28 (Recruiter Assessment)

The following table shows the rank and score for the top 50 schools as determined by this simplified method. We've also included the mean GMAT scores, the assessment scores, and the aggregate scores and rankings as reported by U.S. News.

Rank	School	Peer Assessment	Recruiter Assessment	Mean GMAT	Simplified Score	U.S. News Score	U.S. News Rank
1	Stanford University	4.8	4.5	733	96.6	98	2
2	Harvard University	4.8	4.6	725	96.2	100	1
3	University of Pennsylvania (Wharton)	4.7	4.5	732	96.1	97	4
4	University of Chicago (Booth)	4.7	4.5	726	95.3	98	2
5	Northwestern University (Kellogg)	4.6	4.5	724	94.7	96	5
6	Massachusetts Institute of Technology (Sloan)	4.7	4.5	716	94.1	96	5
7	University of California - Berkeley (Haas)	4.6	4.3	725	93.7	94	7
8	Yale University	4.3	4.3	721	91.9	90	8
9	Dartmouth College (Tuck)	4.3	4.3	717	91.4	90	8
10	Columbia University	4.4	4.1	715	90.6	89	10
11	New York University (Stern)	4.2	4.0	720	89.8	72	20
12	University of Michigan - Ann Arbor (Ross)	4.4	4.1	708	89.7	86	12
13	University of California - Los Angeles (Anderson)	4.1	3.9	713	88.0	79	15
14	Duke University (Fuqua)	4.3	4.1	696	87.8	86	12
15	University of Virginia (Darden)	4.1	4.0	706	87.6	87	11
16	Cornell University (Johnson)	4.1	4.1	697	87.0	80	14
17	University of Texas - Austin (McCombs)	3.9	4.1	694	85.8	76	16
18	University of North Carolina - Chapel Hill (Kenan-Flagler)	3.9	3.7	701	84.6	76	16
19	Carnegie Mellon University (Tepper)	4.0	3.8	690	84.2	75	18
20	Georgetown University (McDonough)	3.6	3.8	692	82.7	69	22
21	Washington University in St. Louis (Olin)	3.6	3.6	695	82.0	71	21
22	Emory University (Goizueta)	3.7	3.8	678	81.4	73	19
23	Vanderbilt University (Owen)	3.5	3.6	690	81.0	69	22
24	University of Southern California (Marshall)	3.8	3.6	679	80.9	64	31
25	University of Texas - Dallas	2.9	4.2	678	80.1	61	37
26	Indiana University (Kelley)	3.8	3.6	668	79.5	69	22
27	University of Notre Dame (Mendoza)	3.5	3.5	682	79.4	66	25
28	University of Minnesota - Twin Cities (Carlson)	3.5	3.4	680	78.7	65	27
29	Rice University (Jones)	3.4	3.5	676	78.3	66	25
30	University of Washington (Foster)	3.4	3.2	688	78.2	65	27
30	Georgia Institute of Technology (Scheller)	3.2	3.6	678	78.2	63	34
32	University of Wisconsin - Madison	3.6	3.4	669	77.7	65	27
33	Arizona State University (Carey)	3.4	3.4	672	77.2	62	35
34	Ohio State University (Fisher)	3.5	3.4	664	76.7	65	27
34	University of California - Davis	3.2	3.2	683	76.7	56	45
36	Brigham Young University (Marriott)	3.0	3.5	674	76.3	64	31
37	Michigan State University (Broad)	3.3	3.4	664	75.8	62	35
38	University of Maryland - College Park (Smith)	3.4	3.4	658	75.5	57	41
39	Texas A&M University - College Station (Mays)	3.3	3.5	654	75.1	64	31
39	Boston University (Questrom)	3.1	3.0	682	75.1	57	41
39	University of Alabama (Manderson)	2.7	3.4	679	75.1	50	53
42	University of Illinois - Urbana-Champaign	3.4	3.3	654	74.5	60	39
43	Boston College (Carroll)	3.2	3.2	664	74.3	53	50
44	University of Florida (Hough)	3.3	2.7	681	74.2	61	37
45	University of California - Irvine (Merage)	3.1	3.4	656	74.0	54	48
46	University of Rochester (Simon)	3.2	3.0	667	73.6	60	39
46	University of Iowa (Tippie)	3.1	3.0	670	73.6	56	45
48	Southern Methodist University (Cox)	3.1	3.2	656	72.9	54	48
49	Purdue University - West Lafayette (Krannert)	3.4	3.4	635	72.6	55	47
50	Pennsylvania State University - University Park (Smeal)	3.2	3.5	636	72.4	57	41

The simplified GMAT/assessment ranking method gives reasonable results, comparable to U.S. News in most cases. The U.S. News ranking method is much more complicated. Which ranking is better? We're obviously biased, but in this situation, like in most other situations, we prefer the simpler method. With the U.S. News system, we have concerns about many of the inputs, and we don't have much confidence in the weights used to compute the final scores. With the simplified method based on GMAT and assessment, we have confidence in all of the inputs and we can understand how they relate to the final score.

Patrick Perry (@ptrckprry) is an Assistant Professor of Information, Operations, and Management Sciences at NYU Stern.

Keith Reigert (@KeithRiegert) is an MBA candidate at NYU Stern and an editor at the Stern Opportunity.

The raw data used to compute the rankings, was originally collected and reported by U.S. News. The plot and the table in this article were produced using the R software envionment. The data and source code are available for download.

Appendix

Quality measures

To validate our intuition that many of the factors used in the U.S. News ranking are poor proxies for school quality, we looked at scatter plots of all seven attributes versus the schools' mean GMAT scores. In these plots, each point represents a school.

plot of chunk unnamed-chunk-8

From these plots, it is clear that “Employment Rate at Graduation”, “Employment Rate Three Months After Graduation”, “Mean Undergraduate GPA”, and “Acceptance Rate” are poor measures of school quality in that they appear only slightly related to “Mean GMAT Score” (arguably the most reliable measure of school quality).

Despite its strong relationship with “Mean GMAT Score”, we still feel uncomfortable including “Mean Starting Salary” as a ranking factor due to its strong dependence on geography (the highest quality schools tend to be located in cities with high-paying jobs).