Andrew Abbott on brute-force research, the future of libraries, and what makes good research good

Andy AbbottAndrew Abbott, the Gustavus F. and Ann M. Swift Distinguished Service Professor at the University of Chicago, edits the American Journal of Sociology. Abbott has twice chaired the University of Chicago’s Library Board and played a central role in planning the university’s Joe and Rika Mansueto Library. His latest book is Digital Paper: A Manual for Research and Writing with Library and Internet Materials.

In Digital Paper, you describe the process of research as “nonlinear” and recommend the simultaneous (or overlapping) tasks of designing, collecting bibliography, scanning and materials search, reading, maintaining files, analyzing retrieved material, and writing. You show how a complex project would get rolling and feed and build and coalesce into a successful article or monograph. But what about a project that can’t fly? Is there a typical point at which a researcher discovers a project’s fatal weakness? When does someone know it’s time to give up?

It’s time to give up when the project is just going nowhere. A student, however, needs to talk to an advisor. Students are often ready to give up in humanities or social science research because they get sucked into the infinite quality of social reality  and get lost. They see so many interesting questions in a project that they don’t know where to start. That’s why they should always have solid empirical and theoretical puzzles as a starting point. With those, you can tell when the project is going nowhere.

But often the stuck student (I saw one today) simply needs to be told “just pick one major thread that runs through your material and write it up. Everybody knows that life is complicated, and that everything has multiple causes. But what we do is trace one particular thread that is not excessively complicated, and use reflection about that thread to discipline our thinking about an abstract, theoretical issue.”

It’s like me answering this question—I could go on and on about it, but I just have to focus on one particular thing—the lost student. Myself, I can always tell when a project doesn’t work. I get bored with it and want to stop. Then I go on to another project, get bored with that, and eventually I circle back around to the first one, and have all kinds of new ideas and excitement about it. It’s true that because of this pattern I do have perhaps a dozen major projects that were never finished despite a lot of effort: a book on education, a huge study of musicians that produced only one published paper and left two other papers sitting idle, a project on occupational therapy that’s still dragging on after more than a decade, etc. But by circling around I get a lot of things done. Generally, I give up on something because everything else is going better! A student can’t afford that procedure.

You use the wonderful verb to brute force to describe a particular phase of handling source material:

Think of Darwin. Darwin was up to his eyeballs in data. Every day he slogged through data on finches in the Galapagos, on the results of cattle-breeding, and so on. Think of Simone de Beauvoir when she wrote The Second Sex. Her book assembles and digests whole continents of data. We may remember such writers for their theories. But they spent their days brute-forcing through records, statistics, histories, and reports. That is where theory comes from.

Can you describe a moment when brute forcing led you to a theory?

Oh, sure—brute force. I can remember sitting in the second floor of the Regenstein Library for about three months, over on the east side, filling out a one-page form on every psychiatrist or neurologist I could locate in Who’s Who in American Medicine for 1925. I coded an entire database of about a thousand psychiatrists out of that damn book, augmented it with four or five other sources, and then created secondary forms for all the psychiatric faculties on which these people served, for all the major societies of which these people were members, and so forth. In effect, I created by hand an entire relational database and all the major subtables of that database. I ended up spending several days in the basement of the New York Academy of Medicine filling out the faculty data from the gargantuan collection of medical school announcements they had.

I didn’t get a theory out of this material, but I got lists of every possible contingency that could happen to a psychiatrist (like being killed by his patient), of all the sanitaria in the US (which often turned out to be the same buildings under many different names), of the kinds of people who tended to start private mental hospitals, and so on. When you brute force things, you are always noticing stuff as you go along, even as you are absolutely dying of boredom. It’s all that stuff you notice as you go along that gradually floats together in your mind. It seldom happens in the archives themselves. It happens when you are somewhere else, thinking, or when your mind is empty and relaxed.

Now it may very well be true that I would have gotten as many insights if I had only gone through half that much data. But I would not have been secure about my judgment as I was when  I had seen nearly everything. That’s a good feeling.

When you are doing brute force, actually doing the stuff, you can’t afford to let your theoretical mind run. You are too busy making sure that you see everything, that you don’t go to sleep or start thinking about what’s for dinner. It’s very hard work to keep really focal attention on detailed material where you are looking for rare things. But it loads all kinds of stuff into your mind that just explodes into inspiration.

Brute force can also produce a ton of leads if you happen to find an unexpectedly rich source early in the project. So when I wrote a paper on the history of the use of libraries, I decided early on to look through the titles of all the MA and PhD theses ever written for the University of Chicago Library School. There were seven hundred of them. The number of research leads, theoretical ideas, and empirical hypotheses suggested in that exercise was heroic—a long afternoon, but with a spectacular payoff.

The back cover of Digital Paper points out the overwhelming mass of information available to researchers and says that the book will explain “how scholars [can] produce groundbreaking research using the physical and electronic resources available in the modern university research library.” At a time when being up-to-the-minute is touted as a virtue, what does it mean that one reviewer praises the book for being “without a hint of trendiness”?

Good research is good in terms of the values of some group of experts. It’s not necessarily good because it’s new. To say that the best research is always “up-to-the-minute” research is more or less to assume cumulation and progress. In the natural sciences that seems a useful assumption, because the natural sciences are loosely built around a progressive ideal of making explanation more and more precise. But that’s not how the humanities and most of the social sciences are built. They’re not really cumulative and nobody expects to find the absolutely correct interpretation of Madame Bovary or the meaning of La Gioconda’s smile or the uniquely true reason the battle of Austerlitz was a French victory. Most library research isn’t done in a cumulative framework. It’s about producing beautiful new (and newly beautiful) assemblages of scholarly material, but not really about replacing old ones the way good science replaces bad science. Yes, we cumulate in little local areas—developing a definitive Chaucer was a cumulative enterprise. But remember, the humanities were subsequently vastly enriched by disassembling precisely that definitive text.

Thus, although library researchers believe in rigor, they don’t really have a set of rules about what the whole enterprise of rigorous library research should be doing, because they don’t have the simple guideline of cumulation. Should we be writing about as many people as possible? Making the canon inclusive (and of whom)? Fostering more and more diverse approaches? Not losing sight of classic problems and issues? Emphasizing perfection of writing or pictures or presentation?

So sometimes we try to make novelty just for its own sake—hence the worry about trendiness, which is just false up-to-the-minute-ness. But groundbreaking work is wonderful because of something that has value in terms of those unwritten, perhaps unknown, and certainly untheorized, rules that govern the enterprise. I’m writing about them myself, but I don’t know other people who are. And they are where the real answers are to be found.

Some people might be surprised to know that the University of Chicago recently built a gigantic, state-of-the-art repository for physical books and papers at a time when more and more resources are available in digital formats. Are other libraries going in the same direction, or is Chicago’s new Mansueto Library bucking the trend? What was the thinking behind Chicago’s decision?

Chicago is clearly bucking the trend, which is to put more and more physical materials in storage, whether or not they are available in digital formats. Remember that in effect nothing published since 1923 is available for free digitally (except US government documents) and that’s 90 percent of the collection of most university libraries. And there still is no trend among scholars to do serious reading on the screen. Serious reading is still done with physical materials, printed off the web if need be.

Chicago decided to keep all the print on campus, and put the unbrowsable materials (such as special collections) and things usually consulted en masse (long runs of government documents) and things already digitized (print journals) into Mansueto.

Why keep them on site? And why keep 4.5 million items in Regenstein? Because of the following calculation. Between 1998 and 2006, of the 4.5 million items in Regenstein, fully 21 percent circulated to a patron. Of those, half circulated only once. Obviously, these were little-used research materials charged out by experts. You might think that those materials could be easily housed off campus. But think again! In eight years there are 2,920 days. And 450,000 items over 2,920 days means that more than 150 times a day some scholar found something he or she wanted to charge on the shelf—not in storage. And of course, that scholar probably looked at three or four more books before choosing the one charged. So it may not look as if the library is heavily used, but in fact it is.

Regenstein is the best working research library in the world because its entire collection is available for real work all the time. And Mansueto guarantees that that will be true until the early 2020s. If in future, everything truly does go digital, and nobody reads physical artifacts (very unlikely, in my view, but possible), then we can use Mansueto for the 3 million items for which there is no financially feasible digitization plan (such as obscure pamphlets and periodicals in out-of-the-way languages, etc.)

That’s why we built Mansueto.

What do you imagine when you think of scholarly research a hundred years from now?

It’s completely impossible to guess what scholarship will be like in one hundred years. For one thing, it depends on international politics—the world could be under a single authoritarian government in a hundred years, with the power, like the Chinese emperors, to destroy all preexisting learning. And if by then everything has been digitized, it will all be destroyable at a single shot. Or perhaps the emperor will want to change it all so it says what is ideologically convenient. The digital knowledge paradise is vulnerable to a thousand kinds of subversion that printed books—in multiple unalterable physical copies—are not. I don’t believe for a second in the invulnerability claimed by the digital messiahs.

As for me, I simply hope that there are a few real libraries left and a few real scholars left to work in them, producing real thoughts, in an artisanal manner, for the purpose of truly understanding their world. But I’m not terribly hopeful. And, posterity of course is always right, so they won’t care what we think. They will think they have the best knowledge system in history.

Please see our commenting policy.